Oblivious RAM Done Right (2): [GO96]

Reading:

  1. Software Protection and Simulation on Oblivious RAMs” by Rafail Ostrovsky

1. Introduction:

This paper starts from the problem of software protection, here we mean “protection against illegitimate duplication”, i.e. ensuring there is no efficient method for creating executable copies of the software. Purely software-based solution doesn’t exist, so hardware must involve. Our task is to minimize the participation of hardware, so that the solution is less expensive.

An existing solution is “software-hardware-package (SH-package)“, that is, selling a physically shielded CPU together with an encrypted program. An instruction cycle of such CPU will be fetching -> decrypting -> executing an instruction. Every ATM machine has such a protected chip.

However, above solution is not good enough, since it doesn’t hide the “access pattern“, i.e. the addresses of the memory cells accessed during the execution are not kept secret. The lack of such property may create trouble, e.g. an adversary could learn the loop structure of the program, which is important in certain scenario. If we come up with a solution that hides the access pattern, we consider it as secure. Informally, “hiding access pattern” means no PPT adversary can distinguish these two cases:

Case 1: The adversary is experimenting with the real CPU that runs the encrypted program.

Case 2: The adversary is experimenting with a fake CPU that runs dummy program, such as “while True do skip”. After the same number of steps as in case 1, the CPU writes exactly the same outputs as in case 1 into memory.

To protect software to such level of security, the cost is “speed”, called software protection overhead. Informally, software protection overhead means the number of steps the protected program makes for each each step of the program. Our task is to minimize this overhead.

2. Model and Definitions:

A machine is oblivious if the sequence in which it accesses memory locations is equivalent for any two inputs with the same running time.

3. Reducing Software Protection to Oblivious Simulation of RAMs:

4. Square Root Solution

5. Hierarchical Solution

6. Lower Bound

Computational Complexity

Readings:

  1. Algorithm Design” by Kleinberg and Tardos
  2. Introduction to the Theory of Computation” by Michael Sipser
  3. UCLA CS 282A Notes by Rafail Ostrovsky

You don’t have to memorize the precise definition for these computational models, since most of the time we won’t use details in their defitions to construct proofs. However, a poor understanding on these topics creates barriers when reading paper, so it is neccessary to at least know what they are talking about.

1. Turing Machine:

Turing machines mimic the function of a computer, but it uses an infinitely-long tape as memory, so it is rather an ideal model. Turing machines can read and write, and use a “read-write head” to point to a specific location on the tape to access the content of that location. For a brief introduction, please refer to this page.

2. Polynomial Time (P):

P class contains all decision problems that a deterministic single-tape Turing machine can solve in polynomial time。Generally speaking, P contains problems of time complexity O(n^k), k is a natrual number. We can assume a modern computer is able to solve a problem of class P, but note that this “easy-to-compute” impression comes from the difficulty of NP class. A problem of complexity O(n^1000000) is definitely not easy to solve.

3. Non-Deterministic Polynomial Time (NP):

NP class contains all problems that are polynomially verifiable。For example, given integers p and q, it is easy to verify whether x = p * q is a composite or not, simply by doing multiplication. However, if we are given an integer x and we are asked to find a factorization of x into p and q, this task can be very difficult to do. Let’s call this problem the “composite problem”, and we say composite problem is polynomially verifiable but not polynomially decidable, so it is of class NP, not P. Similarly, we can assume NP problems are unsolvable on modern computer, although some of them are easy to solve in special cases.

4. P versus NP:

From 2 and 3, we can define P and NP in the following way:

  1. P = the class of languages for which membership can be decided quickly (in polynomial time)
  2. NP = the class of languages for which membership can be verified quickly (in polynomial time)

If P equals NP, then all polynomially verifiable problems are polynomially decidable,As a result, one-way function doesn’t exist,and we are unable to build most of the modern cryptographic theory without the existence of one-way function. Hence P does not equal NP is a foundamental assumption that we make all the time.

5. NP-Completeness:

6. Randomized Algorithm:

7. Probabilistic Polynomial Time (PPT):

8. Randomized Polynomial Time (RP):

9. Bounded Probabilistic Polynomial Time (BPP):

10. Non-Uniform Polynomial Time (P/Poly):

11. Oracle Machines:

An oracle machine can be viewed as a Turing machine + a black box structure, called an oracle. An oracle can answer any decision problem in a single operation. The decision problem can be of any complexity class, even undecidable problem, so oracle machine is a rather ideal model.

12. Random-Access Machine (RAM):

RAM is a class of register machines, can be viewed as a counting machine + the ability of indirect addressing. RAM has registers labelled R1,R2,R3,R4 and etc,Each register stores an integer。RAM has a control unit, which contains a program. RAM uses a program counter (PC) points to and execute certain statement in the program。Here is a graph of the RAM structure.

13. Some notes:

  1. log function has a change of base formula, so any log function can be changed to another base by multiplying a constant。As a result,O(log n) can represent log function with any base, so we don’t need to specify what kind of base number we are working with, and replace everything with logn.