Week 0: Introduction and Basic Data Structures

This 10-week data structures and algorithms series is going to be an essential project in my blog. I am preparing for a Google interview, which is the main reason why I am writing this. I feel it would be valuable if I take notes to record my journey, for reviewing the knowledge that I learn along the way, also for you guys who need sources to prepare for future coding interviews.

A little background about myself: I am studying at UCLA, major in pure mathematics, with a research orientation on cryptography. I am a Python person, so that Python is the official programming language on my website. I haven’t received formal CS education besides theoretical cryptography in school, which means I have gone through all these coding interview preparations by self-studying. However, I don’t consider it a disadvantage. Self-studying is even more effective than taking courses in college, in my opinion. Therefore, don’t worry if you have little background in CS. As long as you follow my plan, I guarantee you can at least reach my level.

At the very beginning, you need an easy read to get started. Your first data structure and algorithm should be entertaining and straightforward, just for the sake of your mental health. Please don’t choose books like CLRS as your first book, unless you think you are von Neumann. As a reference, I suggest you go with Grokking Algorithms. This book covers basic data structures too, so the title is a bit misleading. The code examples are either in Python or pseudocode. After reading it, I highly recommend you to buy this book without hesitation: Problem Solving with Algorithms and Data Structures Using Python. By the time I am writing this line, the price on Amazon is $40. It’s definitely worth the money. It is comprehensive, easy to read, and simply enjoyable. More importantly, it uses Python. Writing Python code in a coding interview will save you tons of time and energy, and you don’t need to worry about integer overflow and memory allocation kind of stuff. However, I don’t recommend to use the codes listed in this book directly in practice. For each topic, please refer my posts and use the templates in them.

While reading books, if you find LeetCode is hard to follow and problems are a little difficult for you, I recommend you to check out this program on educative.io. The first two courses (data structure refresher and hacking the coding interview) can help you to get some practice along with the reading. Note that the system design course in it is very helpful, so make sure you do that too. Spend at least 20 hours on system design is a good investment, even if you are new grad (Google starts to ask system design questions in internship interviews). Knowing basic concepts about system design can make you more confident in an interview.

After these prerequisites, you can start reading my posts and doing LeetCode. The main focus of this series is the “medium” problems on LeetCode, which covers around 80% of a coding interview. Most people are able to handle these medium problems with a proper study plan, so you should spend most time on them. Hard problems are also likely to appear if you are interviewing with big companies, such as Google or Facebook. Anyway, if you are aiming for a higher goal, it is always necessary to put in greater effort.

Enough talking, let’s get to the point. Here is a long list of practice problems that you should do on LeetCode, before moving to Week 1. These problems will show you a big picture of what you would expect to learn in my posts. I will only write solutions on those important problems. That is, either they appear in interviews more frequently than others, or the concept is important for future study.

Time Complexity:

Make sure you know and understand the time complexity of operations on basic data structures, such as append, pop, remove and etc. The table can be found here:

Big-O Cheat Sheet

Why is it important? Suppose you want to optimize the code inside a loop to O(1). If you use pop(0) method on a list to pop out the first element and you assume the operation is O(1), your whole algorithm will be messed up. When you call pop(0), the method will delete the first element and move all other elements to the left by index 1, so it’s O(n). These subtle things could be crucial for the correctness on algorithm design.

List vs Linked List:

Python list is a dynamic array that can hold different data types in it, so it’s very different from C array. This flexibility has a trade-off: Python list takes more space in memory than C array. So if you have a huge data set with only one data type, and you only want to use simple operations on this data set, you should go with C array, also known as the array module in Python. Insertion and deletion in standard list takes O(n) time in worst case, since the elements after the insertion/deletion point need to be moved to the correct position to maintain the ability of random access, which is the key for allowing O(1) time random access for list. Do not confuse accessing with searching. During accessing, you pass into an index; during searching, you pass into a target value. Thus the best you can hope for searching is O(\log n) using binary search, not O(1). In contrast, insertion and deletion in linked list takes O(1) time, since the only real operation is modifying pointers. Random access in linked list takes O(n) time in worst case, since we need to traverse from the head node and search each node one by one. Consider this trade-off when you need to pick list or linked list as the primary data structure for your task.

In a big picture, all data structures fall into these two categories. A data structure is either a special list or a special linked list. For example, hash table is a special list and tree is a special linked list. You will understand this part better after learning all data structures.

LeetCode #206. Reverse Linked List (easy) (memorize it)

LeetCode #24. Swap Nodes in Pairs (medium)

LeetCode #141. Linked List Cycle (easy) (memorize it)

LeetCode # 142. Linked List Cycle II (medium)

LeetCode #25. Reverse Nodes in k-Group (hard)

More Problems:

LeetCode: Linked List

Stack and Queue:

Stack:

In Python, stack is represented by a list, that is, we can initialize a stack using stack = []. The List ADT supports append and pop at the right end of a list, which perfectly follows “First In Last Out” (FILO) principle of a stack. Just as a fun fact, note that collections.deque() can be used as a stack, too. If you only use the append() and pop() methods, the resulting data structure is FILO. I don’t recommend to use deque as stack though, as it supports other methods like popleft(), and that can be problematic if you use the wrong methods accidentally.

Queue:

In contrast, queue is “First in First Out” (FIFO), a mirror data structure of stack. In Python, we always want to use queue = collections.deque() to realize the functionality of a queue, which seems more than necessary, but using other modules can be problematic, and here is why. If you use a regular list, the insertion and deletion will be O(n) time, which is far from ideal. If you use queue = queue.Queue(), when you pop element from empty queue, it raises an error. Using deque would circumvent both these two disadvantages. Deque is implemented using doubly linked list at low level, so randomly accessing an element in the middle takes O(n) time from traversal. Hence we want to use append and pop only, for that they only take O(1) time. We can make a convention here to use append() and popleft() for representing a queue. Visually, elements come into the queue from the right end and leave the queue from the left end. In later posts you will see that queue is heavily linked with BFS.

Sample Problems:

LeetCode #20. Valid Parentheses (easy) (memorize it)

LeetCode #225. Implement Stack using Queues (easy)

LeetCode #232. Implement Queue using Stacks (easy)

More Problems:

LeetCode: Queue & Stack

Priority Queue:

Sample Problems:

LeetCode #703. Kth Largest Element in a Stream (easy)

LeetCode #239. Sliding Window Maximum (hard)

Hash Table and Set:

In Python, hash table is represented by dictionary. It supports search, insertion and deletion in O(1) time, which is too good to be true. The trade off here is that dictionary is unordered. If you want to store elements in a relatively ordered data structure, you can use binary search, which supports search, insertion and deletion in O(\log n) time. All operations for both data structures mentioned above will decay to O(n) time in worst case. Both of them have O(n) worst case space complexity.

Also, make sure you understand the theory behind hash table, such as why hash table is fast for lookup (use hash function for indexing), how to design a proper hash function (should be fast and with minimum hash collision), and how to deal with hash collision if any occurs (open addressing and chaining).

Set is an unordered collection of unique elements, which means it contains no duplicates. It is often used to record the elements that had been visited, for example, the set seen in Two Sum and BFS.

Sample Problems:

LeetCode #242. Valid Anagram (easy)

LeetCode #1. Two Sum (easy) (memorize it)

LeetCode #15. 3Sum (medium) (memorize it)

LeetCode #18. 4Sum (medium)

More Problems:

LeetCode: Hash Table

Binary Tree and Binary Search Tree (BST)

Binary tree has a natural divide and conquer flavor, as a problem on binary tree itself always reduces to subproblems regarding its left subtree and right subtree. Thus, most questions on binary trees have easy solutions using recursion. However, you should memorize the iteration solutions too, just in case any follow-up question comes after.

The performance of BST was mentioned in last section, in contrast with hash table. One important thing to note is that in the definition of BST, we need every node on the left subtree to has smaller value than the value of the root, and every node on the right subtree to has larger value than the value of the root. A very common mistake is to only compare left child and right child to the root, which is a misunderstanding on the definition of BST.

Sample Problems:

LeetCode #98. Validate Binary Search Tree (medium)

LeetCode #236. Lowest Common Ancestor of a Binary Tree (medium)

LeetCode #235. Lowest Common Ancestor of a Binary Search Tree (easy)

More problems:

LeetCode: Binary Tree

LeetCode: Binary Search Tree

Week 2: BFS and Topological Sorting

Breadth-first Search (BFS) appears in interviews A LOT. It is very simple and versatile, so it is worthy to spend some time and tackle it down. One cool thing about BFS is that we could solve most BFS problems using two templates, which I will show you later. If you study really hard, it is possible to master BFS in just one single day (expect 8 hours of effort). BFS utilizes a queue to keep track on elements that are waiting to be searched. In Python, we use collections.deque() to realize this functionality. Also, we need a set to record the nodes that had been visited already. If we don’t do this, the algorithm might falls into infinite loops.

When do we use BFS? There are three major scenarios:

  1. Graph Traversal (level order traversal, connected components, topological sorting)
  2. Finding shortest path in simple graph (every edge has equal weight)
  3. Iteration solution for all possible results (will appear in later posts)

There are two types of BFS, either without-level-order or with-level-order. We will begin our discussion using the simpler case, which is the without-level-order case.

BFS without Level Order:

If we are not asked to record level order, just simply visit the neighbors of head, and then the neighbors of neighbors, and so on, one by one. If we are asked to record level order, we need insert an additional for loop inside the while loop to traverse all nodes in a certain level, and then move on to the next level, level by level. That is the only difference, which is a simple modification of the code. Let’s take a look at the simpler case:

"""
BFS without level order
"""
from collections import deque

seen = set()
queue = deque()

seen.add(start)
queue.append(start)
while len(queue):
    head = queue.popleft()
    for neighbor in head.neighbors:
        if neighbor not in seen:
            seen.add(neighbor)
            queue.append(neighbor)

One thing to note is that, for a graph BFS, seen and queue always come in pairs. We initialize them together, and each time one of them needs to be updated, the other would need to be updated too. You may use some silly phrase like “have you seen queue today?” to help memorization. If we use BFS on a binary tree, then we don’t need to use seen, because trees don’t contain any cycle, so there won’t be any potential danger of infinite loop. Also, note that in a binary tree, the neighbors of a node are its left child and right child (if there is any), so the for loop will be replaced by two if statement.

Sample Problems:

LeetCode #133. Clone Graph (medium)

Given a reference of a node in a connected undirected graph, return a deep copy (clone) of the graph. Each node in the graph contains a val (int) and a list (List[Node]) of its neighbors.

Example:

Input:
{"$id":"1","neighbors":[{"$id":"2","neighbors":[{"$ref":"1"},{"$id":"3","neighbors":[{"$ref":"2"},{"$id":"4","neighbors":[{"$ref":"3"},{"$ref":"1"}],"val":4}],"val":3}],"val":2},{"$ref":"4"}],"val":1}

Explanation:
Node 1's value is 1, and it has two neighbors: Node 2 and 4.
Node 2's value is 2, and it has two neighbors: Node 1 and 3.
Node 3's value is 3, and it has two neighbors: Node 2 and 4.
Node 4's value is 4, and it has two neighbors: Node 1 and 3.

Note:

  1. The number of nodes will be between 1 and 100.
  2. The undirected graph is a simple graph, which means no repeated edges and no self-loops in the graph.
  3. Since the graph is undirected, if node p has node q as neighbor, then node q must have node p as neighbor too.
  4. You must return the copy of the given node as a reference to the cloned graph.

Solution:

BFS with level order:

As mentioned, now we need to insert an additional for loop into the code. Here is how it works:

"""
BFS with level order
"""
from collections import deque

seen = set()
queue = deque()

seen.add(start)
queue.append(start)
while len(queue):
    size = len(queue)
    for _ in range(size):
        head = queue.popleft()
        for neighbor in head.neighbors:
            if neighbor not in seen:
                seen.add(neighbor)
                queue.append(neighbor)

Sample Problems:

LeetCode #102. Binary Tree Level Order Traversal (medium)

Given a binary tree, return the level order traversal of its nodes’ values. (ie, from left to right, level by level).

For example:
Given binary tree [3,9,20,null,null,15,7],

    3
   / \
  9  20
    /  \
   15   7

return its level order traversal as:

[
  [3],
  [9,20],
  [15,7]
]

Solution:

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

class Solution:
    def levelOrder(self, root: TreeNode) -> List[List[int]]:
        if not root:
            return []
        
        queue = collections.deque([root])
        level_order = []
        
        while queue:
            level = []
            size = len(queue)
            for _ in range(size):
                node = queue.popleft()
                level.append(node.val)
                if node.left:
                    queue.append(node.left)
                if node.right:
                    queue.append(node.right)
            level_order.append(level)
            
        return level_order

Next, we are going to introduce topological sorting. Topological sorting is a hidden gem that many people forget to prepare before coding interviews. This topic only takes a few pages in standard algorithm textbooks, but it’s a rather important topic that most tech companies want you to know.

Time Complexity:

In short, the time complexity of BFS is O(|V|+|E|), where |V| is the number of nodes and |E| is the number of edges. Why? Note that the while loop is executed at most once for each vertex, since we use the set seen to avoid any revisiting on node. This gives O(|V|). The for loop inside the while loop is executed at most once for each edge, since each node is popped out of the queue at most once, and we visit its neighbors through edges if and only if the node is popped. This gives O(|E|). Combining the two loops gives us O(|V|+|E|).

Topological Sorting:

Topological sorting works for directed acyclic graphs (DAG) only, and the goal is to find a topological order of the given graph. The graph may have more than one topological order, and we only need to find one of them. By topological order, we mean

Algorithm:

  1. Record the in-degrees of all nodes, and initialize an empty list “topological_order”.
  2. Append all nodes with in-degree 0 to the queue.
  3. Feed the nodes in start_nodes to topological_order one by one. Each time we release a node,
    we visit all its neighbors, and subtract 1 from the in-degree of each neighbor.
  4. If the indegree of any node becomes 0 after this subtraction, we append this node to the queue.
  5. If the queue becomes empty, stop the while loop and return topological_order.

Template:

"""
Definition for a Directed graph node
class DirectedGraphNode:
    def __init__(self, x):
        self.label = x
        self.neighbors = []
"""

class Solution:
    """
    @param graph: A list of Directed graph node
    @return: A list of integer
    """
    def topSort(self, graph):
        node_to_indegree = self.get_indegree(graph)

        # BFS
        topological_order = []
        start_nodes = [n for n in graph if node_to_indegree[n] == 0]
        queue = collections.deque(start_nodes)

        while queue:
            node = queue.popleft()
            topological_order.append(node)
            for neighbor in node.neighbors:
                node_to_indegree[neighbor] -= 1
                if node_to_indegree[neighbor] == 0:
                    queue.append(neighbor)
                
        return topological_order
    
    # Record indegree of all nodes
    def get_indegree(self, graph):
        node_to_indegree = {x: 0 for x in graph}

        for node in graph:
            for neighbor in node.neighbors:
                node_to_indegree[neighbor] += 1
                
        return node_to_indegree

Follow-up: Bidirectional BFS

LeetCode Time!

Week 1: Basic Algorithms

Continuing last week, we are going to go over those basic algorithms that appear in coding interviews frequently. Note that the content below is just an overview, and we will discuss them later with full details. In this week, just learn what they are and how they work superficially, in BFS manner.

Recursion and Divide and Conquer:

LeetCode #50. Pow(x, n) (medium)

LeetCode #169. Majority Element (easy)

More problems can be found here:

LeetCode: Recursion I

LeetCode: Recursion II

Greedy Algorithm:

Greedy Algorithm usually doesn’t appear in a coding interview. In fact greedy algorithm is not a certain algorithm, maybe we should just call it “greedy approach”. If you design a perfect algorithm using greedy approach for a problem, most likely it won’t apply to another problem, even they look similar. That is, greedy approach is not universal, and it requires quite a bit cleverness. As a result, this category doesn’t appear in coding interviews that often. That means, you don’t have to study really hard on this topic. Your task in this week is just to read through and memorize the following solutions, and you are good to go.

LeetCode #169: Majority Element

Given an array of size n, find the majority element. The majority element is the element that appears more than ⌊ n/2 ⌋ times.

You may assume that the array is non-empty and the majority element always exist in the array.

Example 1:

Input: [3,2,3]
Output: 3

Example 2:

Input: [2,2,1,1,1,2,2]
Output: 2

Algorithm:

The main idea is that if we eliminate every pair of majority element and non-majority element, the majority element will still be majority in the new list, since it appears more than ⌊ n/2 ⌋ times. The algorithm is the following: suppose the first element in nums is the majority element, and initialize count to 1. Loop the rest of nums. If the next element is still the same, increment count; otherwise, decrement count. If count is decreased to 0, suppose the next element is the majority element, and keep looping. The code is here:

Solution:

class Solution:
    def majorityElement(self, nums: List[int]) -> int:
        majority = nums[0]
        count = 1
        for i in range(1, len(nums)):
            if majority == nums[i]:
                count += 1
            else:
                count -= 1
            if count == 0:
                if i == len(nums):
                    break
                else:
                    majority = nums[i + 1]
        return majority

Analysis:

This algorithm uses a for loop that runs n - 1 times, and all operations inside the for loop takes time O(1), so the overall running time is O(n). It is also an in-place algorithm, thus takes O(1) space.

LeetCode #321: Create Maximum Number

LeetCode #122. Best Time to Buy and Sell Stock II (easy)

BFS and Topological Sorting:

DFS:

Rabin-Karp Algorithm:

LeetCode #28. Implement strStr() (easy)

Manacher’s Algorithm:

Ad Hoc Multi-Input Functional Encryption (aMIFE)

Readings:

  1. Functional Encryption: Definitions and Challenges
  2. Multi-Input Functional Encryption
  3. Ad Hoc Multi-Input Functional Encryption

Overview:

In this paper, authors proved two main results:

  1. aMIFE for general functionality can be constructed by standard MIFE + general FE + function re-runnable two-round MPC.
  2. As a special case, aMIFE for the inner product functionality can be constructed from standard assumption: Learning From Error (LWE)

To figure out how it is achieved, we need to start from the building blocks:

Functional Encryption (FE):

Consider a cloud image storage service. Law enforcement may require the cloud to search for images containing a particular face. For protecting users’ privacy (if there is still any), the cloud needs a restricted secret key that only decrypts a target image, and only decrypts the target face in clear, while still keeps all other parts of the images blurred. In this scenario, the secret key only reveals a function of the plaintext image. We know traditional public key cryptography is an “all or nothing” scheme, which means either we can decrypt and read the entire plaintext, or we are not able to decrypt it and thus learn nothing about the plaintext. So here function encryption (FE) comes to rescue. Just as in traditional encryption, in FE, ciphertexts can be generated with an encryption key. However, each decryption key is associated with a function f, and decryption of an encryption of m using this key results in not m but f(m). Informally, the security definition of FE requires nothing more than f(m) can be learned from the encryption of m and the decryption key for f.

Multi-Input Functional Encryption (MIFE):

Note that FE only works for computing a function f on a single ciphertext. Moving one step further, what if we want to compute a function defined over multiple plaintexts given their ciphertexts each encrypted under a different key? Now we need to introduce MIFE. In MIFE, decryption keys allow computing joint functions of different plaintexts underlying multiple ciphertexts. That is, decryption takes a key for a function f and ciphertexts c_1, ... , c_n encrypting m_1, ... , m_n, and outputs f(m_1, ... , m_n). The interactive pattern of MIFE can be summarized into the following steps:

  1. Suppose there are l users involved. This number l is counted and sent to a trusted global setup. The global setup outputs a master secret key MSK, and a set of l encryption keys \{EK_1, EK_2, ..., EK_l\}, one for each user.
  2. The global setup picks a function f, and generates a decryption key DK_f using MSK and f. The subscript indicates this decryption key is associated with f.
  3. The global setup publishes the public parameters to all users, and distributes each encryption key EK_i to user with index i. Note that this step can be done in parallel with step 2.
  4. Each user encrypts their plaintext m_i upon receiving encryption key EK_i to get a ciphertext c_i.
  5. All users send their ciphertexts to the aggregator, who now computes the result y = f(m_1, m_2, ..., m_l) using the decryption key DK_f.

Motivation:

The motivation of aMIFE comes from these two drawbacks of the standard MIFE presented above:

  1. In MIFE, encryption and decryption keys are generated via a global setup procedure run by an external entity, called the “key authority”. This key authority is able to decrypt all the ciphertexts gathered from users. It also picks which function f to be computed. Therefore, it must be trusted to maintain the security of this scheme.
  2. In MIFE, the number of function arity must be declared at setup time. It does not support a dynamic setting in which users can join or leave.

Challenges:

Where is challenging part to improve these two drawbacks?

MPC:

For eliminating the key authority, a natural idea is to use MPC to replace it. However, naively using MPC as setup would introduce interaction between users, which MIFE doesn’t allow. Hence we can try Two-round MPC.

Two-round MPC:

Here is how it works:

  1. Round One: Assume the number of users n and the function to be computed f of arity n is predetermined. Each user feeds their index i and plaintext m into the first round MPC, and get the first round message \rho^{(1)} and a secret s.
  2. Round Two: All users publish their first round message \rho^{(1)}. Using all these first round messages, together with each secret s, each user computes the second round message \rho^{(2)}
  3. Compute Result: All users send their second round messages to an aggregator, and the aggregator computes the final result y.

However, naively using a two-round MPC does not suffice, as it introduces two issues:

  1. This template publishes fresh public parameters PP for each DK_f, but MIFE requires publishing PP only once.
  2. EK_i is not known given just the first round MPC messages, so users are not able to encrypt.

Solution to 1: Function re-runnable two-round MPC:

We require that the first round MPC message to be sent only once, and reused for all subsequent second round messages. This property is called “function re-runnable”. Moreover, we require the number of users to be declared during the second round of MPC, instead of at the time that the first round is sent. This property is called “function delayed”.

Solution to 2: Delaying Encryption:

We allow the users to delay encryption until EK_i are known. During the setup, each user runs a FE scheme to get the FE master key. This master key is used to compute the first round MPC message. Additionally, each user encrypts their input m_i using FE.Encrypt.

aMIFE Interaction Pattern:

We want to let each user run a local setup, which replaces the trusted global setup in standard MIFE. In addition, we want the number of users to be undetermined in the setup procedure, so that it allows users to join or leave in the computation process. For simplicity, we present the aMIFE in the private-key setting:

  1. Each user i runs a local setup to generate some public parameters and a private encryption key EK_i.
  2. Each user publishes their public parameters, and encrypts their input x using the private encryption key only, nothing else.
  3. Using their private encryption keys, users can issue partial decryption keys to the aggregator. Each partial decryption key is generated by the public parameters of l - 1 other users.
  4. If these l - 1 other users also issue the matching partial encryption keys for f to the aggregator, then the aggregator can decrypt the ciphertexts c_1, c_2, ..., c_l using all partial decryption keys PDK_{1,f}, PDK_{2,f}, ..., PDK_{l,f}. The final result is y = f(m_1, m_2, ..., m_l)

But how to replace the global setup using local setup? The idea is to nest FE, MIFE and Two Round MPC together.

aMIFE for General Functionalities:

Now we are ready to present the complete aMIFE scheme for general functionalities:

  1. Local Setup: Each user runs FE scheme to generate a FE master secret key MSK_{FE}. Feed this MSK_{FE} into the first round MPC to get the first round protocol message \rho^{(1)} and a secret s. Set public parameter PP = \rho^{(1)} and master secret key MSK = (MSK_{FE}, s). Note that the global setup in standard MIFE is bypassed using FE and first round MPC.
  2. Key Generation: Suppose l users are chosen to participate in the computation, then these l users publish their public parameters and master secret keys. Each user runs second round MPC to generated a second round message \rho^{(2)}, which is the partial decryption key PDK_i.
  3. Encryption:
  4. Decryption:

“Ad hoc Friendliness”:

We say a MIFE scheme is “ad hoc friendly” if it satisfies the following three conditions:

  1. Decentralized Setup: The encryption keys EK_i and the partial master secret key MSK_i for each user can be generated locally.
  2. Local Encryption: Each user is able to encrypt using their encryption key $EK_i$ and plaintext x, nothing else. The encryption process is fully independent from the number of users n.
  3. Piecewise Master Secret Key: If we restrict the master secret key MSK = \{MSK_1, MSK_2, ..., MSK_n\} to some subset S with |S| = l, the subset has the same distribution as a master secret key generated for functions of arity l.

Yes, There is an Oblivious RAM Lower Bound!

Written by Chenyang Yu, for CS 282B final presentation.

Readings:

  1. Simons Institute Talk: “Oblivious RAM” by Boyle
  2. Software protection and simulation on oblivious RAMs” by Goldreich and Ostrovsky, we will refer it as [GO96]
  3. Is There an Oblivious RAM Lower Bound?” by Boyle and Naor
  4. Simons Institute Talk: “Yes, There is an Oblivious RAM Lower Bound!” by Larsen
  5. Yes, There is an Oblivious RAM Lower Bound!” by Larsen and Nielsen

In particular, number 5 is the paper that I am presenting (picked from CRYPTO 2018).

Terminology:

Online: The content of the operation sequence is given to the ORAM one at a time.

Offline: The ORAM is given the entire operation sequence ahead of simulation time.

Bandwidth overhead: The multiplicative factor extra memory blocks that must be accessed to hide the operation sequence. The formula will be given later. When we say “upper bound” or “lower bound” of ORAM, we always mean the bound of amortized bandwidth overhead.

Balls in bins: The data items are modeled as “balls”, and CPU registers and server-side memory are modeled as “bins”, and the set of allowed data operations consists only of moving balls between bins, in other word, shuffling.

Active: Allow the server to perform untrusted computation on behalf of the user.

Passive: Doesn’t allow the server to perform untrusted computation on behalf of the user.

Overview:

ORAM was first introduced for hiding memory access pattern in a software protection context. The main performance metric of an ORAM is the bandwidth overhead. In lectures, we studied square root solutions, hierarchy solution, and so on, in great details. Those models were built for pushing down the upper bound. Today, we are going to discuss the lower bound. In [GO96], we already had a \Omega(log n) lower bound. Is there anything we can do to further reduce the lower bound?

Directions:

There are three directions to think about for further exploring lower bound:

  1. [GO96] lower bound only applies to “balls in bins” algorithms, so we may consider to remove this assumption. Boyle and Naor removed this assumption and showed it will result in super linear lower bounds for sorting circuits. To overcome this barrier, they proposed an “online” ORAM. They also argued that most implementations of ORAM fall into the “online” category.
  2. [GO96] lower bound holds even for offline ORAM. Boyle and Naor pointed out that most practical constructions of ORAM are online, so we may hope to find a better lower bound for online ORAM only.
  3. [GO96] lower bound crucially relies on the statistically closeness of access pattern distributions for any two inputs. In many cases, statistical security might be stronger than necessary, so we may consider to relax this assumption to computationally secure, which is sufficient in real world application. Note that in [GO96], authors pointed out that in practice, one can implement such ORAM using PRFs, so the result is computationally secure only. But the lower bound was proved assuming access to random oracle, so that the proof itself relies on statistical closeness. Hence by “relaxing this assumption”, I mean for proving lower bound, not for implementation.

All together, in this paper, we are going to prove a lower bound for non-“balls in bins” ORAM that is online, passive and computationally secure only. That is, the ORAM algorithm can be arbitrary, the adversary must be efficient, and it holds only for online ORAM.

Definitions and Notations:

An online ORAM supports the following two operations:

  1. write(a, data): Store data in the block of address a, where a \in [n] and data \in \{0.1\}^{r}.
  2. read(a): Return the contents of the block of address a.

We refer to blocks at the server as “memory cells“, where each cell contains w bits. We refer to blocks in the ORAM array as “entries“, where each block contains r bits. Note that this “r” is just the length of “data” as defined above. We also let the client memory to have m bits. Note that if after N accesses the ORAM makes M probes, then the bandwidth overhead is Mw/Nr.

Let y be an operation sequence: y := (op_{1}, ... , op_{M}), where op_{i} \in \{write, read\}. Let A(y) := (A(op_{1}), A(op_{2}), ... , A(op_{M})) denote the memory access pattern to the server from the online ORAM, where each A(op_{i}) represents the list of addresses of the memory cells accessed at the server while processing operation op_{i}.

An online ORAM is secure in the following sense: for two distinct operation sequences y and z with length M, A(y) and A(z) are computationally indistinguishable.

Main Theorem (informal):

Theorem: In such setting, any online ORAM must have an expected amortized bandwidth overhead of \Omega(log(nr/m)) on sequences of \Theta(n) operations.

Comment: Note that this theorem holds in the random oracle model, requiring only computational indistinguishability, holds for any w and allows for arbitrary representations of the data in memory. For the natural setting of parameters r \leq m \leq n^{1-\epsilon} for arbitrary small constant \epsilon > 0, the lower bound simplifies to \Omega(log n).

Proof Strategy:

Note that above definition is very close to the definition of an oblivious data structure, which is defined as follows:

Definition: In the array maintenance problem, we must maintain any array B of n r-bit entries under the following two operations:

  1. write(a, data): Set the contents of B[a] to data, where a \in [n] and data \in \{0,1\}^{r}.
  2. read(a): Return the contents of B[a].

Hence the idea is to prove a lower bound for oblivious data structures solving the array maintenance problem, and then use the same technique to prove the same lower bound for online ORAM.

Models:

Data Structure lower bounds are typically proved in the cell probe model of Yao. The model is very similar to standard RAM, the difference is that the computation is free and we only pay for memory accesses. To meet our goal, we will define our model as “oblivious cell probe model“. Based on this model, we will use “information transfer” method. This method comes from the paper “Logarithmic Lower Bounds in the Cell-Probe Model“.

The first page of the handout contains a graph of such model. Each operation can be a read or a write. Each operation op makes read and write probes to the server memory. The sequence of probes made during an operation op is called the access pattern of op and is written as A(op). Words in the server memory are called cells. Each cell is w bits. The ORAM is restricted to m bits of storage between two operations. During an operation it can use unlimited storage.

Yao’s Garbled Circuit: An Introduction

Citation:

  1. “A Gentle Introduction to Yao’s Garbled Circuits” by Sophia Yakoubov
  2. UCLA CS 282A Notes by Rafail Ostrovsky, part 15

In this post, we will discuss some of the very basics of Yao’s Garbled Circuit, as it is indeed an important prerequisite that appears a lot in other protocols.

Problem:

Consider two people Ginny and Evan. Ginny and Evan wish to figure out whether they should collaborate or not, but neither of them wants the other person to know his/her attitude.

Setup:

In other word, let Ginny holds a bit g and let Evan holds a bit e. The bit has value 1 if he/she agrees to collaborate, or has value 0 if he/she disagrees. They want to design a protocol such that both of them learn the value of g \wedge e at the end of protocol, but neither Ginny or Evan learns the other person’s bit during the protocol. In this scenario, we call Ginny the “garbler”, who is in charge of “garbled gate generation”. We call Evan the “evaluator“, who is in charge of “garbled gate evalution“.

Garbled Gate Generation (by Ginny):

The idea is to design a “garbled gate” and generalize it into garbled circuit. Garbling basically means to obfuscate boolean gate truth table. Ginny picks four random strings as “labels“, each corresponds to one of g = 0, 1 or e = 0, 1.

step 1: encryption

We pick a label from the g side and a label from the e side to form a label pair. Now we have four labels pairs, and together they represent the boolean truth table. Each label pair is used as input for a key derivation function H, the output H(label 1, lable 2) is a symmetric encryption key, and the key is used to encrypt the output of g \wedge e. Hence now the garbled outputs (ciphertexts) takes the form Enc(H(label 1, label 2), g \wedge e) after obfuscation.

step 2: permutation

Simply permute the garbled outputs, so that the garbled gate appears to have random order.

Garbled Gate Evaluation (by Evan):

Evan now needs to decrypt the ciphertext corresponding to the real values of g and e, so he needs to receive those two labels with the real values of g and e from Ginny. We call these two labels “label(g)” and “label(e)” respectively.

step 1: sending label(g)

Since Ginny knows g, she knows which label is label(g). Since labels are just random strings picked by Ginny, Evan is not able to learn anything about g from label(g) itself. So Ginny could just send her label(g) to Evan.

step 2: sending label(e)

Since Ginny doesn’t know e this time, she doesn’t know which label is label(g). She can’t send both labels corresponding to e to Evan, because then Evan can decrypt two ciphertexts. Evan can’t just ask for the one he wants, since he doesn’t want Ginny to learn the value of e. Hence they choose to use “oblivious transfer (OT)” to pass label(e).

From Gates to Circuits:

In reality, Ginny and Evan may wish to compute much more complicated functions using this protocol. In this case, Ginny will garble the entire function circuit, instead of a single gate. Now garbled gate becomes garbled circuit.

Oblivious RAM Done Right (1): Overview

Reading:

  1. UCLA CS 282A Notes

In this section, we will discuss some preliminaries for the study of Oblivious RAM (ORAM). We will clarify the definition of security of remote data storage problem, and provide some basic solutions to the problem.

1. Definition of Security (of Remote Data Storage)

Consider two companies A and B. A runs a cloud storage service, and B is considering to use this service. B would like to do so because leasing remote storage from A is much cheaper than buying machines and store data locally. However, B doesn’t trust A completely. B is afraid of the loss of privacy after uploading these data completely to A. In this scenario, we call company A “cloud” and company B “user. Here is how we define what is meant to be secure:

  1. Privacy: cloud is not able to read user’s data in clear.
  2. Mutation of data: cloud is not able to change user’s data.
  3. Loss of data: user is able to check if any data is lost or missing.
  4. Re-introduction of old data: user is able to check if any old data is introduced as new data by cloud at some later date.
  5. Invisible Access Pattern: If user performs a series of read/write operations, cloud is not able to the sequence of user’s operations.

If above criteria hold, we say this remote data stroage service is secure. Here is how we do to make them come true:

  1. Privacy: Encryption
  2. Mutation of data: Digital Signature Scheme
  3. Loss of data or re-introduction of old data: Hash functions and Merkle trees
  4. Invisible Access Pattern: ORAM

In the following blogs, we focus on 4, which is the study of ORAM. For a trivial solution, suppose user stores N piceces of data in the cloud, and cloud could pass all data back to user one by one in M separate times. User could uses local memory to hold them, perform read/write on relavant data, encrypt them again and pass them back to the cloud. This is the worst solution to solve this problem, and it has overhead O(N). Next, we present a solution with overhead O((logM)^4).

2. A Poly-log Solution:

Data Structure:

We refer user’s data as “words”. A word is a 2-tuple (i, w_{i}), where i is the location of data and w_{i} is the actual data at location i. Cloud creates arrays A_{i} with length 2^i, where 2 \leq i \leq, so the first array has length 4, second array has length 7, third array has length 16, and so on. Each array is called a “buffer“, and each entry of an array is called a “bucket“. Each bucket holds logM words. For each buffer A_{i}, user will use a secret seed and a pseudo-random function (PRF) to generate a hash function. These hash functions obfuscate the location of each piece of data, so cloud is not able to learn anything from the storing patten. The last buffer (also the longest) is called the “Storage Array”. It holds all N pieces of data that user uploaded at beginning.

Initialization:

Accesing Data (Read/Write):