Sunday, November 11, 2007

Interview Questions

After attending a lot of interviews I found that most interviewers are obsessed with Hashmap and like to test your understanding on the algorithm complexity and how hashmap reduces the complexity to O(n). The list problem shown below is an interesting problem.

The List Problem


Its a search problem where you need to find if a small list is contained in a bigger list. The example below shows the goal of the problem.
Say you have two lists:

smallList = "red", "green";
largeList = "red", "green", "blue";

smallList is contained in largeList for the above data.

If you change the smallList like
smallList = "red", "green", "green";
smallList is not contained in largeList. The order of the strings does not matter. What is the best way to implement a function which will return a boolean value indicating if the smallList is contained in largeList.

Solution 1: worst case implementation
for (int i=0; i < smallList.size;i++) {
found = false;
for (int j=0;j < largeList.size;j++) {
if (smallList[i] == largeList[j]) {
largeList.remove(i);
found = true;
// slight improvement would be
// break; here
}
}
if (found == false) {
return false;
}
}
return true;

Complexity is O(n pow 2)
memory usage is good - since it just uses a boolean variable called found

Solution 2: Similar to solution 1 - if the method largeList.remove is not available
largeListMatches = new boolean[largeList.size]; //assume all the values are initialized to false

for (int i=0; i < smallList.size;i++) {
found = false;
for (int j=0;j < largeList.size;j++) {
if (smallList[i] == largeList[j]) {
if (largeListMatches[j] == true) {
continue;
}
largeListMatches[j] = true;
found = true;
}
}
if (found == false) {
return false;
}
}
return true;

Complexity will be the same - around O(n pow 2). Memory usage will be more since an array of boolean (of size largeArray) is used. So the memory required will be sizeof LargeArray + sizeof smallArray + 1byte * length of LargeArray + 1 byte for the boolean found variable

Solution 3: Use a hash map to hash the large list
for (int j=0;j < largeList.size;j++) {
if (value = largeHash.get(largeList[j])) {
largeHash.set(largeList[j], ++value);
} else {
largeHash.set(largeList[j], 1);
}
}
for (int i=0; i < smallList.size;i++) {
if (value = largeHash.get(smallList[i])) {
if (value == 0) {
return false;
} else {
largeHash.set(largeList[j], --value);
}
} else {
return false;
}

}
return false;

Complexity will be reduced to O(n) + O(n) which is O(n). Memory usage will be really high. For an array list of high value like 10000, etc this method will be infeasible. This is the most efficient algorithm - but assumes the system has a high amount of memory
Memory required is 2*(sizeof LargeList) + sizeof (smallList) + integer array of length largeList

Solution 4: Use binary search to find the elements. the largeList has to be sorted first. Any of the sorts like quick sort or merge sort can be used - complexity of O(nlogn). Binary search has a complexity of O(logn) - but it is for each element of small list. O(m*logn)
Overall complexity is O(nlogn) + O(m*logn) = O(nlogn)
Memory usage is equal to the size of largeList + smallList

quicksort(largeList); //or mergesort(largeList);
for (int i=0; i < smallList.size;i++) {
if (binarySearch(smallList[i], largeList) == true) {
largeList.remove(smallList[i]);
} else {
return false;
}
}
return true;

Tree problem

I don't remember the exact question - but here it is:
You have a class called Data Node.
class DataNode {
int id;
int parentId;

DataNode parentNode;
DataNode[] childNodes;
}

Define an efficient method called
public DataNode updateTree(DataNode[] dataNodes)
which will update all the parentNode and childNodes assuming that the id and parentId of all the nodes are already set. The parentId of the root node is set to zero.

Though the question is designed badly asking you to write a method inside a node - it was interesting otherwise. An example will things make more clear

Suppose you have a tree like

18
5 3

The DataNodes array will have
18, 0,NULL, NULL
5, 18, NULL, NULL
3,18, NULL, NULL


The goal of the question is to update the references parentNode and childNodes. So instead of NULL you will be able to navigate the tree structure.

Solution:
The O(n pow 2) solution is easy to design. The other solution is something to do with the thing I said in the top of the article :)

1 comment:

Mg9H said...

For the List problem, solution #4 can be improved to reduce the constant hidden under big-Oh for nlogn:


if smallList.size() > largeList.size() return false;

sort(smallList);
sort(largeList);

int i = 0;
int j = 0;
while (i < smallList.size()) {
// find smallList [i]
while (j < largeList.size() && largeList [j] < smallist [i]) ++j;

// element not found, break
if (j >= largeList.size() || !largeList [j].equals(smallList [i])) return false;

// otherwise, increase both indices
++i, ++j;
}

return true;

The run time of this solution would be O(mlogm) + O (nlogn) + O (m+n) which is sitll O (nlogn) but the constant for the nlogn part is now mainly depending on the sorting algorithm. Whilst solution #4 in your post, the constant for nlogn would be C (from the sort algorithm) + 1 (for the binary search). Not to mention that each call to method "remove" on a list, could lead to a O(n) runtime depending on which kind of list is being used.