Tuesday, March 24, 2020

HackerRank > Interview Preparation Kit > Graphs > Find the nearest clone

Link to problem statement on HackerRank

We are given a tree that each of its nodes have some color and also a wantedColor. We should find the distance between two nearest nodes with wantedColor.

First approach: DFS on Tree
What happened in my mind : After I read the problem statement, I tried different examples on paper. I excluded the obvious case where number of wantedColors is less than 2, then the answer is -1.
First I was trying to solve it using BFS, starting from all nodes with wantedColor, but I couldn't finalize this approach. Then I thought to myself this looks like finding diameter of the tree. Diameter is the longest distance between any two nodes in a tree, in this problem we want the shortest distance that connects any two wantedColors. Then I tried to make the solution clear. I defined int dfs(node, parent) which returns the distance of node to the nearest node with color wantedColor. The initial value for dfs(node, parent) would be zero if node's color is wantedColor, otherwise it would be infinity. Later after I do dfs for all children of node I would update this value.
Then I call dfs for all of children of node and keep those answers in a vector, and sort them increasingly. Other than node itself, the nearest node with wantedColor to it would be childDistance[0] + 1, because childDistance[0] is the nearest node with wantedColor to that child and (+1) for the edge between node and that child. This is how the return value for dfs(node, parent) is calculated.
For the final answer we have some global variable called minDistance which is updated in each dfs, but how?


If node has no child, we update nothing.
If node has at least one child and node's color is wantedColor, the distance between node and left wantedColor is a candidate for the shortest path between two wantedColors, so I update minDistance.
If node has at least two children, then the value returned by child1 and child2 + 2 (2 edges between child1<->node<->child2) is another candidate to update minDistance. (You see it as the green line in picture.)

For more details about approach one please refer to my code on github.



Second approach : lots of BFSs :D (passes the test data, but is not correct)
Let's do what has been asked us in a straightforward way. From each node with color wantedColor do a BFS and as soon as you see another node with color wantedColor break the loop of BFS. If the number of nodes with wantedColor is 2, it's ok, right? because we do 2 BFSs and each checks at most all the edges, as the number of nodes with wantedColor increases, we break sooner in the BFS loop.
Right now after writing this paragraph, I tried to calculate the order of this approach. This is how I approached calculating the order, I tried to count the number of times one node is added to the queue. The answer is huge number of times :D.


This is the anti-test for this approach. If we start from each green node, all of the other nodes would be added to the q before another green is added to the queue. That would be 500,000 x 500,000 which is 250,000,000,000, which is not good to pass the time limit.

For more details on this "wrong" approach please look at my code on github.



Third approach : one BFS
Another approach is to do only one BFS, but instead of starting from one source, we should start from all nodes with wantedColor. When we take out node X of the queue and want to check it's neighbors, if we find some neighbor Y which Y != parent[X], then it's the first time two paths starting from two different wantedColors meet, that's it, this is the shortest path two wantedColors can have from each other.

Here is my code for the third approach on github.



No comments:

Post a Comment

USACO - Prime Palindromes

I just skimmed the problem statement and panicked of the high boundary of the input, but something inside told me don't worry everyth...