Friday, September 16, 2011

No. 04 - Paths with Specified Sum in Binary Tree


Question: All nodes along children pointers from root to leaf nodes form a path in a binary tree. Given a binary tree and a number, please print out all of paths where the sum of all nodes value is same as the given number. The node of binary tree is defined as:

struct BinaryTreeNode
{
    int                    m_nValue;
    BinaryTreeNode*        m_pLeft; 
    BinaryTreeNode*        m_pRight;
};

For instance, if inputs are the binary tree in Figure 1 and a number 22, two paths with be printed: One is the path contains node 10 and 12, and the other contains 10, 5 and 7.
Figure 1: A binary tree with two paths where the sum of all nodes value is 22: One is the path contains node 10 and 12, and the other contains node 10, 5 and 7
Analysis: Path in a binary tree is a new concept for many candidates, so it is not a simple question for them. We may try to find the hidden rules with concrete examples. Let us take the binary tree in Figure 1 as an example.

Since paths always start from a root node, we traverse from a root node in a binary tree. We have three traversal orders, pre-order, in-order and post-order, and we firstly visit a root node with pre-order traversal.

According to the pre-order traversal process on the binary tree in Figure 1, we visit the node 5 after visiting the root node 10. Since a binary tree node does not have a pointer to its parent node, we do not know what nodes have been visited when we reach the node 5 if we do not save the visited node on the path. Therefore, we could insert the current node into a path when we reach it during traversal. The path contains two nodes with value 10 and 5 when we are visiting the node 5. Then we insert the node 4 into the path too when we reach it. We have reached a leaf node, and the sum of three nodes in the path is 19. Since it is not same as the input number 22, current path is not a qualified one.
We should continue to traverse other nodes. Before visiting other nodes, we should return back to the node 5, and then reach the node 7. It can be noticed that node 4 is no longer in the path from node 10 to node 7, so we should remove it from the path. When we are visiting node 7, we insert it into the path, which contains three nodes now. Since the sum of value of these three nodes is 22, the path is qualified.

Lastly we are going to visit the node 12. We should return back to node 5 then back to node 10 before we visit node 12. When we return back from a child node to its parent node, we remove the child node from the path. When we reach the node 12 eventually, the path contains two nodes, one is node 10 and the other is node 12. Since the sum of value of these two nodes is 22, the path is qualified too.

We can summarize the whole process above with the Table 1:

Step
Operation
Is it a leaf?
Path
Sum of nodes value
1
Visit node 10
No
node 10
10
2
Visit node 5
No
node 10, node 5
15
3
Visit node 4
Yes
node 10, node 5, node 4
19
4
Return to node 5

node 10, node 5
15
5
Visit node 7
Yes
node 10, node 5, node 7
22
6
Return to node 5

node 10, node 5
15
7
Return to node 10

node 10
10
8
Visit node 12
Yes
node 10, node 12
22
Table 1: The process to traverse the binary tree in Figure 1

It is time to summarize some rules with the example above. When we visit a node with pre-order traversal order, we insert it into the path, and accumulate its value to sum.  When the node is a leaf and sum is same as the input number, the path is qualified and we need to print it. We continue to visit its children if the current node is not a leaf. After we finish visiting the current node, a recursive function will return back to its parent node automatically, so we should remove the current node from the path before a function returns to make sure the nodes in the path is same as the path from root node to its parent node. The data structure to save paths should be a stack, because paths should be consistent to the recursion status and recursion is essentially pushing and popping in a call stack.
Here is some sample code for this problem:

void FindPath(BinaryTreeNode* pRoot, int expectedSum)
{
    if(pRoot == NULL)
        return;

    std::vector<int> path;
    int currentSum = 0;
    FindPath(pRoot, expectedSum, path, currentSum);
}

void FindPath
(
    BinaryTreeNode*   pRoot,       
    int               expectedSum, 
    std::vector<int>& path,        
    int               currentSum
)
{
    currentSum += pRoot->m_nValue;
    path.push_back(pRoot->m_nValue);

// Print the path is the current node is a leaf
// and the sum of all nodes value is same as expectedSum
    bool isLeaf = pRoot->m_pLeft == NULL && pRoot->m_pRight == NULL;
    if(currentSum == expectedSum && isLeaf)
    {
        printf("A path is found: ");
        std::vector<int>::iterator iter = path.begin();
        for(; iter != path.end(); ++ iter)
            printf("%d\t", *iter);
       
        printf("\n");
    }

    // If it is not a leaf, continue visition its children
    if(pRoot->m_pLeft != NULL)
        FindPath(pRoot->m_pLeft, expectedSum, path, currentSum);
    if(pRoot->m_pRight != NULL)
        FindPath(pRoot->m_pRight, expectedSum, path, currentSum);

    // Before returning back to its parent, remove it from path,
    path.pop_back();
}

In the code above, we save path with a vector in STL actually. We use function push_back to insert a node and pop_back to remove a node to assure it is “First in Last out” in a path. The reason we don’t utilize a stack in STL is that we can only get an element at the top of a stack, but we need to get all nodes when we print a path. Therefore, std::stack is not the best choice for us. 

The discussion about this problem is included in my book <Coding Interviews: Questions, Analysis & Solutions>, with some revisions. You may find the details of this book on Amazon.com, or Apress.
 
The author Harry He owns all the rights of this post. If you are going to use part of or the whole of this ariticle in your blog or webpages,  please add a reference to http://codercareer.blogspot.com/. If you are going to use it in your books, please contact me (zhedahht@gmail.com) . Thanks.

12 comments:

  1. Do you have any idea if the path does not have to start at the root ?

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. I think if the argument currentSum is passed by value instead of by reference, then you do not need the second last line of the code currentSum -= pRoot->m_nValue.

    ReplyDelete
    Replies
    1. Thanks for your feedback. I've revised the code accordingly.

      Delete
  4. Hi Harry,

    It seems that even if you get the desired Path by calling FindPath() for LeftChild, you will call this function again for RightChild.

    This should not happen in the success scenario.

    ReplyDelete
    Replies
    1. It's not a BST. The example makes it look like it, but the article's been talking about binary tree.

      Delete
  5. // Before returning back to its parent, remove it from path,
    path.pop_back();.....

    Do we need to also subtract the value from CurrentSum as we leave a node and move....?

    ReplyDelete
    Replies
    1. Hi Muthu,

      We don't need to subtract the value of CurrentSum as it's taken care by recursion stack..! :-)

      Delete
  6. Its simply superb.Feeling good to share the link to practice c# interview questions

    @ http://skillgun.com

    ReplyDelete
  7. Hi,
    One issue that i see here is, if current sum becomes equal to expected sum before the leaf node, then it will not be printed.

    We need to have
    if(current_sum == expected_sum) Print Path.

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. According to the problem statement - "All nodes along children pointers from root to leaf nodes form a path in a binary tree." Until we hit upon the leaf, the path is not complete according to the definition of path in the problem and hence a pre-matured sum is not considered within the scope of the solution.

    ReplyDelete