Friday, October 21, 2011

No. 10 - K-th Node from End


Problem: Get the Kth node from end of a linked list. It counts from 1 here, so the 1st node from end is the tail of list. 

For instance, given a linked list with 6 nodes, whose value are 1, 2, 3, 4, 5, 6, its 3rd node from end is the node with value 4.

A node in the list is defined as:

struct ListNode
{
    int       m_nValue;
    ListNode* m_pNext;
};

Analysis: In a list with n nodes, its kth node from end should be the (n-k+1)th node from its head. Therefore, if we know the number of nodes n in a list, we can get the required node with n-k+1 steps from its head. How to get the number n? It is easy if we scan the whole list from beginning to end.

The solution above needs to scan a list twice: We get the total number of nodes with the first scan, and reach the kth node from end with the second scan. Unfortunately, interviewers usually expect a solution which only scans a list once.

We have a better solution to get the kth node from end with two pointers. Firstly we move a pointer (denoted as P1) k-1 steps beginning from the head of a list. And then we move another pointer (denoted as P2) beginning from the head, and continue moving the P1 forward at same speed. Since the distance of these two pointers is always k-1, P2 reaches the kth node from end when P1 reaches the tail of a list. It scans a list only once, and it is more efficient than the previous solution.

Figure 1: Get the 3rd node from end of a list with 6 nodes

It simulates the process to get the 3rd node from end of a list with 6 nodes in Figure 1. We firstly move P1 2 steps (2=3-1) to reach the 3rd node (Figure 1-a). Then P2 points to the head of a list (Figure 1-b). We move two pointers at the same speed, when the P1 reaches the tail, what P2 points is the 3rd node from end (Figure 1-c).

The sample code of the solutions with two pointers is shown below:

ListNode* FindKthToTail(ListNode* pListHead, unsigned int k)
{
    if(pListHead == NULL || k == 0)
        return NULL;

    ListNode *pAhead = pListHead;
    ListNode *pBehind = NULL;

    for(unsigned int i = 0; i < k - 1; ++ i)
    {
        if(pAhead->m_pNext != NULL)
            pAhead = pAhead->m_pNext;
        else
        {
            return NULL;
        }
    }

    pBehind = pListHead;
    while(pAhead->m_pNext != NULL)
    {
        pAhead = pAhead->m_pNext;
        pBehind = pBehind->m_pNext;
    }

    return pBehind;
}

The discussion about this problem is included in my book <Coding Interviews: Questions, Analysis & Solutions>, with some revisions. You may find the details of this book on Amazon.com, or Apress.

The author Harry He owns all the rights of this post. If you are going to use part of or the whole of this ariticle in your blog or webpages,  please add a reference to http://codercareer.blogspot.com/. If you are going to use it in your books, please contact me (zhedahht@gmail.com) . Thanks. 

12 comments:

  1. if there are total 5 nodes instead of six then how will your method work?

    ReplyDelete
    Replies
    1. That would be a question for your interviewer. It would depend on what behavior was desired. I'd say return null myself, that would indicate no such node exists.

      Delete
  2. "We get the total number of nodes with the first scan, and reach the kth node from end with the second scan. Unfortunately, interviewers usually expect a solution which only scans a list once."

    And unfortunately obscuring the algorithm does not make it any faster. Obviously you need to scan the list once, simply because you need to find the end (the size). Then with a second iteration you go the required amount of steps to the final node. Now, if you count the computational steps, how is that any different to using two pointers "at the same time"? You can't use two pointers at the same time. If you use two pointers you still scan the list twice, isn't that obvious? What you do is simply doing both computations at once, with the full cost and absolutely no optimization and maybe even worse performance, due to potential caching and code flow issues.

    As Knuth put it: premature optimization is the root of all evil...

    ReplyDelete
    Replies
    1. Not to mention that using linkedlists in context of optimization is a bit ironic :D. If your interviewer asks you for optimizations, tell him not to use linkedlists and he will be pleasently surprised, especially if you can also explain the: Why?!

      Delete
    2. Damn, why can't you edit comments here :D... I actually looked at your code. It sounded much more obscure in the description so in this instance, you may have one advantage of writing it like this and that is for low "K" your CPU cache will still have the delayed list entries of the second pointer in the cache and thus your argument of "doing all at once" might actually fly, but not with your explanation and also only for small "K".

      Delete
  3. Like someone said above, this method basically implements traversing twice and only gives the illusion of one traversal; except in the case of a small value of K. My advice would be to tell the interviewer that if such a requirement is known beforehand, we can create an additional head node which will store the number of nodes in the list. When a new node is added, we can simply incement this count. That way we need to traverse the list only once and the problem is solved while designing the list itself.

    ReplyDelete
  4. In some languages you can get the size of the list I mean the number of nodes by simple statements using sizeof kind of functions. That would save one traversal.

    It could be good idea to use the arrays as a temp storage for elements while first pass and then use the indexes for kth element.

    ReplyDelete
  5. In some languages you can get the size of the list I mean the number of nodes by simple statements using sizeof kind of functions. That would save one traversal.

    It could be good idea to use the arrays as a temp storage for elements while first pass and then use the indexes for kth element.

    ReplyDelete
  6. When we have 5 we can use (n-k+1)-1 right but still we have to traverse for size which makes it two computations unless we are using some language like Java as Asu commented.

    So now:

    if(n.size == even){
    return (n-k+1)-1;
    }else{
    return (n-k+1);
    }

    I know I added one more computation just to keep it more generic for all lists.

    ReplyDelete
  7. We can use only one loop to traverse the list and find the node.
    Means, We will iterate the loop i < k && list!= NULL and in that we will increment one ptr by 1 ans second ptr by 2. then we the get the kth node.

    ReplyDelete
  8. We can use only one loop to traverse the list and find the node.
    Means, We will iterate the loop i < k && list!= NULL and in that we will increment one ptr by 1 ans second ptr by 2. then we the get the kth node.

    ReplyDelete
  9. Ref. Kunal's suggestion - "We can use only one loop to traverse the list and find the node" is not a bad but unfortunately the fact (problem) remains !
    Please understand that this is just a mere aberration in some sense and does nothing in effect to get rid of second traversal since the second pointer is indispensable here too.
    But nevertheless, the simplest and yet elegant approach going with the original solution given by Harry is quite ornamental. The other solutions that suggest to maintain the addresses of individual nodes as we traverse along the first time and so on still steal away the coding elegance and not to mention the overheads to maintain in my opinion.

    ReplyDelete