Saturday, October 29, 2011

No. 16 - Maximal Length of Incremental Subsequences


Problem: Given an unsorted array, find the max length of subsequence in which the numbers are in incremental order.

For example: If the input array is {7, 2, 3, 1, 5, 8, 9, 6}, a subsequence with the most numbers in incremental order is {2, 3, 5, 8, 9} and the expected output is 5.

Analysis: We try to get the maximum length of all increasing subsequences ending with each element in the array.

For instance, when we analyze the maximum length of all increasing subsequences ending with number 5, we compare it with each number before it. The first number 7 is greater than 5. If the subsequence comes from 7, 7 cannot be included into the subsequence. Therefore, its length is 1. The following three numbers 2, 3 and 1 in the input array before 5 are less than 5. If the previous number of 5 in the increasing subsequence is 2, the maximum incremental length is 2. If the previous number of 5 in the increasing subsequence is 3, the maximum incremental length is 3 (with number 2, 3, and 5) since the maximum length of all subsequences ending with number 3 is 2 (with number 2 and 3). Similarly, its maximum length is 2 if the previous number is number 1. The whole process can be summarized in Table 1.


7
2
3
1
5
8
9
6
7
1
NA
NA
NA
NA
NA
NA
NA
2
1
1
NA
NA
NA
NA
NA
NA
3
1
2
1
NA
NA
NA
NA
NA
1
1
1
1
1
NA
NA
NA
NA
5
1
2
3
2
1
NA
NA
NA
8
2
2
3
2
4
1
NA
NA
9
2
2
3
2
4
5
1
NA
6
1
2
3
2
4
1
1
1
Table 1: Process to get the maximum length of increasing subsequences in array {7, 2, 3, 1, 5, 8, 9, 6}. Numbers in each row indicate length of subsequences ending with row title with previous number in the column title. Highlighted numbers with yellow background are the maximum lengths.  The highlighted number 5 with yellow background and red font is the maximum length of all incremental subsequences in the input array.

We are ready to write code with the detailed analysis above. The following is sample code:

int LengthOfIncreasing(int* data, int length)
{
    if(data == NULL || length < 0)
        return 0;

    int* longest = new int[length];
    longest[0] = 1;

    int max = 0;
    for(int i = 1; i < length; ++ i)
    {
        max = 0;
        for(int j = 0; j < i; ++ j)
        {
            if(data[j] < data[i] && longest[j] > max)
                max = longest[j];
        }

        longest[i] = max + 1;
    }

    max = 0;
    for(int i = 0; i < length; ++ i)
        if(longest[i] > max)
            max = longest[i];
                                                                  
    return max;   
}

Even though a 2-D matrix is needed to analyze the process, only a 1-D array is necessary in our code to store the maximum length of subsequences ending with each number in the input array.

If we are familiar with dynamic programming, we can get a direct solution quickly in a formal way. We can define a function f(i) indicating the maximum length of all subsequences ending with the ith number in the input array (denoted as A[i]). We define another function g(i, j) which stands for the maximum length of incremental subsequences ending with A[i] coming from A[j]. The required output is max(f(i)) where 0<=i<n and n is the length of array. We can get f(i) with the following equation:

Actually, same code will be implemented based on this equation. The function f(i) is longest[i] in the code above. Therefore, it is identical to the previous solution.

The discussion about this problem is included in my book <Coding Interviews: Questions, Analysis & Solutions>, with some revisions. You may find the details of this book on Amazon.com, or Apress.
 
The author Harry He owns all the rights of this post. If you are going to use part of or the whole of this ariticle in your blog or webpages,  please add a reference to http://codercareer.blogspot.com/. If you are going to use it in your books, please contact me (zhedahht@gmail.com) . Thanks.   

15 comments:

  1. Hello, I am wondering why do you even need this 2-D matrix, when this problem can be solved using just one-dimensional arrays? Isn't it a memory overhead?

    Please see my C# code without 2D arrays below

    var a = new int[] { 7, 2, 3, 1, 5, 8, 9, 6 };
    var b = new int[a.Length];
    var prev = new int[a.Length];
    var maxLength = 0;
    var maxIndex = -1;

    b[0] = 1;
    prev[0] = -1;

    for (int i = 1; i < a.Length; i++)
    {
    b[i] = 1;
    prev[i] = -1;

    for (int j = 0; j < i; j++)
    {
    if (a[i] > a[j] && b[j] + 1 > b[i])
    {
    b[i] = b[j] + 1;
    prev[i] = j;
    }
    }

    if (b[i] > maxLength)
    {
    maxLength = b[i];
    maxIndex = i;
    }
    }

    var sb = new StringBuilder();
    var currentIndex = maxIndex;

    while (currentIndex != -1)
    {
    sb.Insert(0, a[currentIndex] + " ");
    currentIndex = prev[currentIndex];
    }

    Console.WriteLine("Max length = {0}", maxLength);
    Console.WriteLine(sb.ToString().Trim());


    And thanks for the blog, btw :)

    ReplyDelete
  2. Thank your Uladzimir Pashkevich for your comments. In my post, only a 1-D auxilary array (longest) is used.

    ReplyDelete
  3. just FYI, there is an algorithm that solves LIS in O(nlogn) time and O(n) space, check it out on wiki http://en.wikipedia.org/wiki/Longest_increasing_subsequence#Efficient_algorithms

    ReplyDelete
  4. Note that the example completely changes the problem. The use of the word 'sequence' is incredibly misleading; you're actually looking for SUBSETS starting at the ith node, not SEQUENCES. Finding sequences is trivial (simply walk the array tracking the longest sequential series of increasing numbers). When the problem becomes about SUBSETS, it becomes much more complicated.

    ReplyDelete
  5. ... you also have a memory leak.

    ReplyDelete
  6. Yes, the word subsequence is misleading

    ReplyDelete
  7. Java Program with O(n) - Single for loop without 2D matrix


    import java.util.LinkedList;
    import java.util.List;

    public class FindMaxLengthIncrementalArray {

    /**
    * @param args
    */
    public static void main(String[] args) {
    /*
    * For example: If the input array is {7, 2, 3, 1, 5, 8, 9, 6}, a
    * subsequence with the most numbers in incremental order is {2, 3, 5,
    * 8, 9} and the expected output is 5
    */
    int[] array = { 7, 2, 3, 1, 5, 8, 9, 6 };
    int max = -1;
    List subList = new LinkedList();
    for (int i = 0; i < array.length - 1;) {
    System.out.println(subList);
    if (array[i] > max && array[i] < array[i + 1]) {
    subList.add(array[i]);
    subList.add(array[i + 1]);
    max = array[i + 1];
    i += 2;
    } else if (array[i] > max && ((i + 1) == (array.length - 1))) {
    subList.add(array[i]);
    break;
    } else
    i++;
    }
    System.out.println(subList);
    }
    }

    ReplyDelete
    Replies
    1. Hey Edwin, your algorithm only works with some (very limited) subsets, not to be harsh but you missed a lot of possibly good inputs in it. You shouldn't be comparing two elements at a time, or at least you need more cases. What about an element that you should take, that is in front of an element that is very large, i.e. {1, 10, 2, 3, 4}? Your algorithm would take return 2, for the set {1, 10}.
      You also are failing to account for sequences that may be desirable but very spread out as well. For instance an input array that yours would not work with is this: {7, 2, 3, 1, 5, 4, 8, 5, 9, 6, 7 }. You can't actually do a O(n) solution to this, best is O(nlogn).

      Delete
  8. I can write O(1) without auxiliary memory
    //No. 16 - Maximal Length of Incremental Subsequences
    int MaxLength(int []arr)
    {
    int maxL = 1;
    int maxC = 1;
    for(int i = 1;i < arr.Length;i ++)
    {
    if(a[i] > a[i - 1])
    {
    maxC ++;
    }
    else
    {
    maxL = maxC > maxL ? maxC : maxL;
    maxC = 1;
    }
    }
    return maxL;
    }

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. This is a "C" code. Works fine though.

    #include
    #include
    void main()
    {
    int ar[10]={7,2,3,1,5,8,9},i,j,max=0,c=0;
    clrscr();
    for(i=0;i"<"6;i++)
    { c=1;
    for(j=i+1;j"<"7;j++)
    {
    if(ar[i]"<"ar[j])
    c++;
    }
    if(c">"max)
    max=c;
    }
    printf("%d",max);
    getch();
    }

    ReplyDelete
  11. import java.io.*;
    import java.util.*;
    import java.text.*;
    import java.math.*;
    import java.util.*;
    import java.io.*;
    class enterclass
    {
    public static void main(String[] args) throws Exception
    {
    BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
    String line=br.readLine();
    int N=Integer.parseInt(line);
    int arr[]=new int[N];
    int i=0;
    while(i!=N)
    {arr[i]=Integer.parseInt(br.readLine());
    i++;
    }
    Arrays.sort(arr);
    int count[]=new int[N];
    int j=0;
    for(int y=0;y<N;y++){count[y]=1;}
    int m=0;
    int a=arr[j];
    for(int l=1;l<N;l++)
    {a=a+l;
    for(int k=0;k<N;k++)
    {if(arr[k]==a)
    {++count[j];
    break;}
    else if(k==N-1){
    if(j<N-1)
    {a=arr[j++];

    l=0;}}
    }
    }

    Arrays.sort(count);
    System.out.println(count[N-1]+1);


    }
    }

    ReplyDelete
  12. this code is proper work for any number of size of array...u can easily find out Maximal Length of Incremental Subsequences...it's dyanamic algorithm

    ReplyDelete
  13. Hi,
    Does i make sense to use DP when we can traverse a "unary tree" i.e. the recursion has only on linear branch, and no bifurcations?

    public static int maxLenght(int[] arr) {
    return fml(arr, 0, Integer.MAX_VALUE,0, 0);
    }

    private static int fml(int[] arr, int idx, int last, int lenght, int maxReached) {
    if (idx == arr.length) {
    if (lenght > maxReached) {
    maxReached = lenght;
    }
    return maxReached;
    }

    if (last > arr[idx]) {
    if (lenght > maxReached) {
    maxReached = lenght;
    }
    lenght = 1;
    }


    return fml(arr, idx + 1, arr[idx], lenght + 1, maxReached);
    }

    ReplyDelete