Thursday, January 16, 2014

In a given 2D binary array, find the area of the biggest rectangle which is filled with 1



   Given array

   1 1 0 1 0 1
   1 1 0 1 1 1
   0 1 1 1 1 1
   0 1 1 1 1 1
   0 0 1 1 1 0
   0 0 0 0 0 0

   Consider each block in the array is an unit and he columns in the array as hanging towers.
   Modify the array in a way that, on each row, the number denotes the height of each column.


   1 1 0 1 0 1
   2 2 0 2 1 2
   0 3 1 3 2 3
   0 4 2 4 3 4
   0 0 3 5 4 0
   0 0 0 0 0 0


   Basic Rules to find the biggest rectangle:
   1. For any tower, the boundaries on each side will be decided by its smaller towers. It means
       that all the towers on each side till we find a smaller tower will be included for the calculation.
   2. Area for the subset equals to (the height of the smaller tower) X (no. of towers in the subset)

   Calculate the area of each subset for each level in the array from top to bottom.

   Level[0] [set 0]              Level[0] [set 1]              Level[0] [set 2]

   1 1 0 1 0 1                   1 1 0 1 0 1                   1 1 0 1 0 1
   ---                                 -                                 -
   2 towers with 1 unit each     1 tower with 1 unit           1 tower with 1 unit
   Maximum Area=2x1=2            Maximum Area=1x1=1            Maximum Area=1x1=1

   Level[0] Maximum Area = 2
   -------------------------------------------------------------------------------
   
   Level[1] [set 0]              Level[1] [set 1]

   2 2 0 2 1 2                   2 2 0 2 1 2
   ---                                 -----
   2 towers with 2 units each    1 tower with 1 unit
   Maximum Area=2x2=4            2 towers with 2 units each
                                 Maximum Area=3x1=3

   For [set 1], even though we have 2 towers with 2 units each in this set, we cannot consider them 
   as a single unit because of the smaller unit's presence in between. So, each tower will be consider 
   as a different subset. But, for the tower with 1 unit has bigger towers around it. So, all the units 
   will be taken (but only one unit from each bigger towers) for the area calculation.

   Level[0] Maximum Area = 4
   -------------------------------------------------------------------------------

   and so on...



int findBigBox (int row, int column, int *matrix)
{
  int max_area = 0;
  int *score = (int *) calloc (row+1, sizeof(int));
  if (NULL != score)
  {
    for (int i=0; i<row; i++)
    {
      for (int j=0; j<column; j++)
      {
        /* Process only if we find 1 */
        if (*(matrix+(i*column)+j) != 0)
        {
          int count = 0;
          int start = j;
          /* Iterate through all the consecutive ones. */
          for (;*(matrix+(i*column)+j) != 0 && j<column; j++)
          {
            /* If it is not the first row, add this element 
               with the previous row element at the same column */
            if (i != 0)
            {
              *(matrix+(i*column)+j) += *(matrix+((i-1)*column)+j);
            }
            *(score+*(matrix+(i*column)+j)) = 1;
          }
          
          getScore (row, column, start, j-1, matrix+(i*column), score);

          for (int s=1; s<=row; s++)
          {
            if (*(score+s))
            {
              int area = s * *(score+s);
              if (max_area < area)
                max_area = area;
            }
          }
          /* Reset the score array to use for the next subset */
          memset (score, 0, (row+1)*sizeof(int));
        }
      }
    }
  }
  free (score);
  
  return max_area;
}
Here the process of finding the number of towers on each subset for each height is call SCORE. This SCORE process can be represented as a tree, which has the following rules, 1. Node structure -> Height -> Score -> Next -> Left -> Right 2. Same numbers on the same set which are non-contiguous will be added as the [Next] node. 3. Same numbers on the same set which are contiguous will increment the [Score] in the first node. 4. Taller tower on its left in the set will be added to the [Left] node (child). 5. Taller tower on its right in the set will be added to the [Right] node (child). 6. [Score] = [Left]->[score] + [Right]->[score] + no of the same [Height] in the set. Example: For the below given set, 3 3 2 3 2 3 1 3 1 3 2 3 1. --------- 2. --------- 3. --------- | H=3|S=1 | | H=3|S=2 | | H=2|S=3 | --------- --------- --------- /L --------- | H=3|S=2 | --------- In the third step, we move the number 2 as the root and 3 as its left as 3 falls on 2's left in the set and 3 is bigger than 2. 4. --------- | H=2|S=4 | --------- /L \R --------- --------- | H=3|S=2 | | H=3|S=1 | --------- --------- We move the number 3 as its right as 3 falls on 2's right in the set 5. --------- N --------- | H=2|S=5 | ---- | H=2|S=0 | --------- --------- /L \R --------- --------- | H=3|S=2 | | H=3|S=1 | --------- --------- We move the number 2 to the [Next] of the existing 2 as they are on the same set still. (Which means it is the ROOT node) 6. --------- N --------- | H=2|S=6 | ---- | H=2|S=0 | --------- --------- /L \R \R --------- --------- --------- | H=3|S=2 | | H=3|S=1 | | H=3|S=1 | --------- --------- --------- This 3 falls on the right side of the second 2, and increase the count on the first 2 7. --------- | H=1|S=7 | --------- /L --------- N --------- | H=2|S=6 | ---- | H=2|S=0 | --------- --------- /L \R \R --------- --------- --------- | H=3|S=2 | | H=3|S=1 | | H=3|S=1 | --------- --------- --------- and so on... Finally, traverse the tree to find the Maximum SCORE for each height.
void getScore (int row, int column, int start, int end, int *matrix_row, int *score)
{
  /* Intervals to be processed to calculate the score for the number */
  int *currentQ = (int *) calloc (column, sizeof(int));
  /* Intervals to be updated for the next number */
  int *nextQ = (int *) calloc (column, sizeof(int));

  /* Interval counts */
  int currentCount = 1;
  int nextCount = 0;

  /* Start with the whole given set */
  currentQ[0] = start;
  currentQ[1] = end;

  for (int i=1; i<=row; i++)
  {
    if (*(score+i))
    {
      int max = 0;
      for (int c=0; c<currentCount; c++)
      {
        int tmp_score = *(currentQ+(c*2)+1) - *(currentQ+(c*2)) + 1;

        /* Update the temporary max */
        if (max < tmp_score)
        {
          max = tmp_score;
        }

        /* Update the intervals for the next bigger number */
        for (int k=*(currentQ+(c*2)); k<=*(currentQ+(c*2)+1); k++)
        {
          /* Find the start of an interval */
          while (*(matrix_row+k) == i && k<=*(currentQ+(c*2)+1))
          {
            k++;
          }
          
          /* End of the current set */
          if (k > *(currentQ+(c*2)+1))
            break;
          
          *(nextQ+(nextCount*2)) = k;

          /* Find the end of that interval */
          while (*(matrix_row+k) != i && k<=*(currentQ+(c*2)+1)) k++;

          if (k > *(currentQ+(c*2)+1)) // End of the current set
          {
            *(nextQ+(nextCount*2)+1) = k-1;
          }
          else
          {
            *(nextQ+(nextCount*2)+1) = k-1;
          }
          nextCount++;
        }
      }

      int *tmp = currentQ;
      currentQ = nextQ;
      nextQ = tmp;
      
      currentCount = nextCount;
      nextCount = 0;
    
      /* Update the score */
      *(score+i) = max;
    }
  }
}