Describe and analyze an algorithm that finds the maximum-area rectangular pattern that appears more than once in a given bitmap. Specifically, given a two-dimensional array M[1 .. n, 1 .. n] of bits as input, your algorithm should output the area of the largest repeated rectangular pattern in M. For example, given the bitmap shown on the left in the figure below, your algorithm should return the integer 195, which is the area of the 15 x 13 doggo. (Although it doesn’t happen in this example, the two copies of the repeated pattern might overlap.) Image: enter image description here
1 Answers
Using term rectangle instead of sub-matrix.
Brute Force Approach O(n^8)
Take two distinct positions A and B in matrix. Then consider all rectangles in the matrix with A as top left corner. Similarly consider all rectangles with B as top left corner. Suppose we have a rectangle with top-left corner A of length l and breadth b, then take corresponding rectangle with top left corner B, length l, breath b (Assuming both such rectangles exist). If every bit of both rectangles match, then we have a pattern of area of lb repeating and if lb is greater than previously seen maximum area, update the maximum area.
Repeat this procedure for all pairs of A and B.
Time Complexity : Without making analysis complex, let us assume that for every point P in matrix,for all rectangles with P as top left corner, possible maximum length of rectangle is n (But this is not true, for example consider point in the last row and last column of matrix). Let us make similar assumption for maximum possible breath of rectangle.
Then from product rule in counting, there are n^2 possible rectangles with point P as their top left corner.
This estimate is not so bad because according to this, there are total n^4 rectangles possible, as there are n^2 points in the rectangle and we have n^2 rectangles per point.
Actual answer is ( (n+1) Choose 2 ) * ( (n+1) Choose 2 ), which is in the order of n^4
So in total, for each pair, we'd compare n^2 rectangles. And similar to above, let us estimate that there are n^2 points in each rectangle for easy calculations. So for a pair of points A and B, we'll have O(n^4) comparisons because for each rectangle, we need to check all of their corresponding points.
And there are ( n^2 Choose 2 ) pairs of points in total, which is of order n^4. So we'd have overall time complexity of O(n^8).
Avoiding Repeated Calculations O(n^6)
We see that we are repeatedly checking same area again and again. For example consider two rectangles with top-left corners at A and B respectively, with length l and breadth b. Again when we check another corresponding rectangles with top-left corners at A and B with length (l+1) and breadth b, we are again checking rectangle of length l and breadth b.
So we use memoization to avoid repeated calculations.
Assume length of a rectangle is measured horizontally and breadth vertically measured.
Consider rectangles of length l+1 and breadth b with top-left corners at A and B. Now we need to compare if all values of corresponding points in rectangles match. A_r, B_r be points to right of A and B respectively. Then if rectangles have same values for all corresponding points, rectangles of length l, breadth b with top left corners A and B must repeat. Similarly rectangles of length l and breadth b with top-left corners at A_r and B_r must match.
Consider below figure :
Time Complexity By above procedure, it'd take O(1) time to compare two rectangles. So compared to above case, time complexity is reduced by a factor of n^2 (which was time required to compare rectangles in earlier case). So O(n^6) time in this case.
Let us represent (P, m, n) as rectangle with top-left corner P, length m and breadth m.
Removing unnecessary calculations O(n^5)
In the above approach, even if we know that if rectangles (A, l, b) and (B, l, b) do not match we again compare the rectangles (A, l+1, b) and (B, l+1, B).
So suppose now we have rectangles with top-left corners A and B each of length l, then what is maximum possible breadth of the rectangles?
If all the corresponding elements in top row of each rectangle do not match, then answer is 0. But if all corresponding rows match, then the answer is 1 + (max.breadth of rectangles of length l but with top-left corners below A and B).
And similar to above approach this calculation would take O(1) time, as we need breadth that of (A, B, l-1) and that of below rectangles with length l.
Refer to this answer Largest Square Block for more clarity.
Time Complexity For each pair of points in the matrix, we have to store max. breadth for length 1, 2, ...n. And there are order of n^4 such pairs. And we can check obtain each value in O(1) time. So overall time complexity is O(n^5). And finding max. area for each length of rectangle for each pair is obvious here.
Further Optimization O(n^4)
Imagine you have two copies of bitmaps. One is glued to the ground and other can move on top the glued one. Let us call fixed one as base and other one as moving bitmap.
Now top-left corner of moving bitmap can be on one of (n^2 - 1) points of base, except the case where moving bitmap sits on top of base. Now in each case, there are some points left-out i.e for some points on base bitmap, there will not be a point of moving one on top of it and vice-versa. Assume top-left corner of moving bitmap needs to have an element of base bitmap below it.
Now take one instance of these (n^2 - 1) configurations. And for all points on moving bitmap which have a point of base bitmap below them, let us construct a new matrix such that it contains "Y" if top and bottom element of both the bitmaps are same, else it'd contain "N". And remember, the size of matrix is same as no of elements on moving bitmap which have an element of base bitmap below them.
Then, maximum area of repeated pattern for that portions of base and moving bitmaps would be maximum solid block area containing all Y's, which can be done in O(n^2) time.
Our required answer is maximum of answers in all these (n^2 - 1) configurations.
Refer to Largest rectangle containing all Y's for further clarity
Time Complexity For each configuration, constructing new matrix of "Y" and "N"'s would take O(n^2) time and maximum area calculation also O(n^2) time. And there are (n^2- 1) such configurations, so overall O(n^4) time.
Note Jeff Erickson has stated that the answer can be computed in O(n^3 polylog n) time. I don't know how to do that. It would be good if anyone posts such algorithm.
- 41
- 6