Guides & Tutorials

**ftvfatboy** · 08-03-2006, 03:26 PM

Suboptimal Search Methods

--------------------------------------------------------------------------------

Suboptimal Search Methods

Video compression includes many steps and computations, so researchers have been looking for optimizations and faster algorithms, especially for steps that involve many calculations. One such step is the search for a block C in the previous frame to match a given block B in the current frame. An exhaustive search is time-consuming, so it pays to look for suboptimal search methods that search just some of the many overlapping candidate blocks. These methods do not always find the best match, but can generally speed up the entire compression process while incurring only a small loss of compression efficiency.

Signature-Based Methods: Such a method performs a number of steps, restricting the number of candidate blocks in each step. In the first step, all the candidate blocks are searched using a simple, fast distortion measure such as pel difference classification. Only the best matched blocks are included in the next step, where they are evaluated by a more restrictive distortion measure, or by the same measure but with a smaller parameter. A signature method may involve several steps, using different distortion measures in each.

Distance-Diluted Search: We know from experience that fast-moving objects look blurred in an animation, even if they are sharp in all the frames. This suggests a way to lose data. We may require a good block match for slow-moving objects, but allow for a worse match for fast-moving ones. The result is a block matching algorithm that searches all the blocks close to B, but fewer and fewer blocks as the search gets farther away from B. Figure 6.12a shows how such a method may work for maximum displacement parameters dx = dy = 6. The total number of blocks C being searched goes from (2dx + 1)·(2dy+1) = 13×13 = 169 to just 65, less than 39%!

Locality-Based Search: This method is based on the assumption that once a good match has been found, even better matches are likely to be located near it (remember that the blocks C searched for matches highly overlap). An obvious algorithm is to start searching for a match in a sparse set of blocks, then use the best-matched block C as the center of a second wave of searches, this time in a denser set of blocks. Figure 6.12b shows two waves of search, the first considers widely spaced blocks, selecting one as the best match. The second wave searches every block in the vicinity of the best match.

Quadrant Monotonic Search: This is a variant of locality-based search. It starts with a sparse set of blocks C that are searched for a match. The distortion measure is computed for each of those blocks, and the result is a set of distortion values. The idea is that the distortion values increase as we move away from the best match. By examining the set of distortion values obtained in the first step, the second step may predict where the best match is likely to be found. Figure 6.13 shows how a search of a region of 4×3 blocks suggests a well-defined direction in which to continue searching. This method is less reliable than the previous ones since the direction proposed by the set of distortion values may lead to a local best block, whereas the best block may be located elsewhere.

Dependent Algorithms: As has been mentioned before, motion in a frame is the result of either camera movement or object movement. If we assume that objects in the frame are bigger than a block, we conclude that it is reasonable to expect the motion vectors of adjacent blocks to be correlated. The search algorithm can therefore start by estimating the motion vector of a block B from the motion vectors that have already been found for its neighbors, then improve this estimate by comparing B to some candidate blocks C. This is the basis of several dependent algorithms, which can be spatial or temporal.

More Quadrant Monotonic Search Methods: The following suboptimal block matching methods use the main assumption of the quadrant monotonic search method.

Two-Dimensional Logarithmic Search: This multistep method reduces the search area in each step until it shrinks to one block. We assume that the current block B is located at position (a, b) in the current frame. This position becomes the initial center of the search. The algorithm uses a distance parameter d that defines the search area. This parameter is user-controlled with a default value. The search area consists of the (2d + 1)×(2d + 1) blocks centered on the current block B.

Three-Step Search: This is somewhat similar to the two-dimensional logarithmic search. In each step it tests eight blocks, instead of four, around the center of search, then halves the step size. If s = 3 initially, the algorithm terminates after three steps, hence its name.

Orthogonal Search: This is a variation of both the two-dimensional logarithmic search and the three-step search. Each step of the orthogonal search involves a horizontal and a vertical search. The step size s is initialized to (d + 1)/2, and the block at the center of the search and two candidate blocks located on either side of it at a distance of s are searched. The location of smallest distortion becomes the center of the vertical search, where two candidate blocks above and below the center, at distances of s, are searched. The best of these locations becomes the center of the next search. If the step size s is 1, the algorithm terminates and returns the best block found in the current step. Otherwise, s is halved, and a new, similar set of horizontal and vertical searches is performed.

One-at-a-Time Search: In this type of search there are again two steps, a horizontal and a vertical. The horizontal step searches all the blocks in the search area whose y coordinates equal that of block B (i.e., that are located on the same horizontal axis as B). Assuming that block H has the minimum distortion among those, the vertical step searches all the blocks on the same vertical axis as H and returns the best of them. A variation repeats this on smaller and smaller search areas.

Cross Search: All the steps of this algorithm, except the last one, search the five blocks at the edges of a multiplication sign “×”. The step size is halved in each step until it gets down to 1. At the last step, the plus sign “+” is used to search the areas located around the top-left and bottom-right corners of the preceding step. This has been a survey of quadrant monotonic search methods. We follow with an outline of two advanced search methods.

Hierarchical Search Methods: Hierarchical methods take advantage of the fact that block matching is sensitive to the block size. A hierarchical search method starts with large blocks and uses their motion vectors as starting points for more searches with smaller blocks. Large blocks are less likely to stumble on a local maximum, while a small block generally produces a better motion vector. A hierarchical search method is thus computationally intensive, and the main point is to speed it up by reducing the number of operations. This can be done in several ways as follows:
1. In the initial steps, when the blocks are still large, search just a sample of blocks. The resulting motion vectors are not the best, but they are only going to be used as starting points for better ones.
2. When searching large blocks, skip some of the pixels of a block. The algorithm may, for example, use just one-quarter of the pixels of the large blocks, one half of the pixels of smaller blocks, and so on.
3. Select the block sizes such that the block used in step i is divided into several (typically four or nine) blocks used in the following step. This way a single motion vector calculated in step i can be used as an estimate for several better motion vectors in step i + 1.

Multidimensional Search Space Methods: These methods are more complex. When searching for a match for block B, such a method looks for matches that are rotations or zooms of B, not just translations. A multidimensional search space method may also find a block C that matches B but has different lighting conditions. This is useful when an object moves among areas that are illuminated differently. All the methods discussed so far compare two blocks by comparing the luminance values of corresponding pixels. Two blocks B and C that contain the same objects but differ in luminance would be declared different by such methods.
When a multidimensional search space method finds a block C that matches B but has different luminance, it may declare C the match of B and append a luminance value to the compressed frame B. This value (which may be negative) is added by the decoder to the pixels of the decompressed frame, to bring them back to their original values. A multidimensional search space method may also compare a block B to rotated versions of the candidate blocks C. This is useful if objects in the video presentation may be rotated in addition to being moved. The algorithm may also try to match a block B to a block C containing a scaled version of the objects in B. If, for example, B is of size 8×8 pixels, the algorithm may consider blocks C of size 12×12, shrink each to 8×8, and compare it to B. This kind of block search involves many extra operations and comparisons. We say that it increases the size of the search space significantly, hence the name multidimensional search space. It seems that at present there is no multidimensional search space method that can account for scaling, rotation, and changes in illumination and also be fast enough for practical use.

Thread: Guides & Tutorials

Thread Tools

Display

Hybrid View

Bookmarks

Bookmarks

Posting Permissions