Hardware Implementation of Adaptive Search Range Assignment for High-Performance HEVCInhan Hwang1, Kwangki Ryoo1* 1 Graduate School of Information & Communications, Hanbat National University,125 Dongseodaero, Yuseong-Gu, Daejeon 34158, Republic of Korea In this paper, we propose an adaptive search range allocation algorithm for high-performance HEVC encoder and a hardware architecture suitable for the proposed algorithm.
In order to improve the prediction performance, the existing motion vector is configured with the motion vectors of the neighboring blocks as prediction vector candidates, and a search range of a predetermined size is allocated using one motion vector having a minimum difference from the current motion vector. The proposed algorithm reduces the computation time by reducing the size of the search range by assigning the size of the search range to the rectangle and octagon type according to the structure of the motion vectors for the surrounding four blocks. Moreover, by using all four motion vectors, it is possible to predict more precisely. By realizing it in a form suitable for hardware, hardware area and computation time are effectively reduced.Keywords: Motion estimation, Motion vector, Inter prediction, Search range. 1.
INTRODUCTIONIn recent times, there have been rapid improvements in video processing technology and communication technology. As a result of this, image compression standards with higher performance than H.264/AVC (a conventional image compression standard) have been in high demand for high-resolution image applications. With the increase in resolution comes an exponential increase in the amount of data required to represent the image digitally.Figure 1 shows the amount of data required when an image’s resolution is represented in 8-bit depth, RGB format and transmitted at 30 frames per secondHigh-Efficiency Video Coding (HEVC) is a next-generation video compression standard technology developed jointly by Moving Picture Expert Group (MPEG) and Video Coding Expert Group (VCEG)1.
The basic unit of HEVC is coding and decoding with CU (Coding Unit) of the maximum 64×64 quad tree structure. *Email Address: [email protected]. 1. Amount of Data required for various image resolutionsThe basic unit of prediction mode is a PU (Prediction Unit), and one CU is divided into a range of PUs2 as shown in Figure 2. Fig. 2.
HEVC Hierarchical Coding StructureIn this paper, we use a motion vector for four neighboring blocks of the current PU to assign a new search range ofrectangle or octagon. By eliminating the unnecessary area, a lot of computation time is reduced. Also, accurate prediction is possible because the search area is allocated based on the surrounding motion vector. In addition, it realizes hardware type and effectively reduces hardware area and computation time3.
2. ADAPTIVE SEARCH RANGEIn order to improve the prediction performance of the existing motion vector, the motion vectors of the neighboring blocks are configured as prediction candidates, and a search area of a predetermined size is allocated using one motion vector of minimal difference compared to the current motion vector. Because of this, the computation is very high and occupies more than 96% of the actual encoding time4.Motion search in reference pictures does not search all blocks to reduce computational complexity.
Assuming that the motion information of the current PU is similar to the motion information of the neighboring blocks, it is assumed that the motion vectors of the neighboring blocks are used5 as shown in Figure 2 Fig. 3. Neighboring Blocks for Current Block Computation Using the motion vectors of the neighboring blocks, the search range is allocated as shown in Figure 36. If all the four motion vectors are zero motion vectors, the motion of the current block is likely to be small therefore the search range of the octagon type is allocated. If all of the horizontal motion vectors are zero, the probability that the current block moves vertically is high therefore a search range of type vertical rectangle is allocated. If the vertical motion vector is all zeros, a horizontal rectangle-shaped search range is allocated.
If the motion vector is located at an oblique line, a search range is allocated in the form of Right Diagonal or Left Diagonal. If all of the following cases are not satisfied, that is, the full search range is allocated because the motion vector is spread in all directions. Fig.
4. A search range according to a motion vector 3. PROPOSED ADAPTIVE SEARCH RANGE HARDWARE ARCHITECTUREFigure 4 shows the overall block diagram of the proposed adaptive search range hardware architecture. The overall structure includes a CLK Gen module for distributing necessary clocks, a MEM Ctrl module for receiving pixels from the memory, a TComDataCU module for obtaining a motion vector value by storing input pixel information, a TEncSearch module for allocating a search range using a motion vector. Fig. 5. Adaptive Search Range Hardware Structure Table.
1. Proposed Algorithm Verification Results. 4. RESULTS AND COMPARISONIn order to compare the performance improvement of the proposed algorithm, we verified it using lowdelay_P_main10 in HM16.9 (HEVC software), and the encoding time is calculated using Equation 1. The verification result is shown in Table 1 where the average encoding speed is improved by 36%, BDPSNR increased by 0.1% and BDBitrate decreased by 0.8%.
?TS(%) = ((TSHM – TSpropose)/TSHM)*100 (1) The simulation shows the results for the tested 32×32 CU size operation. The proposed adaptive search range allocation algorithm is based on motion vectors. Figure 5 shows the simulation result of the TComDataCU module that performs the search range determination based on the surrounding motion vector. SearchRange_type is the search area determined by the TComDataCU module.Fig.
6 Motion Vector Search Range Simulation ResultFigure 6 shows the result of the TEncSearch module simulation that controls the address of the search area according to the SearchRange_type. Fig. 7 Address Controller Simulation Result sequence QP HM16.9 Proposed BD rate BD PSNR ?TS Bitrate (kbps) PSNR (dB) Time Bitrate (kbps) PSNR (dB) Time Race horses 22 1948.
6 40 2450.7 1947.4 40 1965.7 -0.001 0.03 19.
7 27 1002.7 36.1 2303.9 1006.6 36.
2 1824.9 20.7 32 514.1 23.8 2190.9 514.1 32.
8 1710.5 21.9 37 263.7 30 2042.
2 265.1 30 1537.8 24.6 BQ squre 22 3609.5 39.9 2419.3 3615.
6 39.9 1765.2 -0.002 0.05 27 27 1633 56.9 2218.
6 1635.9 35.9 1543.6 30.4 32 828.
7 32.9 1989.6 828.7 32.9 1329.6 33.1 37 446 30.
2 1890.4 446 30.2 1222.8 35.
3 Basketballpass 22 1271 42.1 1888 1268.5 42.
1 765.3 0.006 -0.112 59.4 27 664.3 38.6 1830 664.
8 38.6 733 59.9 32 344.4 35.4 1770.8 342.2 35.4 754.
2 57.4 37 180.1 32.7 1695.8 179.
7 32.7 817.5 51.7 5.
CONCLUSIONSThis paper proposes an adaptive search range allocation for HEVC standard. A new search range is allocated based on motion vectors of neighboring blocks for HEVC adaptive search range allocation, and a suitable hardware structure is described. The proposed adaptive search range allocation was applied to HEVC standard reference software HM-16.9, hardware was designed with Verilog HDL and verified using ModelSim SE-64 10.
1c simulator. As a result, the average encoding speed improved by 36%, BDPSNR increased by 0.1% and BDBitrate decreased by 0.8%.