- Research
- Open Access
- Published:

# Dynamic voltage and frequency scaling scheme for an adaptive LDPC decoder using SNR estimation

*EURASIP Journal on Wireless Communications and Networking*
**volume 2013**, Article number: 255 (2013)

## Abstract

In this paper, we propose a low-power adaptive low-density parity check (LDPC) decoder that utilizes dynamic voltage and frequency scaling to reduce power consumption. Most existing adaptive LDPC decoders have focused only on the decoding performance based on the signal-to-noise ratio (SNR) estimation. However, significant idle power is consumed when the decoder awaits the next frame after processing a frame. In mobile communication standards such as China Mobile Multimedia Broadcasting and Digital Video Broadcasting Satellite Second Generation, adaptive coding and modulation has been adopted. Thus, it is possible to reduce the power consumption efficiently by using the SNR estimation. In this paper, we apply a customized frequency selection scheme and a variable voltage generation scheme to an adaptive LDPC decoder to reduce the dynamic power consumption. The proposed schemes result in a reduction of 44% in the energy consumption of an LDPC decoder implemented using 0.18-μm complementary metal-oxide-semiconductor technology.

## 1 Introduction

Today, the need for a reliable high transmission rate is increasing in order to offer various multimedia services with 4G mobile communication systems. Since data transmission in mobile circumstances with a data rate requirement of more than 100 Mbps is common, the demand for efficient error correction codes has been rapidly growing. Turbo codes were regarded as the best channel coding method before low-density parity check (LDPC) codes started to draw attention. Because LDPC codes have a larger minimum distance than Turbo codes and exhibit very good bit error rate (BER) curves, LDPC codes are being studied actively in the area of next-generation data communication [1].

LDPC codes are linear block codes that were originally devised by Gallager in the 1960s [2]. However, the codes were impossible to implement in hardware in those days, so they were largely ignored. About 30 years later, Mackay and Neal reviewed the LDPC code, and they rediscovered the excellent properties of the code thanks to the development of communication and integrated circuit technologies [3]. In 2001, Chung and Richardson showed that the LDPC code can approach within 0.0045 dB of the Shannon limit [4]. Many iterative LDPC decoding schemes are based on the Sum-Product algorithm [5] because the algorithm can be fully parallelized, resulting in high-speed decoding [6]. LDPC codes have been adopted by mobile communication standards such as China Mobile Multimedia Broadcasting (CMMB) [7] and Digital Video Broadcasting Satellite Second Generation (DVB-S2) [8].

There are several studies for low-power LDPC decoding, and they are based on the fact that the dynamic power consumption of a module is proportional to the amount of switching activity. In [9, 10], adjusting either the maximum number of iterations or the quantization level according to the estimated SNR value is proposed in order to reduce the decoding power consumption. In [11], it was shown that dynamic voltage and frequency scaling (DVFS) could be effective in reducing the power consumption while maintaining the BER performance of an LDPC decoder. The number of parity-check errors after a predetermined number of iterations is used to estimate the remaining number of iterations until decoder termination and is used to determine the voltage and the clock frequency for the remaining iterations. Unlike the previous works, the estimated SNR value in this paper is considered to determine the supply voltage and the operating frequency of an LDPC decoder. The SNR estimator using a pilot signal has been employed in adaptive coding and modulation (ACM) [8] for DVB-S2. Thus, the proposed DVFS scheme can be applied to such standards with negligible overhead.

In this paper, binary LDPC codes for CMMB over an additive white Gaussian noise (AWGN) channel with BPSK modulation are considered. We determine the correlation between the throughput and the power consumption of an LDPC decoder with respect to the channel SNR. In DVFS schemes, the supply voltage control is crucial since the achievable maximum clock speed of the circuits is dependent on the supply voltage level. For example, if the maximum clock speed generated by the supply voltage is lower than the required operating clock frequency, a timing violation may occur. To apply the proposed DVFS scheme to an LDPC decoder, we designed a DVFS controller composed of a frequency selector and a supply voltage generator. A look-up table is used for the frequency selector, and the supply voltage generator is based on the variable supply voltage scheme suggested in [12]. Various performance evaluations were conducted to make sure that slowing the performance through DVFS would not violate the throughput requirement of the CMMB standard. With the addition of the controller, the size is increased by 2.2%, while the power consumption is reduced by up to 44% for CMMB codes. Our contributions for low-power LDPC decoding are summarized as follows. First, we show that we can increase the power efficiency of the LDPC decoding with negligible area overhead since almost all of the ACM schemes employ a hardware module for the SNR estimation. Second, we show the power savings when a DVFS scheme is used for standard code such as CMMB. Third, the proposed DVFS scheme can be applied to any standard because the proposed DVFS controller is independent of specific ACM structures. Finally, the proposed LDPC decoder is the design of an ASIC decoder including a DVFS controller, and we used commercial tools to evaluate the quality of the proposed LDPC decoder.

This paper consists of five sections. In the following section, we address the theoretical background regarding LDPC decoding, including the main characteristics of an adaptive LDPC decoder and the conventional structure of the ACM decoding flow. In Section 3, we find the operating clock frequency that satisfies the throughput requirement of CMMB and generate the supply voltage for the required operating clock frequency. The implementation and experimental results are presented in Section 4, and Section 5 concludes this paper.

## 2 Background

### 2.1 LDPC decoding algorithm

The LDPC codes are very long and are often randomly generated, but the decoding is simpler than other codes with comparable error correction capabilities. LDPC is a linear block code, and the decoding is carried out using a parity-check matrix called the H-matrix. When an H-matrix contains a fixed number of 1 s in each row and each column, respectively, the LDPC code is called regular. The row and the column of an H-matrix represent the parity-check codes and symbols, respectively. A Tanner graph is often used to represent the equivalent information [13]. A Tanner graph is a bipartite graph where the partites represent the check nodes and bit nodes, respectively. The check nodes correspond to the rows of the H-matrix, and the bit nodes correspond to the columns of the H-matrix. For instance, when the H-matrix for (7,4) Hamming codes is given as shown in (1), Figure 1 shows the equivalent Tanner graph.

Most practical LDPC decoders are based on a concept of message passing that is called belief propagation (BP). BP is carried out by passing messages that contain the amount of belief to be 0 or 1 between adjacent check nodes and bit nodes. Based on the delivered messages, each node attempts to decode its own value. If the decoded value turns out to contain an error, the decoding process is repeated some pre-defined number of times. LDPC decoding consists of initialization, check node processing, bit node processing, and tentative decision and parity check operations, and the entire algorithm is described in Table 1. After initialization, the rest of the four steps operate iteratively. When the parity check equation (H C^{T} = 0) is satisfied, the decoding process is terminated with success. Otherwise, the decoding process is iterated until the predetermined maximum iteration count is reached.

### 2.2 Adaptive coding and modulation

In the ACM architecture, the data from a base station is transmitted after the channel coding, interleaving, and modulation are processed. The receiver first estimates the channel state from the received signal and then sends the estimation result back to the sender. The channel state estimation is typically performed based on the SNR value. The sender determines the modulation and coding scheme (MCS) [8] based on this information and adaptively applies channel coding, interleaving, and modulation methods according to the channel state for the upcoming transmission. ACM techniques typically result in better transmission rates with smaller error rates than typical coding and modulation techniques, since the proper MCS level is determined based on the estimated channel state. In ACM, accurate channel state estimators are important [14]. Currently, standards for mobile multimedia services such as CMMB, DVB-S2, and DVB-T2 employ ACM techniques. Figure 2 shows an ACM structure when the SNR estimation is used to determine the proper MCS level. In this paper, we use the SNR estimation to make decisions regarding the supply voltage level and the operation frequency to satisfy the throughput requirement of CMMB, and we propose an adaptive structure for a dynamic operating frequency and supply voltage scaling.

### 2.3 Dynamic voltage and frequency scaling

The power consumption in a complementary metal-oxide-semiconductor (CMOS) circuit is computed by (2) [15].

where *P*_{switching} is the dynamic power consumption due to switching activities, *P*_{SC} is the dynamic power consumption due to the short circuit current (*I*_{SC}), and *P*_{leakage} is the static power consumption due to the leakage current. The value of *P*_{switching} is proportional to the operation frequency and the square of the supply voltage, since the voltage change Δ*V* is approximately equal to the supply voltage, *V*_{dd}. In the equation for *P*_{switching}, *α* and *C*_{
L
} are constant, and they represent the node transition factor and the loading capacitance, respectively. Hence, DVFS is very effective in reducing *P*_{switching}*.* DVFS has been commonly used to reduce the dynamic power consumption of both high-performance desktop CPUs and mobile embedded CPUs. In this paper, we apply DVFS to reduce the power consumption of LDPC decoders by adding a DVFS controller to dynamically determine the proper level of the supply voltage and the corresponding operating frequency.

## 3 Proposed low-power LDPC decoder

In this section, we explain a low-power LDPC decoder with the ACM LDPC decoding flow shown in Figure 3. This new LDPC decoder is based on the adaptive LDPC decoder previously proposed in [16], to which we added a DVFS controller to provide dynamically adjustable operating frequencies and the corresponding supply voltages. To find the optimal operating point for CMMB codes, we carried out various analyses regarding the correlation between the operating frequency and the supply voltage with respect to a certain SNR value. The DVFS controller determines the operation frequency (*f*_{ext}) with a given SNR value and properly generates a supply voltage (*V*_{DDL}) so that the adaptive LDPC decoder operates without any timing violation. To explain the LDPC decoder in detail, the next three sub-sections describe the following: the basic LDPC coder used in this paper, the frequency selection scheme, and the variable supply voltage generator.

### 3.1 Adaptive low-power LDPC decoder

The proposed LDPC decoder is based on the adaptive LDPC decoder proposed in [16]. A typical LDPC decoding flow created when an LDPC decoder is used in the ACM architecture is shown in Figure 4. The sender inserts a frame called the start of frame (SOF) into the encoded signal, and the receiver estimates the SNR. The estimated SNR information is provided to the LDPC decoder to determine whether or not the parity check and the tentative decision will be conducted. An accurate estimation of the SNR is crucial for the proposed adaptive parity check scheme to be effective. To estimate the SNR, we used the signal-to-noise variance (SNV) algorithm as described in (3), which is a special case of the maximum likelihood (ML) estimator [17]. It is well known that the SNV algorithm gives a reliable estimation of the SNR for short SOFs. In the SNV algorithm, the SNR estimation is carried out by computing the correlation among a predetermined test-symbol vector ** c** and the corresponding vector of received symbols

**when N test symbols are transferred. To evaluate the accuracy of SNV, we conducted experiments with an SOF of 26 symbols as used in DVB-S2. Figure 5 shows the mean square error (MSE) of the estimated SNR value using SNV, where the MSE is computed by (4). The value of the MSE is less than 2 when the real SNR is greater than 1 dB. Since SNR estimation is necessary for all ACM schemes, power reduction techniques utilizing an SNR estimation do not require any additional hardware resources. Therefore, the power reduction based on the SNR estimation is an efficient way to reduce power consumption with little hardware overhead.**

*r*where *Re{•}* denotes the real part of a complex quantity; and *r*_{
m
}, *c*_{
m
}, and *N* represent the *m* th received symbol, the *m* th test symbol, and the length of the SOF, respectively.

where $\widehat{\rho}$ is an estimate of the SNR, and *ρ* is the true SNR.

We analyzed the performance of the proposed LDPC decoder for CMMB at various SNR values. Table 2 summarizes the results after we carried out a simulation of 1,000,000 frames of data for a 1/2 rate LDPC code with a block length of 9,216 and a dimension of 4,608 (CMMB code rate = 1/2 code) [7]. For each SNR value, we assessed the average number of iterations and the minimum number of iterations with the maximum iteration count set to 50. According to Table 2, a smaller iteration count is sufficient for a higher SNR, while a larger iteration count is needed for a lower SNR. Hence, the tentative decision and the parity check equation need not be computed until we obtain a reasonable SNR value after a certain number of iterations. For instance, when the SNR is 2.5 dB, the minimum iteration count is 4. Therefore, until the LDPC decoding has repeated four times, the parity check and tentative operations can be skipped. As a result, both the decoding time and the power consumption are improved. In this paper, we further improve the power efficiency of the previous LDPC decoder by applying DVFS. The former research focused on the power efficiency when the SNR was low, while our approach can reduce the power consumption even when the SNR values are high. Because there is more possibility for slowing the computation with a high SNR value without violating the throughput requirement, we can reduce power consumption by using a low clock speed and a low supply voltage.

### 3.2 Frequency scaling scheme based on SNR estimation

The system net load data rate for the 1/2-rate LDPC code specified in the CMMB standard is 10.852 Mbps [7]. The proposed LDPC decoder operates at 185 MHz when the supply voltage is 1.8 V. When the maximum iteration count is set to 20, the throughput is 14.32 Mbps. Since the real performance of LDPC decoders depends on the channel condition, we carried out a simulation on 1,000,000 frames to evaluate the LDPC decoder performance with respect to the maximum number of iterations. The results are summarized in Table 3. The LDPC decoding is composed of the initialization and the repeated iterations. The amount of computation in each iteration step is constant; therefore, the same numbers of clock cycles are needed. We estimated the throughput variation according to the changes in the operating clock speed and the maximum iteration count by the SNRs. When the clock speed is 185 MHz, the throughput is 11.10 Mbps, which satisfies the requirement (10.852 Mbps) for CMMB. In Table 3, the entries that are written in bold face type indicate the minimum clock speed that satisfies the throughput requirement for CMMB with the corresponding iteration count. For the cases of iteration counts of 24, 25, and 26, the minimum clock speed that satisfies the requirement for CMMB is 185 MHz.

The frequency scaling scheme of this paper selects the minimum clock speed that satisfies the requirement of the CMMB standard based on the estimated SNR. The minimum clock speed required at each maximum iteration count is listed in Table 3, and the minimum clock speed for the SNR estimation is shown in Figure 6. When the SNR of a channel is less than 1.5 dB, the LDPC decoder proposed in this paper does not satisfy the throughput requirement of CMMB. Therefore, for the SNR values of a channel greater than or equal to 2 dB, the minimum clock speeds are indicated. In the existing adaptive LDPC decoders, the clock frequency is fixed at 185 MHz regardless of the SNR estimation. On the other hand, in the proposed design, the clock frequency of 85 MHz satisfies the throughput requirement for CMMB when the SNR is 2.5 dB.

### 3.3 Variable supply voltage generator

To implement the aforementioned frequency selection scheme that sets the operation clock speed for LDPC codes, the voltage supply level should be set accordingly to avoid timing violations in decoder operation. It is important to reduce the voltage level in such a way that the voltage is low enough to reduce power consumption but high enough to provide the required clock speed. To adjust the voltage level dynamically but stably, a variable supply voltage block was implemented as shown in Figure 7. This variable supply voltage block generates the output voltage (*V*_{DDL}) that guarantees that the target circuit can operate without any timing violations when the target circuit has the 'Critical Path Replica’ as the critical path with respect to the external clock ( __f___{ext}). When the frequency determined by the frequency scaling scheme is provided to the voltage supply block, the proper supply voltage is generated so that the LDPC decoder should operate without any timing violations. Kuroda et al. originally proposed the voltage supply block, and the detailed discussion is found in [12].

## 4 Experimental results

To evaluate the performance of the proposed adaptive LDPC decoder using DVFS, an LDPC decoder for (3,6) regular LDPC codes with lengths of 9,216 with a DVFS controller was implemented. The implementation has a partially parallel structure with 16 sets of processing units to satisfy the throughput requirement of CMMB, as shown in Figure 8. To reduce the amount of memory usage, an address generation unit was used. A detailed discussion of this unit can be found in [18]. A table look-up was used to adjust the coefficients of a modified Min-Sum algorithm and to determine whether or not the parity check processing and the tentative processing would be carried out. The critical path of the proposed adaptive LDPC decoder is found at the minimum value finder inside the check node processing unit. Thus, we used the minimum value finder as the 'Replica Critical Path’ inside the variable supply voltage block. The proposed adaptive LDPC decoder with the DVFS controller was synthesized using Synopsys' Design Compiler with Chartered Semiconductor's 0.18-μm CMOS cell library. The total size of the synthesized circuit, including the SNR estimator, was 316 K in NAND2, and other information regarding the synthesized circuit is summarized in Table 4.

Before we measured the set of available clock speeds and supply voltage levels, we calculated how much power consumption could be saved based on (2) with respect to various voltage supply levels and the corresponding clock frequencies. By decreasing both the clock speed and the supply voltage as shown in Figure 9, the power consumption can be reduced. The feasible minimum clock speed for each SNR value is estimated by throughput analyses, and the supply voltage generator in the DVFS controller automatically generates the corresponding supply voltage. However, the LDPC decoding time increases as we reduce the clock speed. Hence, the actual energy dissipation to decode a code block mainly depends on the voltage scaling. This means that both voltage scaling and a frequency change are needed to reduce the energy consumption.

The set of feasible clock speeds and supply voltage levels for the proposed DVFS controller was obtained by running a Synopsys HSPICE simulation, and the results are listed in Table 5. The clock speed was determined through SNR estimation, and the supply voltage that would not incur any timing violations was measured at a 50-mV interval. The lowest level of the supply voltage was measured in such a way that the propagation delay of the 'Critical Path Replica’ is less than the clock cycle time with a 20% timing margin. For instance, when the estimated SNR is 2.5 dB, the proposed LDPC decoder operates with a clock frequency of 85 MHz and a voltage supply level of 1.1 V. According to [11], the power dissipated by the DVFS controller is less than 10 mW. Therefore, the maximum power dissipation of the controller during the decoding of a single block is 3.45 μJ when the decoder operates in the SNR range between 1.5 and 6 dB. When the SNR is 2.5 dB, the power consumption of the proposed LDPC in decoding one block is 235.43 mJ. Hence, we conclude that the power dissipation overhead due to the DVFS controller is negligibly small.

For a given code block, the amount of power dissipation was measured with respect to various SNR values, and the results are shown in Figure 10. Figure 10 shows the results when we compared the power dissipation of the proposed design when DVFS was applied and the power dissipation of the work proposed in [16], where only early termination and computation (parity check and tentative operations) skips for low SNR values were applied. Synopsys' Power Compiler was used to measure the power consumption. Since the amount of power dissipation by the DVFS controller in decoding a code block was 3.45 μJ, the power consumption overhead due to the DVFS controller was negligible. When the SNR is greater than 1.5 dB, the amounts of power savings are shown in Figure 11. The amount of power savings consistently increases by up to 44% until 2.5 dB; however, it decreases slightly or maintains similar power reduction ratios when the SNR exceeds 2.5 dB. This is mainly due to the fact that there are small changes in the voltage scaling from 1.1 to 0.9 V. As we mentioned earlier, the energy consumption of LDPC decoding using DVFS strongly depends on the voltage scaling, because the execution time increases as the clock speed decreases. As a result, we can reduce the energy consumed in decoding a single code block by up to 44% using the proposed scheme.

## 5 Conclusion and future work

In this paper, we proposed a low power LDPC decoder for China Mobile Multimedia Broadcasting (CMMB) that utilizes DVFS. The proposed decoder consists of an adaptive LDPC decoder and a DVFS controller. The design has a partially parallel structure with 16 sets of processing units to satisfy the throughput requirement of CMMB and operates adaptively according to the SNR estimation. Based on the observation that the maximum decoding iteration counts differ according to the SNR values, we obtained the set of operating frequencies and supply voltage levels for which the proposed decoder would operate to satisfy the throughput requirement of the CMMB standard. A DVFS controller was added to the design to adjust the clock frequency and the supply voltage level, but the area and overhead in terms of the power consumption due to the controller turned out to be negligibly small, 2.3 and 0.002%, respectively. For SNR values greater than 2 dB, the power consumption was reduced by up to 44%, and we showed that the LDPC decoder for CMMB using DVFS becomes even more power-efficient as the SNR value increases from 1.5 to 4 dB. Our future works include methods for more accurate SNR estimation for finer-grained DVFS control.

## References

- 1.
Park JG, Lee CH: Architecture of an LDPC Decoder for DVB-S2 using reuse.

*J Inst Electron Eng Korea*2006, 43 SD: 31-37. - 2.
Gallager RG: Low density parity check codes.

*IRE Trans. Information Theory*1962, IT-8: 21-28. - 3.
MacKay DJC, Neal RM: Near Shannon limit performance of low density parity check codes.

*IEEE Electronics Letters*1996, 32(18):1645-1646. 10.1049/el:19961141 - 4.
Chung S-Y, Forney GD Jr, Richardson TJ, Urbanke R: On the design of low-density parity-check codes within 0.0045 dB of the Shannon limit.

*IEEE Communication Letters*2001, 5(2):58-60. - 5.
Chen J, Dholakia A, Eleftheriou E, Fossories MPC, Hu X: Reduced-complexity decoding of LDPC codes.

*IEEE Trans. Communication*2005, 53(8):1288-1299. 10.1109/TCOMM.2005.852852 - 6.
Shuang W, Cheng S, Qiang W: A parallel decoding algorithm of LDPC codes using CUDA. In

*Signals, Systems and Computers, 2008 42nd Asilomar Conference*. Pacific Grove, CA; 2008:171-175. - 7.
Interfax China:

*China releases mobile TV industrial standard. Retrieved on 2007-04-14*. 2006. - 8.
ETSI, Digital Video Broadcasting (DVB):

*Second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broadband satellite applications. ETIS EN 302 307 v1.1.2*. 2006. - 9.
Kim S, Sobelman GE, Lee H, Nicole R: Adaptive quantization in min-sum based irregular LDPC decoder. In

*Circuit and Systems*. Seattle, WA: ISCAS; 2008:536-539. - 10.
Darabiha A, Carusone AC, Kschischang FR: Power reduction techniques for LDPC decoders.

*IEEE J. Solid State Circuits*2000, 43(8):1835-1845. - 11.
Wang W, Choi G, Gunnam KK: Low-power VLSI Design of LDPC Decoder Using DVFS for AWGN Channels. In

*VLSI Design, 2009 22nd International Conference*. New Delhi; 2009:51-56. - 12.
Kuroda T, Suzuki K, Mita S, Fujita T, Yamane F, Sano F, Chiba A, Watanabe Y, Matsuda K, Maeda T, Sakurai T, Furuyama T: Variable supply-voltage scheme for low-power high-speed CMOS digital design.

*IEEE J. Solid State Circuits*1998, 33(3):454-462. 10.1109/4.661211 - 13.
Tanner R: A recursive approach to low complexity codes.

*IEEE Trans. Information Theory*1981, 27(5):533-547. 10.1109/TIT.1981.1056404 - 14.
Albertazzi G, Cioni S, Corazza GE, Neri M, Pedone R, Salmi P, Vanelli-Coralli A, Villanti M: On the adaptive DVB-S2 physical layer: design and performance.

*IEEE Wireless Communication*2005, 12(6):62-68. 10.1109/MWC.2005.1561946 - 15.
Chandrakasan AP, Sheng S, Brodersen RW: Low-power CMOS digital design.

*IEEE J. Solid State Circuits*1992, 27(4):473-484. 10.1109/4.126534 - 16.
Park J-Y, Chung K-S: An adaptive low-power LDPC decoder using SNR estimation.

*EURASIP J Wireless Commun Netw*2011, 2011: 48. 10.1186/1687-1499-2011-48 - 17.
Pauluzzi DR, Beaulieu NC: A comparison of SNR estimation techniques in the AWGN channel. In

*IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing*. Victoria; 1995:36-39. - 18.
Lee S-J, Park J-Y, Chung K-S: Memory efficient multi-rate regular LDPC decoder for CMMB.

*IEEE Trans. Consumer Electronics*2009, 55(4):1866-1874.

## Acknowledgments

This research was supported by the MSIP (Ministry of Science, ICT & Future Planning), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2013-H0301-13-1011).

## Author information

## Additional information

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- LDPC decoder
- SNR estimation
- DVFS