# FPGA Implementation of An Area Efficient Digit-Serial FIR Filter's Using CSE and GB Algorithm

Dr. FazalNoor Basha<sup>1</sup>, K.Vamsi Krishna<sup>2</sup>

<sup>1</sup>.Associate Professor, K L University, Vaddeswaram, Guntur, A.P <sup>2</sup> M.Tech Student, K L University, Vaddeswaram, Guntur, A.P E-mail: <sup>1</sup> fazalnoorbasha@kluniversity.in, <sup>2</sup>vamsikrish26na@gmail.com

### Abstract

Many efficient algorithms have been proposed to reduce the area occupied by an FIR filters using multiple constant multiplications (MCM) implemented by shifting-adding techniques. In this paper a new approach has been implemented which reduce the hardware of an FIR filters by using common sub expression and Graph Based algorithm. Experimental results indicate that the new algorithm gives better results than the previously proposed algorithms in terms of area at gate-level and yields alternative low-complexity designs relatively to the bit-parallel design.

**Keywords:** Verilog, Multiple Constant Multiplication(MCM), finite impulse response, Low complexity, CSE algorithm, CSD.

### Introduction

In digital signal processing (DSP) systems finite impulse response (FIR) filters have very much importance since their characteristics in linear-phase and feed-forward implementations make them very useful for building stable high performance filter architectures. The direct and transposed form FIR filter logic diagrams are illustrated in Fig.1(a) and 1(b). As shown in figure both architectures have similar complexity in hardware, and the performance and power efficiency.

The architecture of a multiplier of the digital FIR filter[3] in its transposed form is shown in [Fig. 1(b)], where the multiplication of filter inputs with the filter coefficients is realized, due to the significant impact on the complexity and performance of the design because a large number of constant multiplications are required. This is generally know as the multiple constant multiplications (MCM) operation[4] and is also a central operation and performance bottleneck in many other DSP systems such as fast Fourier transforms, discrete cosine transforms (DCT's) and ECC codes.



Figure 1: Design of FIR Filter. (A) Direct Form With Generic Multipliers



Figure 1: Design of FIR filter. (a) Transposed form with generic multipliers

Multiple constant multiplications is involved to generate constant multiplication in Digital Signal Processing, ECC codes, MIMO system applications.

In most of the applications complete use of multipliers are not needed. As the coefficients are constant then constant multiplication can be used. So depending upon the constant coefficients MCM architecture can be constructed, the designed architecture can be called as many times it required. There are two methods to implement Constant multiplication either by digit serial design or digit parallel design. Digit-parallel design of constant multiplier needs external wire for shifting. Hence it occupies more area while implementation takes place in FPGA or any other ASIC. So digit serial design overcomes area constrain with acceptable delay timing.

Multiplication done with constants is known as constant multiplication. This process is used mostly in filter functions. There are two different types of constant multiplications such as Single Constant Multiplication (SCM) and Multiple Constant Multiplication. The input sample is multiplied with single specific coefficient to produce output is called SCM. Canonical Signed Digit (CSD)[7] or Minimal Signed Digit[8] number system is used to implement SCM multipliers.

Input samples are multiplied with multiple coefficients to produce multiple outputs is known as MCM. Multiplication is a process of shift and adds operation[1]. Constant multiplier design includes the number of adders, subtractors and shifters according to the coefficient pair.

FIR filter output can be obtained by multiplication of input sample and impulse response. Direct form and transposed form implementations are two forms of FIR filter implementations. Instead of direct form, transposed form is most effective. The Multiplication process takes place in multiplier block. Thus transposed form

29868

multiplier blocks in FIR filter will replace by MCM architecture also known as shift and add architecture.

FIR filter gives guaranteed stable, feed forward and linear phase. In FIR filter impulse response is equal to the number of coefficients. When compared to Infinite Impulse Response (IIR) filters it is not the case. Thus this FIR filter implementation is sometimes called as multiplier less digit based recoding method.

Objective Main objective is to eliminate multiplier block and introducing MCM architecture in digit serial FIR filter for the reduction of multiplication in the form of shift and add operations.



Figure 2: MCM operation

In Figure.2 X denotes input sample;  $\alpha 1$  and  $\alpha 2$  are coefficients; and Y1 and Y2 are the filter outputs.

Architecture with the same input which is multiplied by a set of constant coefficients is said to be MCM. Outputs Y1, Y2 are produced by multiplying Serial[2] input data with two pair of coefficients.



Figure 3: Digit-Serial operation

Bit-serial operation and Digit-serial operation are the two cases for serial addition, which is shown in Figure 3, while Digit-Serial operation needs less number of delay elements such as flip-flop for addition or subtraction when compared to Bit-serial operation.

### **Proposed Work**

Common sub expression algorithm is used in design of MCM architecture with partial product sharing algorithm (Digit based recoding). In proposed method, the Graph

Based (GB) algorithm is used for this purpose. In proposed design 16 tap FIR filter have been designed. The sixteen constants are taken as coefficient pair. According to principle of MCM constant multiplication[9] is performed by number of shifting and adding operation.



Figure 4: Flow Chart For Proposed System Design



**Figure 5:** Shift-add's implementation of 29x and 43x (a) Without partial product sharing and with partial product sharing (b) Exact CSE algorithm (c) Exact GB algorithm

| Now:                                  |          | 23.1 |  |          |   |       |     |
|---------------------------------------|----------|------|--|----------|---|-------|-----|
| 1000 ns                               |          | 0    |  | 2        | D | 1     | 40  |
| ■ VFIR_existing_tb_v/filter_out[15:0] | 802      | 0    |  |          |   |       | 802 |
| 🞵 cik                                 | 0        |      |  |          |   |       |     |
| 🞵 rst                                 | 0        |      |  |          |   |       |     |
| 🎟 駴 data[7:0]                         | 8'h02    |      |  |          |   | 8'h02 |     |
| 🖽 🚮 filter_out[15:0]                  | 802      | 0    |  |          |   |       | 802 |
| 🖬 🚮 a10[15:0]                         | 58       |      |  | <u> </u> |   | 58    |     |
| 🖬 🚮 a11[15:0]                         | 64       |      |  |          |   | 64    |     |
| 🖬 🚮 a12[15:0]                         | 86       |      |  |          |   | 86    |     |
| 🖬 🚮 a13[15:0]                         | 128      |      |  |          |   | 128   |     |
| 🖬 🚮 a14[15:0]                         | 256      |      |  |          |   | 256   |     |
| 🖬 🚮 a15[15:0]                         | 8        |      |  |          |   |       |     |
| 🖽 🚮 a1[15:0]                          | 6        |      |  |          |   |       |     |
| 🖽 🚮 a2[15:0]                          | 8        |      |  |          |   | 8     |     |
| 🖽 👧 a3[15:0]                          | 10       |      |  |          |   | 10    |     |
| 🖽 🚮 a4[15:0]                          | 16       |      |  |          |   | 16    |     |
| + al a5[15:0]                         | -18<br>► |      |  |          |   | 18    |     |

## **Simulation Results**

## Figure 6: Output For An FIR Filter With Existing Algorithm

| Now:                           |                 |   |   |   |    |     | 36 | .9  |
|--------------------------------|-----------------|---|---|---|----|-----|----|-----|
| 1000 ns                        |                 | 0 |   | 2 | 10 |     |    | 40  |
| FIR_proposed_tb/filter_o[10:0] | 802             |   | 0 | X |    |     |    | 802 |
| <mark>ð</mark> ∏ clk           | 1               |   |   |   |    |     |    |     |
| on rst                         | 0               |   |   |   |    |     |    |     |
| 🖿 🚮 x[1:0]                     | 2               |   |   |   |    | 2   |    |     |
| 🖿 🚮 x15[8:0]                   | 30              |   |   |   |    | 30  |    |     |
| 🖽 🚮 x16[8:0]                   | 32              |   |   |   |    | 32  |    |     |
| 🖿 🚮 x18[8:0]                   | 36              |   |   |   |    | 36  |    |     |
| 🖿 🚮 x23[8:0]                   | 46              |   |   |   |    | 46  |    |     |
| 🖿 🚮 x32[8:0]                   | 64              |   |   |   |    | 64  |    |     |
| 🖿 🚮 x29[8:0]                   | 58              |   |   |   |    | 58  |    |     |
| 🖿 🚮 x43[8:0]                   | 86              |   |   |   |    | 86  |    |     |
| 🛨 🚮 x64[8:0]                   | 128             |   |   |   |    | 128 |    |     |
| ■ 🚮 x_4[8:0]                   | 8               |   |   |   |    | 8   |    |     |
| ▪ <mark>⊘4</mark> x1[8:0]      | 2               |   |   |   |    | 2   |    |     |
| 🖬 🚮 x3[8:0]                    | 6               |   |   |   |    | 6   |    |     |
| + M x4(8:01                    | <b>8</b><br>∢ ► | • |   |   |    | 8   |    |     |

Figure 7: Output For An FIR Filter With Proposed Algorithm

Simulation results have been observed by using Modelsim simulator and synthesis results observed by XILINX ISE tool.

### FIR\_EXISTING\_X Partition Summary

No partition information was found.

| Logic Utilization          | Used | Available | Utilization |  |
|----------------------------|------|-----------|-------------|--|
| Number of Slices           | 110  | 2448      | 4%          |  |
| Number of Slice Flip Flops | 105  | 4896      | 2%          |  |
| Number of 4 input LUTs     | 161  | 4896      | 3%          |  |
| Number of bonded IOBs      | 26   | 108       | 24%         |  |
| Number of MULT18×18SIOs    | 7    | 12        | 58%         |  |
| Number of GCLKs            | 1    | 24        | 4%          |  |

# Figure 8: Device Utilization Report of An Existing System

| FIR | PROPOSED | X Partition | Summary |
|-----|----------|-------------|---------|
|     |          |             |         |

No partition information was found.

| Device Utilization Summary (estimated values) |      |           |             |  |  |
|-----------------------------------------------|------|-----------|-------------|--|--|
| Logic Utilization                             | Used | Available | Utilization |  |  |
| Number of Slices                              | 47   | 2448      | 1%          |  |  |
| Number of Slice Flip Flops                    | 17   | 4896      | 0%          |  |  |
| Number of 4 input LUTs                        | 79   | 4896      | 1%          |  |  |
| Number of bonded IOBs                         | 14   | 108       | 12%         |  |  |
| Number of GCLKs                               | 1    | 24        | 4%          |  |  |

Figure 9: Device Utilization Report of An Proposed System

# **Comparion Results**

| Algorithm             | Area(slices) |
|-----------------------|--------------|
| Normal multiplication | 110          |
| CSE and GB algorithm  | 47           |



Graphical representation of area reduction

### Conclusion

The design of digit serial FIR filter was implemented with low complexity MCM architecture[10]. The simulation results have been observed by Modelsim simulator. Also synthesized by XILINX XST tool and implemented on SPARTAN 3E FPGA. Device utilization values of both algorithms are compared. Hence this MCM approach drastically reduces the system complexity and area. The future enhancement of this paper is to design MCM architecture with more coefficient pairs for FIR filter implementation.

### References

- [1] Kenny Johansson, Oscar Gustafsson, Andrew G. D., and Lars Wanhammar, "Algorithm to reduce the number of shifts and additions in multiplier blocks using serial arithmetic" IEEE MELECON, pp. 197-200, 2004.
- [2] Levent Aksoy, Cristiano Lazzari, Eduardo Costa, Paulo Flores, José Monteiro., "Efficient shift-adds design of digit-serial multiple constant multiplications" GLSVLSI'11,2011.
- [3] Levent Aksoy, Cristiano Lazzari Eduardo Costa, Paulo Flores, José Monteiro, "Design of digit-serial FIR filters: algorithms, architectures, and a CAD tool" IEEE Trans. on Very Large Scale Integration (VLSI) Systems, Vol. 21, No. 3, pp 498-511, 2013.
- [4] Levent Aksoy, Eduardo.D.Costa, Paulo Flores, José Monteiro., "Exact and approximate algorithms for the optimization of area and delay in multiple constant multiplications" IEEE Transactions on Computer-Aided Design of Integrated Circuits And Systems, Vol. 27, No. 6, pp. 1013-1026, 2008.
- [5] Miodrag potkonjak, Mani.B.S., Anantha.P.C., "Multiple constant multiplications efficient and versatile framework and algorithms for

exploring common subexpression elimination" IEEE Trans. Computer Aided-Design of Integrated Circuits and Systems, VOL.15,NO.2, pp 151-165, 1996.

- [6] Yuen H.A.H., Chi U.L., Hing-K.K., Ngai Wong, "Global optimization of common subexpressions for multiplierless synthesis of multiple constant multiplications" IEEE Explore, pp.119-124, 2008.
- [7] R. Hartley, "Subexpression sharing in filters using canonic signed digit multipliers," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 43, no. 10,pp. 677–688, Oct. 1996.
- [8] I.-C. Park and H.-J. Kang, "Digital filter synthesis based on minimal signed digit representation," in Proc. DAC, pp. 468–473, 2001.
- [9] A. Dempster, M. Macleod, "Use of minimum-adder multiplier blocks in FIR digital filters," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 42, no. 9, pp. 569–577, Sep. 1995.
- [10] Y. Voronenko, M. Püschel, "Multiplierless multiple constant multiplication," ACM Trans. Algor., vol. 3, no. 2, pp. 1–39, May, 2007.