This paper presents a configurable FFT architecture for variable length and multistreaming Wi-Max wireless criterion. The architecture processes 1 watercourse of 2048-pt FFT or two watercourses of 1024-pt FFT or 4 watercourse of 512-pt FFT. The architecture consists of 11 SDF pipelined phases and radix-2 butterfly is calculated in each phase. The sampling frequence of the system is varied in conformity with FFT length. The word length and buffer length in each phase is configurable depending on the FFT length. Latch-free clock gating technique is used to cut down power ingestion.
The architecture is synthesized for Virtex-6 XCVLX760 FPGA. Experimental consequence show that the architecture achieves the throughput as required by the Wi-Max criterion and the design has extra characteristics compared to the old attacks. The design used 1 % of the entire available FPGA resources and maximal clock frequence of 13.67 MHz was achieved.
SDF- Single Delay Feedback.
DIT- Decimation in Time.
WiMax- Worldwide Interoperability for Microwave Access.
OFDM-Orthogonal Frequency Division Multiplexing.
MIMO-Multiple Input Multiple Outputs.
DFT- Discrete Fourier Transform.
FPGA- Field Programmable Gate Array.
WLAN- Wireless Local Area Network.
WPAN- Wireless Personal Area Network.
WMAN- Wireless Metropolitan Area Network
ROM- Read Only Memory
DLL- Delay Locked Loop.
In Recent yearss Wireless Systems are developed to increase the transmittal rate of the system. Wi-Max is the Wireless Standard that combines MIMO and OFDM systems. This Wi-Max achieves higher informations transmittal because of the combination of MIMO and OFDM.MIMO used multiple spacial watercourse to increase public presentation
FFT is the signal processing algorithm. In order of transmittal of multiple spacial watercourses FFT requires multiple informations watercourses. But to manage multiple informations watercourses FFT need multiple processor. In use of multiple processors made more hardware resources and increase the power ingestion and besides Wi-Max wireless criterion transmits informations in different defined channel bandwidths ( 5 MHz, 10 MHz, 20 MHz ) this needs variable FFT length ( i.e. ) scaling the FFT to specify channel bandwidth to keep a changeless bearer spacing.
In order to carry through the above demands of the criterion a reconfigurable architecture that can back up both variable length and multistreaming at the same time that requires a research on pipelined FFT architecture. Hence this paper proposed a reconfigurable FFT architecture that can back up Variable Length and Multi streaming at the same time.
It consists of modified SDF, radix-2 FFT Architecture. The latch free clock gating technique is used to time the faculties merely when they are demands which cut down power ingestion. For Multistream processing this co-efficient storage is organized in order to cut down the no of memory entree.
II PROPOSED ARCHITECTURE
The figure 1 is the proposed architecture of variable length and multistream FFT. This architecture is a modified SDF pipelined architecture. Each phases computation is done by a radix-2 butterfly.
Fig 1: Proposed Architecture of FFT
DIT is the algorithm used in FFT. The DIT algorithm has the belongings that they have S phases of the algorithm. Hence the deliberate FFT was said to be N=2^s points. This DIT algorithm is used because that the initial phases of the pipelined architecture is shared between strategies with different FFT length and multiple watercourses. The input watercourses interleaving and trying frequence of the system was changed as reference in the tabular array 1 to spread out the design in to a multistream processing architecture
Table 1: Wi-Max channel bandwidths and FFTs
The FFT strategies for the input x ( I ) to the architecture is shown in the figure 2. The figure of the phases was calculated by log2 informations ( i.e. ) If the informations are 2048-pt FFT is calculated as log2 2048=11, Hence the architecture consists of 11 phases.
Fig 2: FFT input strategies for the proposed architecture
In the architecture multi cyclosis can be processed as 2048-pt FFT or two interleaved watercourses of 1024-pt FFT or four interleaved of 512-pt FFT at 22.8 MHz One watercourse of 1024-pt FFT or two watercourses of 512-pt FFT at 11.4 MHz. Besides a individual watercourse of 512-pt FFT can be calculated at 5.71 MHz. In this architecture clock gating is used to cut down the power ingestion ( i.e. ) for an illustration if we want to cipher no of phases for 1024-pt FFT it needs of log21024=10, so 10 phases is plenty so the 11 phase was powered down. As DIT algorithm used the rotary motion are calculated at the input of the lower edge in the butterfly.
III HARDWARE IMPLEMENTATION
Single phase of the proposed grapevine architecture was shown in the figure 3. Each pipelined phase consists of a radix-2 butterfly component, a complex multiplier, complex co-efficient memory and informations direction units. This consequence in a sum of 11 butterfly elements and ten complex multipliers for the full design. Additionally, the design consists of a centralized control unit to synchronise the information flow.
Fig 3: A modified SDF phase of the proposed architecture
The butterfly unit consists of the complex adder and the complex subtractor. Complex is the combination of existent and fanciful parts. The butterfly unit produce the amount and fanciful parts. The butterfly unit end product was defined in below equations
The basic entity of the butterfly unit implemented was illustrates in figure 4. The above procedure is synchronized on the lifting border of the clock
Fig 4: Butterfly entity
The end product of the complex multiplier can be defined by the below equation. This equation is the direct execution of the complex multiplier requires four multiplier to working above two equations one multiplier can be reduced by replaced with three adders.
Fig 5: Entity of the buffer
The figure 5 shows the buffer design used in this design. Shift registry is the basic constituents of the buffer designed which has two separate registry to salvage existent and fanciful values. The buffer deepness different for different phases of the pipelined design the input of the buffer is same for all the phases but the end product buffer is different for each phases that was design or selected in three location N/2^s, N ( 2*2^s ) or N/ ( 4*2^s ) for 2048-pt, 1024-pt or 512-pt FFTs severally.
Table 2: Buffer length for the pipelined phases.
The tabular array 2 displays the buffer length for assorted phases of the proposed architecture. The deepness of the buffer can be varied to hive away a lower limit of one word to 1024 words. The input informations is written to the first registry and shifted to the back-to-back registries. ROM is used as the co-efficient memory which is individual port memory and clock enable. The size of the ROM in each phase was calculated as 2^ ( s-1 ) where s is the phase figure. Two ROMs are used in this architecture to hive away existent and fanciful parts individually. On positive clock border ROM end products are in words but it was non registered. Twiddle factor is to be stored was generated in Matlab map. The word length of the and the twiddle factor length was same. Up-counter is used to turn to the ROM. The end product width depends on the phases where it is used. The end product was reset when the counter reaches its maximal value
Centralised control unit is used to command the counter and it s used to synchronise and command the informations flow way. The control unit consists of 11 spot up counter. When the input arrives counter is enable, when its upper limit value ( i.e. ) 11 was reached its get reset to its initial value. In each phase switches the input every n/2^s clock rhythms by multiplexer, Hence in each phase each spot of counter end product can be straight connect to the selected pin of multiplexer. The control unit besides provides the clock gating status power down the phases. The length of the FFT and the no of watercourses desired the counters of the control unit. The control signal associates with end product exchanging multiplexer were shown in table 3.
Table 3: Control signals associated with end product exchanging multiplexer
IV VARIABLE LENGTH AND MULTI STREAMING
The proposed architecture can treat both variable length and parallel processing. Figure 6 presents a signal flow graph for a strategy with one streams 16-pt and two watercourses of 8-pt FFT DIT by the proposed architecture.16-pt and two 8-pt FFTs are chosen for the simpleness of account. Four phases of the proposed pipelined architecture are used. In the signal flow graph, x [ n ] 16 represents the 16-pt input informations sequence and X [ k ] 16 represents the corresponding FFT end product samples. X [ n ] 8 represents the two watercourses of 8-pt input informations sequence. X [ K ] 8 is the FFT end product of the end product of the two watercourses. To calculate 16-pt, the complete signal flow graph is used by fall ining stage-4 with stage-3.
In the instance of two 8-pt FFTs, the inputs indexed as 0,1,2,3,4,5,6,7 are the informations samples of the first 8-pt FFT watercourse and the inputs indexed as 0’,1’,2’,3’,4’,5’,6’,7’ are the informations samples of the 2nd watercourse. The input watercourses are interleaved and given as input to the system. In the signal flow graph, the solid lines indicate the information flow of the first watercourse of FFT and the flecked lines indicate the dataflow of 2nd 8-pt FFT. As three phases are required to cipher 8-pt FFT, the end product X [ k ] 8 is taken from the 3rd phase. The end products are besides interleaved. It can be observed from the flow graph that twiddle factors are the same in a phase irrespective of the FFT length or the figure of watercourses.
Fig 6: Variable length DIT FFT signal flow graph.
V AREA AND PERFORMANCE RESULT
To analyse the dynamic public presentation of the FFT architecture implemented, the VHDL execution of the FFT architecture was synthesised utilizing Xilinx incorporate simulation environment tool. Virtex-6 low power FPGA was the mark device. Virtex 6 FPGA is the latest in the market and specialised for low power application. The device is fabricated utilizing 40 nm CMOS procedure engineering. The architecture was synthesised with the undermentioned specification and dregss in the tool.
The FPGA clock is 25 Megahertz
The input end product word length of the samples is16.
The Synthesis tool can be optimising the architecture either for velocity or country to minimise hardware resources.
The tabular array 4 analyser explains the device use of the proposed architecture after synthesis of the 118,560 available pieces on the Xilinx virtex-6 FPGA. The consequence of timing analysis utilizing the timing analyser in the Xilinx ISE tool indicates a maximal clock frequence of 313.67 MHz
Table 4: FPGA device use
VI POWER CONSUMPTION
Power appraisal of a circuit is an of import facet of a system design. The power consumed by the circuit mostly affects the dynamic public presentation of the circuit. Entire power consumed by a design implemented in the FPGA is the amount of the two power measures viz. inactive power and dynamic power. The transistor escape current of the device consequences in inactive power ingestion, Where as dynamic power of a circuit is associated with the design activity, exchanging input/output nodes and the clock frequences associated with the FPGA. Dynamic power of a circuit is calculated utilizing the below equation
Dynamic power = 0.5_fclkCLiterVolt2Doctor of Divinity
The above equation has exchanging activity, node electrical capacity, supply electromotive force and the clock frequence. The power ingestion for the assorted FFT strategies and the comparing of the variable length and the multi cyclosis.
Fig 7: Power ingestion for assorted FFT strategies
Table 5: Comparison of variable length and the multistreaming
• A reconfigurable FFT architecture to cover all the instances of Wi-Max radio OFDM criterion was proposed, designed and verified in this thesis work.
• The FFT demands were tabulated after a brief survey of the OFDM radio criterions. The FFT architecture was designed based on the Wi-Max FFT demands tabular array.
• Decimation-in-Time FFT algorithm was used. The architecture was designed with eleven modified Single Delay Feedback phases. Each phase calculates aradix-2 FFT.
• The FFT architecture is reconfigurable for variable length and multiple watercourses. The architecture processes a individual watercourse of 2048-pt FFT, up to two watercourses of 1024-pt FFT or up to four watercourses of 512-pt FFT.
• The architecture processes uninterrupted watercourses of informations.
• The architecture is power efficient. Clock gating technique was used to cut down power ingestion. Clock gating was used to power down single faculties ( butterflies and complex multipliers ) or a complete phase, when non in usage.
• Equally far as it is known, the architecture proposed in this thesis work is the first architecture to cover all the instances of the Wi-Max wireless criterion.
[ 1 ] M. Garrido, J. Grajal, M. S?anchez, and O. Gustafson, Pipelined radix-2Kprovender frontward FFT architectures, ”IEEE Trans. VLSI Syst., vol. 21, no. 1, pp. 23–32, Jan. 2013.
[ 2 ] S. He and M. Torkelson, Design and execution of a 1024-point grapevine FFT processor, ” May 1998, pp. 131–134.
[ 3 ] Y. O. Park and J.-W. Park, Design of FFT processor for ieee802.16m MIMO-OFDM systems, ” inInt. Conf. Information Comm. Tech. Convergence, Nov. 2010, pp. 191–194.
[ 4 ] M. Garrido, K. K. Parhi, and J. Grajal, A pipelined FFT architecture for real-valued signals, ”IEEE Trans. Circuits Syst. I, vol. 56, no. 12, pp. 2634–2643, Dec. 2009.
[ 5 ] H.-L. Lin, H. Lin, R. Chang, S.-W. Chen, C.-Y. Liao, and C.-H. Wu, A high-speed extremely pipelined 2N-point FFT architecture for a double OFDM processor, ” inInt. Conf. Mixed Design Integrated Circuits Syst., Jun. 2006, pp. 627–631.
[ 6 ] Y. Chen, Y.-W. Lin, Y.-C. Tsao and C.-Y. Lee, A 2.4-Gsample/s DVFS FFT processor for MIMO OFDM communicating systems, ”IEEE J.Solid-State Circuits, vol. 43, no. 5, pp. 1260–1273, May 2008.
[ 7 ] S.-N. Tang, C.-H. Liao and T.-Y. Chang, An area- and energy-efficient multimode FFT processor for WPAN/WLAN/WMAN systems, ”IEEEJ. Solid-State Circuits, vol. 47, no. 6, pp. 1419–1435, Jun. 2012.
[ 8 ] S. Li, H. Xu, W. Fan, Y. Chen, and X. Zeng, A 128/256-point grapevine FFT/IFFT processor for MIMO OFDM system IEEE 802.16e, ” Jun.2010, pp. 1488–1491.
[ 9 ] T. Ahmed, M. Garrido, and O. Gustafson, A 512-point 8-parallel pipelined provender frontward FFT for WPAN, ” Nov. 2011, pp. 981–984.
[ 10 ] T. Lenart and V. Owall, Architectures for dynamic informations grading in 2/4/8k grapevine FFT nucleuss, ”IEEE Trans. VLSI Syst., vol. 14, no. 11, pp. 1286–1290, Nov. 2006.
[ 11 ] M. Garrido, J. Grajal, and O. Gustafson, ’Optimum circuits for spot reversal’ , ”IEEE Trans. Circuits Syst. II, vol. 58, no. 10, pp. 657–661, Oct. 2011.
[ 12 ] M. Garrido, Efficient hardware architectures for the calculation of the FFT and other related signal processing algorithms in existent clip, ” Ph.D. thesis, Universidad Polit?ecnica de Madrid, 2009.
[ 13 ] A. V. Oppenheim and R. W. Schafer,Discrete-Time Signal Processing. Prentice-Hall, 1989.
[ 14 ] M. Garrido, O. Gustafson, and J. Grajal, Accurate rotary motions based on coefficient grading, ”IEEE Trans. Circuits Syst. II, vol. 58, no. 10, pp.662–666, Oct. 2011.
[ 15 ] A. Wenzler and E. Luder, New structures for complex multipliers and their noise analysis, ” vol. 2, Apr. 1995, pp. 1432–1435.