Floating-Point FFT Processor Radix 2 Core

更新时间:2023-07-14 08:02:39 阅读: 评论:0

®White Paper
Floating-Point FFT Processor
(IEEE 754 Single Precision) Radix 2 Core
WP-FFTRDX2-1.0
Introduction
The floating-point fast fourier transform (FFT) processor calculates FFTs with IEEE 754 single precision (1 sign bit, 8exponent bits, and 23 mantissa bits) accuracy. The processor us extended precision arithmetic (1 sign bit, 8 expo-nent bits, and 32 mantissa bits) internally for higher accuracy. The floating point FFT processor radix 2 core can implement any powers of 2 length of FFT. It is optimized for the Stratix ™ device family and takes advantage of the Stratix DSP blocks and M-RAM blocks.
You can parameterize the number of points at compile time. There are two versions of the core in the package, one version is optimized for internal device memory and the other for memory external to the device. Top-level reference designs, with open source code, allow you to integrate the core into your sy
stem.
The core is designed for simulation with the ModelSim simulator (version 5.6a) and synthesis using the Quartus ® II software. Testbench generation and analysis utilities (MATLAB-bad) are included for verifying the core in both environments.
Parameters and Ports
Altera ® provides two top-level system reference designs: FFTTOPA2.VHD  and FFTTOPB2.VHD . The parameters can be t via the PARMS.VHD  file. The FFT processor cores are FFT2A  and FFT2B , respectively.
Table 1 shows the parameters.
Table 2 shows the input signals.
Table 1. Parameters Parameter
Description POINTS
The number of points in the transform, any power of 2, value 16 or higher.DATAINDELAY
Set to 0 for FFTTOPA2. In FFTTOPB2, any value 2 or greater, defines the latency between the core output and the data memory input.DATAOUTDELAY
Set to 0 for FFTTOPA2. In FFTTOPB2, any value 2 or greater, defines the latency between the data mem-ory output and the core input. TWIDINOUTDELAY Set to 0 for FFTTOPA2. In FFTTOPB2, any value 2 or greater, defines the latency between the FFT control output and the FFT twiddle data input.
Table 2. Input Signals
Signal
Description SYSCLK
SYSCLK  is the system clock. All memory access and processing are at the system clock rate.RESET
The RESET  input is active high, and prepares the FFT processor for another FFT operation. GO
The GO  input enables FFT processing when high. It must be held high during calculation of the FFT.REALIN[32:1]IMAGIN[32:1]
Real and imaginary data input ports, in IEEE754 single precision floating point format. The sign bit is bit 32, the exponents occupy from bits 24 to 31, and the mantissa the 23 LSBs.TWREAL[32:1]TWIMAG[32:1]Real and imaginary twiddle input ports, in IEEE754 single precision floating point format. The sign bit is bit 32, the exponents occupy from bits 24 to 31, and the mantissa the 23 LSBs.
Table 3 shows the output signals.
Architecture
The FFT processor core consists of a processing core for a radix 2 FFT. It does not include memory for data, interme-diate storage, or twiddles, or any interfaces required to load and unload data memory. Altera provide two reference designs, FFTTOPA2.VHD  and FFTTOPB2.VHD  as examples of how to implement the memory interfaces. A DOS utility, TWFP1.EXE  is provided to generate memory initialization files (Intel HEX format) for the on-chip ROM containing the twiddle factors.
The data RAM and twiddle ROM memories are synchronous memories, and the data RAM is dual port also (one read port and one write port). For a ROM, an external asynchronous memory can be made to appear synchronous by add-ing another stage of delay to the input and output of the ROM.
For the data RAM, this is more difficult, as the WE  pul generally requires a t-up and hold time for both the data and address inputs to the RAM.  However, there are veral available commercial synchronous dual-port RAMs that can be ud.
Any delays added to external memories, whether to make asynchronous memories appear synchronous, or to improve fitting or performance of the system, can be compensated for by the FFT core through the DATAINDELAY ,
DATAOUTDELAY , and TWIDINOUTDELAY  parameters. In the FFTTOPB2.VHD  reference design, pipelining only occurs between the twiddle ROM and the FFT inputs, which keeps the reference design simple. In an actual design, the pipelining could be split in any ratio before and after the ROM, as long as TWIDINOUTDELAY  was t to the sum of both latencies.
The additional pipeline delays can also be ud to convert the FFT from using dual-port memories to single-port memories. In this ca, two parate banks of memories are required.
The FFT us a decimation in frequency (DIF) algorithm, hence the input data samples are loaded into the memory in natural order, and the transform is stored in bit or digit reverd order. The reference designs include functions to read out the transformed data in natural order.
Compiling the FFT Processor
This ction details the actions to compile the FFT processor.
Setting Parameters
To t the parameters, perform the following steps:
家园共育内容1.Specify the number of points in the PARMS.VHD  file. If you are using the FFTTOPA
2.VHD reference design,
t DATAINDELAY , DATAOUTDELAY , and TWIDINOUTDELAY  to 0.
or
If you are using the FFTTOPB2.VHD reference design, t the delay parameters to some value greater than 2.
Table 3. Output Signals Signal
Description READADD[addwidth:1]
Data read address output. WRITEADD[addwidth:1]
Data write address output. TWIDADD[addwidth:1]
Twiddle read address output.REALOUT[32..1]IMAGOUT[32:1]
Real and imaginary data output ports, in IEEE 754 single precision floating point format. The sign bit is bit 32, the exponents occupy from bits 24 to 31, and the mantissa the 23 least significant bits (LSBs).WRITE
This signal is high when valid data exists on the WRITEADD , REALOUT , and IMAGOUT  ports.DONE When DONE  is high, the FFT processor has completed processing, and data can be read out of the FFT, after which a new data t can be loaded in.
2.Generate the HEX files to initialize the twiddle memories, using the TWFP1.EXE DOS utility. Call the utility
by typing the following command:
TWFP1 Points 2 r
The 2 optimizes the twiddle memories for radix 2 operation.
Three files are generated: WREAL.HEX, WIMAG.HEX, and WQ.HEX. The WREAL.HEX and WIMAG.HEX files are flat files containing every twiddle factor required by a radix 2 FFT, and are ud by the FFTTOPB2.VHD reference design. The WQ.HEX file is an area optimized file that is ud by the FFTTOPA2.VHD reference design. Memory Requirements
The core has the following memory requirements:
■64 × POINTS data RAM bits
■8 × POINTS twiddle ROM bits
For example, a 1K FFT requires 72-Kbits memory.
Compiling with ModelSim
The core has been verified with ModelSim 5.6a. The following libraries from the EDA/SIM_LIB directory in the Quartus II software are required: 220PACK.VHD, 220MODEL.VHD, and ALTERA_MF.VHD (or
ALTERA_MF_93.VHD). The core can be compiled with VHDL 87 or VHDL 93 language support, except for the testbenches, which require 93 support.
The hierarchy (from bottom up as required by ModelSim) is:
■PARMSUB.VHD
■PARMS.VHD
■LS1.VHD
■RS1.VHD
■CLZ1.VHD
■SELONE.VHD
■FPM1.VHD
■ALU1.VHD
■DFT2.VHD
■BFLY2.VHD
■CTL2.VHD
■FFT2A.VHD (or FFT2B.VHD)
街头巷尾■TWMEM2.VHD (or TWEXT2.VHD)
■FFTTOPA2.VHD (or F FTTOPB2.VHD)
The testbench created by the included MATLAB utilities is TB_FFTFP.VHD.
Synthesizing with the Quartus II Software
The core has been verified with Quartus II integrated synthesis. The core achieves the following example push-button result using Quartus II version 2.1 for POINTS = 1024 and in a Stratix EPS1S10F484C5:
■2704 LCs
■  2 DSP blocks
■72 Kbits memory
■166 MHz
The hierarchy (from top down as required by the Quartus II software) is:
■PARMSUB.VHD
■PARMS.VHD
■FFTTOPA2.VHD (or FFTTOPB2.VHD)
三国名句■TWMEM2.VHD (or TWEXT2.VHD)
■FFTA2.VHD (or FFTB2.VHD)
■CTL2.VHD
■BFLY2.VHD
■DFT2.VHD
■ALU1.VHD
■FPM1.VHD
■SELONE.VHD无足
■CLZ1.VHD
■RS1.VHD, LS1
Reference Designs
Altera provides two reference designs. which you may u as given, or you can modify to meet a your requirements. FFTTOPA2
FFTTOPA2 is a reference design that us on-board memory for both the data RAM and twiddle ROM. It us an FFT processor, FFT2A.VHD, as the core of the design, and the generic Stratix memory, altsyncram to implement both the data RAM and twiddle ROM.
FFTTOPA2 includes ports to write data into the FFT system, and ports to read data out of the FFT system.
The actions of the ports may also be en from the testca generated by the MATLAB utilities FFTFPTB1.M (for ModelSim) or FFTFPVEC1.M (for Quartus II). For more infomation, e “Testing” on page5.
Table4 shows the FFTTOPA2 input signals.
Table 4. FFTTOPA2 Input Signals
Signal Description
SYSCLK Main system clock.
RESET Rets FFT processor, active high.
GO Enables FFT processor, active high.
LOAD When high, enables write into data memory. When low, data and address inputs are ignored. UNLOAD When high, enables read out of data memory. When load, address inputs ignored.
LOADADD Address ud for writing data into data memory bank.
写诗歌
UNLOADADD Address ud for reading data out of data memory bank.
REALSIGNIN Sign bit for real data input.
REALEXPIN Exponent for real data input.
REALMANIN Mantissa for real data input.
IMAGSIGNIN Sign bit for imaginary data input.
IMAGEXPIN Exponent for imaginary data input.
IMAGMANIN Mantissa for imaginary data input.
Table5 shows the FFTTOPA2 output signals.
Table 5. FFTTOPA2 Output Signals
Signal Description
REALSIGNOUT Sign bit for real data output.
REALEXPOUT Exponent for real data output.
REALMANOUT Mantissa for real data output.
IMAGSIGNOUT Sign bit for imaginary data output.
IMAGEXPOUT Exponent for imaginary data output.
高三励志语
IMAGMANOUT Mantissa for imaginary data output.
DONE When high, FFT processing is complete, data may be read out of the core.
FFTTOPB2
FFTTOPB2 has an almost identical structure and ports to FFTTOPA2. FFTTOPB2 is designed to work with memory outside the device.
The only difference in the system design between the two examples is the additional level of pipelining in and out of all of the memory components in FFTTOPB2. This additional pipelining allows better timing between the FFT pro-cessor inside the Altera device, and the memory compone
nts external to the device. The FFT processor automatically compensates for the additional latency, which is controlled using the DATAINDELAY, DATAOUTDELAY, and TWID-INOUTDELAY parameters.
For more information on the parameters e “Parameters and Ports” on page1.
Testing
This ction details the actions to test the FFT processor.
Example Test Flow
Note: the following example does not assume any directory structure. You may have to copy files between directo-ries, i.e., the MATLAB work or mapped directories, the ModelSim work or mapped directories, and the Quartus II work or mapped directories.
There are four MATLAB utilities for testing the core: two for u with the ModelSim simulator, and two for u with the Quartus II software.
ModelSim Testing
外痔疮图片
我是优优FFTFPTB1.M is a MATLAB utility that creates a VHDL testbench for both reference designs. The format of the util-ity call is:
FFTFPTB1 (input_vector, radix,datadelay)r
where datadelay parameter is the sum of DATAINDELAY and DATAOUTDELAY.
To create an input vector in MATLAB, with completely random data, peform the following example (for a 256-point FFT):
1.Enter the following values at the MATLAB command prompt:
rr = rand(1,256); % create a random vector
ss = rand(1,256);
vec =rr + i * ss;

本文发布于:2023-07-14 08:02:39,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/89/1080919.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:家园   名句   痔疮   内容   三国
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图