Saturday, September 3, 2011

FFT Core Settings v.s Resources and Performance

Core Generator can be used to quickly find out the resource estimate, latency, and maximum throughput of the FFT core based on the current configuration of the core. This makes it very easy to do trade-off analyses between different FFT architectures, bit widths, output orders, etc.

The Implemetation Tab displays the resource estimates for DSP48 and 18Kb BRAM counts (note for device families with 36Kb BRAMs, each BRAM primitive can be used as two 18Kb BRAMs):

The Latency Tab displays the number of cycles required and the latency for the current transform length:

The numbers on the “Implementation” and “Latency” tabs are updated as soon as settings are changed on the configuration GUI. 

The table below shows the resource estimates and latency numbers for a 1024-point FFT with 250MHz clock, 16-bit inputs and outputs and different architectures and output orders:
 As shown in the table above, different settings, especially different architectures, have big impact on hardware resources and latency. One nice feature is that the tool has “Automatically Select” as an architecture option so the tool will choose the smallest implementation based on the specified “Target Data Throughput” and “Target Clock Frequency”. The selected architecture and its resource estimate and latency are displayed on the “Implementation” and “Latency” tabs (see the snapshot below).


  1. I've tried FFT with 32k & 64k points. Takes a lot of time for simulating. For ISim i've to wait for eternity. ModelSim is better but still takes lot of time. Do u suggest any alternatives?

  2. Are you running the full version of ISIM or Modelsim or the lite version included in ISE Webpack?

  3. For 32k or 64k pts FFT, it does take a long time to simulate. I suggest you use the FFT c-model or HW cosim to simulate the FFT core.

  4. Hi! I used Unscaled mode in implementing 256-pt. FFT using the above IP Core.
    The result that I obtained doesn't exactly match that of MATLAB and is slightly offset by some integer value for every output. For example, it's 1980 in FPGA simulation as against 1968 in MATLAB output.
    What I've also observed is that I'm getting Event_Data_In_Channel_Halt. So, I guess this must have something to do with the wrong output.
    I've posted a post on Xilinx Forums elaborating on it...
    I'll be truly obliged if you could suggest something...