Better Pipelined Accumulator

As mentioned in a previous article, there is a better was to perform a pipelined addition or accumulation.  This method works on the ability of an addition to be efficiently broken up across multiple cycles.

Improved Pipeline Accumulator

Improved Pipeline Accumulator

On the left is the basic circuit for a two cycle pipelined accumulator.  The accumulation can be broken across two cycles efficiently because addition can be broken across two cycles efficiently.  In fact, the above circuit could easily be modified to perform pipeline addition, though pipelined addition doesn’t have feedback in the first place.

Array

Array

To brush up on my Verilog skills, I’ve decided to write a parameterizable pipelined accumulator module.  The code has a definite VHDL feel to it.  It might be a bit more verbose than is needed.

The code works by creating an array, PIPELINE stages long, and NSTAGES tall.  This fits into a nested generate.  Interestingly, Verilog’s nested generates have a different syntax than the normal generates.  As shown, an accumulator is generated when ii=jj, otherwise a register delay is generated.

I may change the code in the future to move the accumulators further right — such should allow for efficient packing into DSP tiles, at the cost of reducing the routing registers at the output.

Another feature of the code is the ability to select how the pipelining should be done.  It can be set up to advance each clock cycle “SHIFT_ON_CLK”, or to advance only on the En input “SHIFT_ON_EN”.  The design also allows more pipeline stages than the minimum required to be added.

The code is intended to show an example of reusable code.  In addition to making heavy use of parameters and generates, a regression testbench is also included.  Multiple test cases are used in the regression test.  If the pipe_acc.v file is modified, it is much easier to test different widths, pipeline stages, and pipeline types.

Furthermore, the testbench uses an LFSR to generate the En cadence.  This can aid in finding strange issues that can occur in pipelines, cases where En-delays and Clk-delays are intermixed in a dangerous manner.

The code is provided below, and has a regression test for a few test cases:

This entry was posted in FPGA, Math and tagged . Bookmark the permalink.

One Response to Better Pipelined Accumulator

  1. Pingback: Pipelined Accumulators | cdstahl.org