Often, things that sound like they should work well, but actually don’t. This actually is related to one of my rules — “avoid excessively creative code.”
In this case, I decided to compare the fancy “carry-lookahead” adder to whatever fabric-based adder would be inferred by the addition operator in VHDL. The fancy carry-lookahead adder is often touted as being a high performance adder. The results for an FPGA certainly didn’t show this.
I did a quick comparison of speed and area for 16b, 32b, 48b, and 64b adders in a Xilinx Virtex-6 -1 speed grade. The Virtex-6 is a very nice FPGA, and the 16b and 32b inferred adders actually met 500MHz timing after PAR. This was at least true for my very simple top-level. The 48b and 64b adders came in at 449MHz and 349MHz respectively.
The logically superior carry-lookahead adder did not fare as well. None of the adders could meet 500MHz — not even the 16b adder. The 64b adder came in at 259MHz. Further, the 64b version used 220 LUTs compared to the inferred adder’s 64 LUTs.
In this case, the differences were very obvious. The 64b carry-lookahead’s worst path had a 20% logic, 80% route distribution. The 64b inferred adder had an 80% logic, 20% route distribution. This is mainly because, while the carry-lookahead used fewer layers of logic, it wasn’t able to make good use of the fast carry-chains in the FPGA.
The lesson from all this is that some things seem like a good idea until you try them. Had a designer started with the carry-lookahead adder, not only would the code be harder to read, but also would use more area and run slower.
Note: for large adders, DSP48 slices can also be used. That was not the purpose of this article.
