Reed Solomon Decoder, Berlekamp-Massey

Given a sequence of syndromes, the Berlekamp-Massey algorithm determines an LFSR that could generate this sequence.  This helps explain how the syndromes can both find and correct errors.  The locations of errors are based on the pattern behind the syndromes.  The Berlekamp-Massey algorithm is used to find this pattern.

Berlekamp-Massey Steps

Example of the Berlekamp-Massey Algorithm

The Berlekamp-Massey algorithm uses very similar hardware compared to the syndrome calculator.  But it is significantly more difficult to write HDL to perform.  While the syndrome calculation was straightforward, but the BM algorithm is iterative and required different values be computed in each iteration.  This makes highly parallel computation more difficult.

The algorithm also adds a new element into the design — the large summation.  Instead of a multiply-accumulate, as in the syndrome calculations, an array of multiplies is then added together using a large N-input XOR gate.

The algorithm is shown below:

Init:
L = 0
l = 1
dm = 1
c(x) = 1 = c_0
p(x) = 1
Iterations:  for i in 1 to k loop
  d = sum(a=0 to L)(c_a * S_{i-a})
  if d = 0 then
    l = l + 1
  elsif 2*i >= L then
    c(x) = c(x) + d dm^-1 p(x) x^l
    l = l + 1
  else
    tmp(x) = c(x)
    c(x) = c(x) + d dm^-1 p(x) x^l
    p(x) = tmp(x)
    dm = d
    l = 1

For the fpga implementation, there are three main calculations:

  1. d = \Sigma_{a=0}^{L} c_a S_{i-a}
  2. c(x) = c(x) + d (d_m^{-1} p(x) x^l)
  3. d_m^{-1} p(x)

The last of which is used to avoid the need for a three input multiplier.  These are the input that are applied to the multiplier array.  The current LFSR is stored in an shift register, and is conceptually shifted upward, with the higher degree terms at the bottom.  This allows c(x) to be multiplied by the correct syndromes in the discrepancy calculation.   It also allows p(x) to be stored in a register and remained unchanged and represent p(x) x^l in the next iterations.

The Berlekamp-Massey implementation uses a larger amount of resources.  Its possible that my implementation could be improved by increasing the resource sharing, and using certain retiming methods.  The relevant code is VHDL: Reed-Solomon Decoder Berlekamp-Massey.

This entry was posted in FPGA and tagged . Bookmark the permalink.

Comments are closed.