Space is at a premium on fpgas. Unlike gpus, which process the same instruction on many cores at a time, each part of the fpga only operates on one thing at a time, essentially having a stream of data going through the code, so to get good performance, code has to be unrolled when possible...