CF Reconfigurable Computing Array

CF Reconfigurable Computing Array


Category: Coprocessor

Created: August 20, 2003

Updated: January 27, 2020

Other project properties

Development Status: Stable

WishBone compliant: No

WishBone version: n/a

License: n/a


Cores are generated from Confluence; a modern logic design language. Confluence is a simple, yet highly expressive language that compiles into Verilog, VHDL, and C. See for more info. Several cores are provided in Verilog, Vhdl, and C. If you don't see the configuration you need, chances are we can easily generate it for you. The Reconfigurable Computing Array (RCA) is a platform for dynamic reconfigurable computing. RCA consists of a fine-grained array of reconfigurable "square" logic tiles. Similar to an FPGA CLB, a tile can be programmed to perform a wide variety of functions.


Overview Unlike FPGAs, RCA has no routing fabric. Rather, all tiles communicate directly with their nearest neighbor, i.e., north, south, west, east. Because a tile's inputs are registered, the lack of routing fabric prevents end-to-end combinatorial logic design that is possible with general purpose FPGAs. However, the advantage of "hard-wiring" tiles is 2 fold: greater logic density and improved speed. FPGAs consume 80-90% of their area on routing; only 10% yields useful logic in the form of CLBs. Without the routing fabric, it is possible RCA can increase logic density by a factor of 10. Secondly, because signals are registered across tile boundaries, timing is deterministic and constant. Further more, since tiles are fine-grained, clocks rates into the GHz should be possible. The goal of this project is to develop an understanding of optimal tile architecture trade-offs and RCA compiler technology. Tile Structure A tile is square, having four 1-bit inputs and four 1-bit outputs named north, south, west, and east. An array is a collection of tiles organized like a checkerboard, each side connecting to an adjacent tile. For instance, the east output of a tile of the left plugs into the west input of a tile on the right. In terms of tile architecture, there are several possibilities. The initial architecture is based on 3-to-1 look-up tables (LUTs). There are four LUTs per tile -- one for each direction -- each LUT with three 8-to-1 multiplexers for input data selection. The following illustrates the tile architecture (only the north datapath shown): Top Level Interface and Array Configuration At the top level, RCA has 4 input data buses and 4 output data buses; and input and output bus for each side of the array (N, S, W, E). Bit 0 of "north_i", "north_o", "south_i," and "south_o" corresponds to the western most tile. Likewise, bit 0 of "west_i", "west_o", "east_i", and "east_o" corresponds to the northern most tile. All tile interconnection registers are synchronized on the "clock_main_c" clock. In addition to the data busses, the configuration bus handles the programming and reconfiguration of the array. Configuration is synchronized on the "clock_config_c" clock. Each data path within each tile is addressable. Configuration addressing is as follows (msb on the left): - ConfigAddr = {RowSelect, ColSelect, DirSelect} - RowSelect of 0 corresponds to the northern most row. - ColSelect of 0 corresponds to the wester most column. - DirSelect: 00=north, 01=south, 10=west, 11=east. The configuration data is 18-bits. It defines the LUT function, the input MUX selection, and the output MUX selection, for a specify tile datapath. The follow defines the configuration data format: - ConfigData[17] : Output Select (0=direct, 1=registered) - ConfigData[16:14] : Input Select 2 - ConfigData[13:11] : Input Select 1 - ConfigData[10:8] : Input Select 0 - 000=north_in - 001=south_in - 010=west_in - 011=east_in - 100=north_state - 101=south_state - 110=west_state - 111=east_state - ConfigData[7:0] : LUT data {f(7), f(6), f(5), f(4), f(3), f(2), f(1), f(0)} Routing and Function With the lack of routing fabric, data routing is performed in the configuration of each tile. Because every tile input is registered, designs on RCA are micro-pipelined. To simplify pipeline data aliment, each tile output can come directly from the LUT or delayed 1 cycle though an output register. With each tile having 4 independent datapaths (N, S, W, E), function and routing can be grouped onto the same tile. For instance, a function can be performed from West and South to East, while at the same time data is routed from North to South. Note the South input and South output are separate datapaths. Embedded Extensions As with platform FPGAs, RCA can benefit from specialized embedded components, such as block ram, hardware multipliers, and processors. Implementing embedded components is possible by replacing internal tiles groups with hard IP.