Soft Multiprocessor Architecture on FPGA

Soft Multiprocessor Architecture on FPGA


Category: System on Chip

Created: August 28, 2006

Updated: January 27, 2020

Language: VHDL

Other project properties

Development Status: Beta

WishBone compliant: No

WishBone version: n/a

License: GPL


Soft Multiprocessor on FPGA is becoming more attractive as the design cost and NRE soaring up in deep-submicron age, especially for high performance computing applications. However, it becomes time consuming and error prone to design multiprocessor as the number of processors grows quickly. To make it easier, I am going to design a tool (BlazeCluster) to generate multiprocessor architecture on FPGA consisting of Xilinx microblaze, PowerPC and open source processor cores from a simple, top-level script.

The tool is written in Perl. On most of Linux installations, the Perl interpreter is already there. For Windows XP you can install activePerl. The generated EDK project consists of XMP, MHS, MSS and UCF file which can be synthesized by Xilinx EDK7.1 and ISE7.1. The simulator is ModelSim 6.1 starter version.

It can also be used as a fast prototype tool for high performance computing (HPC) applications on FPGA.


- flexible Soft Multiprocessor on FPGA
- various interconnections, including DPRAM, FIFO, bus, DMA
- on-chip communication monitoring and profiling
- real application (for example, JPEG Codec, Mandelbrot set) verified


A brief tutorial and case study is available at download page, http://opencores.orgproject,mpdma,SoftwareMultiprocessoronFPGA20070608.pdf and my master paper;content-type=application%2Fpdf


1. Multiprocessor Testbench
1) single processor testbench *
2) four-processor testbench (not-optimized) *

2. Automatic multiprocessor template generator (BlazeCluster)
1) simple generator for Microblaze only *
2) support PowerPC *
3) add area estimation *
4) generator for other boards
5) generate software libraries for communication

3. Additional component
1) Performance counter for Microblaze *
2) DMA controller
3) message interface

4. Verification
1) JPEG encoder*
2) Mandelbroth set on FPGA*



1. 2006/10/23 STEP1-2 Design a testbench to do JPEG encoder on four microblaze processors. Four processors are communicated to each other via FSL links. The design is not optimized thus there is not much performance improvement.

2. 2007/05/05 STEP2-1 BlazeCluster v0.1. It can generate microblaze multiprocessor architecture on Virtex2vp30 FPGA on XUPV2P board from a single script.

Usage: just run It then generate MHS, MSS, UCF file from system.js in the same directory. The script is designed similar to human language so you can easily understand it from several examples attached.

3. 2007/06/08 STEP2-2 BlazeCluster v0.14. It supports PowerPC now. Meanwhile most of Perl code is rewritten to migrate to object-oriented model.

4. 2007/07/14 STEP2-3 BlazeCluster v0.15. Support area estimation. It's helpful to design large system because you can know if your design fits the chip before time-consuming implementation. The result is in a log file. On the other hand, the LUT packing is not taken into consideration so the estimation of slice can be 10%-25% larger. The BRAM and Multiplier estimation is quite accurate.

5. 2007/09/06 STEP3-1 BlazeCluster v0.17 Add performance counter for Microblaze to facilitate profiling on multiprocessor. An example of usage can be found in Mandelbrot set on FPGA testbench. The accuracy can be up to two cycles for every timer operation. More on http://opencores.orgproject,performance_counter,overview

6. 2007/09/06 STEP4-2 Mandelbrot set on soft multiprocessor on FPGA. It consists of one powerPC and eight microblaze processors with FPU running on XUPV2P. The profiling result shows that the load is fairly distributed on eight microblaze processors. The overall performance result is not so impressive, however. It's 4 times slower than a 2.0GHz PC. Any feedback to improve performance is welcome.

Note you need copy Xilinx plb-vga-controller pcores and drivers into project directory in order to implement it. It can be extracted from