Stochastic simulation of large-scale biochemical reaction networks, with thousands of reactions, is important for systems biology and medicine since it will enable the insilico experimentation with genome-scale reconstructed networks. FPGA based stochastic simulation accelerators can exploit parallelism, but have been limited on the size of biomodels they can handle. We present a high performance scalable System on Chip architecture for implementing Gibson and Bruck's Next Reaction Method efficiently in reconfigurable hardware. Our MPSoC uses aggressive pipelining at the core level and also combines many cores into a Network on Chip to also execute in parallel stochastic repetitions of complex biomodels, each one with up to 4K reactions. The performance of our NRM core depends only on the average outdegree of the biomodel's Dependencies Graph (DG) and not on the number of DG nodes (reactions). By adding cores to the NoC, the system's performance scales linearly and reaches GCycles/sec levels. We show that a medium size FPGA running at ~200 MHz deliver high speedup gains relative to a popular and efficient software simulator running on a very powerful workstation PC.