[BiO BB] Implementing HMM models in Hardware (FPGA)

John Jakson johnjakson at yahoo.com
Sun Sep 14 03:00:07 EDT 2003

Hi Venu

Interesting to see others interested in applying FPGAs to Bioinformatics. FPGAs don't get much mention here.

I am not convinced the Bio industry really cares for EE solutions it doesn't understand. Linux clusters are bad enough but what the hell are FPGAs. As an EE VLSI/FPGA hardhat visitor at the BioWorld show, held here in Boston not that long ago all I saw was disinterest and plenty of tower server racks. Not one HW company showed up with anything but Linux clusters or the SGI/IBM/HP/... equivalent. TimeLogic & the 1 or 2 other (defunct ?) accelerator companies were noshow. On talking with the floor folks I found no interest or basic understanding of possible HW alternatives.

The issue comes down to how the problem is stated and how it can be implemented in a solution that most Bio SW types can understand. That means whatever the engine is, it must just run C code, simple as that, preferably the free stuff from NCBI. That always leads to the same solution, clusters of ever faster & ever hotter farms of todays x86. Any rational computer scientist knows this is crazy, and that dedicated HW should be built. TimeLogic says it very well on their web site. In crypto, video or DSP processing, it is relatively easy to turn C code into HW since they are all math intensive and are likely created by the same EEs. 

It may come as a surprise to SW types but HW is routinely modeled in C, but that code is used only to double check the design written in a decent HW description languages like Verilog or VHDL both of which are implicitly parallel languages. There is usually some formal mathematical model often written in Matlab for the real heavy stuff. Its also interesting that the Matlab code usually floating point intensive & the final ASIC/FPGA solutions are not expected to produce identical results since HW is best built integer fashion. One might regard the current Bio C codes as just simulations of HW that hasn't been built yet since few know how to recode them in HW language. TimeLogic did a few but not in a way that can be easily duplicated across the industry.

To turn C code into really fast HW requires understanding what the C code is really doing and having permission to make subtle but harmless changes to it to allow the really big speed ups. That means eliminating floating point. If the Bio author of such SW is also a HW expert (of which there are probably only a handfull or even 0 in the whole world) then equivalent algorithms could be used that are relatively simply to map onto HW structures. I don't see the Bio world hiring too many HW EEs either, we are far too different culturally and we usually don't have Phds, esp not from the right schools.

There are other ways to turn C code into HW, maybe use a C based HW language such as HandelC which is based on Occam & CSP. And there's the clue. If the SW is broken up into the constituent parallel processes that are naturally there but impossible to describe in plain C, then it becomes almost trivial to map those parallel processes onto FPGA fabric or even something like a Transputer farm. The only difference is the granularity. FPGAs are hot today but can only readily be engineered by HW types because their most efficient use requires detailed understanding of pipelines and combinatorial logic and basic cpu design. Transputers if they still existed would be the natural way to go because they are ameniable to both SW & HW engineers but they still worked best when SW & HW were both understood. Occam was just a way to describe parallel processes that decribed HW in a funny syntax. Transputers only died out because the implementation fell far behind x86 performance and was sin
 sourced & underfunded. Most Transputer projects & users ultimately switched to standard DSPs & FPGA leaving the SW user base behind.

Another approach would be to use one of the cpu farms on a chip such as Clearspeed or PicoChip or BOPs (RIP) who have developed risc cpus that can be upto 420 instances on a chip running at 100MHz plus. Interesting to see if those devices can escape cell phone basestations.

So I have taken my passive interest in this subject back to the drawing board to recreate a modern FPGA hosted Transputer that would naturally execute sequential C code, or parallel Occam code & even Verilog code. That means that if code can be partially migrated from seq C to par Occam style C (ie HandelC) then to Verilog ( a C'ish like HW language), the same code still runs on the same cpu (but a little slower perhaps). Extra process scheduling HW is needed to support very fine grained concurrency in a modern Transputer and also a logic simulator. The big pay off is that properly parallized code once in Verilog form still runs either as compiled source code on a farm of cpus using message passing and links, or it can be synthesized with industry standard HW tools back onto the FPGA fabric for the desired speed ups. In effect, sequential procedures in C code can be morphed into on chip HW coproceesors using the reconfigurable features of many FPGAs. Stable FPGA coprocessor e
 can then be turned into much faster and cheaper ASICs in return for nasty upfront NRE. Such solutions could go much farther than current TimeLogics products for many industries beside Bio.

Xilinx & Intel can give us a clue here. A cluster cpu node based on a P4 at say 3GHz might run to $2K per node depending on whats there even though the fastest P4 chip is always say $600. An FPGA RISC cpu node based on MicroBlaze runs at maybe only 125MHz but will cost about $1.40 per node in volume plus extra support. Now if the cpu can be farmed by adding those Transputer extensions, the 24x clock difference doesn't looks so bad compared to the est 400+ fold cpu cost difference. Also a lot of slower cpus each with local RLDRAM don't have the memory latency that P4s suffer from ie 1 DRAM cycle is a few cpu cycle instead of hundreds, and distributed bandwidth is much easier to manage.

Its also interesting to see the changes at TimeLogic, the departure of Jim and the merger with a company that I see has no obvious HW background.


John Jakson

sorry for long rant

Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20030914/38522ba1/attachment.html>

More information about the BBB mailing list