My review of O'Reilly's latest clusters book published at HPCwire (http://www.tgc.com/hpcwire.html): > 'Crazy Talk' Clutters New Cluster Book > Glen Otero, Linux Prophet > When my colleagues and I heard that O'Reilly was releasing another > cluster book ("High Performance Linux Clusters with OSCAR, Rocks, > openMosix & MPI"), we knew it would not turn out well. One of my > colleagues even said, "It's going to be written by some guy that > doesn't know anything and [gets all excited] over clusters." > > Why such a pessimistic prediction? > > For one, it was uttered by the same cluster expert that O'Reilly > ignored while producing their first cluster book debacle several > years > ago. When told that their first book ("Building Linux Clusters" by > David Spector) should be scrapped and rewritten, O'Reilly ignored > their reviewers. The advice only came from the knowledgeable folks at > VA Linux, *the* cluster company at that time. But what does VA Linux > know? It's O'Reilly, they obviously know better. > The first O'Reilly cluster book was a complete disaster. I wrote a > scathing review of it for Linux Journal in 2000. Completely void of > anything useful, the book and included software were simply not > finished. It was like reading a rough draft. Totally embarrassed, and > suddenly void of hubris, O'Reilly apologized to its audience and > pulled the book from print. > Not satisfied to sit around pointing fingers and complaining, I told > O'Reilly I would help them with their next cluster book attempt, if > there even was one. Before long, I signed a contract to write a > clusters book for O'Reilly. But in their infinite wisdom, they didn't > like the first few chapters that I submitted. Although I had gotten > other cluster experts to review what I had written, O'Reilly didn't > bother to get any experts to review what I was writing. They just > didn't like it, so they dismissed it out of hand. Needless to say the > "we know better" attitude was back, and that ended the contract. > > Which brings us to present day. This latest cluster book suffers from > the same brain damaged, hubris-driven process at O'Reilly. Just like > the first book, it's written by a virtual unknown in the cluster > community (Joseph D. Sloan) and comes across as having been written > in > a vacuum. > > Let's start with the book's title, "High Performance Linux Clusters > with OSCAR, Rocks, openMosix & MPI." There's nothing high-performance > about this book because there's no discussion of using any high > performance networks like Myrinet, Infiniband, or Quadrics outside of > four paragraphs on page 40. There are so many ill-informed sweeping > generalizations made about cluster networks on that page that I threw > the book against the wall when I read them. For example, Quadrics and > Infiniband are clearly established networking technologies, not > merely > "emerging," as the author believes. Sloan obviously hasn't attended a > Supercomputing conference in the last several years. Unfortunately, > the rest of the book is rife with several inaccurate cluster > oversimplifications and incorrect definitions of terms like single > system image (SSI) and virtual machine interface (VMI). The > "beginner's guide" design of the book is no excuse for inaccuracies > and oversimplifications. > > In my eyes, this book was doomed for the trash after page 8. Sloan > states that the term "Beowulf" is a politically charged term that > would be avoided in the book. That is the most ridiculous thing I > have ever heard. It's impossible to take that comment seriously, > especially since the author doesn't even take the time to properly > define a Beowulf. For these reasons alone, I can't take this book > seriously. I've thrown back my share of adult beverages with Don > Becker, and trust me when I say that the political nature of Beowulf > has never come up. Adding to the confusion, the phrase "more > traditional Beowulf-style cluster" is then used on page 63. I hope > now > you'll understand why I think this book is schizophrenic at best. > > Defining a Beowulf shouldn't have been too difficult for Sloan. He > could have used a term that he introduced on page 10, "asymmetric > cluster." But I guess it's too much to ask that the Beowulf project, > Tom Sterling and Don Becker's brainchild that started the high > performance cluster phenomenon, be properly described and defined in > a > clusters book. By the way, I've never heard the term "asymmetric > architecture" used when describing clusters. And, outside this book, > you won't either. > > After page 8, it's apparent that the author has nothing original to > offer and is going to regurgitate what has already been written about > clusters. There is absolutely no value in this because the online > documentation for all of the cluster projects covered by the author > is > far more informative than what is included in the book. For example, > while screenshots of a cluster install are included in the online > Rocks documentation, they are omitted in the book. Furthermore, after > regurgitating much of the online Rocks documentation, the author > doesn't offer any additional helpful hints or troubleshooting advice. > As someone who runs a company that provides and supports cluster > software based on Rocks, I can tell you that there are plenty of > pitfalls that should have been mentioned. > > This underscores my major complaint with this book. There's nothing > new, nothing novel and no real help offered. Everything is just laid > out superficially in front of the reader for them to make the right > cluster decision. The book should guide the cluster decision-making > process, but it only offers a bunch of questions -- with no > substantial answers. > > Sloan even admits on page 91 that there is a very detailed set of > installation instructions for OSCAR, including screen shots, > available > online. So why is this book necessary again? Oh yeah, the author is > supposed to help the reader decide if OSCAR, or any cluster toolkit > for that matter, is right for the reader. Unfortunately, no help of > any kind is offered. > > The typos and omissions weren't rampant this time, but the errors I > found on pages 76, 123, 127, 130, and 136 provided nasty flashbacks > of > the first O'Reilly book. Good thing I resigned myself to do a shot of > tequila after every typo I found. It dulled the pain this book > inflicted. > > OK. "Part I -- An Introduction to Clusters" is just inaccurate and > infuriating. "Part II -- Getting Started Quickly" contains recycled > and reformatted content easily found for free online. "Part III -- > Building Custom Clusters" isn't really about building custom > clusters, > but looks more closely at some software that was gleaned over in > Parts > I & II. While I don't agree with the inclusion of the parallel > virtual > file system (PVFS) and the omission of Sun Grid Engine in Part III, > I'm sure this can be chalked up to one of the tough decisions the > author had to make, like the omission of PVM and Condor from the > book. > "Part IV -- Cluster Programming" is actually a very good introduction > to programming, debugging, and profiling MPI programs. > > It's obvious that this book has no clear identity. It's like a 5th > grader's book report: a lifeless facsimile of what's been read, > totally void of originality, wisdom or topic advancement. But it's a > quick read because it uses small words. > > Should I be this harsh? After all, cluster computing is a complex > subject where the answer to most questions is "it depends." However, > I believe that O'Reilly owed us an excellent book after their first > cluster gaffe, so I'm disappointed that O'Reilly took the easy way > out > by reorganizing and watering down documentation that is available > elsewhere. Even the content in the exemplary Part IV can be found in > several other places. It's just a lot less technical and intimidating > here. > > There are better ways to write a clusters book. I know because I've > read several cluster book outlines by members of the cluster > intelligentsia that would have been better than this offering. So I'm > not going easy on O'Reilly, no matter how good their intentions. The > cluster community has a difficult enough time assisting people with > clusters without books like this dynamiting the proverbial cluster > well. The statement on page 28, "...benchmarking is probably a > meaningless activity and waste of time," is just plain wrong and > demonstrates a glaring lack of cluster understanding. > > If you really want to learn about clusters, pick up a copy of > Sterling's "Beowulf Cluster Computing with Linux," 2nd edition, or > check out Warewulf, Rocks, OSCAR, OpenMosix, and ClusterWorld online. > You could join a mailing list, like the Beowulf mailing list, and > subscribe to ClusterWorld Magazine. This is where the creators and > maintainers of all that is clustering hang out, announce, debate, > rant, create, lurk, help, and publish. If you want to be part of > clustering's future, then you'll check out the community's Cluster > Agenda and attend this year's ClusterWorld conference. > > ================================================= > Glen Otero received his Ph.D. in Microbiology and Immunology from > UCLA > in 1995 and immediately escaped to the more temperate climes and > better surf in San Diego. After some research on the molecular and > cellular biology of HIV and Herpes viruses at the Salk Institute for > Biological Sciences, Glen left the wet lab research bench in 1999. > Although leaving the research bench, he didn't leave science > altogether; traveling all the way across the street to the San Diego > Supercomputer Center (SDSC) for a stint at the Protein Data Bank. It > was while at SDSC that Glen had his Linux clusters and bioinformatics > epiphany. Soon after that illuminating event, Glen founded Linux > Prophet, a bioinformatics consultancy specializing in the > implementation, design, and deployment of Linux Beowulf clusters in > the life sciences. Late in 2002 Linux Prophet evolved into Callident, > a Linux cluster software and high performance computing company. > Glen Otero Ph.D. Linux Prophet -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 10616 bytes Desc: not available Url : http://bioinformatics.org/pipermail/biobrew-users/attachments/20050225/3919966f/attachment.bin