RE: CASP vs. The Tower of Babel



 Hi!
 So why is there no collaboration on the code-level between computational
 scientists?
 I don't want to over-simplify the problem, but one important reason IMHO
 is the fact that traditionally computational scientists have been
 _lousy_ programmers. While the technology to make interoperable
 application components has been there for some time (I'm particularly
 thinking of OO technology), computational scientists have been too
 inexperienced in software engineering techniques to exploit this
 technology.
 As a concrete example: It would be almost trivial to design an OO
 framework for Energy Minimization (or Molecular Dynamics or Threading,
  ...) where the force-field would be just one component that you
 "plug"
 into the framework. The same is true for such algorithmic components as
 minimization strategies, potentials of mean force, integration
 algorithms, Monte Carlo moves, Genetic Algorithms, etc.
 But is the resulting code efficient? It's certainly a lot slower than
 FORTRAN! (For language reasons and because you loose opportunities for
 source-level optimizations if components have to be truly independent of
 each other.) But I would argue that firstly speed of execution does not
 matter at this level (we are not talking about production code here).
 And secondly, on the other hand, of what use is speed of execution if
 you can not get the functionality you want (interoperability of
 algorithms, in this case) with reasonable effort?
 But things _are_ getting better: MMTK springs to my mind, and also the
 OMG effort to standardize interfaces for biological sequence analysis
 (although this is not an academic effort). And of course there are
 programs which are not designed with OO techniques and consequent
 interoperability in mind, but which are simply well written (TINKER, for
 instance).
 (Regarding unwillingness to give away source code at all: With Java you
 can even give away the "object code" (byte code) (maybe after running
 it
 through an obfuscator) and it will work on any Java platform...)
 	cheers,
 	gerald
 |--------------------------------------------------------------------|
  Gerald Loeffler - Bioinformatics Scientist
  Boehringer Ingelheim R&D Vienna
  Email: Gerald.Loeffler ^at^ vienna.at
  Phone: +43 676 3289588 (and +43 1 80105 634)
  Fax:   +43 1 80105 683
  Smail: Bender+Co, Dr. Boehringer-Gasse 5-11, A-1121 Vienna, Austria
 > -----Original Message-----
 > From:	Gabriel Berriz [SMTP:berriz ^at^ potato.harvard.edu]
 > Sent:	Sunday, September 20, 1998 7:53 PM
 > To:	chemistry ^at^ www.ccl.net
 > Subject:	CCL:CASP vs. The Tower of Babel
 >
 >
 >
 > I study the statistical mechanics of protein folding using minimal
 > computer models.  I find that my specialty has a Tower of Babel
 > (Babble?) problem, and perhaps the same is true of the whole field of
 > computational protein studies, or even of all of computational
 > chemistry.  It has to do with checking and building on the results of
 > others.  I was an experimentalist in cellular immunology for a few
 > years before switching to my current field, and I recall that trading
 > reagents, libraries, strains, was quite common in that field.  I often
 > unpacked little vials shipped in dry ice, and bearing some precious
 > mutant; typically, after some quick tests, I was up and running with
 > the new stuff.  Not much was required from the source of the samples
 > (a couple of concentrations, buffers used, maybe some growth
 > conditions here and there...).
 >
 > *Nothing* like this happens in my current field.  I'm not sure why,
 > but I have a few guesses.  For one thing, programs are a pain to port
 > across systems (portability of code is not a criterion for
 > publication).  More important, in most cases I don't want the program
 > just to use as a black box.  On the contrary, my interest in the
 > program is usually in how it implements a model; I want not only to
 > reproduce published results, but also to tweak the conditions, and to
 > extend the experiments.  This invariably requires that I understand
 > enough of the code to hack away in it, and here's where I hit the
 > biggest wall.  It takes me too long to understand the code written by
 > my *labmates*, let alone that written by some unknown graduate student
 > 5 years ago half a world away.  (Again, clarity of code is, for the
 > most part, not a criterion for publication).  So, typically I conclude
 > that either I re-implement the idea from scratch, which is usually
 > something I can't afford, or else I drop the matter altogether.
 > (Incidentally, in the few cases I've tried to get source code from
 > other labs, I've received such unequivocal, resounding, unapologetic
 > refusals, that I must conclude my request was deemed to be bad
 > manners.)
 >
 > It is, in my opinion, a very serious problem; it reduces the field to
 > a collection of largely independent efforts, deprived of one of the
 > greatest strengths of the scientific method, namely, the ability to
 > test and build upon the work of others.  I wonder if others feel
 > similarly.
 >
 > I think this frustrating situation was what ultimately gave rise to
 > the biennial structure prediction competition CASP (Critical
 > Assessment of techniques for protein Structure Prediction), in which
 > participants put their structure prediction programs through the fire
 > test of predicting some recently solved protein structures prior to
 > their publication.  This skips over the problem of understanding the
 > programs and the models devised by others by focusing on "objective
 > results".  Faced with this clear prize, the field has naturally
 > responded by a adopting an increasingly heuristic attitude: whatever
 > works, however ad hoc or poorly understood, throw it in there.  If you
 > loose, no one will care, and if you hit the CASP jackpot, then
 > "there's no arguing with success!"
 >
 > Well, I guess that's *one* way to deal with our Tower of Babel
 > problem, but I wonder where this leaves the science...  I'm relatively
 > new to this field, though, and I wonder what others with more
 > experience feel about these issues.
 >
 > Best wishes,
 >
 > Gabriel Berriz
 > Department of Chemistry and Chemical Biology
 > Harvard University
 > berriz ^at^ potato.harvard.edu
 > For best results, replace the word potato by chasma in my address.
 >
 >
 > ---
 > Administrivia: This message is automatically appended by the mail
 > exploder:
 > CHEMISTRY ^at^ www.ccl.net: Everybody | CHEMISTRY-REQUEST ^at^ www.ccl.net:
 > Coordinator
 > MAILSERV ^at^ www.ccl.net: HELP CHEMISTRY or HELP SEARCH | Gopher:
 > www.ccl.net 73
 > Anon. ftp: www.ccl.net   | CHEMISTRY-SEARCH ^at^ www.ccl.net -- archive
 > search
 >              Web: http://www.ccl.net/chemistry.html
 > ---