> From jle@world.std.com Tue Dec 10 19:50:02 1991 > Date: Tue, 10 Dec 91 19:52:19 -0500 > From: jle@world.std.com (Joe M Leonard) > To: chemistry-request@ccl.net > Subject: Long C vs. Fortran summary Jan, Here's a summary of the C vs. Fortran posting I did last week. As you can see, it set a record - 38 replied on the first day! Joe jle@world.std.com --- David Bernholdt (bernhold@qtp.ufl.edu) writes: All of the new stuff in Tektronix's CAChe software is done in C. The preexisting applications (ZINDO, MOPAC, etc.) are in Fortran as ever. Based on recurring discussion threads in Usenet's comp.lang.fortran, there is no consensus yet among scientific programmers in general that C is much superior and everyone should go off and rewrite everything. Probably more _new_ scientific applications are now being written in C than previously. I have heard arguments for performance tradeoffs in both directions depending on the application. Martin Norin (martin@link.sunet.se) writes: To my knowledge, most applications in organic chemistry are using C (carbon). Only a few are using B (boron). jle: He couldn't resist, and neither could I... John Duchowski (duchow@ucrac1.ucr.edu) writes (summary): Folks use C for portability (both among workstations and to PC's) and it sure beats using BASIC. He pointed out that a recent book on laboratory automation was written in BASIC, even though most commercial machines use C. Bill Ross (ross@zeno.mmwb.uscf.edu) writes: I have been busy replacing Amber's MDANAL trajectory analysis program with one in C called CARNAL (jle: hey, I like it...). A new AMBER front end w/X graphical interface called Leap is written in C. Steve Sidner (ssidner@unmc.edu) writes: Last time I talked to the Tripos people, Sybyl was about 1,500,000 lines of C. It shows. Their science may be debatable sometimes, but the user interface, integration and rapid availability on new platforms sure isn't. As you may know, there are FORTRAN to C converters for anybody wanting to laterally arabesque. I think it was Joe Sventek who said, "FORTRAN is just portable assembly language" (jle: I have a button that says that C combines the flexibility of assembly language with the power of assembly language). jle: Many others have reiterated Steve's comment regarding Tripos's use of C Berkley Shands (berkley@wubs.wustl.edu) writes: C is something you swim in :-) You jump in with both feet and tread water forever... We re-wrote Systematic Search (ala SYBYL Search) in C from a FORTRAN backend. It took about two man years, with most of the data structures altered to take the C datastructures into account. The nicest thing was dynamic memory allocation. We are now at 33,xxx lines of C, with a MOTIF interface. The program set should soon be available from Tripos (ala SYBYL again). It does ring searches on linear chains and N membered rings, Pharmacophore searches (all in the 1 degree (or floating fractional degree) range) and allows interactive control over the search. I'll grant you that most C compilers are now very smart about optimizing calculations. The SGI compiler fails to do common sub-expression removal or constant folding. The SUN slows down when you use hardware floating point (snicker snicker). Alas, FORTRAN was less portable than C, and didn't do forking or MIMD applications well. Yvonne Martin (martin@cmda.abbott.com) writes: We write our new programs in C. The Daylight software is also in C as far as I know. Mike Peterson (system@alchemy.chem.utoronto.ca) writes: We have one group and one other postdoc here who do almost all their programming in C - they're looking at chaos in general, and cellular automata are often used as models and they often want "visualization" via X windows. They sometimes call FORTRAN routines though for heavy duty number crunching like numerical integration of diff. equations. One problem for some people is the lack of the COMPLEX datatype (and the library routines that go with them), though this is not a widespread issue. Jan Radomski (radomski@mond1.ccrc.uga.edu) writes: We sure will, did and do this. Entirely in C (and C++). Without talking to my boss first, I'm not free to disclose the actual nature of this (as usual, it is meant to be HOT STUFF)! Bob Goldstein (U09872@uicvm.uic.edu) writes: You wanted to know about development in C, so I'm putting in my vote. I wrote a simulation/data fitting program for chemical equilibria in C (now available to the public for free) and a program for simulating dynamic light scattering from flexible macromolecules in C. (In both cases the availability of yacc for making an input parser to handle the input files was quite useful. Also the ability to dynamically allocation memory in a portable way allowed me to move these programs easily among machines.) I am now fooling around with some novel energy minimization schemes aimed at proteins in C for similar reasons. In my experience, existing FORTRAN/chemistry programmers don't like C because (1) inertia, (2) vectorization, (3) terse syntax, (4) interfacing to existing FORTRAN code. And it is quite common to believe that a second programming language is as hard to learn as the first, although this is usually not true, especially for languages as similar as C and FORTRAN. jle: two points, first, I'd argue that learning to program is far more difficult than learning a programming language and second, I think many folks will have heartburn with the comment that C and FORTRAN are similar... As to interfacing to existing code, there exist FORTRAN-to-C translators, at least one of which is free and accessible over the net from netlib. You can ftp to netlib and get the source for f2c, or just put the line "execute f2c" at the top of a fortran source file and mail it to netlib@research.att.com and the translated C code will be returned. Helmut Heller (heller@lisboa.ks.uiuc.edu) writes: I am currently porting my molecular dynamics program EGO from occam II to C to widen its hardware platform. Originally, it was developed for a "parallel" Transputer machine, but with the event of more commercial parallel machines (CM-5, Touchstone, Delta, networks of workstations) it seems necessary to convert it to C as occam II is only supported on Transputer systems. Andreas Windemut of our group has written an MD program, called MD, entirely in C. Both programs are in the public domain and are available by anonymous ftp from lisboa.ks.uiuc.edu. For EGO, the Transputer-based program, we also have a guest account which provides access to a 6 Transputer system over the internet. If you would like more information, please let me know. Clarke Earley (cearly@vax1.umkc.edu) writes: While I would not consider my contribution in this area an ongoing programming development project, I have written a relatively simple program (ca. 7000 lines) for the display of molecular pictures in C (with a few minor assembly routines). The program is named MOLYROO, and is available from QCPE. This program was developed for use on IBM-compatable PC's using Microsoft C due to the fact that this package included a reasonably complete set of graphics routines that was not available with any of the FORTRAN compilers I was using at the time (This is no longer the case). I found C to be a very powerful language, and it has a number of features that I found very useful in writing this program, particularly the use of data structures. However, while (because?) I have had no formal training in C, I found programming in this language somewhat more difficult than using FORTRAN, particularly in the DEBUGGING stages of development. I also did not find the language to be as portable as advocates claim. In attempting to learn the language, I ran a simple benchmark to compare the performance of C vs. FORTRAN in numerical calculations. I found that C was noticeably slower than FORTRAN (I don't remember the exact numbers but I think it may have been a factor of ca. 1.25-2.0 times slower). Similar results were obtained on both a PC using Microsoft C and on a VAX. I would be interested to know the results of other more accurate/representative benchmarks. Mike Whitbeck (whitbeck@wheeler.wrc.unr.edu) writes: I use both C and FORTRAN. In the past I even used a little PASCAL and PL1. I am now looking into C++. It has been my experience that C is much more powerful a language than FORTRAN, especially with respect to memory allocation and structured programming. Unfortunately, FORTRAN seems to almost always run MUCH faster. At first this seems odd since undoubtedly the FORTRAN compiler is written in C and does more work (larger object files - strong variable typing...). I believe the speed discrepancy is attributable to the very high level of optimization (for speed) in most FORTRAN compilers. As an example, I have a PD FORTRAN for an ATARI ST that produces executables which run twice as fast as those from a $300 'optimized' C compiler! But the executable file size is 5-10 times larger. Carefully hand- optimizing the C code for speed evens things up. Translating FORTRAN to C? Yes, I've done that - it mostly ends up a complete rewrite. Some things are much easier to code in C than FORTRAN. I have also used the f2c translator (netlib@research.att.com, simtel20). When using the translator I comment out all the I/O stuff and then hand code it back into the C output at which time I change all the function names and type declarations to their ANSI C forms eliminating the need for FORTRAN library calls. By the way, I don't think C is more portable then FORTRAN. Peter Shenkin (shenkin@avogadro.barnard.columbia.edu) writes: I do protein molecular modeling, writing my code in C. Also, Jan Hermans recently told me that he now uses C exclusively, and is rewriting his FORTRAN codes in C. Richard Macdonald (richard@iris26.biosym.com) writes: It probably is obvious, but the big commercial chemistry software houses probably use C. We certainly do at Biosym, although we have a lot of FORTRAN code, some that we will rewrite in C and some that we won't. Gary Kedziora (kedziora@sodium.mps.ohio-state.edu) writes: I'd be interested in hearing what people say about using C in chemistry applications. I was considering writing a multireference perturbation theory program in C, but decided against it because nobody in our group (Shavitt group) knows C, so maintenance would be tough. Also, I'd like to use a lot of existing FORTRAN routines and the C/FORTRAN interface is not very portable (I'm told). Ray Cline (rec@sandia.llnl.gov) writes: My major interest in the application of parallel processing to computational chemistry (MD, Monte Carlo, QMC, path integrals, etc.). In general, anything that I write from scratch is done in C (and in the future, probably C++). The features of C that I find particularly useful are the availability of STANDARD dynamic memory allocation and bit manipulation routines and the ability to define useful datastructures. The data structures are particularly useful in parallel programming when it is necessary to encapsulate data into messages for communication. There has always been a debate about the speed of FORTRAN vs. other languages. Modern compilers that use common backend optimizers for all languages have made this argument meaningless (jle: is this really true, given the difficulty of recognizing language constructions in FORTRAN vs. C in the frontend?) The added fact that most programming tools are being developed for C and C++, not FORTRAN, make C program development more efficient. The major problem that I have is that when other people give me programs, they tend to be in FORTRAN. The tools available for FORTRAN to C conversion (f2c and forc) are not fool-proof. This is only a minor problem, since most of the code development I do from scratch. Mike Colvin (colvin@lll-crg.llnl.gov) writes: The bulk of the massively parallel QC codes at Sandia are written in C. Our main motivation for using C has been portability and the availability of datastructures. We've also been programming in C-WEB (C with Tex documentation interwoven with the source code) and are experimenting with C++. Richard Gililan (reg@chem.ucsd.edu) writes: All the Wilson group simulation code (mostly molecular dynamics) is written in C. We have contemplated moving to C++ but that would involve more time than we're willing to spend at this point. I've never had any real problem finding the numerical routines we need. Our group relies heavily on Numerical Recipes in C but, on occasion, I have used more exotic routines from elsewhere (netlib, etc.). Links to FORTRAN are trivial. Matt Clark (matt@tripos.com) writes: At Tripos, all development is in C for portability, and for the more advanced capabilities such as dynamic memory using and data structures. Leif Laaksonen (laaksone@convex.csc.fi) writes: What do you mean by chemistry code? If you mean graphics code then most of the code is written in C or C++ (I believe) (jle: I expect this, having written a BUNCH of graphics codes in FORTRAN for a VAX a while ago - the language's not really set up for that...) but if you refer to the so- called number crunching than most of the code is still in FORTRAN (I believe). As far as I know there is work going on for rewriting old MD/MM codes into C. I would be happy to see how close good C coding comes to old FORTRAN coding in the sense of computational efficiency. The tests I made on our Cray showed that good vector FORTRAN code is still much faster than my C code. Does anybody know if this is still the case? On normal workstations the C code should be pretty close to FORTRAN code. I don't know what people think but as an old FORTRAN programmer I now do everything in C and I really would never change back to FORTRAN. Stephen Daleman (cpsdale@vm.uoguelph.ca) writes (summary): 1. What's taught in the classroom is entirely different from what's being done in the real world. 2. FORTRAN is still used for number crunching due to the strength of the support libraries. 3. C (and PASCAL) provide the data structures to construct large applications. 4. FORTRAN9X might provide tough competition to C (when it appears). 5. If new ideas are to be expressed, old style thinking and paradigms will have to be abandoned. Doug Allan (allan@helios.tn.cornell.edu) writes: C, after all, stands for Chemistry... Are you asking for numerical programs being developed in C, or anything like graphics, etc. (jle: see above, non-graphics). I think programs like DGauss, DMol, deMon, GAUSSIAN90, etc. will continue to be developed in FORTRAN, at least as far as I know, while interfaces to actual machines (windows, frontends, etc.) seem to be written in C (such as Insight of Biosym, I think). My own coding effort remains restricted to FORTRAN, with a few splashes of C where FORTRAN just can't cope. Probably the first codes to be done in C will be those closest to the machine and with the least numerically intensive demands, i.e. relative simple molecular mechanics or molecule manipulation programs which produce screen pictures. The existence of an IMSL-type facility in C might push such more numerically intensive codes in the direction of C (jle: do we want multi- platform codes or codes tailored to each platform - convenience vs. speed?). Don Kinghorn (don%kinghorn.wsu.edu@yoda.eecs.wsu.edu) writes: I'm doing nonadiabatic 4-particle variational calculations using an explicitly correlated basis of gaussian geminals. I'm coding everything in C, integral formulas, matrix elements, a couple of secular equation solvers, and optimization routines using simulated annealing and conjugate gradients. The problem is essentially a HUGH nonlinear optimization problem with a messy cost function. The main reason I chose C is the development environment I have available, a NeXT workstation (without a FORTRAN compiler). I can do algorithm development and prototyping in Mathematica and Maple, and then code everything in C all in a very friendly environment. When I have everything written and tested, I'll port it to our 3090 under AIX (jle: ouch!) for large basis-set calculations. I don't know if there is any advantage in using C for calculations like these other than the convenience factor that I mentioned. In fact, it would have been easier to use Fortran subroutine libraries rather than writing the code myself, but I'm doing this calculation as a master's thesis and I wanted to gain programming experience in C. A place where C might have an advantage over FORTRAN for scientific programming is implementing code for distributed or parallel architectures. The advantage being that C is such a nice systems programming language, especially in a UNIX environment. Mark Thompson (d3f012@gator.pnl.gov) writes (summary): I have developed Argus, an electronic structure code written in C: (long list of features, comments and a few publications) Argus is about halfway finished being ported to the Intel iPSC/860 parallel computer.