From owner-chemistry -8 at 8- ccl.net Tue Dec 10 09:28:01 2013 From: "Olasunkanmi Lukman Olawale walecomuk|a|yahoo.co.uk" To: CCL Subject: CCL:G: Share of experience, software and hardware Message-Id: <-49416-131210072109-2687-8jiVtsqljuZtRc5Gs2Jbrw a server.ccl.net> X-Original-From: Olasunkanmi Lukman Olawale Content-Type: multipart/alternative; boundary="862858767-2088822979-1386678060=:84267" Date: Tue, 10 Dec 2013 12:21:00 +0000 (GMT) MIME-Version: 1.0 Sent to CCL by: Olasunkanmi Lukman Olawale [walecomuk]^[yahoo.co.uk] --862858767-2088822979-1386678060=:84267 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Dear Ibrahim,=0AThis page may help: www.pqs-chem.com.=0ABest Regards.=0A=0A= =0A=A0=0AOlasunkanmi Lukman Olawale=0A=0A________________________________= =0ACurrent Address:=0ADepartment of Chemistry,=0AObafemi Awolowo University= ,=0AIle-Ife, Osun State.=0ANigeria.=0A=0A+234-0-80-52401564 Or +234-0-80-67= 161091=0A________________________________=0A =0A=0A=0A=0AOn Monday, 9 Decem= ber 2013, 15:55, Daniel Jana dfjana . gmail.com w= rote:=0A =0A=0ASent to CCL by: Daniel Jana [dfjana_-_gmail.com]=0AHello,=0A= =0AStrategy 1 - I see no problem with having multiple users using the=0Asam= e computer. Of course, physically it's hard... you tend to only=0Ahave one = keyboard and screen after all. However, once you have the=0Acomputers on th= e network, nothing prevents other users from connecting=0A(I assume we are = talking about Linux workstations) via SSH and=0Alaunching calculations. Thr= ough a combination of job priority and=0Aappropriate choice of the number o= f cores, the machine can be used for=0Aregular work (checking journals on t= he web, reading/writing papers,=0Athe occasional video on youtube because n= o one works 100% of the time)=0Awhile running jobs.=0AThe easiest way to ma= nage the software will be having a NFS-shared=0Apartition with all the soft= ware installed. This means you only install=0Athe software in one place, ra= ther than locally in every machine.=0AIn this scenario, users typically che= ck for available workstations and=0Arun their jobs directly on the machines= . You can always go the=0Aextra-mile and install a scheduler so they can su= bmit jobs to the=0Aqueue and it runs on the first available machine. But th= at may be too=0Amuch to learn in the beginning.=0A=0AStrategy 2 - Obviously= having a cluster is the ideal solution, but I'm=0Anot sure with that budge= t you'll go far. Perhaps you buy computers=0A> from a regular shop and not = rack-ready hardware, making it a bit=0Acheaper). You will still have a lot = of computers in the same room=0Aproducing heat. You have to at least consid= er the possibility that=0Apart of that money will go to buying and installi= ng AC. If you go with=0Astrategy 1 you probably will have the students spre= ad over several=0Arooms so the problem becomes less obvious. And buying a r= ack-ready=0Acluster will also mean buying a rack. With such a small budget = it may=0Aend up not being a negligible part of your budget.=0AConcerning Am= ber/Gaussian... those two codes have different=0Acapabilities when it comes= to scaling to many cores. My personal=0Afeeling is Gaussian scales poorly = beyond 8 cores and poorly over more=0Athan one machine. Amber, on the other= hand, should be more or less=0Alinearly scaling for at least a few hundred= cores. This means that if=0Ayou plan to have a cluster for Gaussian there'= s not much of a need for=0AInfiniband, while for Amber it does make sense (= because running it=0Aover Ethernet does impact the performance substantiall= y. Of course,=0Awith a total budget of 40 kUSD, talking about Infiniband is= probably a=0Abit stupid.=0A=0AI'd say in the beginning strategy 1 makes mo= re sense. You still need=0Acomputers for the students anyway... no point ha= ving a cluster if=0Ausers can't connect to it. You can also start learning = slowly the=0Atools needed to have a cluster (e.g. learning NFS in the begin= ning to=0Ashare the software; later on installing NIS to manage the users= =0Acentrally, rather than having to install all users on all=0Aworkstations= ; later on installing a scheduler so that jobs can be=0Aautomatically submi= tted to remote machines). It's true that it's not=0Atrivial to manage one, = so taking baby steps is probably the best way=0Ato go at it. When you feel = more comfortable with it, perhaps even=0Ahaving one or two students capable= of dealing with all the needed=0Atools that make a cluster, perhaps you co= uld then think about=0Aacquiring a cluster, potentially with a few other gr= oups so you could=0Amake an investment giving you a cluster with 30 or 40 n= odes.=0A=0AWhen it comes to software: I would avoid as much purchasing=0Ano= n-scientific software. Why spend money on an OS, when Linux=0A(provided you= and your students have either the skills or the time to=0Alearn them) cost= s nothing and is probably the best solution? Once your=0Astudents are accus= tomed to the shell, they can start working on=0Ascripts that make their lif= e easier (e.g. by parsing the output files=0Aand extracting only the releva= nt bits, rather than having to do it all=0Aby hand). Linux and related tool= s will cover most of your needs (even=0Aif you go for a cluster, NFS, DHCP,= SSH, NIS, are all readily=0Aavailable and there's plenty of information on= how to get them to=0Awork). And if you are anyway considering a cluster, c= hances are you'll=0Aneed to learn Linux anyway. At least for the cluster, y= ou need a=0Ascheduler to manage the jobs of the users (although, as I menti= oned=0Aearlier, it may even make sense with the workstations). Lately I've= =0Abeen inclined to use SLURM. Torque feels a bit abandoned, SGE split=0Ain= to so many things after Sun got bought by Oracle that I don't even=0Aknow w= hich version to install. I could name a few other but some of=0Athe ones I'= ve tried just felt too bad to be put into production. SLURM=0Ais a young pr= oject, it has some quirks, but it seems a good bet for=0Athe near future. C= ompilers: in the beginning you can certainly work=0Awith GNU compilers (gcc= , gfortran, ...), coming with Linux. Most of=0Athe codes you need to compil= e will work with those. You'll definitely=0Aneed to install BLAS and LAPACK= . Perhaps they will be available from=0Athe Linux distribution you choose. = But it would be best to compile=0Athem locally, for optimal performance. FF= TW 2 and 3 will also be=0Aimportant, but you'll figure that out quickly. Ho= wever, on the long=0Arun, consider purchasing Intel compilers and MKL. The = codes compiled=0Awith those are often faster than those compiled with GNU c= ompilers.=0AWith a limited number of machines, efficiency may be the best u= pgrade=0Ayou can get.=0A=0AAs for vendors, I feel I cannot give you a good = answer. Certainly the=0Abest vendor in Egypt is not the same as here (read = best in whatever=0Away you want, from cheapest to the one giving the best c= ostumer=0Aservice).=0A=0AI hope this helps,=0ADaniel=0A=0APS - Please consi= der a backup solution. You may go with strategy 1 for=0Anow, but it serves = no purpose to have all those computers and risk=0Alosing months of work bec= ause a hard drive died. Consider buying a=0Amachine, with several disks and= several times the capacity of the=0Aindividual computers and automating ba= ckups of the workstations. It=0Acan even be the machine where you install t= he software, to reduce=0Acosts. Bonus points if you manage to have it in a = separate location=0A(e.g. a server room on the other side of the campus). L= ike this you=0Aavoid losing the backups and the workstations when a fire bu= rns your=0Alab or when someone steals some computers overnight. It may seem= that=0Ayou can think about this later, but from personal experience and=0A= anecdotal evidence, people only think about backups when it's too=0Alate, w= hen you already need them.=0A=0AOn 8 December 2013 20:55, Mahmoud A. A. Ibr= ahim=0Am.ibrahim[A]compchem.net wrote:=0A>=0A> = Sent to CCL by: "Mahmoud A. A. Ibrahim" [m.ibrahim^compchem.net]=0A> Dear C= olleagues=0A> We ask you kindly to share your experience with us.=0A> Nowad= ays, we are establishing a new computational chemistry lab and aiming to=0A= > purchase some hardware.=0A> The budget is not high. It is around 40,000$.= =0A> We have two strategies:=0A> 1- Purchase good workstations with the ava= ilable budget. The problem is that=0A> only one user will use the workstati= on, i.e. we need a workstation per each=0A> student. If there is any way to= make many users to use the same workstation=0A> at the same time, please s= hare your knowledge with us and let us know.=0A> 2- Purchase a small HPC wh= ich can be upgraded in the near future (just add=0A> more processors and st= orage disks). I prefer this strategy which makes us=0A> able to increase ou= r facilities in the future very easily without getting red=0A> off the old = ones. But, we don't have a professional technicians herein at the=0A> curre= nt time, and our colleagues say that it is not easy to manage a small=0A> H= PC to handle your jobs.=0A> We need your experience and let us know if you = were us which one you would=0A> purchase (workstations or small HPC).=0A> I= t would be nice from you if you let us know what all hardware and software= =0A> you need to purchase starting from operating system upto the software= =0A> responsible for handling the jobs and compilers. As well in case of pu= rchase=0A> HPC/workstations, which company you would recommend.=0A> For you= r information, we are aiming to run Gaussian calculations and AMBER=0A> sim= ulations at the current time.=0A> Finally, we thank you deeply in advance f= or your support.=0A> Sincerely;=0A> M. Ibrahim=0A> P.S. we read many posts = on CCL regarding the hardware but because of the fast=0A> growing up of tec= hnology we are afraid we missed something around. We do=0A> apologize for a= ny inconvenience caused.=0A> --=0A> Mahmoud A. A. Ibrahim=0A> Editor, Journ= al of Organic and Biomolecular Simulations (JOBS), Science=0A> Publications= =0A> Group Leader, CompChem Lab, Chemistry Department,=0A> Faculty of Scien= ce, Minia University, Minia 61519, Egypt.=0A> Email: m.ibrahim()compchem.ne= t=0A>=A0 =A0 =A0 =A0 =A0 =A0 m.ibrahim()mu.edu.eg=0A> Website: www.compche= m.net>=0A>=0A=0A=0A=0A-=3D This is automatically added to each message by t= he mailing script =3D-=0ATo recover the email address of the author of the = message, please change=0Athe strange characters on the top line to the _+_ si= gn. You can also=0A=0A= =0A=0A=A0 =A0 =A0 http://ww= w.ccl.net/cgi-bin/ccl/send_ccl_message=0A=0AE-mail to administrators: CHEMI= STRY-REQUEST_+_ccl.net or use=0A=A0 =A0 =A0 http://www.ccl.net/cgi-bin/ccl/se= nd_ccl_message=0A=0A=0A=A0 =A0 =A0 http://www.ccl.ne= t/chemistry/sub_unsub.shtml=0A=0ABefore posting, check wait time at: http:/= /www.ccl.net=0A=0A=0AConferences: http://serve= r.ccl.net/chemistry/announcements/conferences/=0A=0ASearch Messages: http:/= /www.ccl.net/chemistry/searchccl/index.shtml=0A=0AIf your mail bounces from= CCL with 5.7.1 error, check:=0A=A0 =A0 =A0= =0A=0A--862858767-2088822979-1386678060=:84267 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
Dear Ibrahim,
This page may help: www.pqs-chem.com.
Bes= t Regards.
 
Olasunkanmi Lukman Olawale

Current Address:
Department of Chemistry,
= Obafemi Awolowo University,
Ile-Ife,= Osun State.
Nigeria.
+234-0-80-52401564 Or +234-0-80-67161091



On Monday, 9 December 2013, 15:55, Daniel Jana dfjana . gmail.= com <owner-chemistry_+_ccl.net> wrote:

Sent to CCL by: Daniel Jana [dfjana_-_gmail.com]Hello,

Strategy 1 - I see no problem with having multiple users us= ing the
same computer. Of course, physically it's hard... you tend to only
have one = keyboard and screen after all. However, once you have the
computers on t= he network, nothing prevents other users from connecting
(I assume we ar= e talking about Linux workstations) via SSH and
launching calculations. = Through a combination of job priority and
appropriate choice of the numb= er of cores, the machine can be used for
regular work (checking journals= on the web, reading/writing papers,
the occasional video on youtube bec= ause no one works 100% of the time)
while running jobs.
The easiest w= ay to manage the software will be having a NFS-shared
partition with all= the software installed. This means you only install
the software in one= place, rather than locally in every machine.
In this scenario, users ty= pically check for available workstations and
run their jobs directly on = the machines. You can always go the
extra-mile and install a scheduler so they can submit jobs to the
queue and it runs on the first= available machine. But that may be too
much to learn in the beginning.<= br>
Strategy 2 - Obviously having a cluster is the ideal solution, but I= 'm
not sure with that budget you'll go far. Perhaps you buy computers> from a regular shop and not rack-ready hardware, making it a bit
c= heaper). You will still have a lot of computers in the same room
produci= ng heat. You have to at least consider the possibility that
part of that= money will go to buying and installing AC. If you go with
strategy 1 yo= u probably will have the students spread over several
rooms so the probl= em becomes less obvious. And buying a rack-ready
cluster will also mean = buying a rack. With such a small budget it may
end up not being a neglig= ible part of your budget.
Concerning Amber/Gaussian... those two codes h= ave different
capabilities when it comes to scaling to many cores. My personal
feeling is Gaussian scales poorly beyond 8 cores and= poorly over more
than one machine. Amber, on the other hand, should be = more or less
linearly scaling for at least a few hundred cores. This mea= ns that if
you plan to have a cluster for Gaussian there's not much of a= need for
Infiniband, while for Amber it does make sense (because runnin= g it
over Ethernet does impact the performance substantially. Of course,=
with a total budget of 40 kUSD, talking about Infiniband is probably a<= br>bit stupid.

I'd say in the beginning strategy 1 makes more sense.= You still need
computers for the students anyway... no point having a c= luster if
users can't connect to it. You can also start learning slowly = the
tools needed to have a cluster (e.g. learning NFS in the beginning t= o
share the software; later on installing NIS to manage the users
cen= trally, rather than having to install all users on all
workstations; later on installing a scheduler so that jobs can beautomatically submitted to remote machines). It's true that it's not
t= rivial to manage one, so taking baby steps is probably the best way
to g= o at it. When you feel more comfortable with it, perhaps even
having one= or two students capable of dealing with all the needed
tools that make = a cluster, perhaps you could then think about
acquiring a cluster, poten= tially with a few other groups so you could
make an investment giving yo= u a cluster with 30 or 40 nodes.

When it comes to software: I would = avoid as much purchasing
non-scientific software. Why spend money on an = OS, when Linux
(provided you and your students have either the skills or= the time to
learn them) costs nothing and is probably the best solution= ? Once your
students are accustomed to the shell, they can start working= on
scripts that make their life easier (e.g. by parsing the output files
and extracting only the relevant bits, rather than having = to do it all
by hand). Linux and related tools will cover most of your n= eeds (even
if you go for a cluster, NFS, DHCP, SSH, NIS, are all readily=
available and there's plenty of information on how to get them to
wo= rk). And if you are anyway considering a cluster, chances are you'll
nee= d to learn Linux anyway. At least for the cluster, you need a
scheduler = to manage the jobs of the users (although, as I mentioned
earlier, it ma= y even make sense with the workstations). Lately I've
been inclined to u= se SLURM. Torque feels a bit abandoned, SGE split
into so many things af= ter Sun got bought by Oracle that I don't even
know which version to ins= tall. I could name a few other but some of
the ones I've tried just felt= too bad to be put into production. SLURM
is a young project, it has som= e quirks, but it seems a good bet for
the near future. Compilers: in the beginning you can certainly work
with GNU compilers (gcc, gfortr= an, ...), coming with Linux. Most of
the codes you need to compile will = work with those. You'll definitely
need to install BLAS and LAPACK. Perh= aps they will be available from
the Linux distribution you choose. But i= t would be best to compile
them locally, for optimal performance. FFTW 2= and 3 will also be
important, but you'll figure that out quickly. Howev= er, on the long
run, consider purchasing Intel compilers and MKL. The co= des compiled
with those are often faster than those compiled with GNU co= mpilers.
With a limited number of machines, efficiency may be the best u= pgrade
you can get.

As for vendors, I feel I cannot give you a go= od answer. Certainly the
best vendor in Egypt is not the same as here (r= ead best in whatever
way you want, from cheapest to the one giving the b= est costumer
service).

I hope this helps,
Daniel

PS - Please consider a backup solution. You may go= with strategy 1 for
now, but it serves no purpose to have all those com= puters and risk
losing months of work because a hard drive died. Conside= r buying a
machine, with several disks and several times the capacity of= the
individual computers and automating backups of the workstations. It=
can even be the machine where you install the software, to reduce
co= sts. Bonus points if you manage to have it in a separate location
(e.g. = a server room on the other side of the campus). Like this you
avoid losi= ng the backups and the workstations when a fire burns your
lab or when s= omeone steals some computers overnight. It may seem that
you can think a= bout this later, but from personal experience and
anecdotal evidence, pe= ople only think about backups when it's too
late, when you already need = them.

On 8 December 2013 20:55, Mahmoud A. A. Ibrahim
m.ibrahim[A]compchem.net <owner-chemistry%a%ccl.net> wrot= e:
>
> Sent to CCL by: "Mahmoud A. A. Ibrahim" [m.ibrahim^compc= hem.net]
> Dear Colleagues
> We ask you kindly to share your ex= perience with us.
> Nowadays, we are establishing a new computational= chemistry lab and aiming to
> purchase some hardware.
> The bu= dget is not high. It is around 40,000$.
> We have two strategies:
= > 1- Purchase good workstations with the available budget. The problem i= s that
> only one user will use the workstation, i.e. we need a works= tation per each
> student. If there is any way to make many users to = use the same workstation
> at the same time, please share your knowle= dge with us and let us know.
> 2- Purchase a small HPC which can be u= pgraded in the near future (just add
> more processors and storage di= sks). I prefer this strategy which makes us
> able to increase our facilities in the future very easily without getting red
> off t= he old ones. But, we don't have a professional technicians herein at the> current time, and our colleagues say that it is not easy to manage a = small
> HPC to handle your jobs.
> We need your experience and = let us know if you were us which one you would
> purchase (workstatio= ns or small HPC).
> It would be nice from you if you let us know what= all hardware and software
> you need to purchase starting from opera= ting system upto the software
> responsible for handling the jobs and= compilers. As well in case of purchase
> HPC/workstations, which com= pany you would recommend.
> For your information, we are aiming to ru= n Gaussian calculations and AMBER
> simulations at the current time.<= br>> Finally, we thank you deeply in advance for your support.
> S= incerely;
> M. Ibrahim
> P.S. we read many posts on CCL regarding the hardware but because of the fast
> growing up of techn= ology we are afraid we missed something around. We do
> apologize for= any inconvenience caused.
> --
> Mahmoud A. A. Ibrahim
>= Editor, Journal of Organic and Biomolecular Simulations (JOBS), Science> Publications
> Group Leader, CompChem Lab, Chemistry Department= ,
> Faculty of Science, Minia University, Minia 61519, Egypt.
>= Email: m.ibrahim()compchem.net
>          &= nbsp; m.ibrahim()mu.edu.eg
> Website: www.compchem.net>
>


-=3D This is automatically added to each message by the maili= ng script =3D-
To recover the email address of the author of the message= , please change
the strange characters on the top line to the _+_ sign. Yo= u can also
= E-mail to subscribers: CHEMISTRY_+_ccl.net or use:
  =     http://www.ccl.net/cgi-bin/ccl/send_ccl_message
E-mail to administrators: CHEMISTRY-REQUEST_+_ccl.net or = use
      http://www.ccl.net/cgi-bin/ccl/send_ccl_mess= age
      http://www.c= cl.net/chemistry/sub_unsub.shtml

Before posting, check wait time= at: http://www.ccl.net

Job:
http:/= /www.ccl.net/jobs
Conferences: http://server.ccl.net/chemistry/announcements/conferences/<= br>
Search Messages: http://www.ccl.net/chemistry/searchccl/index.= shtml
=       http://www.ccl.net/spammers.txt

RTFI: http://www.= ccl.net/chemistry/aboutccl/instructions/




--862858767-2088822979-1386678060=:84267--