CCL:G: Share of experience, software and hardware
- From: Olasunkanmi Lukman Olawale
<walecomuk:+:yahoo.co.uk>
- Subject: CCL:G: Share of experience, software and hardware
- Date: Tue, 10 Dec 2013 12:21:00 +0000 (GMT)
Dear Ibrahim,
This page may help: www.pqs-chem.com.
Best Regards.
Olasunkanmi Lukman Olawale
________________________________
Current Address:
Department of Chemistry,
Obafemi Awolowo University,
Ile-Ife, Osun State.
Nigeria.
+234-0-80-52401564 Or +234-0-80-67161091
________________________________
On Monday, 9 December 2013, 15:55, Daniel Jana dfjana . gmail.com
<owner-chemistry_+_ccl.net> wrote:
Sent to CCL by: Daniel Jana [dfjana_-_gmail.com]
Hello,
Strategy 1 - I see no problem with having multiple users using the
same computer. Of course, physically it's hard... you tend to only
have one keyboard and screen after all. However, once you have the
computers on the network, nothing prevents other users from connecting
(I assume we are talking about Linux workstations) via SSH and
launching calculations. Through a combination of job priority and
appropriate choice of the number of cores, the machine can be used for
regular work (checking journals on the web, reading/writing papers,
the occasional video on youtube because no one works 100% of the time)
while running jobs.
The easiest way to manage the software will be having a NFS-shared
partition with all the software installed. This means you only install
the software in one place, rather than locally in every machine.
In this scenario, users typically check for available workstations and
run their jobs directly on the machines. You can always go the
extra-mile and install a scheduler so they can submit jobs to the
queue and it runs on the first available machine. But that may be too
much to learn in the beginning.
Strategy 2 - Obviously having a cluster is the ideal solution, but I'm
not sure with that budget you'll go far. Perhaps you buy computers
> from a regular shop and not rack-ready hardware, making it a bit
cheaper). You will still have a lot of computers in the same room
producing heat. You have to at least consider the possibility that
part of that money will go to buying and installing AC. If you go with
strategy 1 you probably will have the students spread over several
rooms so the problem becomes less obvious. And buying a rack-ready
cluster will also mean buying a rack. With such a small budget it may
end up not being a negligible part of your budget.
Concerning Amber/Gaussian... those two codes have different
capabilities when it comes to scaling to many cores. My personal
feeling is Gaussian scales poorly beyond 8 cores and poorly over more
than one machine. Amber, on the other hand, should be more or less
linearly scaling for at least a few hundred cores. This means that if
you plan to have a cluster for Gaussian there's not much of a need for
Infiniband, while for Amber it does make sense (because running it
over Ethernet does impact the performance substantially. Of course,
with a total budget of 40 kUSD, talking about Infiniband is probably a
bit stupid.
I'd say in the beginning strategy 1 makes more sense. You still need
computers for the students anyway... no point having a cluster if
users can't connect to it. You can also start learning slowly the
tools needed to have a cluster (e.g. learning NFS in the beginning to
share the software; later on installing NIS to manage the users
centrally, rather than having to install all users on all
workstations; later on installing a scheduler so that jobs can be
automatically submitted to remote machines). It's true that it's not
trivial to manage one, so taking baby steps is probably the best way
to go at it. When you feel more comfortable with it, perhaps even
having one or two students capable of dealing with all the needed
tools that make a cluster, perhaps you could then think about
acquiring a cluster, potentially with a few other groups so you could
make an investment giving you a cluster with 30 or 40 nodes.
When it comes to software: I would avoid as much purchasing
non-scientific software. Why spend money on an OS, when Linux
(provided you and your students have either the skills or the time to
learn them) costs nothing and is probably the best solution? Once your
students are accustomed to the shell, they can start working on
scripts that make their life easier (e.g. by parsing the output files
and extracting only the relevant bits, rather than having to do it all
by hand). Linux and related tools will cover most of your needs (even
if you go for a cluster, NFS, DHCP, SSH, NIS, are all readily
available and there's plenty of information on how to get them to
work). And if you are anyway considering a cluster, chances are you'll
need to learn Linux anyway. At least for the cluster, you need a
scheduler to manage the jobs of the users (although, as I mentioned
earlier, it may even make sense with the workstations). Lately I've
been inclined to use SLURM. Torque feels a bit abandoned, SGE split
into so many things after Sun got bought by Oracle that I don't even
know which version to install. I could name a few other but some of
the ones I've tried just felt too bad to be put into production. SLURM
is a young project, it has some quirks, but it seems a good bet for
the near future. Compilers: in the beginning you can certainly work
with GNU compilers (gcc, gfortran, ...), coming with Linux. Most of
the codes you need to compile will work with those. You'll definitely
need to install BLAS and LAPACK. Perhaps they will be available from
the Linux distribution you choose. But it would be best to compile
them locally, for optimal performance. FFTW 2 and 3 will also be
important, but you'll figure that out quickly. However, on the long
run, consider purchasing Intel compilers and MKL. The codes compiled
with those are often faster than those compiled with GNU compilers.
With a limited number of machines, efficiency may be the best upgrade
you can get.
As for vendors, I feel I cannot give you a good answer. Certainly the
best vendor in Egypt is not the same as here (read best in whatever
way you want, from cheapest to the one giving the best costumer
service).
I hope this helps,
Daniel
PS - Please consider a backup solution. You may go with strategy 1 for
now, but it serves no purpose to have all those computers and risk
losing months of work because a hard drive died. Consider buying a
machine, with several disks and several times the capacity of the
individual computers and automating backups of the workstations. It
can even be the machine where you install the software, to reduce
costs. Bonus points if you manage to have it in a separate location
(e.g. a server room on the other side of the campus). Like this you
avoid losing the backups and the workstations when a fire burns your
lab or when someone steals some computers overnight. It may seem that
you can think about this later, but from personal experience and
anecdotal evidence, people only think about backups when it's too
late, when you already need them.
On 8 December 2013 20:55, Mahmoud A. A. Ibrahim
m.ibrahim[A]compchem.net <owner-chemistry%a%ccl.net> wrote:
>
> Sent to CCL by: "Mahmoud A. A. Ibrahim" [m.ibrahim^compchem.net]
> Dear Colleagues
> We ask you kindly to share your experience with us.
> Nowadays, we are establishing a new computational chemistry lab and aiming
to
> purchase some hardware.
> The budget is not high. It is around 40,000$.
> We have two strategies:
> 1- Purchase good workstations with the available budget. The problem is
that
> only one user will use the workstation, i.e. we need a workstation per each
> student. If there is any way to make many users to use the same workstation
> at the same time, please share your knowledge with us and let us know.
> 2- Purchase a small HPC which can be upgraded in the near future (just add
> more processors and storage disks). I prefer this strategy which makes us
> able to increase our facilities in the future very easily without getting
red
> off the old ones. But, we don't have a professional technicians herein at
the
> current time, and our colleagues say that it is not easy to manage a small
> HPC to handle your jobs.
> We need your experience and let us know if you were us which one you would
> purchase (workstations or small HPC).
> It would be nice from you if you let us know what all hardware and software
> you need to purchase starting from operating system upto the software
> responsible for handling the jobs and compilers. As well in case of
purchase
> HPC/workstations, which company you would recommend.
> For your information, we are aiming to run Gaussian calculations and AMBER
> simulations at the current time.
> Finally, we thank you deeply in advance for your support.
> Sincerely;
> M. Ibrahim
> P.S. we read many posts on CCL regarding the hardware but because of the
fast
> growing up of technology we are afraid we missed something around. We do
> apologize for any inconvenience caused.
> --
> Mahmoud A. A. Ibrahim
> Editor, Journal of Organic and Biomolecular Simulations (JOBS), Science
> Publications
> Group Leader, CompChem Lab, Chemistry Department,
> Faculty of Science, Minia University, Minia 61519, Egypt.
> Email: m.ibrahim()compchem.net
> m.ibrahim()mu.edu.eg
> Website: www.compchem.net>
>
-= This is automatically added to each message by the mailing script =-
To recover the email address of the author of the message, please change
the strange characters on the top line to the _+_ sign. You can also
http://www.ccl.net/cgi-bin/ccl/send_ccl_message
E-mail to administrators: CHEMISTRY-REQUEST_+_ccl.net or use
http://www.ccl.net/cgi-bin/ccl/send_ccl_message
http://www.ccl.net/chemistry/sub_unsub.shtml
Before posting, check wait time at: http://www.ccl.net
Conferences: http://server.ccl.net/chemistry/announcements/conferences/
Search Messages: http://www.ccl.net/chemistry/searchccl/index.shtml
If your mail bounces from CCL with 5.7.1 error, check:
--862858767-2088822979-1386678060=:84267
Content-Type: text/html; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
<html><body><div style="color:#000; background-color:#fff;
font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande,
Sans-Serif;font-size:12pt">Dear Ibrahim,<br>This page may help:
www.pqs-chem.com.<br>Best
Regards.<br><div><span></span></div><div> </div><div><span
style="color:rgb(0, 0, 127);font-weight:bold;">Olasunkanmi Lukman
Olawale</span><br></div><hr
style="width:100%;height:2px;"><span
style="font-weight:bold;">Current
Address:</span><br><span style="color:rgb(0, 0,
191);font-weight:bold;">Department of Chemistry,</span><br
style="color:rgb(0, 0, 191);font-weight:bold;"><span
style="color:rgb(0, 0, 191);font-weight:bold;">Obafemi Awolowo
University,</span><br style="color:rgb(0, 0,
191);font-weight:bold;"><span style="color:rgb(0, 0,
191);font-weight:bold;">Ile-Ife, Osun State.</span><br
style="color:rgb(0, 0, 191);font-weight:bold;"><span
style="color:rgb(0, 0,
191);font-weight:bold;">Nigeria.</span><br
style="color:rgb(0,
0, 191);font-weight:bold;"><div></div><div
style="text-align:left;"><span style="color:rgb(0, 0,
191);font-weight:bold;">+</span><span style="color:rgb(0,
0, 191);">2</span><span style="color:rgb(0, 0,
191);font-weight:bold;">34-0-80-52401564 Or
+234-0-80-67161091</span><br><hr
style="width:100%;height:2px;"> </div><div
style="display: block;" class="yahoo_quoted"> <br>
<br> <div style="font-family: HelveticaNeue, Helvetica Neue,
Helvetica, Arial, Lucida Grande, Sans-Serif; font-size: 12pt;"> <div
style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida
Grande, Sans-Serif; font-size: 12pt;"> <div dir="ltr">
<font face="Arial" size="2"> On Monday, 9 December
2013, 15:55, Daniel Jana dfjana . gmail.com
<owner-chemistry_+_ccl.net> wrote:<br> </font>
</div> <div class="y_msg_container"><br>Sent to CCL
by: Daniel Jana [dfjana_-_gmail.com]<br>Hello,<br><br>Strategy
1 - I see no problem with having multiple users using the<br>same
computer. Of course, physically it's hard... you tend to only<br>have one
keyboard and screen after all. However, once you have the<br>computers on
the network, nothing prevents other users from connecting<br>(I assume we
are talking about Linux workstations) via SSH and<br>launching
calculations. Through a combination of job priority and<br>appropriate
choice of the number of cores, the machine can be used for<br>regular work
(checking journals on the web, reading/writing papers,<br>the occasional
video on youtube because no one works 100% of the time)<br>while running
jobs.<br>The easiest way to manage the software will be having a
NFS-shared<br>partition with all the software installed. This means you
only install<br>the software in one place, rather than locally in every
machine.<br>In this scenario, users typically check for available
workstations and<br>run their jobs directly on the machines. You can
always go the<br>extra-mile and install a
scheduler so they can submit jobs to the<br>queue and it runs on the
first available machine. But that may be too<br>much to learn in the
beginning.<br><br>Strategy 2 - Obviously having a cluster is the
ideal solution, but I'm<br>not sure with that budget you'll go far.
Perhaps you buy computers<br>> from a regular shop and not
rack-ready hardware, making it a bit<br>cheaper). You will still have a
lot of computers in the same room<br>producing heat. You have to at least
consider the possibility that<br>part of that money will go to buying and
installing AC. If you go with<br>strategy 1 you probably will have the
students spread over several<br>rooms so the problem becomes less obvious.
And buying a rack-ready<br>cluster will also mean buying a rack. With such
a small budget it may<br>end up not being a negligible part of your
budget.<br>Concerning Amber/Gaussian... those two codes have
different<br>capabilities when it comes to scaling to many
cores. My personal<br>feeling is Gaussian scales poorly beyond 8 cores
and poorly over more<br>than one machine. Amber, on the other hand, should
be more or less<br>linearly scaling for at least a few hundred cores. This
means that if<br>you plan to have a cluster for Gaussian there's not much
of a need for<br>Infiniband, while for Amber it does make sense (because
running it<br>over Ethernet does impact the performance substantially. Of
course,<br>with a total budget of 40 kUSD, talking about Infiniband is
probably a<br>bit stupid.<br><br>I'd say in the beginning
strategy 1 makes more sense. You still need<br>computers for the students
anyway... no point having a cluster if<br>users can't connect to it. You
can also start learning slowly the<br>tools needed to have a cluster (e.g.
learning NFS in the beginning to<br>share the software; later on
installing NIS to manage the users<br>centrally, rather than having to
install all users on
all<br>workstations; later on installing a scheduler so that jobs can
be<br>automatically submitted to remote machines). It's true that it's
not<br>trivial to manage one, so taking baby steps is probably the best
way<br>to go at it. When you feel more comfortable with it, perhaps
even<br>having one or two students capable of dealing with all the
needed<br>tools that make a cluster, perhaps you could then think
about<br>acquiring a cluster, potentially with a few other groups so you
could<br>make an investment giving you a cluster with 30 or 40
nodes.<br><br>When it comes to software: I would avoid as much
purchasing<br>non-scientific software. Why spend money on an OS, when
Linux<br>(provided you and your students have either the skills or the
time to<br>learn them) costs nothing and is probably the best solution?
Once your<br>students are accustomed to the shell, they can start working
on<br>scripts that make their life easier (e.g. by parsing the
output files<br>and extracting only the relevant bits, rather than having
to do it all<br>by hand). Linux and related tools will cover most of your
needs (even<br>if you go for a cluster, NFS, DHCP, SSH, NIS, are all
readily<br>available and there's plenty of information on how to get them
to<br>work). And if you are anyway considering a cluster, chances are
you'll<br>need to learn Linux anyway. At least for the cluster, you need
a<br>scheduler to manage the jobs of the users (although, as I
mentioned<br>earlier, it may even make sense with the workstations).
Lately I've<br>been inclined to use SLURM. Torque feels a bit abandoned,
SGE split<br>into so many things after Sun got bought by Oracle that I
don't even<br>know which version to install. I could name a few other but
some of<br>the ones I've tried just felt too bad to be put into
production. SLURM<br>is a young project, it has some quirks, but it seems
a good bet for<br>the near future. Compilers:
in the beginning you can certainly work<br>with GNU compilers (gcc,
gfortran, ...), coming with Linux. Most of<br>the codes you need to
compile will work with those. You'll definitely<br>need to install BLAS
and LAPACK. Perhaps they will be available from<br>the Linux distribution
you choose. But it would be best to compile<br>them locally, for optimal
performance. FFTW 2 and 3 will also be<br>important, but you'll figure
that out quickly. However, on the long<br>run, consider purchasing Intel
compilers and MKL. The codes compiled<br>with those are often faster than
those compiled with GNU compilers.<br>With a limited number of machines,
efficiency may be the best upgrade<br>you can get.<br><br>As
for vendors, I feel I cannot give you a good answer. Certainly the<br>best
vendor in Egypt is not the same as here (read best in whatever<br>way you
want, from cheapest to the one giving the best
costumer<br>service).<br><br>I hope this
helps,<br>Daniel<br><br>PS - Please consider a backup
solution. You may go with strategy 1 for<br>now, but it serves no purpose
to have all those computers and risk<br>losing months of work because a
hard drive died. Consider buying a<br>machine, with several disks and
several times the capacity of the<br>individual computers and automating
backups of the workstations. It<br>can even be the machine where you
install the software, to reduce<br>costs. Bonus points if you manage to
have it in a separate location<br>(e.g. a server room on the other side of
the campus). Like this you<br>avoid losing the backups and the
workstations when a fire burns your<br>lab or when someone steals some
computers overnight. It may seem that<br>you can think about this later,
but from personal experience and<br>anecdotal evidence, people only think
about backups when it's too<br>late, when you already need
them.<br><br>On 8 December 2013 20:55, Mahmoud A. A.
Ibrahim<br>m.ibrahim[A]compchem.net
<owner-chemistry%a%ccl.net>
wrote:<br>><br>> Sent to CCL by: "Mahmoud A. A.
Ibrahim" [m.ibrahim^compchem.net]<br>> Dear
Colleagues<br>> We ask you kindly to share your experience with
us.<br>> Nowadays, we are establishing a new computational
chemistry lab and aiming to<br>> purchase some
hardware.<br>> The budget is not high. It is around
40,000$.<br>> We have two strategies:<br>> 1- Purchase
good workstations with the available budget. The problem is
that<br>> only one user will use the workstation, i.e. we need a
workstation per each<br>> student. If there is any way to make many
users to use the same workstation<br>> at the same time, please
share your knowledge with us and let us know.<br>> 2- Purchase a
small HPC which can be upgraded in the near future (just add<br>>
more processors and storage disks). I prefer this strategy which makes
us<br>> able to increase
our facilities in the future very easily without getting red<br>>
off the old ones. But, we don't have a professional technicians herein at
the<br>> current time, and our colleagues say that it is not easy
to manage a small<br>> HPC to handle your jobs.<br>>
We need your experience and let us know if you were us which one you
would<br>> purchase (workstations or small HPC).<br>>
It would be nice from you if you let us know what all hardware and
software<br>> you need to purchase starting from operating system
upto the software<br>> responsible for handling the jobs and
compilers. As well in case of purchase<br>> HPC/workstations, which
company you would recommend.<br>> For your information, we are
aiming to run Gaussian calculations and AMBER<br>> simulations at
the current time.<br>> Finally, we thank you deeply in advance for
your support.<br>> Sincerely;<br>> M.
Ibrahim<br>> P.S. we read many posts on CCL
regarding the hardware but because of the fast<br>> growing up of
technology we are afraid we missed something around. We do<br>>
apologize for any inconvenience caused.<br>> --<br>>
Mahmoud A. A. Ibrahim<br>> Editor, Journal of Organic and
Biomolecular Simulations (JOBS), Science<br>>
Publications<br>> Group Leader, CompChem Lab, Chemistry
Department,<br>> Faculty of Science, Minia University, Minia 61519,
Egypt.<br>> Email:
m.ibrahim()compchem.net<br>>
m.ibrahim()mu.edu.eg<br>>
Website:
www.compchem.net><br>><br><br><br><br>-=
This is automatically added to each message by the mailing script =-<br>To
recover the email address of the author of the message, please
change<br>the strange characters on the top line to the _+_ sign. You can
also<br<br><br>E-mail to subscribers: <a ymailto="mailto:CHEMISTRY_+_ccl.net"
href="mailto:CHEMISTRY_+_ccl.net">CHEMISTRY_+_ccl.net</a> or
use:<br> <a href="http://www.ccl.net/cgi-bin/ccl/send_ccl_message"
target="_blank">http://www.ccl.net/cgi-bin/ccl/send_ccl_message</a><br><br>E-mail
to administrators: <a ymailto="mailto:CHEMISTRY-REQUEST_+_ccl.net" href="mailto:CHEMISTRY-REQUEST_+_ccl.net">CHEMISTRY-REQUEST_+_ccl.net</a>
or use<br> <a href="http://www.ccl.net/cgi-bin/ccl/send_ccl_message"
target="_blank">http://www.ccl.net/cgi-bin/ccl/send_ccl_message</a><br><br<br>
<a href="http://www.ccl.net/chemistry/sub_unsub.shtml"
target="_blank">http://www.ccl.net/chemistry/sub_unsub.shtml</a><br><br>Before
posting, check wait time at: <a href="http://www.ccl.net/"
target="_blank">http://www.ccl.net</a><br><br>Job: <a
href="http://www.ccl.net/jobs"
target="_blank">http://www.ccl.net/jobs </a><br>Conferences:
<a
href="http://server.ccl.net/chemistry/announcements/conferences/"
target="_blank">http://server.ccl.net/chemistry/announcements/conferences/</a><br><br>Search
Messages: <a href="http://www.ccl.net/chemistry/searchccl/index.shtml"
target="_blank">http://www.ccl.net/chemistry/searchccl/index.shtml</a><br><br<br>
<a href="http://www.ccl.net/spammers.txt" target="_blank">http://www.ccl.net/spammers.txt</a><br><br>RTFI: <a
href="http://www.ccl.net/chemistry/aboutccl/instructions/"
target="_blank">http://www.ccl.net/chemistry/aboutccl/instructions/</a><br><br><br><br><br></div>
</div> </div> </div>
</div></body></html>