CCL: help needed
- From: "Perry E. Metzger"
<perry*_*piermont.com>
- Subject: CCL: help needed
- Date: Fri, 28 Oct 2005 10:11:47 -0400
Sent to CCL by: "Perry E. Metzger" [perry_._piermont.com]
"Alex. A. Granovsky gran ~~ classic.chem.msu.su"
<owner-chemistry^-^ccl.net> writes:
>> Sadly, you are wrong on this. Windows has a number of very serious
>> design flaws, especially in things like the virtual memory management
>> subsystem, which seriously compromise performance. For example, it is
>> difficult to do the sort of "memory for disk I/O" tradeoff
that you do
>> on most Unix systems under Windows because the ability to tune buffer
>> cache policy is extremely limited. It is easy to put Windows into
>> situations where it becomes I/O bound even though there is enormous
>> amounts of memory available for caching.
>
> The statements above are no more than lore, which are definitely not
> based on the results of any serious comparison, and only show that
> people usually do not know and do not make use of advanced Windows
> memory management and I/O features.
Oh, really?
I've personally conducted extensive benchmarking on this specific
topic, and I've read enormous amounts of Microsoft documentation. The
page cache policy in Windows is utterly primitive. As a result of
this, file pages are evicted from cache long before they need to
be.
You can, of course, set the registry key to tell the box to behave as
a file server, at which point, executable pages are evicted from cache
long before they need to be. There is no in between. There are no
pluggable policies. You can't even tune the policy that is there.
About a year ago I conducted a demonstration in which a single
processor machine running NetBSD (not Linux, but the principle would
apply to Linux as well) successfully rebuilt a large software system
many times faster than a four processor Windows server. The Windows
machine had more memory, and each individual processor was faster than
the NetBSD machine's one processor. The reason? The Windows machine
was unable to keep enough pages in memory to be able to keep its four
processors running at 100%. The NetBSD box more or less put everything
it needed into memory once and barely touched the disk again, so its
processor hit 100% and stayed pinned there. If the Windows machine had
been able to this, it would have easily outperformed the NetBSD
machine, but since it could not, most of its four expensive processors
were sitting idle most of the time.
I can easily conduct a demonstration like this for anyone who likes in
the New York area, where I'm based. I've done it multiple times
before, and I can happily do it again. To make it fair, I'll let
whomever wants tune the Windows machine any way they like.
This specific issue comes down to this: Windows does not have a true
unified VM subsystem architecture in which the buffer cache is
properly integrated into the virtual memory subsystem. It also does
not have a flexible, self tuning mechanism for managing the tradeoff
between using memory for executable pages, data pages and file cache
pages. The result of this is that it does lots of I/O when it doesn't
need to, drastically hurting performance.
You can argue all you like about how much nicer Windows is. Perhaps it
is. That is subjective. The OBJECTIVE benchmarks, however, show that
it is trivial to drive a Windows box out of page cache and make it
stall.
> If one would be really interested in Windows vs. Linux performance,
> it is a good idea to use PC GAMESS as the benchmark
No, actually, that isn't a particularly good benchmark. Proper
benchmarking requires that you use a variety of tasks that exercise a
variety of operating conditions. A single program can never be a
good benchmark. In a computational chemistry context, a variety of
loads are needed to properly assess the differences between the two
systems.
> It is not the problem at all to create an input file which will put
> Linux memory & I/O subsystems down.
That actually is false. If you have conditions in which the default
settings of the Linux algorithms are working incorrectly, then by a
simple tuning process you can alter the tradeoff between executable,
data and file pages and optimize your performance.
It may be true that if you don't know what you are doing, you can't
make a Linux box perform properly, but at least there is stuff you can
do if you know what you're doing. Even if you know what you are doing,
there is little you can do to tune Windows properly.
> The same is true for Windows to some degree. I personally do
> not aware of any (and doubt if it is possible at all) good
> implementation of memory management in any OS for the case of
> simultaneous heavy I/O and memory load.
Then you aren't paying close attention to the work that people have
done in the last 20 years in operating system design.
> Nevertheless, Windows has much more advanced memory and I/O API than
> Linux
Oh, really? Can you explain, then, why it is that Unix systems easily
beat Windows on high performance network I/O, why there is no ability
to tune the Windows page cache, why Windows doesn't have a unified VM
model, why Windows is so much worse at context switches, etc?
If Windows is so fast, one might ask why it is that the record for
fastest TCP transmission rates is not held by Windows hardware, and
why researchers on networking performance rarely do their work under
Windows.
Sure, Cutler stole a lot of VMS code to build NT. If you assume VMS
nearly 20 years ago was the best model of how to do I/O on earth, I'm
sure you're a Windows fan. The benchmarks don't agree.
Anyway, the proof is in the pudding. In the real world, benchmarks
don't favor Windows. Claim otherwise if you like -- I'm happy to show
people if they're in my vicinity.
Perry