CCL: help needed



 Sent to CCL by: "Perry E. Metzger" [perry_._piermont.com]
 "Alex. A. Granovsky gran ~~ classic.chem.msu.su"
 <owner-chemistry^-^ccl.net> writes:
 >> Sadly, you are wrong on this. Windows has a number of very serious
 >> design flaws, especially in things like the virtual memory management
 >> subsystem, which seriously compromise performance. For example, it is
 >> difficult to do the sort of "memory for disk I/O" tradeoff
 that you do
 >> on most Unix systems under Windows because the ability to tune buffer
 >> cache policy is extremely limited. It is easy to put Windows into
 >> situations where it becomes I/O bound even though there is enormous
 >> amounts of memory available for caching.
 >
 > The statements above are no more than lore, which are definitely not
 > based on the results of any serious comparison, and only show that
 > people usually do not know and do not make use of advanced Windows
 > memory management and I/O features.
 Oh, really?
 I've personally conducted extensive benchmarking on this specific
 topic, and I've read enormous amounts of Microsoft documentation. The
 page cache policy in Windows is utterly primitive. As a result of
 this, file pages are evicted from cache long before they need to
 be.
 You can, of course, set the registry key to tell the box to behave as
 a file server, at which point, executable pages are evicted from cache
 long before they need to be. There is no in between. There are no
 pluggable policies. You can't even tune the policy that is there.
 About a year ago I conducted a demonstration in which a single
 processor machine running NetBSD (not Linux, but the principle would
 apply to Linux as well) successfully rebuilt a large software system
 many times faster than a four processor Windows server. The Windows
 machine had more memory, and each individual processor was faster than
 the NetBSD machine's one processor. The reason? The Windows machine
 was unable to keep enough pages in memory to be able to keep its four
 processors running at 100%. The NetBSD box more or less put everything
 it needed into memory once and barely touched the disk again, so its
 processor hit 100% and stayed pinned there. If the Windows machine had
 been able to this, it would have easily outperformed the NetBSD
 machine, but since it could not, most of its four expensive processors
 were sitting idle most of the time.
 I can easily conduct a demonstration like this for anyone who likes in
 the New York area, where I'm based. I've done it multiple times
 before, and I can happily do it again. To make it fair, I'll let
 whomever wants tune the Windows machine any way they like.
 This specific issue comes down to this: Windows does not have a true
 unified VM subsystem architecture in which the buffer cache is
 properly integrated into the virtual memory subsystem. It also does
 not have a flexible, self tuning mechanism for managing the tradeoff
 between using memory for executable pages, data pages and file cache
 pages. The result of this is that it does lots of I/O when it doesn't
 need to, drastically hurting performance.
 You can argue all you like about how much nicer Windows is. Perhaps it
 is. That is subjective. The OBJECTIVE benchmarks, however, show that
 it is trivial to drive a Windows box out of page cache and make it
 stall.
 > If one would be really interested in Windows vs. Linux performance,
 > it is a good idea to use PC GAMESS as the benchmark
 No, actually, that isn't a particularly good benchmark. Proper
 benchmarking requires that you use a variety of tasks that exercise a
 variety of operating conditions. A single program can never be a
 good benchmark. In a computational chemistry context, a variety of
 loads are needed to properly assess the differences between the two
 systems.
 > It is not the problem at all to create an input file which will put
 > Linux memory & I/O subsystems down.
 That actually is false. If you have conditions in which the default
 settings of the Linux algorithms are working incorrectly, then by a
 simple tuning process you can alter the tradeoff between executable,
 data and file pages and optimize your performance.
 It may be true that if you don't know what you are doing, you can't
 make a Linux box perform properly, but at least there is stuff you can
 do if you know what you're doing. Even if you know what you are doing,
 there is little you can do to tune Windows properly.
 > The same is true for Windows to some degree. I personally do
 > not aware of any (and doubt if it is possible at all) good
 > implementation of memory management in any OS for the case of
 > simultaneous heavy I/O and memory load.
 Then you aren't paying close attention to the work that people have
 done in the last 20 years in operating system design.
 > Nevertheless, Windows has much more advanced memory and I/O API than
 > Linux
 Oh, really? Can you explain, then, why it is that Unix systems easily
 beat Windows on high performance network I/O, why there is no ability
 to tune the Windows page cache, why Windows doesn't have a unified VM
 model, why Windows is so much worse at context switches, etc?
 If Windows is so fast, one might ask why it is that the record for
 fastest TCP transmission rates is not held by Windows hardware, and
 why researchers on networking performance rarely do their work under
 Windows.
 Sure, Cutler stole a lot of VMS code to build NT. If you assume VMS
 nearly 20 years ago was the best model of how to do I/O on earth, I'm
 sure you're a Windows fan. The benchmarks don't agree.
 Anyway, the proof is in the pudding. In the real world, benchmarks
 don't favor Windows. Claim otherwise if you like -- I'm happy to show
 people if they're in my vicinity.
 Perry