This article is based on the latest builds of Windows 2000 Server & Professional RC 1.Scalability enhancements
Enterprise Memory Architecture
Availability
Blue Screen of Death
In creating Windows 2000, the development team had a number of key
architectural challenges to address:
- Improved scalability
with a requirement to exploit microprocessor advances and Symmetrical Multi
Processing technology (SMP), exploit large amounts of memory, and exploit high-performance
I/O
Higher availability
with a requirement to allow a higher proportion of routine maintenance tasks to be
completed without down time, provision of network failure detection and automatic
recovery, and improved system monitoring and management
Greater reliability
with a requirement once again for less configuration and maintenance down time, a
reduction in the number of system crashes or Blue Screen Of Death (BSOD)
events, and improvements in device drivers and tools
Improved storage management
providing easier management of large amounts of storage, a reduction in down time
when managing storage, support for alternative cost-effective media, and file system
indexing
Scalability enhancements
Starting with scalability enhancements, it is apparent that advances in processor
architecture and high-end server specifications provide a faster and more flexible
platform on which to run. However, todays operating systems will also have more and
more demands placed upon them as a result of increases in the number of processors and
amount of memory that these machines can support. New generations of I/O architecture will
also place additional demands on the OS.
Windows 2000 has been designed from the ground up with these advances in mind, providing
support for higher processor counts and being optimised for SMP. It now includes such
features as a per-processor look-aside cache, improved memory allocation efficiency
(providing a 5 per cent improvement in disk I/O), reduced resource contention, per-process
completion ports (providing a 7 per cent increase in TPC-C throughput), a per processor
thread pool, and the use of fibers. The latter are lightweight versions of threads, with a
reduced memory cost that will ease porting of software from Unix environments.
CPU and SMP optimisation is also improved by the use of Job Objects that manage groups of
processes as a unit, keeping them separate from other groups. This provides finer control
over processes running on the system and limits possible adverse effects. The file system
cache has been increased from 512MB to 960MB.
Enterprise Memory Architecture
The new Enterprise Memory Architecture (EMA) exploits larger physical memory space by
increasing the maximum addressable memory from 4GB to 64GB on Intel and 32MB on Alpha
platforms. Actually, on Intel systems the limit is 32MB under Windows 2000 Advanced
Server, and 64MB under Datacentre Server. EMA will work on any Alpha platform, where the
OS will manage the memory enabling more than one application to use the address space.
Although maximum memory size is greater on Intel systems, it is supported by Xeon
processors only, and the address space is managed by the application, allowing only one
application to use it at a time.
Because we are still effectively working with a 32 bit address space, the additional
memory is divided into multiple 4GB pages handled via the new Page Size Extension feature
on the Xeon processors. This allows Windows 2000 to easily intermix 32 bit and 36 bit
addresses, and reduces the effort needed to develop and support changes in the virtual
memory subsystem. Managing memory above 4GB with large pages is more efficient, providing
better performance, but many will not implement large memory servers until 64 bit Windows
with its more efficient linear address space is finally available. It will
be interesting to see whether Windows 2000 Datacentre Server will ever see the light of
day, given the fact that by the time it appears, 64 bit Windows will be on the horizon.
Final tweaks to scalability are in the area of enhanced I/O efficiency. Here, code path
lengths have been reduced to improve the I/O drivers, handle improvements have been made
to allow more handles to be used with reduced contention, context switching has been
reduced in the NTFS file system, and there has been a reduction in contention on spin
locks on SCSI devices.
Availability
Moving on from scalability to availability, Microsofts big offering in this area
(when it finally appears) is clustering, which it intends to deliver in two phases. The
current phase allows workload on two servers to automatically fail over (that is,
automatically transfer from the primary to the secondary server in case of system failure)
to the second server, thus creating the beginnings of a high availability NT environment.
The second phase, to be delivered with Windows 2000, will extend high availability
clustering by adding support for large, multi-server clusters that share resources and
behave like a single, logical super server.Clients see a cluster as if it were
a single, high-performance, highly reliable server, and services are cluster-wide with the
ability to tolerate component failures. So, should any one server fail, its services are
automatically handled by other members of the cluster. Components can be added to the
cluster transparently to users (i.e. new servers or storage devices), and existing client
connectivity is not affected by clustered applications. This provides the means to offer
rolling upgrades whilst maintaining continuous service for the user, once again enhancing
the availability of Windows 2000 systems.
The third item in our list of challenges is related closely to the idea of high
availability greater reliability. What everyone wants from Windows 2000 is fewer
system crashes the notorious BSOD syndrome and fewer forced reboots
following minor reconfigurations. Hardware and software configuration and maintenance
should be a whole lot friendlier under Windows 2000 where, it is said, reboots will be
required for only a few major changes. This is a laudable aim, but it has to be said that
I must have hit all of them almost immediately in my testing of Beta 3 and Release
Candidate 1, which does not bode well. In other words, there are still far too many
occasions where a reboot is necessary when it probably should not be. Hopefully, things
will improve further between now and release time.
On the positive side, a reinstall is no longer required when upgrading a server to be a
Domain Controller, which is a huge step forward (though why it should ever have been
necessary in the first place is anyones guess probably a hangover form the
bad old LAN Manager days). The number of forced reboots has been reduced by about 50 in
areas such as volume management, configuring network protocols (this is the area where I
still have the occasional problem), settings on PCI and other PnP hardware, and so on. In
theory, the only reboots that will be required going forward are for major events such as
machine name and domain changes, font changes (why?), and Service Pack installs.
Blue Screen of Death
In the past we have all become far more familiar than we would like to be with the
infamous Blue Screen Of Death. The two biggest causes of this ailment are poor driver code
and resource and memory leaks, the eventual BSOD resulting from a serious error detected
by kernel mode code which finds it can do nothing to rectify the problem. Memory leaks
cause vital resources to drain away slowly until performance slows to a crawl or the
system hangs completely. This forces many administrators to perform regular
preventative reboots of the system to restore the missing memory. OK, so its
not a system crash, but it still results in down time and an interruption to the service
to the end user. Memory leaks have probably done more to earn NT the
unreliable label that it seems to carry - small wonder, then, that attempting
to eradicate the memory leak has been a top priority for this new release.
Most current memory leak problems are almost impossible to identify and even harder to
cure once they have a grip on the system, so new tools will be provided to help identify
and fix leaks as they occur. Prevention is better than cure, of course, and the new job
object allows the imposition of memory limits on a collection of processes. Work is in
hand to improve the problem of bad drivers too, with improved DDK driver samples and
documentation, enhanced driver testing, driver signing, and the adoption of the new
Windows Driver Model (WDM). Microsoft will also carry out regular testing of major third
party anti-virus software, another regular cause of NT problems.
Of course, no one is trying to pretend that there will never be another BSOD. If and when
it does occur (hopefully far less frequently than under NT 4 and previous versions), crash
dumps have been made very much quicker than at present and comprehensive crash dump
analysis tools are being developed to help identify the cause. A Web-based
trouble-shooter will be available for most of the common blue screens, and
application recovery techniques have been improved too.
Hopefully this will have given some insight as to the effort that has gone into making
Windows 2000 more scalable, manageable and, above all, reliable. No one is pretending that
Windows 2000 will not be without its problems. With almost a complete redesign and rewrite
under the hood, and huge new additions such as Active Directory, it will be inevitable
that we see the odd few performance problems, reliability issues, and even the occasional
BSOD in the early days following its release. However, the Windows 2000 architecture
appears to offer a sound platform for the future, with the promise of a faster and more
reliable OS somewhere down the road.
.
[an error occurred while processing this directive] |