|
Storage trends
Fibre channel: the SAN enabler
The advantages of fibre channel
Network attached storage
Clients, servers and back end systems
Whos who and where are we?
Storage area networks (SANs) is not just another buzz
phrase, they are set to become as familiar a part of the IT firmament as LANs and WANs.
They represent a killer coincidence of maturing technology and strategic IT direction, a
combination which is always necessary to convert the latest fad into a major building
block for enterprise IT infrastructures.
Storage trends
The strategic IT direction in this case is the trend towards centralised storage as
organisations struggle to cope with proliferating volumes of data attached to different
platforms and distributed to multiple locations. Attempts to build totally consistent
distributed databases have largely failed, and the prevailing mood is to consolidate core
data centrally and distribute copies of it on demand for more local applications such as
datamarts. Allied to this is the trend towards shared storage, where the main operating
systems such as NT, UNIX, NetWare, IBM AS/400 and IBM System/390 all access a common
repository of data rather than having their own directly attached storage peripherals.
Having just one data repository to look after brings obvious operational cost savings, and
also reduces the load on the host CPUs which no longer have to be involved in data access
on behalf of users and other systems. There is also a qualitative advantage in having all
the data in one place. The idea is that the different systems access shared storage
systems via a SAN rather than existing LANs or campus networks. Then, the back-end
movement of data between storage systems and hosts would be independent of the front-end
delivery to clients. This makes it easier to meet performance targets because now both the
storage systems and the network which used to connect them to the various hosts and
servers can be tuned and scaled independently of the increasingly overloaded LANs.
Fibre channel: the SAN enabler
However, the SAN would not have been possible without a key underlying technological
development: fibre channel. This has been a vital enabler for SANs, according to Donald
Madden, UK Storage Product Manager at Compaq. "Over the last few months, SANs have
really come to the fore as the technologies, particularly fibre channel, fall into
place," said Madden. Fibre channel makes it possible to connect storage systems
together in a flexible network, rather than just in point-to-point channels as has been
done with SCSI. Just as with conventional LANs, fibre channel allows storage systems to be
interconnected either in shared networks where in effect systems contend for a common
channel, or in switched mode to provide dedicated bandwidth between any pair of systems.
It is worth noting here that just as storage systems can be connected by conventional
LANs, so fibre channel can be used for networking servers, hosts and clients together as
an alternative to other high speed protocols like ATM. However, this is only really viable
where there is a need for applications involving transmission of very large batches of
data at high speed. As a result, fibre channels use for this has been confined to
niche applications with particularly high throughput requirements, such as transmitting
high definition video material within film studios. By the same token, fibre channel is
superior to conventional LAN technology for the SAN, where the requirement is to shift
large blocks of data rapidly between storage systems and hosts or servers.
The advantages of fibre channel
Fibre Channel was conceived originally as the sequel to SCSI for connecting storage
systems to each other and to hosts via direct channels. This leads straight to a
misunderstanding, since current fibre systems actually use the existing SCSI protocol at
the data link level for transmitting blocks of bytes. The difference lies at the physical
level, where the loop architecture of fibre channel replaces the bus like structure of
SCSI. In fact, fibre channel as a physical medium can carry a number of higher level
protocols apart from SCSIs block transfer method, including IP and ATM. Replacing
your SCSI cables with a SAN based on the fibre channel physical topology and framing but
retaining SCSI at the data transfer level enables existing storage systems to be connected
without change to a SAN while enjoying the various advantages of fibre channel. These are:
Performance. SCSI has been stretched to increase its per channel throughput in
progressive multiples of two from the original 10 MB/s to 80 MB/s with the latest Ultra-2
Wide SCSI. But this is probably the limit, while standard fibre channel offers 100 MB/s
full duplex, i.e. in both directions, and is likely to be stepped up to 200 MB/s and then
400 MB/s.
Cost effective and simple to configure. FC reduces the number of parallel buses
and expensive controllers needed to support large numbers of disk drives. It also reduces
the cabling complexity for linking multiple controllers, because it replaces numerous
point to point parallel buses with a single coherent network.
Flexibility and scalability. FC can handle multiple higher level protocols such as
SCSI simultaneously, just as say an Ethernet LAN can. In fact, as it shares the same
physical infrastructure and signalling as Gigabit Ethernet there will be more scope in
future to build cable networks that can be used for either LANs or SANs. Also, fibre
channel offers greater potential for enhancement than SCSI, not just to higher speeds as
already noted, but also to incorporate new media and additional mappings of higher layer
protocols.
Distance. SCSI channel lengths are severely restricted to at most 25 metres and
less for the higher speed options, which precludes its use as a SAN technology where
storage systems may be kept in their own machine room or data bunker. Fibre channel goes
much further, to 500 metres over 50 micron fibre cable driven by short-wave lasers, and up
to ten kms over nine-micron single mode fibres driven by long wave lasers. |
Network attached storage
There has also been a move towards Network Attached Storage (NAS), which is sometimes
confused with the SAN. SAN is a network, while NAS is actually the name for a black box
storage device attached to a conventional LAN. The idea of NAS is that where you have a
server dedicated just to dishing out files to users, performance can be improved by
replacing it with a machine designed specifically for such a task. It is a return to the
dedicated black box. "When using a server just to serve files and not for other
applications, NAS will be more cost effective," said Jan Wrabetz, General Manager of
the network business group at storage system vendor StorageTek. Wrabetz views NAS as part
of a wider trend away from general purpose servers to application specific machines,
driven by increasing workloads and more server-centric computing as with Hydra. "File
access is just one such application, and that could be extended to a Web serving box, just
serving up Web pages," said Wrabetz.
A point to note about a NAS is that it serves clients, while storage subsystems on a SAN
are accessed by hosts and servers. In this sense they do not appear to have anything in
common other than some of the access technology and of course techniques such as RAID to
optimise performance and protect against hardware failure. However, Wrabetz believes that
the two will become part of a common family, with NAS devices eventually being connected
to conventional LANs at the front, still serving clients directly, but also being
interconnected via a SAN at the back. Then a NAS device would in effect be a gateway
bridging conventional LANs with the back end SAN, enabling clients to bypass servers and
have access to all the storage resources of the SAN via a black box dedicated to the task.
In this context it is natural to regard the SAN as a way of distributing storage devices
while making them appear as a single coherent system to clients and other devices on a
LAN. In fact, only the second half of this statement is true, for the SAN actually
conforms to a trend towards centralised storage, as noted earlier. Or, as Astley Gayle,
Business Development Manager for the worlds biggest storage system vendor EMC put
it, "companies have for many years been deploying client/server architecture in a
distributed fashion and so have been breathing out. Were now seeing the trend for
them to breathe in."
Clients, servers and back end systems
It is also possible to view the SAN as an ingredient of three tier systems comprising
clients running presentation services, servers running applications, and back end systems
managing data. In this model the SAN would interconnect the middle tier of application
servers (including NAS devices) with the third tier of back end storage systems. In this
guise the SAN could interconnect servers and hosts with each other as well as to storage
systems. Indeed, SANs have been called secondary backbones to convey the impression of a
back end network interconnecting servers and storage subsystems, with the primary network
comprising the conventional LANs. In fact, as Gayle admitted, SANs actually acts as an
enabler for an IT model where applications are distributed as far as servers at least,
while data is centralised and consolidated. "This trend to consolidate data is being
driven by environments like data warehousing, e-commerce, and enterprise applications like
Oracle, SAP, People Soft, and Bahn. The reason is simple: operational costs for
managing/deploying distributed systems where you can easily have 200 servers within an
enterprise are prohibitively high now. You can get economies of scale and the best of both
worlds by still distributing your applications, but centralising your storage via
SANs," said Gayle.
The key here is total cost of ownership (TCO), of which storage is an increasingly big
component, now accounting for 50% of the whole for typical enterprises, according to
Gayle. For storage intensive applications such as data warehousing it can be a lot more
than 50%. "The SAN cuts the contribution of storage to TCO by reducing hardware,
environmental complexity and administration, and also lessening the need to have
experienced staff," said Gayle.
Whos who and where are we?
So, at what stage are SANs and who are the key vendors? SANs can be built now, but are
still at the demonstration stage. One of the largest known SANs comprising 32 clustered
Sun servers, and a backend Oracle database, was recently demonstrated by storage software
vendor Veritas in conjunction with a number of hardware companies. The demonstration
featured a large number of storage and network hardware vendors and embraced all the
components of a SAN: disk and tape based dedicated storage arrays, Unix hosts, NT servers,
and fibre channel switches and hubs. Essentially there are two classes of products: the
storage systems themselves, and the physical fibre channel network interconnecting them.
The latter are the new ingredients and come in two essential variants: hubs and switches.
A SAN can be viewed as providing virtual channels between storage systems, and with hubs
the overall capacity is shared between contending systems. Therefore, the capacity
available will vary according to network loading.
Switches on the other hand provide dedicated bandwidth, so that once a channel path has
been established the full capacity will be available. Several specialist vendors, notably
Ancor Communications, are flourishing in the fast growing market for fibre channel
switches and hubs, but other established network system vendors are showing interest. On
the storage side, four of the major players are EMC, IBM, Compaq and Sun Microsystems.
Until recently, storage systems have always been "owned" by a single server or
host to which they are directly attached. This became less efficient with the growth of
enterprise networks where say a user on an NT LAN might want to access data held on a
storage system attached to an IBM mainframe. In this case, the mainframe would have to
extract the data on behalf of the NT server. This consumes processing cycles that would
otherwise be available to direct users of the mainframe and also clogs up the network. The
idea of shared storage is to have systems holding all data accessed on an equal footing by
all types of host and server, via the SAN. The problem here is that each operating system
has its own way of formatting data, so a shared storage computer would have to present
data in the form each host or server expects. As an interim step, shared storage systems
are being partitioned into areas reserved for each operating system; this is not true of
shared data. It does however spare the hosts from having to serve data to each other. It
also diverts the transmission of data between storage system and the server that requested
it from the LAN to the SAN. Vendors such as EMC predict that true shared data systems
running on SANs will be with us within a few years.
Acronyms
ATM - Asynchronous Transfer Mode
NAS - Network Attached Storage
RAID - Redundant Arrays of Inexpensive (or Independent) Disks
SAN - Storage Area Network
SCSI - Small Computer System Interface |
|