.

 


Features - January 1999 - Time of SANs
Philip Hunter
discusses the concept of Storage Area Networks.
.

Storage area networks (SANs) is not just another buzz phrase, they are set to become as familiar a part of the IT firmament as LANs and WANs. They represent a killer coincidence of maturing technology and strategic IT direction, a combination which is always necessary to convert the latest fad into a major building block for enterprise IT infrastructures.

Storage trends


The strategic IT direction in this case is the trend towards centralised storage as organisations struggle to cope with proliferating volumes of data attached to different platforms and distributed to multiple locations. Attempts to build totally consistent distributed databases have largely failed, and the prevailing mood is to consolidate core data centrally and distribute copies of it on demand for more local applications such as datamarts. Allied to this is the trend towards shared storage, where the main operating systems such as NT, UNIX, NetWare, IBM AS/400 and IBM System/390 all access a common repository of data rather than having their own directly attached storage peripherals. Having just one data repository to look after brings obvious operational cost savings, and also reduces the load on the host CPUs which no longer have to be involved in data access on behalf of users and other systems. There is also a qualitative advantage in having all the data in one place. The idea is that the different systems access shared storage systems via a SAN rather than existing LANs or campus networks. Then, the back-end movement of data between storage systems and hosts would be independent of the front-end delivery to clients. This makes it easier to meet performance targets because now both the storage systems and the network which used to connect them to the various hosts and servers can be tuned and scaled independently of the increasingly overloaded LANs.

Fibre channel: the SAN enabler


However, the SAN would not have been possible without a key underlying technological development: fibre channel. This has been a vital enabler for SANs, according to Donald Madden, UK Storage Product Manager at Compaq. "Over the last few months, SANs have really come to the fore as the technologies, particularly fibre channel, fall into place," said Madden. Fibre channel makes it possible to connect storage systems together in a flexible network, rather than just in point-to-point channels as has been done with SCSI. Just as with conventional LANs, fibre channel allows storage systems to be interconnected either in shared networks where in effect systems contend for a common channel, or in switched mode to provide dedicated bandwidth between any pair of systems.

It is worth noting here that just as storage systems can be connected by conventional LANs, so fibre channel can be used for networking servers, hosts and clients together as an alternative to other high speed protocols like ATM. However, this is only really viable where there is a need for applications involving transmission of very large batches of data at high speed. As a result, fibre channel’s use for this has been confined to niche applications with particularly high throughput requirements, such as transmitting high definition video material within film studios. By the same token, fibre channel is superior to conventional LAN technology for the SAN, where the requirement is to shift large blocks of data rapidly between storage systems and hosts or servers.

The advantages of fibre channel

Fibre Channel was conceived originally as the sequel to SCSI for connecting storage systems to each other and to hosts via direct channels. This leads straight to a misunderstanding, since current fibre systems actually use the existing SCSI protocol at the data link level for transmitting blocks of bytes. The difference lies at the physical level, where the loop architecture of fibre channel replaces the bus like structure of SCSI. In fact, fibre channel as a physical medium can carry a number of higher level protocols apart from SCSI’s block transfer method, including IP and ATM. Replacing your SCSI cables with a SAN based on the fibre channel physical topology and framing but retaining SCSI at the data transfer level enables existing storage systems to be connected without change to a SAN while enjoying the various advantages of fibre channel. These are:

Performance
. SCSI has been stretched to increase its per channel throughput in progressive multiples of two from the original 10 MB/s to 80 MB/s with the latest Ultra-2 Wide SCSI. But this is probably the limit, while standard fibre channel offers 100 MB/s full duplex, i.e. in both directions, and is likely to be stepped up to 200 MB/s and then 400 MB/s.

Cost effective and simple to configure
. FC reduces the number of parallel buses and expensive controllers needed to support large numbers of disk drives. It also reduces the cabling complexity for linking multiple controllers, because it replaces numerous point to point parallel buses with a single coherent network.

Flexibility and scalability
. FC can handle multiple higher level protocols such as SCSI simultaneously, just as say an Ethernet LAN can. In fact, as it shares the same physical infrastructure and signalling as Gigabit Ethernet there will be more scope in future to build cable networks that can be used for either LANs or SANs. Also, fibre channel offers greater potential for enhancement than SCSI, not just to higher speeds as already noted, but also to incorporate new media and additional mappings of higher layer protocols.

Distance
. SCSI channel lengths are severely restricted to at most 25 metres and less for the higher speed options, which precludes its use as a SAN technology where storage systems may be kept in their own machine room or data bunker. Fibre channel goes much further, to 500 metres over 50 micron fibre cable driven by short-wave lasers, and up to ten kms over nine-micron single mode fibres driven by long wave lasers.

Network attached storage

There has also been a move towards Network Attached Storage (NAS), which is sometimes confused with the SAN. SAN is a network, while NAS is actually the name for a black box storage device attached to a conventional LAN. The idea of NAS is that where you have a server dedicated just to dishing out files to users, performance can be improved by replacing it with a machine designed specifically for such a task. It is a return to the dedicated black box. "When using a server just to serve files and not for other applications, NAS will be more cost effective," said Jan Wrabetz, General Manager of the network business group at storage system vendor StorageTek. Wrabetz views NAS as part of a wider trend away from general purpose servers to application specific machines, driven by increasing workloads and more server-centric computing as with Hydra. "File access is just one such application, and that could be extended to a Web serving box, just serving up Web pages," said Wrabetz.

A point to note about a NAS is that it serves clients, while storage subsystems on a SAN are accessed by hosts and servers. In this sense they do not appear to have anything in common other than some of the access technology and of course techniques such as RAID to optimise performance and protect against hardware failure. However, Wrabetz believes that the two will become part of a common family, with NAS devices eventually being connected to conventional LANs at the front, still serving clients directly, but also being interconnected via a SAN at the back. Then a NAS device would in effect be a gateway bridging conventional LANs with the back end SAN, enabling clients to bypass servers and have access to all the storage resources of the SAN via a black box dedicated to the task. In this context it is natural to regard the SAN as a way of distributing storage devices while making them appear as a single coherent system to clients and other devices on a LAN. In fact, only the second half of this statement is true, for the SAN actually conforms to a trend towards centralised storage, as noted earlier. Or, as Astley Gayle, Business Development Manager for the world’s biggest storage system vendor EMC put it, "companies have for many years been deploying client/server architecture in a distributed fashion and so have been breathing out. We’re now seeing the trend for them to breathe in."

Clients, servers and back end systems


It is also possible to view the SAN as an ingredient of three tier systems comprising clients running presentation services, servers running applications, and back end systems managing data. In this model the SAN would interconnect the middle tier of application servers (including NAS devices) with the third tier of back end storage systems. In this guise the SAN could interconnect servers and hosts with each other as well as to storage systems. Indeed, SANs have been called secondary backbones to convey the impression of a back end network interconnecting servers and storage subsystems, with the primary network comprising the conventional LANs. In fact, as Gayle admitted, SANs actually acts as an enabler for an IT model where applications are distributed as far as servers at least, while data is centralised and consolidated. "This trend to consolidate data is being driven by environments like data warehousing, e-commerce, and enterprise applications like Oracle, SAP, People Soft, and Bahn. The reason is simple: operational costs for managing/deploying distributed systems where you can easily have 200 servers within an enterprise are prohibitively high now. You can get economies of scale and the best of both worlds by still distributing your applications, but centralising your storage via SANs," said Gayle.

The key here is total cost of ownership (TCO), of which storage is an increasingly big component, now accounting for 50% of the whole for typical enterprises, according to Gayle. For storage intensive applications such as data warehousing it can be a lot more than 50%. "The SAN cuts the contribution of storage to TCO by reducing hardware, environmental complexity and administration, and also lessening the need to have experienced staff," said Gayle.

Who’s who and where are we?


So, at what stage are SANs and who are the key vendors? SANs can be built now, but are still at the demonstration stage. One of the largest known SANs comprising 32 clustered Sun servers, and a backend Oracle database, was recently demonstrated by storage software vendor Veritas in conjunction with a number of hardware companies. The demonstration featured a large number of storage and network hardware vendors and embraced all the components of a SAN: disk and tape based dedicated storage arrays, Unix hosts, NT servers, and fibre channel switches and hubs. Essentially there are two classes of products: the storage systems themselves, and the physical fibre channel network interconnecting them. The latter are the new ingredients and come in two essential variants: hubs and switches. A SAN can be viewed as providing virtual channels between storage systems, and with hubs the overall capacity is shared between contending systems. Therefore, the capacity available will vary according to network loading.

Switches on the other hand provide dedicated bandwidth, so that once a channel path has been established the full capacity will be available. Several specialist vendors, notably Ancor Communications, are flourishing in the fast growing market for fibre channel switches and hubs, but other established network system vendors are showing interest. On the storage side, four of the major players are EMC, IBM, Compaq and Sun Microsystems.

Until recently, storage systems have always been "owned" by a single server or host to which they are directly attached. This became less efficient with the growth of enterprise networks where say a user on an NT LAN might want to access data held on a storage system attached to an IBM mainframe. In this case, the mainframe would have to extract the data on behalf of the NT server. This consumes processing cycles that would otherwise be available to direct users of the mainframe and also clogs up the network. The idea of shared storage is to have systems holding all data accessed on an equal footing by all types of host and server, via the SAN. The problem here is that each operating system has its own way of formatting data, so a shared storage computer would have to present data in the form each host or server expects. As an interim step, shared storage systems are being partitioned into areas reserved for each operating system; this is not true of shared data. It does however spare the hosts from having to serve data to each other. It also diverts the transmission of data between storage system and the server that requested it from the LAN to the SAN. Vendors such as EMC predict that true shared data systems running on SANs will be with us within a few years.

Acronyms

ATM - Asynchronous Transfer Mode
NAS - Network Attached Storage
RAID - Redundant Arrays of Inexpensive (or Independent) Disks
SAN - Storage Area Network
SCSI - Small Computer System Interface