Tuesday, April 14, 2009

iSCSI vrs NAS

There seems to be continuing discussions about the value of iSCSI vrs NAS (Network Attached Storage). NAS, in general, has two incarnations: NFS (Network File System) seen mostly in UNIX type systems and CIFS (Common Internet File System) which is seen mostly in Microsoft systems.

The discussion seems to center around looking at the iSCSI and NAS technologies as if it were interchangeable. It is true that both technologies can be used for reading and writing storage and it is also true that NAS filers (or storage controllers) can do everything that an iSCSI storage controller can do, plus more. However, they are fundamentally different in their structure and as a result are significantly different in what hardware processing capabilities (CPU, Memory, etc.) are required to support their capabilities.

The iSCSI structure is based on SCSI Block protocol, which is created as a result of application file system calls for Reads or Writes. The NAS (NFS/CIFS) structure is based on special “Client-Server” protocols which are also created as a result of application file systems calls.

In the case of NAS the file system work is not really done in the client system, but via the NAS (NFS/CIFS) protocol which invokes various functions in the NAS server’s File System. The file system in the NAS server must then convert these file system functions into a SCSI Block protocol that will in turn access the actual storage device. In other words NAS moves the function of the physical file system from the client into the NAS server appliance. The same physical file system work needs to be done whether it is done in the client or in the NAS appliance.

Years ago (in 1998) when I first got involved with iSCSI, there was a lot of discussion about whether there was even a need for the iSCSI protocol. After all, the discussion went; we have NAS (NFS/CIFS), so why does the world need yet another TCP/IP based storage access protocol. (At that time the name iSCSI had not even been coined, we called it SCSI over (GE) TCP/IP.) So to try to fully understand the value (if any) of this potentially new protocol, we set out to measure the SCSI over Gigabit Ethernet (GE) TCP/IP vrs NFS using (GE) TCP/IP. The results were startling to us at the time, but were key to our decision to continue with the effort to standardize what came to be known as iSCSI.

At this time there was also a lot of talk about offloading onto an adapter card the TCP/IP functions and various other TCP/IP optimizations that would be useful for not only transporting SCSI but also NAS protocols. To fully understand the potential we looked at three different implementations of the TCP/IP part of the equation: normal TCP/IP implementations (which generally used two buffer to buffer copies during its processing in the host system), versions that had only one buffer copy in the host system, and a versions that had zero buffer copies in the host system (data was fetched/placed from/into the application memory location directly by/from an adapter). This last approach became known as a TOE (TCP/IP Offload Engine). A graph of the results of the analysis can be seen in the slide shown below.


The results of this analysis showed that iSCSI (SCSI over GE-TCP/IP) transmit would be 26% of the processing time of NFS and iSCSI receive would be about 32% of the processing time of NFS. So a rough general statement might say that iSCSI used about one third of the processing power need by NFS. (That can be seen by comparing the blue columns with the yellow columns.) The analysis also showed that if a TOE approach was used for iSCSI and NFS (zero copies), the results would even be more dramatic. In that case the iSCSI Transmit became about 8% (1/12th) of the processing time of NFS, and the iSCSI Receive would be about 6% (1/15th) of the processing time of NFS.

(All of these measurements were based on the same processors, NICs, storage and Gigabit Ethernet Links, and the same about of file data was transferred. It should probably also be noted that we measured the client side overhead in the same way but could not find significant differences in the processing time of iSCSI clients vrs NFS clients.)

Usually now days the Storage Controller has replace the target server that was measured above, however, the comparison between the processor needs of iSCSI vrs NAS (NFS/CIFS) is probably still valid.

Now at first glance one would think that the case was a complete repudiation of NAS by iSCSI, however, that is just not correct. We were not really comparing apples to apples here, because NFS provides other capabilities that are not available with iSCSI. And that is the ability to share files between different client systems. And NAS permits whole file management capabilities, which often simplifies the management of storage.

We did not take measurements with CIFS since we felt the point had been made and the additional protocol elements of CIFS would add even more processing time into the equation. However, like NFS, CIFS provides sharing capabilities – and includes a built in locking capability to manage dynamic file updates while sharing.

So the generality seems to say that if you do not have a data sharing requirement between your clients, then iSCSI is probably the most effective approach. But if you do have data sharing requirements, then an iSCSI approach is probably not appropriate, but a NAS protocol (NFS/CIFS) probably is.

Applying the above to practical situations of scale out, the slide shown below depicts the issues and nets them down.


This means that an installation that has requirements for some file sharing should probably have some NAS servers/controllers. But since the majority of data on a client is not shared, having an iSCSI storage controller probably makes since also.

One important point to consider is that when it comes to Scaling, one always needs to look for the possible bottle necks and apply the best approach to the reduction of the bottle neck affect. Clearly the overhead in a NAS controller is significantly more than an iSCSI controller so the iSCSI controller should be able to scale better than the NAS approach. But since they each provide different capabilities, an installation should use both NAS and iSCSI where only the shared data goes to the NAS controller.

Since iSCSI and NFS/CIFS are both IP based protocols, the same physical Ethernet connection can be used to carry both protocols. Therefore, some vendors have implemented what I call dual-dialect Storage Controllers. These are Storage Controllers that can accept either iSCSI or NAS (NFS/CIFS) protocols. In this case one can see that it might be possible to balance the low overhead of iSCSI with the functionality of NFS/CIFS protocols.
…………. John L. Hufferd

No comments:

Post a Comment