Core Storage Fabric
From Syscore
Contents |
About our SAN
OIT's Syscore group began to deploy its first fibre channel SAN in the spring of 2005 in order to replace aging (and failing) direct-attached SCSI JBODs which were employed to store AFS volume data. The goal of the SAN was to make it modular and redundant, taking advantage of the FC protocol's own routing and fail-over mechanisms. It also let us take advantage of our two data centers which are housed in separate buildings on campus (Engineering (ENG) and Public Policy (PUP)) and extend the SAN between both by running fiber optics in the utility tunnels which connect the two buildings.
We have several policies regarding the build-out and use of our SAN to ensure that its original design goals are retained:
- Switches, where feasible, are placed in a "stacked" pair which offers two local chassis for hosts physically near the stack to connect to.
- All devices, where feasible, have at least 2 connections to the SAN, each connection to a separate switch, with the hosts' OS providing multipath capabilities. This ensures continued connectivity to the SAN in the event of a single switch failure.
- When a new disk array is required, two are purchased. One array is installed in the ENG data center and the other in the PUP data center. These two arrays are then configured similarly in terms of volumes and LUNs. Hosts utilizing those LUNs use their OS's software mirroring capabilties to mirror between the two LUNs (and therefore, the arrays) thus providing realtime mirroring between our two physically distant data centers.
Please refer to Fig. 1 to see a map of our SAN as it currently is.
As of March 2007, our SAN hosts 60TB of disk storage space, and up to 55TB of tape storage. Approximately 14 initiators are connected to the SAN.
Switching Infrastructure
Our current switching infrastructure comprises of eight QLogic SANbox 5200 switches. These eight switches are assembled into four stacks of two switches each, with two 10Gb/s connections between the two switches. Communication between stacks is via two or more 2Gb/s ISL links. Currently, we have three switch stacks installed in ENG012C, and one switch stack installed in the Public Policy Building (PUP) datacenter (see above diagram).
Management of these switches is performed via Qlogic's SANsurfer Java-based application.
Storage Units
Apple Xserve RAID
We currently have two Apple Xserve RAID (XRAID) 14x400G units attached to our Fabric. Hosts that use these arrays use software mirroring to replicate their data between them. The Apple XRAID units are darned affordable, however, their configuration isn't very dynamic. However, for the purposes of being backend storage for 'building block' applications such as AFS fileservers, they're perfect.
One caveat when dealing with them -- when you change anything in the configuration, they like to restart their controllers. This means that there's a possiblity that some IO may be lost. It's best to remove the mirrors located on an XRAID from production before dorking around, and then add them back in. It only takes a couple hours to resync, but could take days to recover from data loss.
They are both configured as two RAID5 sets, containing 7 drives each; each RAID5 set is split into 4 500GB LUNs.
The current LUN layout (for both) are as follows:
| Controller/LUN | User |
| c0/0 | hfs10 |
| c0/1 | hfs11 |
| c0/2 | hfs1 |
| c0/3 | bfs1 |
| c1/0 | hfs12 |
| c1/1 | hfs2 |
| c1/2 | bfs1 |
| c1/3 | bfs1 |
Solaris Multipathing (MPxIO) Configuraton for the Xserve RAID
When a Solaris host has multiple fiber channel paths to an Apple Xserve RAID and you are using the stock Solaris qlc driver, you can enable fail-over and load-shared multi-pathing through the scsi_vhci driver. To enable this, add the following lines to /kernel/drv/scsi_vhci.conf
device-type-scsi-options-list = "APPLE Xserve RAID", "symmetric-option"; symmetric-option = 0x1000000;
Be sure to keep the 3 space characters between "APPLE" and "Xserve". Please note that as of this writing, the HBA drivers from Qlogic are not supported by MPxIO so use the qlc driver that comes with Solaris.
No editting of /kerne/drv/scsi_vhci.conf is neccessary when adding multipathed Sun StorEDGE-branded arrays.
On Solaris 10 x86, MPxIO is enabled by default. On SPARC, however, you have to turn it on. One (of several) ways this is done is by editting /kernel/drv/fp.conf and setting the mpxio-disable variable to "no" and rebooting. This willl enable MPxIO on all fp(7d) fibre channel ports on the system. Be sure to reboot with a reconfigure and after the system comes up, run devfsadm -C -v to clean up /dev/(r)dsk/ symlinks that are then invalid.
Note that as of Solaris 10 11/06 (aka, Update 3) there is no longer the requirement to add the above entry to /kerne/drv/scsi_vhci.conf as the Xserve RAID apprears to have been popular enough that Sun added the smarts to MPxIO to autodetect it.
Sun StorageTek 3511
Each of our two Sun StorageTek 3511s comprise of a dual-controller (1GB cache each) 3511 head with 12x250GB SATA drives, and an attached 3511 JBOD which holds 12x500GB SATA drives for a total of 9TB per 3511.
These two 3511s used to provided storage services for our critical non-AFS applications such as MySQL, Oracle, Calendar, LDAP and so on, but in February of 2007 we migrated storage duties for those items to our newer and better-performing StorageTek 6140 arrays.
The 3511s will be reused exclusively for our new mail servers when we migrate our email storage infrastructure off of AFS to Cyrus.
Sun StorageTek 6140
In January 2007, Syscore purchased two dual-controller Sun StorageTek 6140 arrays of 8TB each (16x500GB drives) to take on the growing demands of our non-AFS storage requirements. These arrays have taken over all storage responsibilities which were served by our StorageTek 3511s.
Hosts on the Fabric
The following is a catalog of all the hosts (targets and initiators only) connected to the Core Storage Fabric switches.
| Hostname | Host Type | WWN(s) | Switch/Port | Type/Speed |
| bfs1.afs | Sun V20z | Port0: 21:00:00:e0:8b:81:ac:87 Port1: 21:01:00:e0:8b:a1:ac:87 | ff1-sw2-p8 ff1-sw3-p8 | SW Optical/2Gb |
| hfs1.afs | Sun V20z | Port0: 21:00:00:e0:8b:81:3c:87 Port1: 21:01:00:e0:8b:a1:3c:87 | ff1-sw2-p11 ff1-sw3-p7 | SW Optical/2Gb |
| hfs2.afs | Sun V20z | Port0: 21:00:00:e0:8b:81:e3:86 Port1: 21:01:00:e0:8b:a1:e3:86 | ff1-sw1-p6 ff1-sw4-p6 | SW Optical/2Gb |
| hfs10.afs | Sun V20z | Port0: 21:00:00:e0:8b:1d:c1:b2 Port1: 21:01:00:e0:8b:3d:c1:b2 | ff1-sw2-p5 ff1-sw3-p5 | SW Optical/2Gb |
| hfs11.afs | Sun V20z | Port0: 21:00:00:e0:8b:1d:92:b1 Port1: 21:01:00:e0:8b:3d:92:b1 | ff1-sw1-p6 ff1-sw4-p6 | SW Optical/2Gb |
| hfs12.afs | Sun V20z | Port0: 21:00:00:e0:8b:1d:c2:af Port1: 21:01:00:e0:8b:3d:c2:af | ff1-sw2-p6 ff1-sw3-p6 | SW Optical/2Gb |
| bb-app7 | Dell 2850 | Port0: 20:00:00:06:2b:0a:dc:fc Port1: 20:00:00:06:2b:0a:dc:fd | ff1-sw7-p4 ff1-sw7-p5 | Copper/2Gb |
| bb-db7 | Dell 2850 | Port0: 20:00:00:06:2b:0a:de:30 Port1: 20:00:00:06:2b:0a:de:31 | ff1-sw7-p6 ff1-sw7-p7 | Copper/2Gb |
| bb-app3 | Dell 2650 | Port0: 20:00:00:06:2b:0a:d6:7c Port1: 20:00:00:06:2b:0a:d6:7d | ff1-sw7-p8 ff1-sw7-p9 | Copper/2Gb |
| bb-db3 | Dell 2650 | Port0: ? Port1: ? | ff1-sw7-p10 ff1-sw7-p11 | Copper/2Gb |
| grimm | Sun V210 | Port0: 21:00:00:e0:8b:1d:be:b0 Port1: 21:01:00:e0:8b:3d:be:b0 | ff1-sw2-p10 ff1-sw3-p10 | Optical/2Gb |
| ff1-raid1-top.cmgmt ff1-raid1-bot.cmgmt | Apple Xserve RAID | c0: 60:00:39:30:00:01:04:31 c1: 60:00:39:30:00:01:04:a8 | ff1-sw5-p4 ff1-sw6-p4 | Copper/2Gb |
| ff1-raid2-top.cmgmt ff1-raid2-bot.cmgmt | Apple Xserve RAID | c0: 60:00:39:30:00:01:03:4a c1: 60:00:39:30:00:01:05:7c | ff1-sw2-p4 ff1-sw3-p4 | Copper/2Gb |
| ff1-raid3-c0.cmgmt | Sun StorEdge 3511FC | Port 0: 21:60:00:c0:ff:89:25:53 | ff1-sw5-p9 | Copper/2Gb |
| ff1-raid4-c0.cmgmt | Sun StorEdge 3511FC | Port 0: 21:60:00:c0:ff:89:25:55 | ff1-sw2-p9 | Optical/2Gb |
Future direction of our SAN
In the upcoming FY2008 season, Syscore will be experimenting with different iSCSI solutions and implementing the best one in order to reduce the cost of getting block-level storage resources from our arrays to our servers.
