Main

Projects Archives

March 23, 2005

And so we begin: The AFS server overhaul.

For the past three to four years or so, the AFS servers which house and serve all UMBC users' UNIX home directories, email, and personal web pages have been operating on a mix of Sun UltraSPARC and x86-based hardware platforms running Solaris 8 and Linux. Each server has its own direct-attached SCSI-based storage array, each containing 8 drives of various sizes. Host-based software RAID has kept drive failures from becoming a visible issue to the users whose home volumes are stored on a particular server.

Four years is a fairly respectable period of time for a service to be running non-stop like that. Naturally, though, technology and physical wear and tear progresses to the point where it becomes advantageous to seek out a upgrade before the old gets too old and catastrophic failure becomes a distinct possibility and resident boogeyman.

Continue reading "And so we begin: The AFS server overhaul." »

April 13, 2005

AFS Fiber Channel Network Test Plan

As implementation of UMBC's AFS home directory fiber channel network carries on, I have written a test plan to pursue once all the hardware is in place.

AFS Fiber Channel Network Test Plan

April 20, 2005

Live server monitoring roll out

As a side-project to the new AFS server upgrade project, I have implemented a SNMP-based server monitoring site using freely-available PHP-based Cacti.

Right now, statistics for each server are basic, covering CPU usage, load average, disk space consumption, and network interface traffic. In time, I will develop SNMP agents and queries that will dig deeper into what's happening on our servers such as process count (eg: monitoring the number of httpd processes on web servers, imapd/pop3d processes on the mail readers, and so on.) Also, I plan to monitor all ports on our fiber channel switches for traffic and errors.

Since every server will be running NET-SNMP, we can also utilize this as a (potential) new way to monitor servers using SNMP traps. This, however, is another project for another day.

The ultimate goal, though, is to use this as a capacity planning tool.

You can view the statistics we are collecting by clicking here.

April 27, 2005

Blackboard apppak3

Weblock is now running blackboard 6.3.something. It was also taken apart, blown out, and had a magnet run across its motherboard. This was an attempt to remove the dirt/metal fillings inside the server.(from the contractors that put our trays in) Hopefully it will stop crashing now.

May 11, 2005

New MySQL server backup regimine

The new MySQL server is now doing nightly dumps of its databases at 3:30am. It creates a full dump and a individual dumps of each database on it in a backup directory, organized by date. This will make reconstituting the entire databases server or individual databases easy to do in the even that needs to be done.

October 27, 2005

Recent goings-on with the Core Storage Fabric

Tonight I split the mirrors on the production AFS servers to upgrade the second of the two Apple XRAIDs's firmware to version 1.5. While doing that I also put in the proper LUN Masking for the three new AFS server that I built last night - bfs1, hfs1 and hfs2.

BFS1 will replace two very... um... mature AFS servers which serve out things such as everyone's web content and departmental spaces in AFS. It'll have a total of 1TB of mirrored disk space to allocate out.

HFS1 and HFS2 will replace the last of the old direct-attached SCSI AFS servers which serve everyone's home volumes. When these two come online, the HFS* servers will be 100% on the fabric.

I also spent part of the morning updating the zoning on the FC switches and getting them ready for the pending expansion into the PP building. This also entailed the installation of longwave SFP tranceivers into the switches here in ECS to make the distance across campus.

When the mirrors on the production AFS servers are done syncing tonight, I'll join the mirrors on the three new servers and then we'll plan the moving of volumes to BFS1, and also kick off the volume balancer script to move the home volumes off the old servers and on to the two new ones, and then go down and decomission them.

November 4, 2005

QLogic fiber channel switches now running in PP

Two new QLogic SANbox 5200 fiber channel switches were configured and installed today in the Public Policy (PP) data center. They are awaiting the installation of the cross-campus fiber pairs which will allow them to join the Core Storage Fabric in the ECS data center with an aggregate speed of 8Gb/second between the two buildings (2x 2Gb/s, bidirectional)

December 5, 2005

Service capacity upgrades

Today Santa came early for Syscore and we received 9 new Sun V20z 2x1.8Ghz Opteron (single core) servers. Over the course of winter break, they will be phased into service in the following roles:

* One will serve as cold spare. Now that we have ourselves a critical mass of V20zs in our farm, it's prudent to keep a virgin spare around. This server will have a OS load on it, ready to go.

* Six will replace the current cluster of 16 email servers that handle inbound and outbound umbc.edu email traffic. Three will be located in the ECS data center and the other three in the PP data center. An additional 4 servers will also be freed up as we will stop running dedicated Milter (spam and virus filtering) servers and run these services directly on the mail servers themselves. So 20 servers will be turned into 6. Yay for the data center power and cooling budget.

* Two will be used to vastly increase the capacity and speed of the webmail system. This will cure the recent user complaints of slowness with that service. It'll also have redundancy as there is currently no backup for the single server now acting as the webmail server. As with the new email servers, one will be located in ECS and the other in PP.

Some of the hardware these new V20zs replace will be recycled as upgrades for other systems and be retained as emergency spares other backline uses.

December 23, 2005

Inter-switch link redundancy testing success

This morning, Rob and I tested the redundant links between our fibre channel switches in ECS and PP. Everything went off without a hitch, and that's good considering that we were dealing with live traffic!

We have two dual switch clusters in ECS. There are two fibre channel links between the clusters, one from each switch. There is an additional single switch in the blackboard server rack which has one link to each cluster... so basically it straddles the two clusters.

We also have one dual switch cluster in PP, and both switches in that cluster has a connection to one of the two clusters in ECS.

So we pulled the two direct links between the ECS clusters, and traffic failed over to the blackboard switch and went through that.

We then pulled the connection to the blackboard switch, and then traffic predicably failed over to going to, and then back out from the switch cluster in PP.

This means it'll take a lot of things going wrong to split the fabric.... we can survive two switch failures without splitting the fabric. This testing also confirmed that we have a correct fibre channel zone configuration (fibre channel zones are analogous to VLANs on a ethernet network)

January 19, 2006

Storage projects update

I'm happy to report that the second AFS RAID array was installed onto the SAN in the PUP building and now all of UMBC's user home directories, email storage, and most of the UMBC web content is mirrored across two buildings. This is a huge step for data survivability.

I also installed two new Sun StorEdge 3511 arrays onto the SAN, one in ECS and the other in PUP. Like the AFS RAID arrays, these two will provide mirrored redundancy for Blackboard, Oracle, and MySQL data. The Blackboard servers are on the SAN, too, and next week that stuff will be moved onto those disks.

To come: Research storage arrays will be added to the SAN as well. These will provide several terabytes of storage purely for research projects being conducted on various UMBC clusters as well as hercules and titan. This storage will be served out via NFS to those requiring it.

January 23, 2006

Email Delivery System Upgrades

I've been in the process for the past week or so rearchitecting and installing a comletely new mail delivery, relay, and filtering system here at UMBC. The existing system, while flexible, had a lot of unnecessary overhead, had become hard to maintain, and wasn't up to the task of handling our current email load in a timely manner.

The new system was put into service during Thursday and Friday of last week, and with the exception of some very minor configuration problems, has been working flawlessly.

Continue reading "Email Delivery System Upgrades" »

July 14, 2006

Research storage project == done

Today I finshed my research storage project and added a little bonus to it.

Our two research servers, hercules.rs and titan, have been using NFS mounts from ds1.rs for the last several years to store large files associated with the various projects of users of those two machines. ds1.rs was a Dell PE1750 with a PowerVault attached to it with a total space of around 375GB. ds2.rs replaces this with a Sun X4100 running Solaris 10 06/06 with ZFS connected to two Apple Xserve RAIDs and has a total of 4.2TB online.

The bonus is that I also installed a ADS-cabable Samba server on ds2.rs, so peoples' research volumes can also be made available to Windows workstations over SMB/CIFS as well as NFS. Very cool!

Rob is now in the process of moving the RS NIS map server from ds1.rs to ds2, and after that, the old ds1.rs will replace our old cobbled-together USENET news server (yes, we still offer USENET news here :)

November 1, 2006

New IMAP/POP servers and other stuff

The new IMAP/POP servers have arrived and are being installed. These will replace the current servers which are Dell 2650s running RedHat. The new servers are Sun X4100 systems with 8GB of RAM and a total of four 2.6GHz CPU cores. They will run Solaris 10 as the operating system.

As a side note, once these servers are installed and put into production (3 are actually already installed and in production, we need to install the last 2) our primary core infrastructure (AFS, Mail - including SMTP, IMAP/POP and spam filtering, DNS, LDAP, and Web) will all be running Solaris with just a few minor systems still running Linux. In time and where appropriate, those systems will also be migrated to Solaris 10. Beyond that, we're looking at the possibility of running Linux systems (such as the GL login servers) in a virtualized environment under Solaris (using BrandZ)

November 27, 2006

What happened to all of my spam?

OIT’s Core Systems group has put in place an implementation of greylisting on the central email servers over the thanksgiving holiday weekend. Greylisting is another method of combating spam which relies on the methods that most spammers are currently using to send their messages by returning a temporary error to a mail server the first time it attempts to communicate with a particular recipient. While well behaved mail servers will simply retry sending their message within a short period of time, spammers won’t, causing your wanted messages to get through but leaving the spam on the floor.

Like any anti-spam technique, greylisting does have a negative side effect. The side effect is that a delay – sometimes significant – may be incurred on the first time a message is sent to a UMBC recipient from a particular server. However, after the initial message does get through (after a short timeout period), messages from that mail server should flow smoothly to your mailbox from that sender. While our servers are configured to accept mail 2 to 5 minutes after the initial attempt is made – the remote server is responsible for scheduling the re-try.

We’d like your feedback on how the greylisting is working.

(read more for details.)

Continue reading "What happened to all of my spam?" »

About Projects

This page contains an archive of all entries posted to OIT SysCore in the Projects category. They are listed from oldest to newest.

Etcetera is the previous category.

Server/Equipment Status is the next category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.34