Sunday, July 29, 2007

Planning for high traffic

The most worrying project on my list at this time is working out how to achieve as much grunt as possible to withstand the estimated traffic from the first Steve Irwin day tribute website since his passing.

Currently the Zoo has a single primary server hosting all sites and a secondary web server sharing the load on a few sites. It's not the perfect model by far and I intend on tidying it up as follows.

What I plan to do is install MySQL 5.x on the secondary server and ready it for replication. Then I will dump the contents from the existing MySQL 4.x databases and point all the websites at it. Once I'm satisfied that it's functioning as expected (I'm in the process of testing this on the bench). Then I will upgrade the MySQL 4.x to 5.x on the primary server and begin multi-master replication with the other. This completes stage one of the preparations.

With the databases in place and functioning, I will work out how to replicate all vhosts between the two servers. Plesk, the web control panel operating on both servers, makes this more complex than it really needs to be since I will also need to create the client/domain accounts for each of the replicated sites. When I have worked out what is required I will script the synchronization as much as possible and hopefully end up with near 100% automation. I may need to tap into Plesk's API to do this. This will complete stage two.

With replication of both the databases and general structures taking place between the two existing servers I can now introduce more servers and the greater complexity they will bring.

I will be working towards four application servers just for static content/scripts and a single localised database server. I intend to just have raw boxes running either Redhat or FreeBSD, no fancy control panels getting in the way. This will allow me to script everything with simplicity and provide a basic configuration to each server. I will replicate the data from the first of the existing servers to one of the new application servers and from there to each of the remaining three. This is so I can keep the amount of public traffic between the servers to a minimum. I will introduce the new database server into the multi-master replication loop and point the four new application servers at it. This completes stage three.

Once I am satisfied that each application server is connecting to the database server over their private network and that the database server is successfully replicating the databases from the existing two servers I will set up the load balancer to include the four new application servers. We may need to cut over to a new load balancer since the existing one may not support this many servers.

This setup will provide me with six front end servers and three database servers - with a bit of sharing of resources here and there. The following diagram shows what I intend on achieving.



Relevant links:
ONLamp Advanced MySQL Replication Techniques
MySQL 5.0 Manual - Replication
Rsync
SWSoft Plesk - Upgrading MySQL

Saturday, July 21, 2007

These updates are getting scarce...

It's not like many people come here for the regular updates anyway.

Network Storage adventures
To get some kind of data redundancy going here at the Zoo I've been playing around with different NAS based operating systems.

The first I tried was FreeNAS - using FreeBSD as its base operating system it provides a modified 'Monowall' administration web interface that allows all sorts of functions for manging storage, services, shares and users. I used it for about 2 months and it failed me when I upgraded the system with 2 more 500GB SATA disks plugged into a Silicon Image 2 port PCI SATA controller. FreeNAS is fairly simple to get up and going - the hardest part was working out the confusing workflow of setting up the physical disks, RAID and partitions.

Given the failure (due to some buggy Sil driver in BSD) I decided to try out OpenFiler instead. I knew that Linux had better support for the Sil chipset - although I probably wouldn't base my whole NAS system around support for cheap controller card in future. OpenFiler is based upon rPath which is a embedded Linux based OS. It's unique package management system, Conary, makes updates quite simple - it's similar to Debian's apt-get.

The OpenFiler installation was rudimentary as far as Linux installations go - it's almost identical to RedHat/Fedora. However it threw up a bunch of obscure and useless error messages while trying to work out the existing partitions - a bit of googling later revealed that it probably didn't understand the GEOM volumes left over from FreeNAS. Once I let it re-initialize the disks it was all fine.

The biggest gripe I had with the whole installation of OpenFiler was that initially I thought that it required a network Directory service to operate. Painful as the only resemblance of a Directory I had was the AD running on some SBS2003 R2 server. So I tried in vein to get that running, no success. I did manage to get somewhat farther using a NT domain function but that only showed groups, not users. It wasn't until I dug through the OpenFiler forums that I found the latest version included it's own OpenLDAP directory service.

So I tried updating it using it's web interface - no go, it would appear to download and install various components but what appeared to be a few key components had to be installed 'in the background'. After waiting sometime it didn't appear it was doing anything of a sort. So I took to it's CLI and worked out how to use the conary package manager and ended up with: #conary updateall --replace-files . Which worked.

Once the system was updated I was presented with a few extra LDAP specific configuration options. Ticked and typed in the appropriate things and I was in business with local directory user/group authentication. I created the LVM PVs/VGs and Volumes and formatted the resultant 2.4TB partition using Ext3.

I set up the SMB and FTP services and shared it out to the appropriate groups. The extra 'host/network' based access control took me off guard and once I worked out how that side of things should work I had workstations backing up to the server using SyncBack in no time.

So far out of one File Server and two departments I have used up 425GB. That's from about 12 workstations total. Now another 10 or so departments and 115 workstations. In future I'll look at its iSCSI and HA support.

IPSec encrypted GRE tunnel, MTU settings suitable for PPPoE/A link:
crypto isakmp policy 10
encr 3des
authentication pre-share
group 2
crypto isakmp key shared_key address 1.2.3.4 no-xauth
!
!
crypto ipsec transform-set transform_name ah-sha-hmac esp-3des esp-sha-hmac
mode transport
!
crypto map map_name 10 ipsec-isakmp
set peer 1.2.3.4
set transform-set transform_name
match address 101
!
!
!
interface Tunnel0
ip address 192.168.254.1 255.255.255.252
ip mtu 1500
keepalive 10 3
tunnel source Dialer0
tunnel destination 1.2.3.4
tunnel key 12345
!
access-list 101 permit gre any any
!
interface Vlan1
ip tcp adjust-mss 1400
crypto map map_name