The brief is to provide failover only. The reason for this is that the collaboration server is not “load balancing aware” in that it assumes that it will be hosted on a single host. To provide rudimentary fail-over capability I have set up a method that will switch all sessions to another host should the active host fail. However clients will stay on the new host until it fails and only then switch all sessions to the other. The key word here is ‘all’ because it’s important to keep all sessions on the same host.
From a users perspective; in the event of an active host outage they will lose connectivity but will be able to log back in straight away and continue until such a time the alternative host fails. This prevents them from being switched over only to be kicked again when the prior host has been restored and also ensures that ALL sessions are sent to a single host and not spread across multiple host, so everyone is in the same chatrooms.
My first idea was to adapt BIGIP’s Priority Group capability however this presented the same problem where I could not ‘stick’ the clients to a server. As soon as a same or higher priority server was restored the sessions would be sent to the new host effectively splitting the chat rooms. Also load balancing will take place on member servers of the same priority.
So I did a bit of digging around and discovered a method of using an iRule to provide me with the capability to ‘stick’ sessions based upon an arbitrary number in this case I used the TCP Port number.
The iRule is as follows:
CLIENT_ACCEPTED is an event that is triggered when a connection has been established between a client device and the BIGIP.
‘persist uie’ is where I am manipulating the connection persistence and in this case the Universal Inspection Engine. Here I am simply setting a integer, can be any number but I have chosen to use the connecting TCP port number ([TCP::local_port]). This fixes the session persistence to a single host, preventing load-balancing.
The following BIGIP configuration has been tested as working by business systems analyst using a combination of application logs, BIGIP statistics and packet captures. He confirmed what traffic was being sent on which ports - Port 8010 is used for the majority of user generated traffic that must be kept on a single host. Port 8443 is used to transport application specific information but does not carry anything that is user generated and therefore does not require persistence.
The aforementioned iRule is referenced by a ‘Universal Persistence’ profile as follows:
And then reference that Univeral Persistence profile from a Performace Layer 4 type Virtual Server like so::
Another Virtual Server is required for HTTPS traffic however this does not require any special configuration and is set up as a typical HTTP type Virtual Server e.g:
The above configuration refers to a six member/node pool. Each member runs both the general Blackboard application and the Collaboration Service. We have yet to load test the combination of the application and collaboration services and how they influence how the BIGIP balances the load across the members - considering using ‘Observed (node)’ as opposed to the current ‘Observed (member)’ method since the same nodes are used in multiple pools. Although at some stage I would like to look at uses Dynamic Ratio if it can play nicely with persistent connections.
References:
http://devcentral.f5.com/wiki/default.aspx/iRules/CLIENT_ACCEPTED.html
http://devcentral.f5.com/wiki/default.aspx/iRules/persist.html
http://support.f5.com/kb/en-us/archived_products/big-ip/manuals/product/bigip4_5admin/BIGip_uie.html
Also take note of:
http://support.f5.com/kb/en-us/solutions/public/4000/100/sol4166.html