Standard Operational Procedures
Servers are monitored every 5 minutes from multiple external locations, using a third party service. In the event of a server not responding email and SMS alerts are sent to the support team. Server event logs are constantly monitored with Error, Alert and Information events being immediately sent to the support team for diagnosis. Any issues identified are generally resolved before they become a major problem.
Databases and applications from each server are backed up daily to network attached storage on a 14 day cycle. The log from each backup is emailed to the server team to ensure that backup windows are monitored.
Backups are either to network attached devices inside the data centre or to Amazon S3 storage. The Amazon option gives the user the ability to independently download the backups (this is a read-only service and is protected by a User/Pass combination).
Quarterly server management reports of the physical hardware are produced detailing major server events during the preceding 3 month (end January, April, July, October). Disk space of both physical and virtual machines is monitored every 4 hours with an alert level by default set at 80% of capacity. Disk usage trends are recorded in an Excel spreadsheet for disk space trend analysis.
All servers are protected by a Ripe Group managed firewall. Servers behind the firewall are on RFC1918 compliant private networks with no access from one subnet to another except via the firewall. The firewall also performs Intrusion Detection allowing us to identify probe attempts on the servers and ban suspect IP addresses for a period of 4 hours. This helps to protect the servers from organised attacks.
- Scheduled Maintenance:
We apply Microsoft Updates on the Friday following the second Tuesday of each month after consultation with each server owner. In the event of critical patches we attempt to apply these patches as soon as possible after consultation with the server owner. Linux patches are applied at the end of each month after consultation with each server owner. In the event of server compromise we take whatever remedial action is jointly considered appropriate. This may include isolation of the server from the network while this is carried out.