Restoration Error

April 16th, 2010

For reasons unknown at this point, the backups that were restored through the night are actually from February of this year.  This is of course a major blow to the hard work the team put in through the night as well as causing further inconvenience for users.

We are presently downloading the offsite backups which are up to date and we shall then have to start the restoration of accounts from scratch.

As the offsite backups are already in tar format, the process shouldn’t take as long as it did last night.

Obviously this is a major setback but we would appreciate your patience and understanding whilst we try and resolve this asap.

Please note that to keep accounts in sync we have deliberately stopped all services on the Linux server whilst we restore from the recent backups

Update 09:35 – Backups from 15th April are now being downloaded t the server.  We expect this to be complete by approximately 10:45

Update 10:11 – Approximately 5GB of data left to download before we can start restoring. This should start at around 10:30

Update 10:50 – Currently terminating accounts so that they can be recreated with the correct data.

Update 12:10 – The restoration of accounts has now begun in alphabetical order.  Each account takes between 30-60 seconds to fully restore so it is hard for us to give an ETA of the completion. As each account is restored however, it will immediately become active again for web traffic and email.

Update 12:18 – Restoration 15% complete

Update 12:31 – Restoration 25% complete

Update 12:45 – Restoration 30% complete

Update 13:11 – We’re just resolving a minor issue with cPanel and we will then be able to continue with the account restoration.

Update 13:27 – Earlier problems with cPanel have been resolved and we are continuing with the restoration albeit in blocks of 10 domains at a time to prevent server overloads.

Update 13:52 – We are now well over half way through the restoration process.

Update 14:22 – We are now restoring the last batch of accounts and this should be completed within the hour.  Thank you for your patience during this very frustrating period.

Update 14:51 – There are currently around ten accounts left to restore…

We are aware of an issue where PHP files are not being parsed rather the browser is trying to download the file.  This is being treated as a high priority and will be dealt with as soon as the restoration task completes.

Update 15:19 – All accounts have now been full restored.  We are currently recompiling Apache/PHP to resolve the above issue.

MySQL Issue on Linux Server One

April 15th, 2010

We are aware of an issue with the MySQL server on Linux Server One and we are working to resolve this as quickly as possible.

Update 18:36 – The server is failing to respond due to an excessively high CPU usage.  Wea re attempting to reboot the server now.

Update 18:40 – It looks like the reboot has not resolved the issue, technicians are on-site and are investigating further.

Update 19:29 – We have gained access to the server but it does look like a major issue with the boot disk. We are running a disk repair at the moment to try and resolve the boot issue.

Update 20:24 – Unfortunately it does look like the primary hard disk on the server is fatally corrupted.  The procedure is now that we will have to replace the disk, reinstall the OS and cPanel and then start restoring accounts.  Note that this process will take several hours to complete but we will give you constant updates as to the progress.

Update 21:05 – The faulty drive has now been replaced and the operating system re-installed. We are currently installing cPanel/WHM

Update 22:33 – cPanel/WHM has now been re-installed and we are in the process of securing it for the restoration of accounts.

Update 02:14 – We have now commenced restoring user account.  This will be done alphabetically and we shall proceed as quickly as possible.

Update 02:38 – Account zero-a restored

Update 02:58 – “B” accounts restored

Update 03:27 – “C” accounts restored

Update 03:42 – “D” accounts restored

Update 03:53 – “E” accounts restored

Update 04:34 – “F” accounts restored

Update 05:00 – “G” & “H” accounts restored

Update 05:10 – “I” accounts restored

Update 05:12 – “J” accounts restored

Update 05:19 – “K” accounts restored

Update 05:25 – “L” accounts created

Update 05:41 – “M” accounts restored

Update 05:47 – “N” accounts restored

Update 06:00 – “O” accounts restored

Update 0612 – “P” accounts restored

Update 06:14 – “Q” accounts restored

Update 06:18 – “R” accounts restored

Update 06:47 – “S” accounts restored

Update 07:04 – “T” accounts restored

Update 07:06 – “U” accounts restored

Update 07:18 – “W” accounts restored

Update 07:23 – “Z” accounts restored

All user accounts have nw been restored from the backups taken at around 2am yesterday morning. Would allusers please carefully check their account and report any issues through the helpdesk: https://www.openmindhosting.co.uk/support/

A RFO (Reason For Outage) will be issued shortly.

Network Issue at Linx

March 17th, 2010

The London Internet Exchange (LINX) is currently suffering from a network issue that is affecting traffic. We immediately shut down our connections to LINX until they resolve this problem. Traffic has been rerouted over our other peering points until the situation is resolved.

Date: 16/03/2010
Time: 22:30
Effect on service: Increased latency for traffic that would normally reach us via LINX
Duration: Ongoing

The only information that we have from LINX at this time is that they have acknowledged an issue and are investigating the problem. This problem has only affected traffic that reaches our network via LINX, which due to our diverse peering points with many of the major ISPs has meant minimal impact on incoming and outgoing traffic. The Open Mind Hosting network has remained stable throughout.

Traffic that would normally reach us via LINX will now be rerouted and connect to us via our other peering points. This may result in a slight increase in latency for that traffic, if it involves additional hops. We are now waiting on LINX for updates as to the nature of the problem, likely resolution time and ultimately for an all clear notification.

If anyone is experiencing problems with their service, please do not hesitate to contact us.

MySQL4 Database Server

February 21st, 2010

It looks like we have a disk failure on this server resulting in MySQL4 databases becoming non-operational.

Technicians are currently working on the issue and we shall post an update as soon as we have one…

UPDATE 15:19 – Unfortunately it is confirmed that the drive has failed and we are currently waiting for technicians to replace it. Once this is done we shall be able to restore data from the backups.

UPDATE 16:18 – The disk has now been replaced and we have started to restore data from the backups.  This should take no more than 2 hours to complete.

UPDATE 16:47 – Data has now been restored, we are in the process of reconfiguring the database server.

UPDATE 19:18 – Unfortunately our batch restore script is failing to restore databases correctly so to ensure zero data loss we are having to restore databases one by one.  There are currently 181 databases on the server and each one takes 30-60 seconds to restore.  Databases will be restored in alphabetical order so we appreciate your patience whilst we carry out this work.

UPDATE 20:44 – We are now restoring databases beginning with “H” and are more than 50% through the restoration work.

UPDATE 23:03 – All databases have now been restored, if you experience any further problems then please do not hesitate to get in touch with our support team.  We shall be issuing a full RFO (Reason For Outage) tomorrow.

Helm Maintenance

February 8th, 2010

We shall be carrying out some essential maintenance on the Helm control panel this morning so it will be unavailable for approximately 30 minutes.

UPDATE 12:34 – This work has now been completed.

Windows Statistics Server

January 22nd, 2010

We are currently in the process of upgrading our primary Windows statistics server up to the latest release of SmarterStats.

During the up;grade users will not be able to access their stats reports. We shall update this post once the work has been completed.

UPDATE 11:21 – This work has now been completed and stats are once again available to all Windows network users.

Upgrade to Windows DNS Servers

January 15th, 2010

We will shortly commence an upgrade to our primary and secondary DNS servers.

The DNS servers will continue to operate throughout the upgrade so there will be no interruption in service. During the upgrade however, you will not be able to add, edit or remove and DNS records from existing zones within Helm.

We will update this post once the upgrade has been completed.

UPDATE 12:24 – The upgrade has now been completed with zero downtime or loss of data. We hope you will notice an increase in speed with DNS server requests from now on.

Data Centre Outage

January 8th, 2010

We are currently experiencing a major outage at the data centre our newtork is located within.

This is affecting all shared/dedicated clients.

Currently we are waiting for the data centre to update us as to the nature of the problem and expcted resolution.

Updae 01:45 – The network is now back online, we are communicating with the data centre to find the cause of the outage.

MySQL5 Database Issue

January 4th, 2010

Users have reported connectivity problems with the MySQL5 database server on the Windows network.

We are currently investigating the issue and hope to have it resolved shortly.

Update: This issue has now been resolved.

Billing System Closed

December 31st, 2009

The billing system is currently closed and will remain this way until Midday, 1st January 2010

The reason for the closure is to ensure that all invoices and orders are charged at the correct rate of VAT when the VAT rate changes from 15% to 17.5% at midnight tonight.

Once the billing system re-opens, users will be able to once again place new orders and access existing accounts.