The next scheduled vulnerable period (2018-01-09 0700-1000) will be used to perform a software update on the NetApp filers elmer and eldo. Elmer is the main filer directly visible to users and eldo provides the backing store for virtual machines.
This is a fairly minor update intended to get the filers running the best supported version for our hardware. This software version has been running on the backup filer for several months.
The upgrade process exploits the redundancy inherent in the hardware to minimise disruption. There are two identical controllers, and while one is upgrading the other can continue to service clients on its behalf using the redundant paths to the shared pool of discs. Nevertheless there will be some disruption: Windows and Mac clients using CIFS are likely to be disconnected at least twice during the process, and there will be some short periods during which the NFS service does not respond, as the service is handed over from one controller to the other.
As with every filer outage, however short, there is a risk of consequential disruption to other services. In particular it is possible that Xen-based virtual machines will need to be rebooted afterwards if their virtual discs go into “read only” state. If this consequential disruption does happen, it may extend later into the day as the problem is not always immediately apparent.
People may like to know that at the time of writing, these filers have been running without interruption for 1113 days and have serviced about half a trillion NFS requests.