Saturday 17 August 2013

Failover - Streaming Replication


Continuing from last posts about Streaming replication, we gaze ahead performing fail-over between servers. 

If the Master server fails then the standby server should start failover procedures.

If the standby server falls short then no failover need take location. If the standby server can be restarted, even some time subsequent, then the recovery process can also be restarted directly, taking benefit of restartable recovery. If the standby server will not be restarted, then a full new standby server example should be conceived.

If the primary server fails and the standby server becomes the new primary, and then the old prime restarts, you should have a means for informing the vintage prime that it is no longer the prime. This is sometimes renowned as STONITH (Shoot The Other Node In The Head), which is necessary to bypass situations where both schemes think they are the prime, which will lead to disarray and finally facts and figures loss.

So, switching from primary to standby server can be very quick but requires some time to re-prepare the failover cluster. Regular swapping from primary to standby is helpful, since it permits normal downtime on each scheme for maintenance. This furthermore serves as a test of the failover means to ensure that it will actually work when you need it. in writing administration methods are suggested.

To initiate failover of a log-shipping standby server, conceive a initiate file with the filename and path specified by the trigger_file setting in recovery.conf. If trigger_file is not granted, there is no way to exit recovery in the standby and encourage it to a expert. That can be helpful for e.g reporting servers that are only utilised to offload read-only queries from the primary, not for high accessibility purposes.

Create recovery command file in the standby server; the following parameters are needed for streaming replication.

trigger_file = '/path_to/trigger'

Once the trigger file is discovered, the primary server puts itself down and standby server change it's state to master(temp). futhermore, the recovery.conf file will be altered to recovery.done. As happened, we could sense that failover had been performed.