Friday, 10 October 2014

Identify Replication delay/lag on Streaming replication

As specified prior, sync between Primary and Standby servers can undoubtedly be seen by executing below functions,

On Primary:
SELECT pg_current_xlog_location();

On Standby:
SELECT pg_last_xlog_receive_location();
SELECT pg_last_xlog_replay_location();


furthermore, on a real-traffic extensive database wherein hits after hits onto database is expected and resulting in series of writes/selects will impact on above Hexa-decimal values being represent the state of sync between Primary and Standby Servers.

To Check replication status, use below function by firing this on Primary Node:

PG_STAT_REPLICATION:

Column
Type
Description
pid
integer
Process ID of a WAL sender process
usesysid
oid
OID of the user logged into this WAL sender process
usename
name
Name of the user logged into this WAL sender process
application_name
 text
Name of the application that is connected to this WAL sender
client_addr
inet

IP address of the client connected to this WAL sender.
If this field is null, it indicates that the client is connected via a
Unix socket on the server machine.
client_hostname
text
Host name of the connected client, as reported by a reverse DNS
lookup of client_addr. This field will only be non-null for
IP connections, and only when log_hostname is enabled.
client_port
integer
TCP port number that the client is using for communication with
this WAL sender, or -1 if a Unix socket is used
backend_start
timestamp 
Time when this process was started, i.e., when the client connected
to this WAL sender
state
text
Current WAL sender state
sent_location
text
Last transaction log position sent on this connection
write_location
text
Last transaction log position written to disk by this standby server
flush_location
text
Last transaction log position flushed to disk by this standby server
replay_location
text
Last transaction log position replayed into the database on this
standby server
sync_priority
integer
Priority of this standby server for being chosen as the synchronous
Standby
sync_state
text
Synchronous state of this standby server.


Select appropriate columns to view the replication status:

select client_addr, state, sync_priority, sent_location, write_location, flush_location, replay_location from pg_stat_replication;  

Also, another function would help out in Replication lag behind Stand by and Primary server. 

pg_last_xact_replay_timestamp() 

Using Current time and with above function, we get replication delay as

Kamal=# select now() - pg_last_xact_replay_timestamp() AS replication_delay_time;
 replication_delay_time
-----------------------
 00:00:04.064454
(1 row) 

Parellely, monitor the database size if this replication designed for multiple database.

 SELECT D.DATNAME AS NAME,  PG_CATALOG.PG_GET_USERBYID(D.DATDBA) AS OWNER,
    CASE WHEN PG_CATALOG.HAS_DATABASE_PRIVILEGE(D.DATNAME, 'CONNECT')
        THEN PG_CATALOG.PG_SIZE_PRETTY(PG_CATALOG.PG_DATABASE_SIZE(D.DATNAME))
        ELSE 'NO ACCESS'
    END AS SIZE
FROM PG_CATALOG.PG_DATABASE D
    ORDER BY
    CASE WHEN PG_CATALOG.HAS_DATABASE_PRIVILEGE(D.DATNAME, 'CONNECT')
        THEN PG_CATALOG.PG_DATABASE_SIZE(D.DATNAME)
        ELSE NULL
    END DESC
    LIMIT --;