.\" Man page generated from reStructuredText.
.
.
.nr rst2man-indent-level 0
.
.de1 rstReportMargin
\\$1 \\n[an-margin]
level \\n[rst2man-indent-level]
level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
-
\\n[rst2man-indent0]
\\n[rst2man-indent1]
\\n[rst2man-indent2]
..
.de1 INDENT
.\" .rstReportMargin pre:
. RS \\$1
. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin]
. nr rst2man-indent-level +1
.\" .rstReportMargin post:
..
.de UNINDENT
. RE
.\" indent \\n[an-margin]
.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
.nr rst2man-indent-level -1
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "PG_AUTOCTL" "5" "Nov 06, 2022" "2.0" "pg_auto_failover"
.SH NAME
pg_autoctl \- pg_auto_failover Configuration
.sp
Several defaults settings of pg_auto_failover can be reviewed and changed depending
on the trade\-offs you want to implement in your own production setup. The
settings that you can change will have an impact of the following
operations:
.INDENT 0.0
.INDENT 3.5
.INDENT 0.0
.IP \(bu 2
Deciding when to promote the secondary
.sp
pg_auto_failover decides to implement a failover to the secondary node when it
detects that the primary node is unhealthy. Changing the following
settings will have an impact on when the pg_auto_failover monitor decides to
promote the secondary PostgreSQL node:
.INDENT 2.0
.INDENT 3.5
.sp
.nf
.ft C
pgautofailover.health_check_max_retries
pgautofailover.health_check_period
pgautofailover.health_check_retry_delay
pgautofailover.health_check_timeout
pgautofailover.node_considered_unhealthy_timeout
.ft P
.fi
.UNINDENT
.UNINDENT
.IP \(bu 2
Time taken to promote the secondary
.sp
At secondary promotion time, pg_auto_failover waits for the following timeout to
make sure that all pending writes on the primary server made it to the
secondary at shutdown time, thus preventing data loss.:
.INDENT 2.0
.INDENT 3.5
.sp
.nf
.ft C
pgautofailover.primary_demote_timeout
.ft P
.fi
.UNINDENT
.UNINDENT
.IP \(bu 2
Preventing promotion of the secondary
.sp
pg_auto_failover implements a trade\-off where data availability trumps service
availability. When the primary node of a PostgreSQL service is detected
unhealthy, the secondary is only promoted if it was known to be eligible
at the moment when the primary is lost.
.sp
In the case when \fIsynchronous replication\fP was in use at the moment when
the primary node is lost, then we know we can switch to the secondary
safely, and the wal lag is 0 in that case.
.sp
In the case when the secondary server had been detected unhealthy
before, then the pg_auto_failover monitor switches it from the state SECONDARY to
the state CATCHING\-UP and promotion is prevented then.
.sp
The following setting allows to still promote the secondary, allowing
for a window of data loss:
.INDENT 2.0
.INDENT 3.5
.sp
.nf
.ft C
pgautofailover.promote_wal_log_threshold
.ft P
.fi
.UNINDENT
.UNINDENT
.UNINDENT
.UNINDENT
.UNINDENT
.SH PG_AUTO_FAILOVER MONITOR
.sp
The configuration for the behavior of the monitor happens in the PostgreSQL
database where the extension has been deployed:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
pg_auto_failover=> select name, setting, unit, short_desc from pg_settings where name ~ \(aqpgautofailover.\(aq;
\-[ RECORD 1 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.enable_sync_wal_log_threshold
setting    | 16777216
unit       |
short_desc | Don\(aqt enable synchronous replication until secondary xlog is within this many bytes of the primary\(aqs
\-[ RECORD 2 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.health_check_max_retries
setting    | 2
unit       |
short_desc | Maximum number of re\-tries before marking a node as failed.
\-[ RECORD 3 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.health_check_period
setting    | 5000
unit       | ms
short_desc | Duration between each check (in milliseconds).
\-[ RECORD 4 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.health_check_retry_delay
setting    | 2000
unit       | ms
short_desc | Delay between consecutive retries.
\-[ RECORD 5 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.health_check_timeout
setting    | 5000
unit       | ms
short_desc | Connect timeout (in milliseconds).
\-[ RECORD 6 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.node_considered_unhealthy_timeout
setting    | 20000
unit       | ms
short_desc | Mark node unhealthy if last ping was over this long ago
\-[ RECORD 7 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.primary_demote_timeout
setting    | 30000
unit       | ms
short_desc | Give the primary this long to drain before promoting the secondary
\-[ RECORD 8 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.promote_wal_log_threshold
setting    | 16777216
unit       |
short_desc | Don\(aqt promote secondary unless xlog is with this many bytes of the master
\-[ RECORD 9 ]\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
name       | pgautofailover.startup_grace_period
setting    | 10000
unit       | ms
short_desc | Wait for at least this much time after startup before initiating a failover.
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
You can edit the parameters as usual with PostgreSQL, either in the
\fBpostgresql.conf\fP file or using \fBALTER DATABASE pg_auto_failover SET parameter =
value;\fP commands, then issuing a reload.
.SH PG_AUTO_FAILOVER KEEPER SERVICE
.sp
For an introduction to the \fBpg_autoctl\fP commands relevant to the pg_auto_failover
Keeper configuration, please see \fI\%pg_autoctl config\fP\&.
.sp
An example configuration file looks like the following:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
[pg_autoctl]
role = keeper
monitor = postgres://autoctl_node@192.168.1.34:6000/pg_auto_failover
formation = default
group = 0
hostname = node1.db
nodekind = standalone

[postgresql]
pgdata = /data/pgsql/
pg_ctl = /usr/pgsql\-10/bin/pg_ctl
dbname = postgres
host = /tmp
port = 5000

[replication]
slot = pgautofailover_standby
maximum_backup_rate = 100M
backup_directory = /data/backup/node1.db

[timeout]
network_partition_timeout = 20
postgresql_restart_failure_timeout = 20
postgresql_restart_failure_max_retries = 3
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
To output, edit and check entries of the configuration, the following
commands are provided:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
pg_autoctl config check [\-\-pgdata <pgdata>]
pg_autoctl config get [\-\-pgdata <pgdata>] section.option
pg_autoctl config set [\-\-pgdata <pgdata>] section.option value
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
The \fB[postgresql]\fP section is discovered automatically by the \fBpg_autoctl\fP
command and is not intended to be changed manually.
.sp
\fBpg_autoctl.monitor\fP
.sp
PostgreSQL service URL of the pg_auto_failover monitor, as given in the output of
the \fBpg_autoctl show uri\fP command.
.sp
\fBpg_autoctl.formation\fP
.sp
A single pg_auto_failover monitor may handle several postgres formations. The default
formation name \fIdefault\fP is usually fine.
.sp
\fBpg_autoctl.group\fP
.sp
This information is retrieved by the pg_auto_failover keeper when registering a node
to the monitor, and should not be changed afterwards. Use at your own risk.
.sp
\fBpg_autoctl.hostname\fP
.sp
Node \fIhostname\fP used by all the other nodes in the cluster to contact this
node. In particular, if this node is a primary then its standby uses that
address to setup streaming replication.
.sp
\fBreplication.slot\fP
.sp
Name of the PostgreSQL replication slot used in the streaming replication
setup automatically deployed by pg_auto_failover. Replication slots can\(aqt be renamed
in PostgreSQL.
.sp
\fBreplication.maximum_backup_rate\fP
.sp
When pg_auto_failover (re\-)builds a standby node using the \fBpg_basebackup\fP
command, this parameter is given to \fBpg_basebackup\fP to throttle the
network bandwidth used. Defaults to 100Mbps.
.sp
\fBreplication.backup_directory\fP
.sp
When pg_auto_failover (re\-)builds a standby node using the \fBpg_basebackup\fP
command, this parameter is the target directory where to copy the bits from
the primary server. When the copy has been successful, then the directory is
renamed to \fBpostgresql.pgdata\fP\&.
.sp
The default value is computed from \fB${PGDATA}/../backup/${hostname}\fP and
can be set to any value of your preference. Remember that the directory
renaming is an atomic operation only when both the source and the target of
the copy are in the same filesystem, at least in Unix systems.
.sp
\fBtimeout\fP
.sp
This section allows to setup the behavior of the pg_auto_failover keeper in
interesting scenarios.
.sp
\fBtimeout.network_partition_timeout\fP
.sp
Timeout in seconds before we consider failure to communicate with other
nodes indicates a network partition. This check is only done on a PRIMARY
server, so other nodes mean both the monitor and the standby.
.sp
When a PRIMARY node is detected to be on the losing side of a network
partition, the pg_auto_failover keeper enters the DEMOTE state and stops the
PostgreSQL instance in order to protect against split brain situations.
.sp
The default is 20s.
.sp
\fBtimeout.postgresql_restart_failure_timeout\fP
.sp
\fBtimeout.postgresql_restart_failure_max_retries\fP
.sp
When PostgreSQL is not running, the first thing the pg_auto_failover keeper does is
try to restart it. In case of a transient failure (e.g. file system is full,
or other dynamic OS resource constraint), the best course of action is to
try again for a little while before reaching out to the monitor and ask for
a failover.
.sp
The pg_auto_failover keeper tries to restart PostgreSQL
\fBtimeout.postgresql_restart_failure_max_retries\fP times in a row
(default 3) or up to \fBtimeout.postgresql_restart_failure_timeout\fP
(defaults 20s) since it detected that PostgreSQL is not running, whichever
comes first.
.SH AUTHOR
Microsoft
.SH COPYRIGHT
Copyright (c) Microsoft Corporation. All rights reserved.
.\" Generated by docutils manpage writer.
.