.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.0102 (Pod::Simple 3.45) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "PT-REPLICA-RESTART 1" .TH PT-REPLICA-RESTART 1 2025-01-01 "perl v5.40.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME pt\-replica\-restart \- Watch and restart MySQL replication after errors. .SH SYNOPSIS .IX Header "SYNOPSIS" Usage: pt-replica-restart [OPTIONS] [DSN] .PP pt-replica-restart watches one or more MySQL replication replicas for errors, and tries to restart replication if it stops. .SH RISKS .IX Header "RISKS" Percona Toolkit is mature, proven in the real world, and well tested, but all database tools can pose a risk to the system and the database server. Before using this tool, please: .IP \(bu 4 Read the tool's documentation .IP \(bu 4 Review the tool's known "BUGS" .IP \(bu 4 Test the tool on a non-production server .IP \(bu 4 Backup your production server and verify the backups .SH DESCRIPTION .IX Header "DESCRIPTION" pt-replica-restart watches one or more MySQL replication replicas and tries to skip statements that cause errors. It polls replicas intelligently with an exponentially varying sleep time. You can specify errors to skip and run the replicas until a certain binlog position. .PP Although this tool can help a replica advance past errors, you should not rely on it to "fix" replication. If replica errors occur frequently or unexpectedly, you should identify and fix the root cause. .SH OUTPUT .IX Header "OUTPUT" pt-replica-restart prints a line every time it sees the replica has an error. By default this line is: a timestamp, connection information, relay_log_file, relay_log_pos, and last_errno. You can add more information using the "\-\-verbose" option. You can suppress all output using the "\-\-quiet" option. .SH SLEEP .IX Header "SLEEP" pt-replica-restart sleeps intelligently between polling the replica. The current sleep time varies. .IP \(bu 4 The initial sleep time is given by "\-\-sleep". .IP \(bu 4 If it checks and finds an error, it halves the previous sleep time. .IP \(bu 4 If it finds no error, it doubles the previous sleep time. .IP \(bu 4 The sleep time is bounded below by "\-\-min\-sleep" and above by "\-\-max\-sleep". .IP \(bu 4 Immediately after finding an error, pt-replica-restart assumes another error is very likely to happen next, so it sleeps the current sleep time or the initial sleep time, whichever is less. .SH "GLOBAL TRANSACTION IDS" .IX Header "GLOBAL TRANSACTION IDS" As of Percona Toolkit 2.2.8, pt-replica-restart supports Global Transaction IDs introduced in MySQL 5.6.5. It's important to keep in mind that: .IP \(bu 4 pt-replica-restart will not skip transactions when multiple replication threads are being used (replica_parallel_workers > 0). pt-replica-restart does not know what the GTID event is of the failed transaction of a specific replica thread. .IP \(bu 4 The default behavior is to skip the next transaction from the replica's source. Writes can originate on different servers, each with their own UUID. .Sp See "\-\-source\-uuid". .SH "EXIT STATUS" .IX Header "EXIT STATUS" An exit status of 0 (sometimes also called a return value or return code) indicates success. Any other value represents the exit status of the Perl process itself, or of the last forked process that exited if there were multiple servers to monitor. .SH COMPATIBILITY .IX Header "COMPATIBILITY" pt-replica-restart should work on many versions of MySQL. Lettercase of many output columns from SHOW REPLICA STATUS has changed over time, so it treats them all as lowercase. .SH OPTIONS .IX Header "OPTIONS" This tool accepts additional command-line arguments. Refer to the "SYNOPSIS" and usage information for details. .IP \-\-always 4 .IX Item "--always" Start replicas even when there is no error. With this option enabled, pt-replica-restart will not let you stop the replica manually if you want to! .IP \-\-ask\-pass 4 .IX Item "--ask-pass" Prompt for a password when connecting to MySQL. .IP \-\-charset 4 .IX Item "--charset" short form: \-A; type: string .Sp Default character set. If the value is utf8, sets Perl's binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. .IP \-\-[no]check\-relay\-log 4 .IX Item "--[no]check-relay-log" default: yes .Sp Check the last relay log file and position before checking for replica errors. .Sp By default pt-replica-restart will not doing anything (it will just sleep) if neither the relay log file nor the relay log position have changed since the last check. This prevents infinite loops (i.e. restarting the same error in the same relay log file at the same relay log position). .Sp For certain replica errors, however, this check needs to be disabled by specifying \f(CW\*(C`\-\-no\-check\-relay\-log\*(C'\fR. Do not do this unless you know what you are doing! .IP \-\-config 4 .IX Item "--config" type: Array .Sp Read this comma-separated list of config files; if specified, this must be the first option on the command line. .IP \-\-daemonize 4 .IX Item "--daemonize" Fork to the background and detach from the shell. POSIX operating systems only. .IP \-\-database 4 .IX Item "--database" short form: \-D; type: string .Sp Database to use. .IP \-\-defaults\-file 4 .IX Item "--defaults-file" short form: \-F; type: string .Sp Only read mysql options from the given file. You must give an absolute pathname. .IP \-\-error\-length 4 .IX Item "--error-length" type: int .Sp Max length of error message to print. When "\-\-verbose" is set high enough to print the error, this option will truncate the error text to the specified length. This can be useful to prevent wrapping on the terminal. .IP \-\-error\-numbers 4 .IX Item "--error-numbers" type: hash .Sp Only restart this comma-separated list of errors. Makes pt-replica-restart only try to restart if the error number is in this comma-separated list of errors. If it sees an error not in the list, it will exit. .Sp The error number is in the \f(CW\*(C`last_errno\*(C'\fR column of \f(CW\*(C`SHOW REPLICA STATUS\*(C'\fR. .IP \-\-error\-text 4 .IX Item "--error-text" type: string .Sp Only restart errors that match this pattern. A Perl regular expression against which the error text, if any, is matched. If the error text exists and matches, pt-replica-restart will try to restart the replica. If it exists but doesn't match, pt-replica-restart will exit. .Sp The error text is in the \f(CW\*(C`last_error\*(C'\fR column of \f(CW\*(C`SHOW REPLICA STATUS\*(C'\fR. .IP \-\-help 4 .IX Item "--help" Show help and exit. .IP \-\-host 4 .IX Item "--host" short form: \-h; type: string .Sp Connect to host. .IP \-\-log 4 .IX Item "--log" type: string .Sp Print all output to this file when daemonized. .IP \-\-master\-uuid 4 .IX Item "--master-uuid" type: string .Sp This option is deprecated and will be removed in future releases. Use "\-\-source\-user" instead. .IP \-\-max\-sleep 4 .IX Item "--max-sleep" type: float; default: 64 .Sp Maximum sleep seconds. .Sp The maximum time pt-replica-restart will sleep before polling the replica again. This is also the time that pt-replica-restart will wait for all other running instances to quit if both "\-\-stop" and "\-\-monitor" are specified. .Sp See "SLEEP". .IP \-\-min\-sleep 4 .IX Item "--min-sleep" type: float; default: 0.015625 .Sp The minimum time pt-replica-restart will sleep before polling the replica again. See "SLEEP". .IP \-\-monitor 4 .IX Item "--monitor" Whether to monitor the replica (default). Unless you specify \-\-monitor explicitly, "\-\-stop" will disable it. .IP \-\-password 4 .IX Item "--password" short form: \-p; type: string .Sp Password to use when connecting. If password contains commas they must be escaped with a backslash: "exam\e,ple" .IP \-\-pid 4 .IX Item "--pid" type: string .Sp Create the given PID file. The tool won't start if the PID file already exists and the PID it contains is different than the current PID. However, if the PID file exists and the PID it contains is no longer running, the tool will overwrite the PID file with the current PID. The PID file is removed automatically when the tool exits. .IP \-\-port 4 .IX Item "--port" short form: \-P; type: int .Sp Port number to use for connection. .IP \-\-quiet 4 .IX Item "--quiet" short form: \-q .Sp Suppresses normal output (disables "\-\-verbose"). .IP \-\-recurse 4 .IX Item "--recurse" type: int; default: 0 .Sp Watch replicas of the specified server, up to the specified number of servers deep in the hierarchy. The default depth of 0 means "just watch the replica specified." .Sp pt-replica-restart examines \f(CW\*(C`SHOW PROCESSLIST\*(C'\fR and tries to determine which connections are from replicas, then connect to them. See "\-\-recursion\-method". .Sp Recursion works by finding all replicas when the program starts, then watching them. If there is more than one replica, \f(CW\*(C`pt\-replica\-restart\*(C'\fR uses \f(CWfork()\fR to monitor them. .Sp This also works if you have configured your replicas to show up in \f(CW\*(C`SHOW REPLICAS\*(C'\fR. The minimal configuration for this is the \f(CW\*(C`report_host\*(C'\fR parameter, but there are other "report" parameters as well for the port, username, and password. .IP \-\-recursion\-method 4 .IX Item "--recursion-method" type: array; default: processlist,hosts .Sp Preferred recursion method used to find replicas. .Sp Possible methods are: .Sp .Vb 5 \& METHOD USES \& =========== ================== \& processlist SHOW PROCESSLIST \& hosts SHOW REPLICAS (SHOW SLAVE HOSTS before MySQL 8.1) \& none Do not find replicas .Ve .Sp The processlist method is preferred because SHOW REPLICAS is not reliable. However, the hosts method is required if the server uses a non-standard port (not 3306). Usually pt-replica-restart does the right thing and finds the replicas, but you may give a preferred method and it will be used first. If it doesn't find any replicas, the other methods will be tried. .IP \-\-run\-time 4 .IX Item "--run-time" type: time .Sp Time to run before exiting. Causes pt-replica-restart to stop after the specified time has elapsed. Optional suffix: s=seconds, m=minutes, h=hours, d=days; if no suffix, s is used. .IP \-\-sentinel 4 .IX Item "--sentinel" type: string; default: /tmp/pt\-replica\-restart\-sentinel .Sp Exit if this file exists. .IP \-\-slave\-user 4 .IX Item "--slave-user" type: string .Sp This option is deprecated and will be removed in future releases. Use "\-\-replica\-user" instead. .IP \-\-slave\-password 4 .IX Item "--slave-password" type: string .Sp This option is deprecated and will be removed in future releases. Use "\-\-replica\-password" instead. .IP \-\-replica\-user 4 .IX Item "--replica-user" type: string .Sp Sets the user to be used to connect to the replicas. This parameter allows you to have a different user with less privileges on the replicas but that user must exist on all replicas. .IP \-\-replica\-password 4 .IX Item "--replica-password" type: string .Sp Sets the password to be used to connect to the replicas. It can be used with \-\-replica\-user and the password for the user must be the same on all replicas. .IP \-\-set\-vars 4 .IX Item "--set-vars" type: Array .Sp Set the MySQL variables in this comma-separated list of \f(CW\*(C`variable=value\*(C'\fR pairs. .Sp By default, the tool sets: .Sp .Vb 1 \& wait_timeout=10000 .Ve .Sp Variables specified on the command line override these defaults. For example, specifying \f(CW\*(C`\-\-set\-vars wait_timeout=500\*(C'\fR overrides the defaultvalue of \f(CW10000\fR. .Sp The tool prints a warning and continues if a variable cannot be set. .IP \-\-skip\-count 4 .IX Item "--skip-count" type: int; default: 1 .Sp Number of statements to skip when restarting the replica. .IP \-\-source\-uuid 4 .IX Item "--source-uuid" type: string .Sp When using GTID, an empty transaction should be created in order to skip it. If writes are coming from different nodes in the replication tree above, it is not possible to know which event from which UUID to skip. .Sp By default, transactions from the replica's source (\f(CW\*(AqSource_UUID\*(Aq\fR from \&\f(CW\*(C`SHOW REPLICA STATUS\*(C'\fR) are skipped. .Sp For example, with .Sp .Vb 1 \& source1 \-> replica1 \-> replica2 .Ve .Sp When skipping events on replica2 that were written to source1, you must specify the UUID of source1, else the tool will use the UUID of replica1 by default. .Sp See "GLOBAL TRANSACTION IDS". .IP \-\-sleep 4 .IX Item "--sleep" type: int; default: 1 .Sp Initial sleep seconds between checking the replica. .Sp See "SLEEP". .IP \-\-socket 4 .IX Item "--socket" short form: \-S; type: string .Sp Socket file to use for connection. .IP \-\-stop 4 .IX Item "--stop" Stop running instances by creating the sentinel file. .Sp Causes \f(CW\*(C`pt\-replica\-restart\*(C'\fR to create the sentinel file specified by "\-\-sentinel". This should have the effect of stopping all running instances which are watching the same sentinel file. If "\-\-monitor" isn't specified, \f(CW\*(C`pt\-replica\-restart\*(C'\fR will exit after creating the file. If it is specified, \f(CW\*(C`pt\-replica\-restart\*(C'\fR will wait the interval given by "\-\-max\-sleep", then remove the file and continue working. .Sp You might find this handy to stop cron jobs gracefully if necessary, or to replace one running instance with another. For example, if you want to stop and restart \f(CW\*(C`pt\-replica\-restart\*(C'\fR every hour (just to make sure that it is restarted every hour, in case of a server crash or some other problem), you could use a \f(CW\*(C`crontab\*(C'\fR line like this: .Sp .Vb 1 \& 0 * * * * pt\-replica\-restart \-\-monitor \-\-stop \-\-sentinel /tmp/pt\-replica\-restartup .Ve .Sp The non-default "\-\-sentinel" will make sure the hourly \f(CW\*(C`cron\*(C'\fR job stops only instances previously started with the same options (that is, from the same \f(CW\*(C`cron\*(C'\fR job). .Sp See also "\-\-sentinel". .IP \-\-until\-master 4 .IX Item "--until-master" type: string .Sp This option is deprecated and will be removed in future releases. Use "\-\-until\-source" instead. .IP \-\-until\-source 4 .IX Item "--until-source" type: string .Sp Run until this source binary log file and position. Start the replica, and retry if it fails, until it reaches the given replication coordinates. The coordinates are the logfile and position on the source, given by relay_source_log_file, exec_source_log_pos. The argument must be in the format "file,pos". Separate the filename and position with a single comma and no space. .Sp This will also cause an UNTIL clause to be given to START REPLICA. .Sp After reaching this point, the replica should be stopped and pt-replica-restart will exit. .IP \-\-until\-relay 4 .IX Item "--until-relay" type: string .Sp Run until this relay log file and position. Like "\-\-until\-source", but in the replica's relay logs instead. The coordinates are given by relay_log_file, relay_log_pos. .IP \-\-user 4 .IX Item "--user" short form: \-u; type: string .Sp User for login if not current user. .IP \-\-verbose 4 .IX Item "--verbose" short form: \-v; cumulative: yes; default: 1 .Sp Adds more information to the output. This flag can be specified multiple times. e.g. \-v \-v OR \-vv. By default (no verbose flag) the tool outputs connection information, a timestamp, relay_log_file, relay_log_pos, and last_errno. One flag (\-v) adds last_error. See also "\-\-error\-length". Two flags (\-vv) prints the current sleep time each time pt-replica-restart sleeps. To suppress all output use the "\-\-quiet" option. .IP \-\-version 4 .IX Item "--version" Show version and exit. .IP \-\-[no]version\-check 4 .IX Item "--[no]version-check" default: yes .Sp Check for the latest version of Percona Toolkit, MySQL, and other programs. .Sp This is a standard "check for updates automatically" feature, with two additional features. First, the tool checks its own version and also the versions of the following software: operating system, Percona Monitoring and Management (PMM), MySQL, Perl, MySQL driver for Perl (DBD::mysql), and Percona Toolkit. Second, it checks for and warns about versions with known problems. For example, MySQL 5.5.25 had a critical bug and was re-released as 5.5.25a. .Sp A secure connection to Percona’s Version Check database server is done to perform these checks. Each request is logged by the server, including software version numbers and unique ID of the checked system. The ID is generated by the Percona Toolkit installation script or when the Version Check database call is done for the first time. .Sp Any updates or known problems are printed to STDOUT before the tool's normal output. This feature should never interfere with the normal operation of the tool. .Sp For more information, visit . .PP Show version and exit. .SH "DSN OPTIONS" .IX Header "DSN OPTIONS" These DSN options are used to create a DSN. Each option is given like \&\f(CW\*(C`option=value\*(C'\fR. The options are case-sensitive, so P and p are not the same option. There cannot be whitespace before or after the \f(CW\*(C`=\*(C'\fR and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. .IP \(bu 4 A .Sp dsn: charset; copy: yes .Sp Default character set. .IP \(bu 4 D .Sp dsn: database; copy: yes .Sp Default database. .IP \(bu 4 F .Sp dsn: mysql_read_default_file; copy: yes .Sp Only read default options from the given file .IP \(bu 4 h .Sp dsn: host; copy: yes .Sp Connect to host. .IP \(bu 4 p .Sp dsn: password; copy: yes .Sp Password to use when connecting. If password contains commas they must be escaped with a backslash: "exam\e,ple" .IP \(bu 4 P .Sp dsn: port; copy: yes .Sp Port number to use for connection. .IP \(bu 4 S .Sp dsn: mysql_socket; copy: yes .Sp Socket file to use for connection. .IP \(bu 4 u .Sp dsn: user; copy: yes .Sp User for login if not current user. .IP \(bu 4 s .Sp dsn: mysql_ssl; copy: yes .Sp Create SSL connection .SH ENVIRONMENT .IX Header "ENVIRONMENT" The environment variable \f(CW\*(C`PTDEBUG\*(C'\fR enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: .PP .Vb 1 \& PTDEBUG=1 pt\-replica\-restart ... > FILE 2>&1 .Ve .PP Be careful: debugging output is voluminous and can generate several megabytes of output. .SH ATTENTION .IX Header "ATTENTION" Using might expose passwords. When debug is enabled, all command line parameters are shown in the output. .SH "SYSTEM REQUIREMENTS" .IX Header "SYSTEM REQUIREMENTS" You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. .SH BUGS .IX Header "BUGS" For a list of known bugs, see . .PP Please report bugs at . Include the following information in your bug report: .IP \(bu 4 Complete command-line used to run the tool .IP \(bu 4 Tool "\-\-version" .IP \(bu 4 MySQL version of all servers involved .IP \(bu 4 Output from the tool including STDERR .IP \(bu 4 Input files (log/dump/config files, etc.) .PP If possible, include debugging output by running the tool with \f(CW\*(C`PTDEBUG\*(C'\fR; see "ENVIRONMENT". .SH DOWNLOADING .IX Header "DOWNLOADING" Visit to download the latest release of Percona Toolkit. Or, get the latest release from the command line: .PP .Vb 1 \& wget percona.com/get/percona\-toolkit.tar.gz \& \& wget percona.com/get/percona\-toolkit.rpm \& \& wget percona.com/get/percona\-toolkit.deb .Ve .PP You can also get individual tools from the latest release: .PP .Vb 1 \& wget percona.com/get/TOOL .Ve .PP Replace \f(CW\*(C`TOOL\*(C'\fR with the name of any tool. .SH AUTHORS .IX Header "AUTHORS" Baron Schwartz .SH "ABOUT PERCONA TOOLKIT" .IX Header "ABOUT PERCONA TOOLKIT" This tool is part of Percona Toolkit, a collection of advanced command-line tools for MySQL developed by Percona. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and primarily developed by him and Daniel Nichter. Visit to learn about other free, open-source software from Percona. .SH "COPYRIGHT, LICENSE, AND WARRANTY" .IX Header "COPYRIGHT, LICENSE, AND WARRANTY" This program is copyright 2011\-2024 Percona LLC and/or its affiliates, 2007\-2011 Baron Schwartz. .PP THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. .PP This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses. .PP You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111\-1307 USA. .SH VERSION .IX Header "VERSION" pt-replica-restart 3.7.0 .SH "POD ERRORS" .IX Header "POD ERRORS" Hey! \fBThe above document had some coding errors, which are explained below:\fR .IP "Around line 6296:" 4 .IX Item "Around line 6296:" Non-ASCII character seen before =encoding in 'Percona’s'. Assuming UTF\-8