.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.0102 (Pod::Simple 3.45) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "PT-FINGERPRINT 1" .TH PT-FINGERPRINT 1 2025-01-01 "perl v5.40.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME pt\-fingerprint \- Convert queries into fingerprints. .SH SYNOPSIS .IX Header "SYNOPSIS" Usage: pt-fingerprint [OPTIONS] [FILES] .PP pt-fingerprint converts queries into fingerprints. With the \-\-query option, converts the option's value into a fingerprint. With no options, treats command-line arguments as FILEs and reads and converts semicolon-separated queries from the FILEs. When FILE is \-, it read standard input. .PP Convert a single query: .PP .Vb 1 \& pt\-fingerprint \-\-query "select a, b, c from users where id = 500" .Ve .PP Convert a file full of queries: .PP .Vb 1 \& pt\-fingerprint /path/to/file.txt .Ve .SH RISKS .IX Header "RISKS" Percona Toolkit is mature, proven in the real world, and well tested, but all database tools can pose a risk to the system and the database server. Before using this tool, please: .IP \(bu 4 Read the tool's documentation .IP \(bu 4 Review the tool's known "BUGS" .IP \(bu 4 Test the tool on a non-production server .IP \(bu 4 Backup your production server and verify the backups .SH DESCRIPTION .IX Header "DESCRIPTION" A query fingerprint is the abstracted form of a query, which makes it possible to group similar queries together. Abstracting a query removes literal values, normalizes whitespace, and so on. For example, consider these two queries: .PP .Vb 3 \& SELECT name, password FROM user WHERE id=\*(Aq12823\*(Aq; \& select name, password from user \& where id=5; .Ve .PP Both of those queries will fingerprint to .PP .Vb 1 \& select name, password from user where id=? .Ve .PP Once the query's fingerprint is known, we can then talk about a query as though it represents all similar queries. .PP Query fingerprinting accommodates a great many special cases, which have proven necessary in the real world. For example, an IN list with 5 literals is really equivalent to one with 4 literals, so lists of literals are collapsed to a single one. If you want to understand more about how and why all of these cases are handled, please review the test cases in the Github repository. If you find something that is not fingerprinted properly, please submit a bug report with a reproducible test case. Here is a list of transformations during fingerprinting, which might not be exhaustive: .IP \(bu 4 Group all SELECT queries from mysqldump together, even if they are against different tables. Ditto for all of pt-table-checksum's checksum queries. .IP \(bu 4 Shorten multi-value INSERT statements to a single \fBVALUES()\fR list. .IP \(bu 4 Strip comments. .IP \(bu 4 Abstract the databases in USE statements, so all USE statements are grouped together. .IP \(bu 4 Replace all literals, such as quoted strings. For efficiency, the code that replaces literal numbers is somewhat non-selective, and might replace some things as numbers when they really are not. Hexadecimal literals are also replaced. NULL is treated as a literal. Numbers embedded in identifiers are also replaced, so tables named similarly will be fingerprinted to the same values (e.g. users_2009 and users_2010 will fingerprint identically). .IP \(bu 4 Collapse all whitespace into a single space. .IP \(bu 4 Lowercase the entire query. .IP \(bu 4 Replace all literals inside of \fBIN()\fR and \fBVALUES()\fR lists with a single placeholder, regardless of cardinality. .IP \(bu 4 Collapse multiple identical UNION queries into a single one. .SH OPTIONS .IX Header "OPTIONS" This tool accepts additional command-line arguments. Refer to the "SYNOPSIS" and usage information for details. .IP \-\-config 4 .IX Item "--config" type: Array .Sp Read this comma-separated list of config files; if specified, this must be the first option on the command line. .IP \-\-help 4 .IX Item "--help" Show help and exit. .IP \-\-match\-embedded\-numbers 4 .IX Item "--match-embedded-numbers" Match numbers embedded in words and replace as single values. This option causes the tool to be more careful about matching numbers so that words with numbers, like \f(CW\*(C`catch22\*(C'\fR are matched and replaced as a single \f(CW\*(C`?\*(C'\fR placeholder. Otherwise the default number matching pattern will replace \&\f(CW\*(C`catch22\*(C'\fR as \f(CW\*(C`catch?\*(C'\fR. .Sp This is helpful if database or table names contain numbers. .IP \-\-match\-md5\-checksums 4 .IX Item "--match-md5-checksums" Match MD5 checksums and replace as single values. This option causes the tool to be more careful about matching numbers so that MD5 checksums like \f(CW\*(C`fbc5e685a5d3d45aa1d0347fdb7c4d35\*(C'\fR are matched and replaced as a single \f(CW\*(C`?\*(C'\fR placeholder. Otherwise, the default number matching pattern will replace \f(CW\*(C`fbc5e685a5d3d45aa1d0347fdb7c4d35\*(C'\fR as \f(CW\*(C`fbc?\*(C'\fR. .IP \-\-query 4 .IX Item "--query" type: string .Sp The query to convert into a fingerprint. .IP \-\-version 4 .IX Item "--version" Show version and exit. .SH ENVIRONMENT .IX Header "ENVIRONMENT" The environment variable \f(CW\*(C`PTDEBUG\*(C'\fR enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: .PP .Vb 1 \& PTDEBUG=1 pt\-fingerprint ... > FILE 2>&1 .Ve .PP Be careful: debugging output is voluminous and can generate several megabytes of output. .SH ATTENTION .IX Header "ATTENTION" Using might expose passwords. When debug is enabled, all command line parameters are shown in the output. .SH "SYSTEM REQUIREMENTS" .IX Header "SYSTEM REQUIREMENTS" You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. .SH BUGS .IX Header "BUGS" For a list of known bugs, see . .PP Please report bugs at . Include the following information in your bug report: .IP \(bu 4 Complete command-line used to run the tool .IP \(bu 4 Tool "\-\-version" .IP \(bu 4 MySQL version of all servers involved .IP \(bu 4 Output from the tool including STDERR .IP \(bu 4 Input files (log/dump/config files, etc.) .PP If possible, include debugging output by running the tool with \f(CW\*(C`PTDEBUG\*(C'\fR; see "ENVIRONMENT". .SH DOWNLOADING .IX Header "DOWNLOADING" Visit to download the latest release of Percona Toolkit. Or, get the latest release from the command line: .PP .Vb 1 \& wget percona.com/get/percona\-toolkit.tar.gz \& \& wget percona.com/get/percona\-toolkit.rpm \& \& wget percona.com/get/percona\-toolkit.deb .Ve .PP You can also get individual tools from the latest release: .PP .Vb 1 \& wget percona.com/get/TOOL .Ve .PP Replace \f(CW\*(C`TOOL\*(C'\fR with the name of any tool. .SH AUTHORS .IX Header "AUTHORS" Baron Schwartz and Daniel Nichter .SH "ABOUT PERCONA TOOLKIT" .IX Header "ABOUT PERCONA TOOLKIT" This tool is part of Percona Toolkit, a collection of advanced command-line tools for MySQL developed by Percona. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and primarily developed by him and Daniel Nichter. Visit to learn about other free, open-source software from Percona. .SH "COPYRIGHT, LICENSE, AND WARRANTY" .IX Header "COPYRIGHT, LICENSE, AND WARRANTY" This program is copyright 2011\-2024 Percona LLC and/or its affiliates. .PP THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. .PP This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses. .PP You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111\-1307 USA. .SH VERSION .IX Header "VERSION" pt-fingerprint 3.7.0