.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "XML_GREP 1" .TH XML_GREP 1 2023-07-25 "perl v5.38.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME xml_grep \- grep XML files looking for specific elements .SH SYNOPSYS .IX Header "SYNOPSYS" .Vb 1 \& xml_grep [options] .Ve .PP or .PP .Vb 1 \& xml_grep .Ve .PP By default you can just give \f(CW\*(C`xml_grep\*(C'\fR an XPath expression and a list of files, and get an XML file with the result. .PP This is equivalent to writing .PP .Vb 1 \& xml_grep \-\-group_by_file file \-\-pretty_print indented \-\-cond .Ve .SH OPTIONS .IX Header "OPTIONS" .IP \fB\-\-help\fR 4 .IX Item "--help" brief help message .IP \fB\-\-man\fR 4 .IX Item "--man" full documentation .IP \fB\-\-Version\fR 4 .IX Item "--Version" display the tool version .IP "\fB\-\-root\fR " 4 .IX Item "--root " look for and return xml chunks matching .Sp if neither \f(CW\*(C`\-\-root\*(C'\fR nor \f(CW\*(C`\-\-file\*(C'\fR are used then the element(s) that trigger the \f(CW\*(C`\-\-cond\*(C'\fR option is (are) used. If \f(CW\*(C`\-\-cond\*(C'\fR is not used then all elements matching the are returned .Sp several \f(CW\*(C`\-\-root\*(C'\fR can be provided .IP "\fB\-\-cond\fR " 4 .IX Item "--cond " return the chunks (or file names) only if they contain elements matching .Sp several \f(CW\*(C`\-\-cond\*(C'\fR can be provided (in which case they are OR'ed) .IP \fB\-\-files\fR 4 .IX Item "--files" return only file names (do not generate an XML output) .Sp usage of this option precludes using any of the options that define the XML output: \&\f(CW\*(C`\-\-roots\*(C'\fR, \f(CW\*(C`\-\-encoding\*(C'\fR, \f(CW\*(C`\-\-wrap\*(C'\fR, \f(CW\*(C`\-\-group_by_file\*(C'\fR or \f(CW\*(C`\-\-pretty_print\*(C'\fR .IP \fB\-\-count\fR 4 .IX Item "--count" return only the number of matches in each file .Sp usage of this option precludes using any of the options that define the XML output: \&\f(CW\*(C`\-\-roots\*(C'\fR, \f(CW\*(C`\-\-encoding\*(C'\fR, \f(CW\*(C`\-\-wrap\*(C'\fR, \f(CW\*(C`\-\-group_by_file\*(C'\fR or \f(CW\*(C`\-\-pretty_print\*(C'\fR .IP \fB\-\-strict\fR 4 .IX Item "--strict" without this option parsing errors are reported to STDOUT and the file skipped .IP \fB\-\-date\fR 4 .IX Item "--date" when on (by default) the wrapping element get a \f(CW\*(C`date\*(C'\fR attribute that gives the date the tool was run. .Sp with \f(CW\*(C`\-\-nodate\*(C'\fR this attribute is not added, which can be useful if you need to compare 2 runs. .IP "\fB\-\-encoding\fR " 4 .IX Item "--encoding " encoding of the xml output (utf\-8 by default) .IP "\fB\-\-nb_results\fR " 4 .IX Item "--nb_results " output only results .IP \fB\-\-by_file\fR 4 .IX Item "--by_file" output only results by file .IP "\fB\-\-wrap\fR " 4 .IX Item "--wrap " wrap the xml result in the provided tag (defaults to 'xml_grep') .Sp If wrap is set to an empty string (\f(CW\*(C`\-\-wrap \*(Aq\*(Aq\*(C'\fR) then the xml result is not wrapped at all. .IP \fB\-\-nowrap\fR 4 .IX Item "--nowrap" same as using \f(CW\*(C`\-\-wrap \*(Aq\*(Aq\*(C'\fR: the xml result is not wrapped. .IP "\fB\-\-descr\fR " 4 .IX Item "--descr " attributes of the wrap tag (defaults to \f(CW\*(C`version="" date=""\*(C'\fR) .IP "\fB\-\-group_by_file\fR " 4 .IX Item "--group_by_file " wrap results for each files into a separate element. By default that element is named \f(CW\*(C`file\*(C'\fR. It has an attribute named \f(CW\*(C`filename\*(C'\fR that gives the name of the file. .Sp the short version of this option is \fB\-g\fR .IP "\fB\-\-exclude\fR " 4 .IX Item "--exclude " same as using \f(CW\*(C`\-v\*(C'\fR in grep: the elements that match the condition are excluded from the result, the input file(s) is (are) otherwise unchanged .Sp the short form of this option is \fB\-v\fR .IP "\fB\-\-pretty_print\fR " 4 .IX Item "--pretty_print " pretty print the output using XML::Twig styles ('\f(CW\*(C`indented\*(C'\fR', '\f(CW\*(C`record\*(C'\fR' or '\f(CW\*(C`record_c\*(C'\fR' are probably what you are looking for) .Sp if the option is used but no style is given then '\f(CW\*(C`indented\*(C'\fR' is used .Sp short form for this argument is \fB\-s\fR .IP \fB\-\-text_only\fR 4 .IX Item "--text_only" Displays the text of the results, one by line. .IP \fB\-\-html\fR 4 .IX Item "--html" Allow HTML input, files are converted using HTML::TreeBuilder .IP \fB\-\-Tidy\fR 4 .IX Item "--Tidy" Allow HTML input, files are converted using HTML::Tidy .SS "Condition Syntax" .IX Subsection "Condition Syntax" is an XPath-like expression as allowed by XML::Twig to trigger handlers. .PP examples: 'para' 'para[@compact="compact"]' '*[@urgent]' '*[@urgent="1"]' 'para[\fBstring()\fR="WARNING"]' .PP see XML::Twig for a more complete description of the syntax .PP options are processed by Getopt::Long so they can start with '\-' or '\-\-' and can be abbreviated (\f(CW\*(C`\-r\*(C'\fR instead of \f(CW\*(C`\-\-root\*(C'\fR for example) .SH DESCRIPTION .IX Header "DESCRIPTION" \&\fBxml_grep\fR does a grep on XML files. Instead of using regular expressions it uses XPath expressions (in fact the subset of XPath supported by XML::Twig) .PP the results can be the names of the files or XML elements containing matching elements. .SH "SEE ALSO" .IX Header "SEE ALSO" XML::Twig Getopt::Long .SH LICENSE .IX Header "LICENSE" This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. .SH AUTHOR .IX Header "AUTHOR" Michel Rodriguez