HXSELECT(1) HTML-XML-utils HXSELECT(1)
NAME
hxselect - extract elements or attributes that match a (CSS) selector
SYNOPSIS
hxselect [ -i ] [ -c ] [ -l language ] [ -s separator ] selectors
DESCRIPTION
hxselect reads a well-formed XML document and outputs all elements and
attributes that match one of the CSS selectors that are given as an
argument. For example
hxselect ol li:first-child
selects the first li (list item in XHTML) in an ol (ordered list).
If there are multiple selectors, they must be separated by commas. For
example,
hxselect p + ul, blockquote ol
selects all ul elements that follow a p and all ol elements that are
descendants of a blockquote element.
The command operates on the standard input.
hxselect assumes that class selectors (".foo") refer to an attribute
called "class" and that ID selectors ("#foo") refer to an attribute
called "id".
The experimental attribute node selector '::attr(name)' is supported
and selects the attribute of that name.
Comments and processing instructions are ignored, i.e., they are read
but never written.
OPTIONS
The following options are supported:
-i Match case-insensitively. Useful for HTML and some other
SGML-based languages.
-c Print content only. Without -c, the start and end tag of the
matched element are printed as well; with -c only the
contents of the matched element are printed. If an attribute
rather than an element is selected (::attr() selector), only
the value of the attribute is printed.
-l language
Sets the default language, in case the root element doesn't
have an xml:lang attribute (default: none). Example: -l en
-s separator
A string to print after each match (default: empty). Accepts
C-like escapes. Example: -s ' 0(aq to print an empty line
after each match.
OPERANDS
The following operand is supported:
selectors
One or more comma-separated selectors. Most selectors from CSS
level 3 are supported, with the exception of selectors that
require interaction (e.g., ':active') or layout (e.g., ':first-
line).
BUGS
Case-insensitive selectors (option -i) currently only works for ASCII
characters ("a" matches "A"), not for other characters ("a" does not
match "A").
SEE ALSO
asc2xml(1), xml2asc(1), hxnormalize(1), hxremove(1), UTF-8 (RFC 2279)
7.x 10 Jul 2011 HXSELECT(1)