HXTOC(1) HTML-XML-utils HXTOC(1)
NAME
hxtoc - insert a table of contents in an HTML file
SYNOPSIS
hxtoc [ -x ] [ -l low ] [ -h high ] [ -t ] [ -d ] [ -c class ] [ -f ] [
file-or-URL ]
DESCRIPTION
The hxtoc command reads an HTML file, inserts missing ID attributes in
all H1 to H6 elements between the levels -l and -h (unless the option
-d is in effect, see below). Unless the option -t is given, it also
inserts A elements with NAME attributes, because old browsers do not
recognize ID attributes as target anchors. The output is written to
stdout.
If there is a comment of the form
or a pair of comments
...
then the comment, or the pair with everything in between, will be
replaced by a table of contents, consisting of a list (UL) of links to
all headers in the document.
The text of headers is copied to this table of contents, including any
inline markup, except that ID attributes, DFN tags and SPAN tags with a
CLASS of "index" are omitted (but the elements' content is copied).
The copied text can optionally be "flattened" first, see option -f.
If a header has a CLASS attribute with as value (or one of its values)
the keyword "no-toc", then that header will not appear in the table of
contents.
OPTIONS
The following options are supported:
-x Use XML conventions: empty elements are written with a slash
at the end:
-l low Sets the lowest numbered header to appear in the table of
content. Default is 1 (i.e., H1).
-h high Sets the highest numbered header to appear in the table of
content. Default is unlimited.
-t Normally, hxtoc adds both ID attributes and empty A elements
with a NAME attribute and CLASS="bctarget", so that older
browsers that do no understand ID will still find the target.
With this option, the A elements will not be generated.
-c class The generated UL elements in the table of contents will have
a CLASS attribute with the value class. The default is
"toc".
-d Tries to use sectioning elements as targets in the table of
contents instead of H1 to H6. A sectioning elements is a DIV,
SECTION, ARTICLE, ASIDE or NAV element that contains at least
one heading element (H1 to H6) or HGROUP. The sectioning
element will be given an ID if it doesn't have one yet. With
this option, the level of any H1 to H6 that is the first
heading of a sectioning element is not determined by its
name, but by the nesting depth of the sectioning elements.
(Any H1 to H6 that are not the first heading of a sectioning
element still have their level implied by their name.)
-f Flatten the text of the table of contents. Without -f, the
contents of header elements are copied to the table of
contents almost unchanged, i.e., including any child elements
and their attributes (except for ID attributes, DFN elements
and certain SPAN elements, as explained above). With -f, the
contents are flattened instead: All child elements are
removed and only their contents are copied to the table of
contents. Additionally elements with an ALT attribute, such
as IMG, are replaced by the contents of the ALT attribute.
Exception: BDO tags are copied unchanged and elements with a
DIR attribute are replaced by a SPAN with that DIR attribute.
(BDO and DIR may occur in languages written right-to-left.)
OPERANDS
The following operand is supported:
file-or-URL
The name or URL of an HTML file. If absent, standard input is
read instead.
DIAGNOSTICS
The following exit values are returned:
0 Successful completion.
> 0 An error occurred in the parsing of the HTML file. hxtoc
will try to correct the error and produce output anyway.
SEE ALSO
asc2xml(1), hxnormalize(1), hxnum(1), xml2asc(1)
BUGS
The error recovery for incorrect HTML is primitive.
7.x 10 Jul 2011 HXTOC(1)