HXWLS(1) HTML-XML-utils HXWLS(1)

hxwls - list links in an HTML file

hxwls [ -l ] [ -t ] [ -r ] [ -h ] [ -a ] [ -b base ] [ file ]

The hxwls command reads an HTML file (standard input by default) and prints out all links it finds. The output is written to stdout.

The following options are supported:

Produce a long listing. Instead of just the URI, hxwls prints three columns: the element name, the value of the REL attribute, and the target URI.
Produce a tuple listing. hxwls prints four columns: the URI of the document itself, the element name, the value of the REL attribute, and the target URI.
Print relative URLs as they are, without converting them to absolute URLs.
Use base as the initial base URL. If there is a <base> element in the document, it will override the -b option.
Output as HTML. The output will be listed in the form of <a> elements.
Convert any IRIs (Internationalized Resource Identifiers) to ASCII-only URIs. This causes any non-ASCII characters in the path of a URI to be encoded as %-escaped octets and non-ASCII characters in the domain name as punycode. (Punycode encoding is only available if hxwls is compiled with libidn support.)

The following operand is supported:

The name or the URL of an HTML file. If absent, standard input is read instead.

The following exit values are returned:

0
Successful completion.
> 0
An error occurred in the parsing of the HTML file. hxwls will try to correct the error and produce output anyway.

asc2xml(1), hxnormalize(1), hxnum(1), xml2asc(1)

10 Jul 2011 7.x