TEXT2PCAP(1)   TEXT2PCAP(1)

text2pcap - Generate a capture file from an ASCII hexdump of packets

text2pcap-a ] [ -b 2|8|16|64 ] [ -D ] [ -e <ethertype> ] [ -E <encapsulation type> ] [ -F <file format> ] [ -i <proto> ] [ -l <typenum> ] [ -N <intf-name> ] [ -m <max-packet> ] [ -o hex|oct|dec|none ] [ -q ] [ -r <regex> ] [ -s <srcport>,<destport>,<tag> ] [ -S <srcport>,<destport>,<ppi> ] [ -t <timefmt> ] [ -T <srcport>,<destport> ] [ -u <srcport>,<destport> ] [ -4 <srcip>,<destip> ] [ -6 <srcip>,<destip> ] <infile>|- <outfile>|-

text2pcap -h|--help

text2pcap -v|--version

Text2pcap is a program that reads in an ASCII hex dump and writes the data described into a capture file. text2pcap can read hexdumps with multiple packets in them, and build a capture file of multiple packets. Text2pcap is also capable of generating dummy Ethernet, IP, and UDP, TCP or SCTP headers, in order to build fully processable packet dumps from hexdumps of application-level data only.

Text2pcap can write the file in several output formats. The -F flag can be used to specify the format in which to write the capture file, text2pcap -F provides a list of the available output formats. By default, it writes the packets to outfile in the pcapng file format. Text2cap also supports compression formats, which can be specified with the --compress options. If that option is not given, the the desired compression method, if any, is deduced from the extension of outfile; e.g. if it has the extension '.gz', then the output file is compressed to a gzip archive.

Text2pcap understands a hexdump of the form generated by od -Ax -tx1 -v. In other words, each byte is individually displayed, with spaces separating the bytes from each other. Hex digits can be upper or lowercase.

In normal operation, each line must begin with an offset describing the position in the packet, followed a colon, space, or tab separating it from the bytes. There is no limit on the width or number of bytes per line, but lines with only hex bytes without a leading offset are ignored (in other words, line breaks should not be inserted in long lines that wrap.) Offsets are more than two digits; they are in hex by default, but can also be in octal or decimal - see -o. Each packet must begin with offset zero, and an offset zero indicates the beginning of a new packet. Offset values must be correct; an unexpected value causes the current packet to be aborted and the next packet start awaited. There is also a single packet mode with no offsets; see -o.

Packets may be preceded by a direction indicator ('I' or 'O') and/or a timestamp if indicated by the command line (see -D and -t). If both are present, the direction indicator precedes the timestamp. The format of the timestamps is specified as a mandatory parameter to -t. If no timestamp is parsed, in the case of the first packet the current system time is used, while subsequent packets are written with timestamps one microsecond later than that of the previous packet.

Other text in the input data is ignored. Any text before the offset is ignored, including email forwarding characters '>'. Any text on a line after the bytes is ignored, e.g. an ASCII character dump (but see -a to ensure that hex digits in the character dump are ignored). Any line where the first non-whitespace character is a '#' will be ignored as a comment. Any lines of text between the bytestring lines are considered preamble; the beginning of the preamble is scanned for the direction indicator and timestamp as mentioned above and otherwise ignored.

Any line beginning with #TEXT2PCAP is a directive and options can be inserted after this command to be processed by text2pcap. Currently there are no directives implemented; in the future, these may be used to give more fine grained control on the dump and the way it should be processed e.g. timestamps, encapsulation type etc.

In general, short of these restrictions, text2pcap is pretty liberal about reading in hexdumps and has been tested with a variety of mangled outputs (including being forwarded through email multiple times, with limited line wrap etc.)

Here is a sample dump that text2pcap can recognize, with optional directional indicator and timestamp:

I 2019-05-14T19:04:57Z
000000 00 0e b6 00 00 02 00 0e b6 00 00 01 08 00 45 00
000010 00 28 00 00 00 00 ff 01 37 d1 c0 00 02 01 c0 00
000020 02 02 08 00 a6 2f 00 01 00 01 48 65 6c 6c 6f 20
000030 57 6f 72 6c 64 21
000036

Text2pcap is also capable of scanning a text input file using a custom Perl compatible regular expression that matches a single packet. text2pcap searches the given file (which must end with '\n') for non-overlapping non-empty strings matching the regex. Named capturing subgroups, which must match exactly once per packet, are used to identify fields to import. The following fields are supported in regex mode, one mandatory and three optional:

"data"  Actual captured frame data to import
"time"  Timestamp of packet
"dir"   Direction of packet
"seqno" Arbitrary ID of packet

The 'data' field is the captured data, which must be in a selected encoding: hexadecimal (the default), octal, binary, or base64 and containing no characters in the data field outside the encoding set besides whitespace. The 'time' field is parsed according to the format in the -t parameter. The first character of the 'dir' field is compared against a set of characters corresponding to inbound and outbound that default to "iI<" for inbound and "oO>" for outbound to assign a direction. The 'seqno' field is assumed to be a positive integer base 10 used for an arbitrary ID. An optional field’s information will only be written if the field is present in the regex and if the capture file format supports it. (E.g., the pcapng format supports all three fields, but the pcap format only supports timestamps.)

Here is a sample dump that the regex mode can process with the regex '^(?<dir>[<>])\s(?<time>\d+:\d\d:\d\d.\d+)\s(?<data>[0-9a-fA-F]+)$' along with timestamp format '%H:%M:%S.%f', directional indications of '<' and '>', and hex encoding:

> 0:00:00.265620 a130368b000000080060
> 0:00:00.280836 a1216c8b00000000000089086b0b82020407
< 0:00:00.295459 a2010800000000000000000800000000
> 0:00:00.296982 a1303c8b00000008007088286b0bc1ffcbf0f9ff
> 0:00:00.305644 a121718b0000000000008ba86a0b8008
< 0:00:00.319061 a2010900000000000000001000600000
> 0:00:00.330937 a130428b00000008007589186b0bb9ffd9f0fdfa3eb4295e99f3aaffd2f005
> 0:00:00.356037 a121788b0000000000008a18

The regex is compiled with multiline support, and it is recommended to use the anchors '^' and '$' for best results.

Text2pcap also allows the user to read in dumps of application-level data and insert dummy L2, L3 and L4 headers before each packet. This allows Wireshark or any other full-packet decoder to handle these dumps. If the encapsulation type is Ethernet, the user can elect to insert Ethernet headers, Ethernet and IP, or Ethernet, IP and UDP/TCP/SCTP headers before each packet. The fake headers can also be used with the Raw IP, Raw IPv4, or Raw IPv6 encapsulations, with the Ethernet header omitted. These encapsulation options can be used in both hexdump mode and regex mode.

When <infile> or <outfile> are '-', standard input or standard output, respectively, are used.

-a

Enables ASCII text dump identification. It allows one to identify the start of the ASCII text dump and not include it in the packet even if it looks like HEX. This parameter has no effect in regex mode.

NOTE: Do not enable it if the input file does not contain the ASCII text dump.

-b 2|8|16|64

Specify the base (radix) of the encoding of the packet data in regex mode. The supported options are 2 (binary), 8 (octal), 16 (hexadecimal), and 64 (base64 encoding), with hex as the default. This parameter has no effect in hexdump mode.

-D

Indicates that the text before each input packet may start either with an I or O indicating that the packet is inbound or outbound. If both this flag and the t flag are used, the directional indicator is expected before the time code. This parameter has no effect in regex mode, where the presence of the <dir> capturing group determines whether direction indicators are expected.

Direction indication is stored in the packet headers if the output format supports it (e.g. pcapng), and is also used when generating dummy headers to swap the source and destination addresses and ports as appropriate.

-e <ethertype>

Include a dummy Ethernet header before each packet. Specify the EtherType for the Ethernet header in hex. Use this option if your dump has Layer 3 header and payload (e.g. IP header), but no Layer 2 encapsulation. Example: -e 0x806 to specify an ARP packet.

For IP packets, instead of generating a fake Ethernet header you can also use -E rawip or -l 101 to indicate raw IP encapsulation. Note that raw IP encapsulation does not work for any non-IP Layer 3 packet (e.g. ARP), whereas generating a dummy Ethernet header with -e works for any sort of L3 packet.

-E <encapsulation type>

Sets the packet encapsulation type of the output capture file. text2pcap -E provides a list of the available types; note that not all file formats support all encapsulation types. The default type is ether (Ethernet).

NOTE: This sets the encapsulation type of the output file, but does not translate the packet headers or add additional headers. It is used to specify the encapsulation that matches the input data.

-F <file format>

Sets the file format of the output capture file. Text2pcap can write the file in several formats; text2pcap -F provides a list of the available output formats. The default is the pcapng format.

-h|--help

Print the version number and options and exit.

-i <proto>

Include dummy IP headers before each packet. Specify the IP protocol for the packet in decimal. Use this option if your dump is the payload of an IP packet (i.e. has complete L4 information) but does not have an IP header with each packet. Note that an appropriate Ethernet header is automatically included with each packet as well if the link-layer type is Ethernet. Example: -i 46 to specify an RSVP packet (IP protocol 46). See https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml for the complete list of assigned internet protocol numbers.

-l <typenum>

Sets the packet encapsulation type of the output capture file, using pcap link-layer header type numbers. Default is Ethernet (1). See https://www.tcpdump.org/linktypes.html for the complete list of possible encapsulations. Example: -l 7 for ARCNet packets encapsulated BSD-style.

-m <max-packet>

Set the maximum packet length, default is 262144. Useful for testing various packet boundaries when only an application level datastream is available. Example:

od -Ax -tx1 -v stream | text2pcap -m1460 -T1234,1234 - stream.pcap

will convert from plain datastream format to a sequence of Ethernet TCP packets.

-N <intf-name>

Specify a name for the interface included when writing a pcapng format file.

-o hex|oct|dec|none

Specify the radix for the offsets (hex, octal, decimal, or none). Defaults to hex. This corresponds to the -A option for od. This parameter has no effect in regex mode.

NOTE: With -o none, only one packet will be created, ignoring any direction indicators or timestamps after the first byte along with any offsets.

-P <dissector>

Include an EXPORTED_PDU header before each packet. Specify, as a string, the dissector to be called for the packet (DISSECTOR_NAME tag). Use this option if your dump is the payload for a single upper layer protocol (so specifying a link layer type would not work) and you wish to create a capture file without a full dummy protocol stack. Automatically sets the link layer type to Wireshark Upper PDU export. Without this option, if the Upper PDU export link layer type (252) is selected the dissector defaults to "data".

-q

Don’t display the summary of the options selected at the beginning, or the count of packets processed at the end.

-r <regex>

Process the file in regex mode using regex as described above.

NOTE: The regex mode uses memory-mapped I/O and does not work on streams that do not support seeking, like terminals and pipes.

-s <srcport>,<destport>,<tag>

Include dummy SCTP headers before each packet. Specify, in decimal, the source and destination SCTP ports, and verification tag, for the packet. Use this option if your dump is the SCTP payload of a packet but does not include any SCTP, IP or Ethernet headers. Note that appropriate Ethernet and IP headers are automatically also included with each packet. A CRC32C checksum will be put into the SCTP header.

-S <srcport>,<destport>,<ppi>

Include dummy SCTP headers before each packet. Specify, in decimal, the source and destination SCTP ports, and a verification tag of 0, for the packet, and prepend a dummy SCTP DATA chunk header with a payload protocol identifier if ppi. Use this option if your dump is the SCTP payload of a packet but does not include any SCTP, IP or Ethernet headers. Note that appropriate Ethernet and IP headers are automatically included with each packet. A CRC32C checksum will be put into the SCTP header.

-t <timefmt>

Treats the text before the packet as a date/time code; timefmt is a format string supported by strftime(3), supplemented with the field descriptor '%f' for fractional seconds up to nanoseconds. Example: The time "10:15:14.5476" has the format code "%H:%M:%S.%f" The special format string ISO indicates that the string should be parsed according to the ISO-8601 specification. This parameter is used in regex mode if and only if the <time> capturing group is present.

NOTE: Date/time fields from the current date/time are used as the default for unspecified fields.

-T <srcport>,<destport>

Include dummy TCP headers before each packet. Specify the source and destination TCP ports for the packet in decimal. Use this option if your dump is the TCP payload of a packet but does not include any TCP, IP or Ethernet headers. Note that appropriate Ethernet and IP headers are automatically also included with each packet. Sequence numbers will start at 0.

-u <srcport>,<destport>

Include dummy UDP headers before each packet. Specify the source and destination UDP ports for the packet in decimal. Use this option if your dump is the UDP payload of a packet but does not include any UDP, IP or Ethernet headers. Note that appropriate Ethernet and IP headers are automatically also included with each packet. Example: -u1000,69 to make the packets look like TFTP/UDP packets.

-v|--version

Print the full version information and exit.

-4 <srcip>,<destip>

Prepend dummy IP header with specified IPv4 source and destination addresses. This option should be accompanied by one of the following options: -i, -s, -S, -T, -u Use this option to apply "custom" IP addresses. Example: -4 10.0.0.1,10.0.0.2 to use 10.0.0.1 and 10.0.0.2 for all IP packets.

-6 <srcip>,<destip>

Prepend dummy IP header with specified IPv6 source and destination addresses. This option should be accompanied by one of the following options: -i, -s, -S, -T, -u Use this option to apply "custom" IP addresses. Example: -6 2001:db8::b3ff:fe1e:8329,2001:0db8:85a3::8a2e:0370:7334 to use 2001:db8::b3ff:fe1e:8329 and 2001:0db8:85a3::8a2e:0370:7334 for all IP packets.

--compress <type>

Compress the output file using the type compression format. --compress with no argument provides a list of the compression formats supported for writing. The type given takes precedence over the extension of outfile.

--log-level <level>

Set the active log level. Supported levels in lowest to highest order are "noisy", "debug", "info", "message", "warning", "critical", and "error". Messages at each level and higher will be printed, for example "warning" prints "warning", "critical", and "error" messages and "noisy" prints all messages. Levels are case insensitive.

--log-fatal <level>

Abort the program if any messages are logged at the specified level or higher. For example, "warning" aborts on any "warning", "critical", or "error" messages.

--log-domains <list>

Only print messages for the specified log domains, e.g. "GUI,Epan,sshdump". List of domains must be comma-separated. Can be negated with "!" as the first character (inverts the match).

--log-debug <list>

Force the specified domains to log at the "debug" level. List of domains must be comma-separated. Can be negated with "!" as the first character (inverts the match).

--log-noisy <list>

Force the specified domains to log at the "noisy" level. List of domains must be comma-separated. Can be negated with "!" as the first character (inverts the match).

--log-fatal-domains <list>

Abort the program if any messages are logged for the specified log domains. List of domains must be comma-separated.

--log-file <path>

Write log messages and stderr output to the specified file.

od(1), pcap(3), wireshark(1), tshark(1), dumpcap(1), mergecap(1), editcap(1), strftime(3), pcap-filter(7) or tcpdump(8)

This is the manual page for Text2pcap 4.4.1. Text2pcap is part of the Wireshark distribution. The latest version of Wireshark can be found at https://www.wireshark.org.

Original Author
Ashok Narayanan <ashokn[AT]cisco.com>

2024-10-13