.\" -*- mode: troff; coding: utf-8 -*-
.\" Automatically generated by Pod::Man 5.0102 (Pod::Simple 3.45)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
.ie n \{\
. ds C` ""
. ds C' ""
'br\}
.el\{\
. ds C`
. ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD. Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
. if \nF \{\
. de IX
. tm Index:\\$1\t\\n%\t"\\$2"
..
. if !\nF==2 \{\
. nr % 0
. nr F 2
. \}
. \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "Pegex::Miscellany 3"
.TH Pegex::Miscellany 3 2024-09-01 "perl v5.40.0" "User Contributed Perl Documentation"
.\" For nroff, turn off justification. Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH Miscellany
.IX Header "Miscellany"
This document contains things about Pegex that were written but seemed out of
place in their original documents. Still they are possibly useful so live
here for now.
.SH "Pegex Overview"
.IX Header "Pegex Overview"
In the diagram below, there is a simple language called Foo. The diagram shows
how Pegex can take a text grammar defining Foo and generate a parser that can
parse Foo sources into data (abstract syntax trees).
.PP
.Vb 2
\& Parsing a language called "Foo"
\& with the Pegex toolset.
\&
\& .\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.
\& .\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-. | Pegex::Compiler |
\& | Foo Language | |\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-| Serialize
\& |\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-|\-\-\-\-\->| Pegex::Grammar::Pegex |\-\-\-\-\-\-\-\-\-.
\& | Pegex grammar text | | Pegex::Receiver | |
\& \*(Aq\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\*(Aq \*(Aq\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\*(Aq v
\& ...................... | .\-\-\-\-\-\-.
\& | | | compile() | YAML |
\& |foo: verb noun | v \*(Aq\-\-\-\-\-\-\*(Aq
\& |verb: /Hello/ | .\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-. .\-\-\-\-\-\-.
\& |noun: /world/ | | Foo grammar tree | | JSON |
\& | | \*(Aq\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\*(Aq \*(Aq\-\-\-\-\-\-\*(Aq
\& ...................... | .\-\-\-\-\-\-.
\& | | Perl |
\& v \*(Aq\-\-\-\-\-\-\*(Aq
\& .\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-. .\-\-\-\-\-\-\-\-.
\& | Pegex::Grammar::Foo | | Python |
\& |\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-| \*(Aq\-\-\-\-\-\-\-\-\*(Aq
\& | Pegex::Parser | .\-\-\-\-\-.
\& | Pegex::AST::Foo | | etc |
\& .\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-. \*(Aq\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\*(Aq \*(Aq\-\-\-\-\-\*(Aq
\& | Foo Language | |
\& |\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-|\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\->| parse()
\& | Foo source text | v
\& \*(Aq\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\*(Aq .\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.
\& ................... | Parsed Foo Data Tree |
\& |Hello world | \*(Aq\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\*(Aq
\& ................... ........................
\& |\- verb: Hello |
\& |\- noun: world |
\& ........................
.Ve
.SH FYI
.IX Header "FYI"
Pegex is self-hosting. This means that the Pegex grammar language
syntax is defined by a Pegex grammar! This is important because (just
like any Pegex based language) it makes it easier to port to new
programming languages. You can find the Pegex grammar for Pegex
grammars here: .
.PP
Pegex was originally inspired by Perl 6 Rules. It also takes ideas from Damian
Conway's Perl 5 module, Regexp::Grammars. Pegex tries to take the best
ideas from these great works, and make them work in as many languages as
possible. That's Acmeism.
.SH "Self Compilation Tricks"
.IX Header "Self Compilation Tricks"
You can have some fun using Pegex to compile itself. First get the Pegex
grammar repo:
.PP
.Vb 2
\& git clone git://github.com/ingydotnet/pegex\-pgx.git
\& cd pegex\-pgx
.Ve
.PP
Then parse and dump the Pegex grammar with Pegex:
.PP
.Vb 1
\& perl \-MXXX \-MPegex \-e \*(AqXXX pegex("pegex.pgx")\->parse("pegex.pgx")\*(Aq
.Ve
.PP
For a different view of the data tree, try:
.PP
.Vb 1
\& perl \-MXXX \-MPegex \-e \*(AqXXX pegex("pegex.pgx", receiver => "Pegex::Tree")\->parse("pegex.pgx")\*(Aq
.Ve
.PP
Finally to emulate the Pegex compiler do this:
.PP
.Vb 1
\& perl \-MXXX \-MPegex \-e \*(AqXXX pegex("pegex.pgx", receiver => "Pegex::Pegex::AST")\->parse("pegex.pgx")\*(Aq
.Ve
.PP
This specifies a "receiving" class that can shape the results into something
useful. Indeed, this is the exact guts of Pegex::Grammar::Pegex.
.SH "A Real World EXAMPLE"
.IX Header "A Real World EXAMPLE"
TestML is a new Acmeist unit test language. It is perfect for software that
needs to run equivalently in more than one language. In fact, Pegex itself is
tested with TestML!!
.PP
TestML has a language specification grammar:
.PP
The Perl6 implementation of TestML uses this grammar in:
.PP
All other implementations of TestML use this Pegex grammar:
.PP
In Perl 5, Pegex::Compiler is used to compile the grammar into this simple
data structure (shown in YAML):
.PP
The grammar can also be precompiled to JSON:
.PP
Pegex::Compiler further compiles this into a Perl 5 only grammar tree, which
becomes this module:
.PP
TestML::Parser::Grammar is a subclass of Pegex::Grammar. It can be used to
parse TestML files. TestML::Parser calls the \fBparse()\fR method of the grammar
with a TestML::AST object that receives callbacks when various rules match,
and uses the information to build a TestML::Document object.
.PP
Thus TestML is an Acmeist language written in Pegex. It can be easily ported
to every language where Pegex exists. In fact, it must be ported to those
languages in order to test the new Pegex implementation!