PDF::Builder::Content::Column_docs(3) User Contributed Perl Documentation NAME PDF::Builder::Content::Column_docs -- column text formatting system PDF::Builder::Content::Text/column and related routines These routines form a sub-library for support of complex columnar output with high level markup languages. Currently, a single rectangular layout may be defined on a page, to be filled by user-defined content. Any content which could not be fit within the column confines is returned in an internal array format, and may be passed to the next column() call to finish the formatting. Future plans call for non-rectangular columns to be definable, as well as flow from one column to another on a page, and column balancing. Other possible enhancements call for support of non-Western writing systems (e.g., bidirectional text, using the HarfBuzz library), proper word-splitting and paragraph shaping (possibly using the Knuth-Plass algorithm), and additional markup languages. column ($rc, $next_y, $unused) = $text->column($page, $text, $grfx, $markup, $txt, %opts) This method fills out a column of text on a page, returning any unused portion that could not be fit, and where it left off on the page. Tag names, CSS entries, markup type, etc. are case-sensitive (usually lower-case letters only). For example, you cannot give a

paragraph in HTML or a P selector in CSS styling. $page is the page context. Currently, its only use is for page annotations for links ('md1' []() and 'html' ), so if you're not using those, you may pass anything such as "undef" for $page if you wish. $text is the text context, so that various font and text-output operations may be performed. It is often, but not necessarily always, the same as the object containing the "column" method. $grfx is the graphics (gfx) context. It may be a dummy (e.g., undef) if no graphics are to be drawn, but graphical items such as the column outline ('outline' option) and horizontal rule (


in HTML markup) use it. Currently, text-decoration underline (default for links, 'md1' "[]()" and 'html' "") or line-through or overline use the text context, but may in the future require a valid graphics context. Images (when implemented) will require a graphics context. $markup is information on what sort of markup is being used to format and lay out the column's text: 'pre' The input material has already been processed and is already in the desired form. $txt is an array reference to the list of hashes. This must be used when you are calling column() a second (or later) time to output material left over from the first call. It may also be used when the caller application has already processed the text into the appropriate format, and other markup isn't being used. 'none' If none is specified, there is no markup in use. At most, a blank line or a new text array element specifies a new paragraph, and that's it. $txt may be a single string, or an array (list) of strings. The input txt is a list (anonymous array reference) of strings, each containing one or more paragraphs. A single string may also be given. An empty line between paragraphs may be used to separate the paragraphs. Paragraphs may not span array elements. 'md1' This specifies a certain flavor of Markdown compatible with Text::Markdown. See the full description below. There are other flavors of Markdown, so other mdn flavors may be defined in the future, such as POD from Perl code. 'html' This specifies that a large subset of HTML markup is used, along with some attributes and CSS. Numeric entities (decimal &#nnn; and hexadecimal &#xnnn;) are supported, as well as named entities (— for example). The input txt is a list (anonymous array reference) of strings, each containing one or more paragraphs and other markup. A single string may also be given. Per normal HTML practice, paragraph tags should be used to mark paragraphs. Note that HTML::TreeBuilder is configured to automatically mark top body-level text with paragraph tags, in case you forget to do so, although it is probably better to do it yourself, to maintain more control over the processing. Separate array elements will first be glued together into a single string before processing, permitting paragraphs to span array elements if desired. Other input formats There are other markup languages out there, such as HTML-like Pango, nroff-like man page, Markdown-like wikimedia, and Perl's POD, that might be supported in the future (provided there are supported Perl libraries for them). It is very unlikely that TeX or LaTeX will ever be supported, as they both already have excellent PDF output. PDF::Builder currently only supports the markup languages described above. If you want to use something else (e.g., Perl's POD, or man format, or even MS Word or some other WYSIWYG format), you will need to find a converter utility to convert it to a supported flavor of Markdown or HTML. Many such converters already exist, so take a look (although you may well have to do some cleanup before column() accepts the resulting HTML as input). Perhaps in the future, PDF::Builder will directly support additional formats, but no promises. $txt is the input text: a string, an array reference to multiple strings, or an array reference to hashes. See $markup for details. %opts Options -- a number of these are, despite the name, mandatory. 'rect' => [x, y, width, height] This defines a column as a rectangular area of a given width and height (both in points) on the current page. In the future, it is expected that more elaborate non-rectangular areas will be definable, but for now, a simple rectangle is all that is permitted. The column's upper left coordinate is "x, y". The top text baseline is assumed to be relative to the UL corner (based on the determined line height), and the column outline clips that baseline, as it does additional baselines down the page (interline spacing is "leading" multiplied by the largest "font_size" or image height needed on that line). Currently, 'rect' is required, as it is the only column shape supported. 'relative' => [ x, y, scale(s) ] 'relative' defaults to "[ 0, 0, 1, 1 ]", and allows a column outline (currently only 'rect') to be either absolute or relative. "x" and "y" are added to each "x,y" coordinate pair, after scaling. Scaling values: (none) The scaling defaults to 1 in both x and y dimensions (no change). scale (one value) The scaling in both the x (width) and y (height) dimensions uses this value. scale_x, scale_y (two values) There are two separate scaling factors for the x dimension (width) and y dimension (height). This permits a generically-shaped outline to be defined, scaled (perhaps not preserving the aspect ratio) and placed anywhere on the page. This could save you from having to define similarly-shaped columns from scratch multiple times. If you want to define a relative outline, the lower left corner (whether or not it contains a point, and whether or not it's the first one listed) would usually be "0, 0", to have scaling work as expected. In other works, your outline template should be in the lower left corner of the page. 'start_y' => $start_y If omitted, it is assumed that you want to start at the top of the defined column (the maximum "y" value minus the maximum vertical extent of this line). If used, the normal value is the "next_y" returned from the previous column() call. It is the deepest extent reached by the previous line (plus leading), and is the top-most point of the new first line of this column() call. Note that the "x" position will be determined by the column shape and size (the left-most point of the baseline), so there is no place to explicitly set an "x" position to start at. 'font_size' => $font_size This is the starting font size (in points) to be used. Over the course of the text, it may be modified by markup. The default is 12pt. It is in turn overridden by any CSS or HTML font size-settings. The starting font size may be set in a number of ways. It may be inherited from a previous "$text->font(..., font-size)" statement; it may be set via the "font_size" option (overriding any font method inheritance); it may default to 12pt (if neither explicit way is given). For HTML markup, it may of course be modified by the "font" tag or by CSS styling "font-size". For Markdown, it may be modified by CSS styling. 'font_info' => $string This permits the user to specify the starting font used in column() (body font-family, font-style, font-weight, color). column() will pick up any font already loaded ("$text->font($font, $size);", or using FontManager), and use that as the "current" font. If no font has been loaded, and no other instructions are given, the FontManager default (core Times-Roman) will be used. The "font_info" option for column() may be given to override either of the two above methods. You may specify a $string of '-fm-' to instruct column() to use the FontManager "default" font (Times face core font). Or, you may pick a font face known to FontManager (added by user code if not one of the 28 core fonts), and optionally give it style and weight: $string of 'face:style:weight:color'. The style defaults to 'normal' (non-italic), or 'normal' or '0' may be given. For italics, use 'italic' or '1'. The weight defaults to 'normal' (unbolded weight), or 'normal' or '0' may be given. For bold (heavy) text, use 'bold' or '1'. Finally, a color may be given. Finally, the "style" option for column() may be given to override any of the above settings, e.g., 'style'=>{ body { font-family:... } and set the initial current font. Remember that, as with anything font-related that column() does, the 'face' (family) used must already be known to FontManager (explicitly loaded with add_font() if not one of the 28 core fonts). Remember that the first 14 fonts are standard PDF, and the second 14 are normally supplied with Windows (but not always with other operating systems). 'marker_width' => $marker_width 'marker_gap' => $marker_gap This is the width of the gutter to the left of a list item, where (for the first line of the item) the marker lives. The marker contains the symbol (for bulleted/unordered lists) or formatted number and "before" and "after" text (for numbered/ordered lists). Both have a single space (marker_gap = 1em) before the item text starts. The number is a length, in points. The default is 1 em (1 times the font_size passed to column()), and is not adjusted for any changes of font_size in the markup, so that lists are indented consistently. This is usually fine for unordered (bulleted) lists and single digit ordered (numbered) lists, although you may need to make it wider for two or three digit numbered lists. An explicit value passed in is also not changed -- the gutter width for the marker will be the same in all lists (keeping them aligned). If you plan to have exceptionally long markers, such as an ordered list of years in Roman numerals, e.g., (MCMXCIX), you may want to make this gutter a bit wider. A value may be given for the marker_gap, which is the gap between the ($marker_width wide) marker and the start of the list item's text. The default is $fs points (1 em), set by the font_size in the markup. The "list-style-position" CSS property may be given as the standard 'outside' (the default) or 'inside', or (extension to CSS) to indent the left side of second, third, etc.
  • lines to somewhere between the 'inside' and 'outside' positions. Be sure to consider the "_marker-align" extended property to left, center, or right (default) align the marker within the "marker_gutter". 'leading' => $leading This is the leading ratio used throughout the column text. The "$x, $y" position through "$x + width" is assumed to be the first text baseline. The next line down will be "$y - $leading*$font_size". If the font_size changes for any reason over the course of the column, the baseline spacing (leading * font_size) will also change. The default leading ratio is 1.125 (12.5% added to font). 'para' => [ $indent, $top-margin ] When starting a new paragraph, these are the default indentation (in points), and the extra vertical spacing for a top margin on a paragraph. Otherwise, the default is "[ 1*$font_size, 0 ]" (1em indent, 0 additional vertical space). Either may be overridden by the appropriate CSS settings. An outdent may be defined with a negative indentation value. These apply to all $markup types. At the top of a column, any top margin (not just for paragraphs) is ignored. 'outline' => "color string" You may optionally request that the column be outlined in a given color, to aid in debugging fitting problems. This will require that the graphics context be provided to column(). 'color' => "color string" The color to draw the text (or rule or other graphic) in. The default is black (#000000). 'style' => "CSS styling" You may define CSS (selectors and properties lists) to override the built-in CSS defaults. These will be applied for the entire column() call. You can use this, or "style" tags in 'html', but for 'none' or 'md1', you will need to use this method to set styling. See also the "font_info=>" option to set initial font settings. Note that, unlike the "style=" attribute in HTML tags, the "style=>" option is formatted like a