.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.0102 (Pod::Simple 3.45) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Mail::SpamAssassin::Pyzor::Digest::Pieces 3" .TH Mail::SpamAssassin::Pyzor::Digest::Pieces 3 2024-09-01 "perl v5.40.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME Mail::SpamAssassin::Pyzor::Digest::Pieces \- Pyzor backend logic module .SH DESCRIPTION .IX Header "DESCRIPTION" This module houses backend logic for Mail::SpamAssassin::Pyzor::Digest. .PP It reimplements logic found in pyzor's \fIdigest.py\fR module (). .SH FUNCTIONS .IX Header "FUNCTIONS" .ie n .SS "$strings_ar = digest_payloads( $EMAIL_MIME )" .el .SS "\f(CW$strings_ar\fP = digest_payloads( \f(CW$EMAIL_MIME\fP )" .IX Subsection "$strings_ar = digest_payloads( $EMAIL_MIME )" This imitates the corresponding object method in \fIdigest.py\fR. It returns a reference to an array of strings. Each string can be either a byte string or a character string (e.g., UTF\-8 decoded). .PP NB: RFC 2822 stipulates that message bodies should use CRLF line breaks, not plain LF (nor plain CR). We will thus convert any plain CRs in a quoted-printable message body into CRLF. Python, though, doesn't do this, so the output of our implementation of \f(CWdigest_payloads()\fR diverges from that of the Python original. It doesn't ultimately make a difference since the line-ending whitespace gets trimmed regardless, but it's necessary to factor in when comparing the output of our implementation with the Python output. .ie n .SS "normalize( $STRING )" .el .SS "normalize( \f(CW$STRING\fP )" .IX Subsection "normalize( $STRING )" This imitates the corresponding object method in \fIdigest.py\fR. It modifies \f(CW$STRING\fR in-place. .PP As with the original implementation, if \f(CW$STRING\fR contains (decoded) Unicode characters, those characters will be parsed accordingly. So: .PP .Vb 1 \& $str = "123\exc2\exa0"; # [ c2 a0 ] == \eu00a0, non\-breaking space \& \& normalize($str); .Ve .PP The above will leave \f(CW$str\fR alone, but this: .PP .Vb 1 \& utf8::decode($str); \& \& normalize($str); .Ve .PP \&... will trim off the last two bytes from \f(CW$str\fR. .ie n .SS "$yn = should_handle_line( $STRING )" .el .SS "\f(CW$yn\fP = should_handle_line( \f(CW$STRING\fP )" .IX Subsection "$yn = should_handle_line( $STRING )" This imitates the corresponding object method in \fIdigest.py\fR. It returns a boolean. .ie n .SS "$sr = assemble_lines( \e@LINES )" .el .SS "\f(CW$sr\fP = assemble_lines( \e@LINES )" .IX Subsection "$sr = assemble_lines( @LINES )" This assembles a string buffer out of \f(CW@LINES\fR. The string is the buffer of octets that will be hashed to produce the message digest. .PP Each member of \f(CW@LINES\fR is expected to be an \fBoctet string\fR, not a character string. .ie n .SS "($main, $sub, $encoding, $checkval) = parse_content_type( $CONTENT_TYPE )" .el .SS "($main, \f(CW$sub\fP, \f(CW$encoding\fP, \f(CW$checkval\fP) = parse_content_type( \f(CW$CONTENT_TYPE\fP )" .IX Subsection "($main, $sub, $encoding, $checkval) = parse_content_type( $CONTENT_TYPE )" .ie n .SS "@lines = splitlines( $TEXT )" .el .SS "\f(CW@lines\fP = splitlines( \f(CW$TEXT\fP )" .IX Subsection "@lines = splitlines( $TEXT )" Imitates \f(CW\*(C`str.splitlines()\*(C'\fR. (cf. \f(CW\*(C`pydoc str\*(C'\fR) .PP Returns a plain list in list context. Returns the number of items to be returned in scalar context.