.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.0102 (Pod::Simple 3.45) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Xapian::TermGenerator 3" .TH Xapian::TermGenerator 3 2024-09-01 "perl v5.40.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME Search::Xapian::TermGenerator \- Parses a piece of text and generates terms. .SH DESCRIPTION .IX Header "DESCRIPTION" This module takes a piece of text and parses it to produce words which are then used to generate suitable terms for indexing. The terms generated are suitable for use with Search::Xapian::Query objects produced by the Search::Xapian::QueryParser class. .SH SYNOPSIS .IX Header "SYNOPSIS" .Vb 1 \& use Search::Xapian; \& \& my $doc = new Search::Xapian::Document(); \& my $tg = new Search::Xapian::TermGenerator(); \& $tg\->set_stemmer(new Search::Xapian::Stem("english")); \& $tg\->set_document($doc); \& $tg\->index_text("The cat sat on the mat"); .Ve .SH METHODS .IX Header "METHODS" .IP new 4 .IX Item "new" TermGenerator constructor. .IP "set_stemmer " 4 .IX Item "set_stemmer " Set the Search::Xapian::Stem object to be used for generating stemmed terms. .IP "set_stopper " 4 .IX Item "set_stopper " Set the Search::Xapian::Stopper object to be used for identifying stopwords. .IP "set_document " 4 .IX Item "set_document " Set the Search::Xapian::Document object to index terms into. .IP "get_document " 4 .IX Item "get_document " Get the currently set Search::Xapian::Document object. .IP "index_text [ []]" 4 .IX Item "index_text [ []]" Indexes the text in string . The optional parameter sets the wdf increment (default 1). The optional parameter sets the term prefix to use (default is no prefix). .IP "index_text_without_positions [ []]" 4 .IX Item "index_text_without_positions [ []]" Just like index_text, but no positional information is generated. This means that the database will be significantly smaller, but that phrase searching and NEAR won't be supported. .IP "increase_termpos []" 4 .IX Item "increase_termpos []" Increase the termpos used by index_text by (default 100). .Sp This can be used to prevent phrase searches from spanning two unconnected blocks of text (e.g. the title and body text). .IP get_termpos 4 .IX Item "get_termpos" Get the current term position. .IP "set_termpos " 4 .IX Item "set_termpos " Set the current term position. .IP get_description 4 .IX Item "get_description" Return a description of this object. .SH REFERENCE .IX Header "REFERENCE" .Vb 1 \& https://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html .Ve