|
Qizx fe-4.4p2 API | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.qizx.xdm.XMLPullStreamBase
com.qizx.api.util.fulltext.FullTextSnippetExtractor
public class FullTextSnippetExtractor
Extracts a snippet of text from an XML node, attempting to show the key terms of a full-text query.
As an implementation of the FullTextPullStream
interface, this
object separates full-text term occurrences from surrounding plain text,
allowing to "highlight" by enclosing them within an XML element.
To get the results, the method moveToNextEvent has to be called.
Returned events are of type FT_TERM and TEXT, finally followed by
XMLPullStream.END
.
Example: the full-text query is: . ftcontains 'romeo juliet'
,
and the FullTextSnippetExtractor is used on this document:
<PLAY> <TITLE>The Tragedy of Romeo and Juliet</TITLE> <FM> ...
Events generated could be successively:
TEXT | text='The Tragedy of ' | wordCount=3 |
FT_TERM | text='Romeo' | wordCount=1 , termPosition=0 |
TEXT | text=' and ' | wordCount=1 |
FT_TERM | text='Juliet' | wordCount=1 , termPosition=1 |
END | ||
A convenience method makeSnippet
provides simpler
means of building a snippet. Example (assuming 'session' is a XQuerySession
or Library):
Expression ftquery = session.compileExpression(". ftcontains 'romeo juliet' all words"); FullTextSnippetExtractor ftx = new FullTextSnippetExtractor(ftquery); Node snippet = ftx.makeSnippet(node, session.getQName("div"), session.getQName("span"), session.getQName("class"), "ft_");Results would be a node of the form:
<div>The Tragedy of <span class="ft_0">Romeo</span> and <span class="ft_1">Juliet</span></div>
Field Summary | |
---|---|
static int |
GAP
A pseudo-event that represents skipped words |
Fields inherited from interface com.qizx.api.fulltext.FullTextPullStream |
---|
FT_TERM |
Fields inherited from interface com.qizx.api.XMLPullStream |
---|
COMMENT, DOCUMENT_END, DOCUMENT_START, ELEMENT_END, ELEMENT_START, END, PROCESSING_INSTRUCTION, START, TEXT |
Constructor Summary | |
---|---|
FullTextSnippetExtractor(Expression query)
Creates a FullTextSnippetExtractor from a compiled expression. |
|
FullTextSnippetExtractor(com.qizx.queries.FullText.Selection query,
FullTextFactory ftFactory)
For internal use. |
|
FullTextSnippetExtractor(String simpleSyntaxQuery,
FullTextFactory fulltextFactory,
String language)
Creates a FullTextSnippetExtractor from a query string using the simple full-text syntax. |
Method Summary | |
---|---|
Node |
getCurrentNode()
Returns the current node, if the implementation of this object is able to. |
int |
getMaxSnippetSize()
Gets the current maximum number of words in a snippet. |
int |
getMaxWorkSize()
Gets the maximum number of words examined to create a snippet. |
QName |
getName()
Returns the name of the current element node, or if the node is not an element, returns the name of the parent element. |
int |
getQueryTermCount()
Returns the number of terms in the query. |
String[] |
getQueryTerms()
Returns the terms of the query as a String array. |
int |
getTermPosition()
On a FT_TERM event, returns the rank of the term (word, wildcard) in the full-text query. |
String |
getText()
Returns the textual contents of an atomic node. |
int |
getWordCount()
On a TEXT or FT_TERM event, returns the number of words in the text chunk. |
Node |
makeSnippet(Node node,
QName wrapperElement,
QName hiliterElement,
QName styleAttribute,
String stylePrefix)
Directly builds a snippet from a source Node. |
int |
moveToNextEvent()
Moves the event stream one step forward. |
void |
setMaxSnippetSize(int maxSnippetSize)
Sets the maximum number of words in a snippet. |
void |
setMaxWorkSize(int maxWorkSize)
Sets the maximum number of words examined to create a snippet. |
void |
start(Node node)
Searches snippet components in a source XML document or node. |
Methods inherited from class com.qizx.xdm.XMLPullStreamBase |
---|
getAttributeCount, getAttributeName, getAttributeValue, getCurrentEvent, getDTDName, getDTDPublicId, getDTDSystemId, getEncoding, getInternalSubset, getNamespaceCount, getNamespacePrefix, getNamespaceURI, getTarget, getTextLength |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface com.qizx.api.XMLPullStream |
---|
getAttributeCount, getAttributeName, getAttributeValue, getCurrentEvent, getDTDName, getDTDPublicId, getDTDSystemId, getEncoding, getInternalSubset, getNamespaceCount, getNamespacePrefix, getNamespaceURI, getTarget, getTextLength |
Field Detail |
---|
public static final int GAP
Constructor Detail |
---|
public FullTextSnippetExtractor(Expression query) throws EvaluationException
Expression e = session.compileExpression( ". ftcontains 'Romeo Juliet' all words case sensitive"); FullTextSnippetExtractor se = new FullTextSnippetExtractor(e);
context
argument is ignored. Full-text options are taken
into account. Example:
Expression e = session.compileExpression("ft:contains('+Romeo +Juliet', <options case='sensitive'/>)"); FullTextSnippetExtractor se = new FullTextSnippetExtractor(e);
Expression e = session.compileExpression("'+Romeo +Juliet'"); FullTextSnippetExtractor se = new FullTextSnippetExtractor(e);
context
argument is ignored. Full-text options are taken
into account.
query
- a compiled full-text predicate, or a string using the simple
full-text syntax.
EvaluationException
FullTextHighlighter
public FullTextSnippetExtractor(String simpleSyntaxQuery, FullTextFactory fulltextFactory, String language) throws DataModelException
simpleSyntaxQuery
- a query using the simple full-text syntax.fulltextFactory
- a FullTextFactory used with the language parameter
to get a tokenizer (both at compile-time and run-time).language
- language used for the options of the full-text query
DataModelException
- if the query is incorrectpublic FullTextSnippetExtractor(com.qizx.queries.FullText.Selection query, FullTextFactory ftFactory) throws EvaluationException
EvaluationException
Method Detail |
---|
public int getMaxSnippetSize()
public void setMaxSnippetSize(int maxSnippetSize)
maxSnippetSize
- a positive integerpublic int getMaxWorkSize()
public void setMaxWorkSize(int maxWorkSize)
When the scanned document or node belongs to an indexed XML Library, indexes are used to skip directly to occurrences of full-text terms, thus reducing the work load.
maxWorkSize
- a positive integer.public Node makeSnippet(Node node, QName wrapperElement, QName hiliterElement, QName styleAttribute, String stylePrefix) throws DataModelException
node
- source documentwrapperElement
- name of an element used to wrap the whole snippethiliterElement
- name of an element used to wrap each highlighted termstyleAttribute
- optional name (can be null) of an attribute of the
hiliter element bearing a style indicationstylePrefix
- a prefix for the style indicator, if styleAttribute is
used.
DataModelException
- if there is a problem accessing the input nodepublic void start(Node node) throws DataModelException
node
- a node (document or element) from which to extract a snippet.
DataModelException
- raised by problems accessing to the source nodepublic int moveToNextEvent() throws DataModelException
XMLPullStream
moveToNextEvent
in interface XMLPullStream
XMLPullStream.END
.
DataModelException
- may be thrown by the stream implementation in
case access to data is impossible (deleted document, closed Library).public int getTermPosition()
FullTextPullStream
Example: in the following query, terms 'romeo' has position 0, and term 'juliet' has position 1.
. ftcontains "romeo juliet" all words
Note that excluded terms (following ftnot
or not
in
) are ignored.
getTermPosition
in interface FullTextPullStream
public int getWordCount()
FullTextPullStream
getWordCount
in interface FullTextPullStream
public String getText()
XMLPullStream
getText
in interface XMLPullStream
getText
in class com.qizx.xdm.XMLPullStreamBase
public QName getName()
XMLPullStream
getName
in interface XMLPullStream
public int getQueryTermCount()
FullTextPullStream
getQueryTermCount
in interface FullTextPullStream
public String[] getQueryTerms()
FullTextPullStream
getQueryTerms
in interface FullTextPullStream
public Node getCurrentNode()
XMLPullStream
getCurrentNode
in interface XMLPullStream
|
© 2010 Axyana Software | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |