Qizx fe-4.4p2 API

com.qizx.api.util.fulltext
Class FullTextHighlighter

java.lang.Object
  extended by com.qizx.xdm.XMLPullStreamBase
      extended by com.qizx.api.util.fulltext.FullTextHighlighter
All Implemented Interfaces:
FullTextPullStream, XMLPullStream

public class FullTextHighlighter
extends com.qizx.xdm.XMLPullStreamBase
implements FullTextPullStream

An implementation of FullTextPullStream that can be used to highlight terms of a full-text query (more generally to distinguish these terms from the rest of the XML source).

The source can be a node of the Data Model, or any stream of XML events provided by a XMLPullStream.

The full-text query can be specified in diverse manners: see the constructors.


Field Summary
 
Fields inherited from interface com.qizx.api.fulltext.FullTextPullStream
FT_TERM
 
Fields inherited from interface com.qizx.api.XMLPullStream
COMMENT, DOCUMENT_END, DOCUMENT_START, ELEMENT_END, ELEMENT_START, END, PROCESSING_INSTRUCTION, START, TEXT
 
Constructor Summary
FullTextHighlighter(Expression query)
          Creates a FullTextHighlighter from a compiled XQuery Expression.
FullTextHighlighter(com.qizx.queries.FullText.Selection query, FullTextFactory fulltextFactory)
          For internal use.
FullTextHighlighter(String[] words, FullTextFactory fulltextFactory, String language)
          Creates a FullTextHighlighter from a list of words.
FullTextHighlighter(String simpleSyntaxQuery, FullTextFactory fulltextFactory, String language)
          Creates a FullTextHighlighter from a query string using the simple full-text syntax.
 
Method Summary
 String extractFirstWords(String text, int count)
          Internal use.
 String extractLastWords(String text, int count)
          Internal use.
 Node getCurrentNode()
          Returns the current node, if the implementation of this object is able to.
 QName getName()
          Returns the name of the current element node, or if the node is not an element, returns the name of the parent element.
 int getQueryTermCount()
          Returns the number of terms in the query.
 String[] getQueryTerms()
          Returns the terms of the query as a String array.
 String getTarget()
          Returns the target name for a PROCESSING_INSTRUCTION.
 int getTermPosition()
          On a FT_TERM event, returns the rank of the term (word, wildcard) in the full-text query.
 String getText()
          Returns the textual contents of an atomic node.
 int getTextLength()
          Returns the size of the textual contents of an atomic node.
 int getWordCount()
          On a TEXT or FT_TERM event, returns the number of words in the text chunk.
 int moveToNextEvent()
          Moves the event stream one step forward.
 void start(Node node)
          Starts iteration on a Node tree.
 void start(XMLPullStream source)
          Starts iteration using another XML Stream as source.
 
Methods inherited from class com.qizx.xdm.XMLPullStreamBase
getAttributeCount, getAttributeName, getAttributeValue, getCurrentEvent, getDTDName, getDTDPublicId, getDTDSystemId, getEncoding, getInternalSubset, getNamespaceCount, getNamespacePrefix, getNamespaceURI
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.qizx.api.XMLPullStream
getAttributeCount, getAttributeName, getAttributeValue, getCurrentEvent, getDTDName, getDTDPublicId, getDTDSystemId, getEncoding, getInternalSubset, getNamespaceCount, getNamespacePrefix, getNamespaceURI
 

Constructor Detail

FullTextHighlighter

public FullTextHighlighter(Expression query)
                    throws EvaluationException
Creates a FullTextHighlighter from a compiled XQuery Expression.

The expression must be either of:

Parameters:
query - a compiled full-text predicate, or a string using the simple full-text syntax.
Throws:
EvaluationException

FullTextHighlighter

public FullTextHighlighter(String simpleSyntaxQuery,
                           FullTextFactory fulltextFactory,
                           String language)
                    throws DataModelException
Creates a FullTextHighlighter from a query string using the simple full-text syntax. Example:
FullTextHighlighter hiliter =
  new FullTextHighlighter("+Romeo +Juliet", ftfactory, "en");

Parameters:
simpleSyntaxQuery - a query using the simple full-text syntax.
fulltextFactory - a FullTextFactory used with the language parameter to get a tokenizer (both at compile-time and run-time).
language - language used for the options of the full-text query
Throws:
DataModelException - if the query is incorrect

FullTextHighlighter

public FullTextHighlighter(String[] words,
                           FullTextFactory fulltextFactory,
                           String language)
Creates a FullTextHighlighter from a list of words.

Parameters:
words - an array of words, used as is (no tokenization applied).
fulltextFactory - a FullTextFactory used to get a tokenizer
language - language used for the options of the full-text query

FullTextHighlighter

public FullTextHighlighter(com.qizx.queries.FullText.Selection query,
                           FullTextFactory fulltextFactory)
For internal use.

Method Detail

start

public void start(Node node)
           throws DataModelException
Starts iteration on a Node tree. If the node belongs to an XML Library, the iteration can be optimized using XML Library indexes.

Parameters:
node -
Throws:
DataModelException

start

public void start(XMLPullStream source)
Starts iteration using another XML Stream as source. This version cannot be optimized using XML Library indexes.

Parameters:
source - a pull stream. Text nodes (events of type TEXT) can be split into several sections corresponding to recognized full-text terms and plain text (resp. events FT_TERM and TEXT).

getQueryTermCount

public int getQueryTermCount()
Description copied from interface: FullTextPullStream
Returns the number of terms in the query.

Specified by:
getQueryTermCount in interface FullTextPullStream

getQueryTerms

public String[] getQueryTerms()
Description copied from interface: FullTextPullStream
Returns the terms of the query as a String array.

Specified by:
getQueryTerms in interface FullTextPullStream

moveToNextEvent

public int moveToNextEvent()
                    throws DataModelException
Description copied from interface: XMLPullStream
Moves the event stream one step forward.

Specified by:
moveToNextEvent in interface XMLPullStream
Returns:
the next event. If the stream has reached its end, returns XMLPullStream.END.
Throws:
DataModelException - may be thrown by the stream implementation in case access to data is impossible (deleted document, closed Library).

getTermPosition

public int getTermPosition()
Description copied from interface: FullTextPullStream
On a FT_TERM event, returns the rank of the term (word, wildcard) in the full-text query. Depends on the actual implementation of this interface.

Example: in the following query, terms 'romeo' has position 0, and term 'juliet' has position 1.

 . ftcontains "romeo juliet" all words
 

Note that excluded terms (following ftnot or not in) are ignored.

Specified by:
getTermPosition in interface FullTextPullStream

getWordCount

public int getWordCount()
Description copied from interface: FullTextPullStream
On a TEXT or FT_TERM event, returns the number of words in the text chunk. For a FT_TERM, the value returned is 1, because phrases are not recognized as a whole.

Specified by:
getWordCount in interface FullTextPullStream

getName

public QName getName()
Description copied from interface: XMLPullStream
Returns the name of the current element node, or if the node is not an element, returns the name of the parent element.

Specified by:
getName in interface XMLPullStream
Returns:
the latest element name

getText

public String getText()
Description copied from interface: XMLPullStream
Returns the textual contents of an atomic node. On PROCESSING_INSTRUCTION, returns the contents without the target name. On element and document events, return null.

Specified by:
getText in interface XMLPullStream
Overrides:
getText in class com.qizx.xdm.XMLPullStreamBase
Returns:
a String for the direct contents of the current leaf node

getTextLength

public int getTextLength()
Description copied from interface: XMLPullStream
Returns the size of the textual contents of an atomic node.

Specified by:
getTextLength in interface XMLPullStream
Overrides:
getTextLength in class com.qizx.xdm.XMLPullStreamBase
Returns:
the number of characters
See Also:
XMLPullStream.getText()

getTarget

public String getTarget()
Description copied from interface: XMLPullStream
Returns the target name for a PROCESSING_INSTRUCTION.

Specified by:
getTarget in interface XMLPullStream
Overrides:
getTarget in class com.qizx.xdm.XMLPullStreamBase
Returns:
a String which is the target name of the PI

extractFirstWords

public String extractFirstWords(String text,
                                int count)
Internal use.


extractLastWords

public String extractLastWords(String text,
                               int count)
Internal use.


getCurrentNode

public Node getCurrentNode()
Description copied from interface: XMLPullStream
Returns the current node, if the implementation of this object is able to. Otherwise the null value is returned.

Specified by:
getCurrentNode in interface XMLPullStream

© 2010 Axyana Software