You are here: Home Zope products TextIndexNG

TextIndexNG (3.0.6, 2.1.1 stable)

1
by cms last modified 2009-01-08
0.0
0.0
0.0
0.0
Released on 2005-09-13 by ZOPYX for Zope 2, Zope 3 under Zope Public License (ZPL) available for All platforms.
Software development stage: stable
TextIndexNG is the new fulltext index for Zope and is the most feature-complete solution for fulltext indexing under Zope. The next generation text index for the Zope Catalog. TextIndexNG V 3 is a complete new implementation based on Zope 3 technologies and can be used both in Zope 2.8 or 2.7 (with Five) or in Zope 3.
Sophisticated search with indexing of PDF, MSWord docs.

New features:

  • multi-field support: one index can index multiple fields/attributes of objects; queries can happen against all fields or a single field
  • multi-language support: one index can index documents in different languages. Queries can be limited to a particular language.
  • configurable converters: external converters for foreign formats like PDF, DOC, etc. can be configured through ZCML
  • custom content-types can be indexed either by implementing the required interfaces or by providing an adapter providing the required interfaces for a given content-type
  • Integrates with Zope 2 and Zope 3

Features:

  • DocumentConverters
  • StemmerSupport for 13 languages
  • SimilaritySearch for english text (based on the Levenshtein distance)
  • NearSearch,
  • PluggableParsers
  • extended StopWords support
  • full integration in ZCatalog
  • TestFunctionality through ZMI
  • ExtensibleArchitecture
  • being MoreEfficient than the current !TextIndex
  • full globbing support (wildcard search)
  • NormalizationSupport (e.g. reducing accented characters to their base form)
  • full UnicodeAwareness
  • Relevance ranking of search results added. Searches are now ranked using an extended cosine measure. The cosine measure is based on a vector model and calculates the document "score" based on the frequency of the query terms inside the document result set.
  • Much faster phrase/near search: the old implementation of TextIndexNG had to perform a very expensive job at query time when phrase/near search was performed. Re-using the !WidCode module of !ZCTextIndex made this operation less expensive.
  • Left-truncation added: TextIndexNG can be configured creation-time time to support left-truncation (means you can search for "*suffix") Left-truncation is an option because this feature requires a second reverted index inside the lexicion and much more memory!
  • optional auto-expansion support: This optional feature allows you to get better search results when some of the query terms could not be found. The index expands a query term "foo" to "foo*" if there was no hit for "foo". This expansion is currently global for the index. This feature will be available on a per-query basis in a later version. (Auto-expansion will be extended in a later version to search for similiar terms)
  • improved HTML converter: now using Chris Withers "Strip-o-Gram" module instead of the Strip-Tag-Parser
  • added converter for text/sgml
  • Similarity search (soundex, metaphone, doublemetaphone) dropped and replace with a more general approach and language indepedant approach using the Levenshtein distance.
  • range searches like "Fi..Foo"
  • substring searches "substring"
Document Actions
Powered by Plone