The FulltextIndex class¶
Abstracts access to full text indexes (right now only Whoosh and ElasticSearch is supported, but maybe later, Solr, Xapian and/or Sphinx will be supported).
- class ferenda.FulltextIndex(location, repos)¶
- static connect(indextype, location, repos=[])¶
Open a fulltext index (creating it if it doesn’t already exists).
Parameters: - location (str) – Type of fulltext index (“WHOOSH” or “ELASTICSEARCH”)
- location – The file path of the fulltext index.
- get_default_schema()¶
- exists()¶
Whether the fulltext index exists.
- create(schema, repos)¶
Creates a fulltext index using the provided default schema.
- destroy()¶
Destroys the index, if created.
- open()¶
Opens the index so that it can be queried.
- schema()¶
Returns the schema that actually is in use. A schema is a dict where the keys are field names and the values are any subclass of ferenda.fulltextindex.IndexedType
- update(uri, repo, basefile, title, identifier, text, **kwargs)¶
Insert (or update) a resource in the fulltext index. A resource may be an entire document, but it can also be any part of a document that is referenceable (i.e. a document node that has @typeof and @about attributes). A document with 100 sections can be stored as 100 independent resources, as long as each section has a unique key in the form of a URI.
Parameters: - uri (str) – URI for the resource
- repo (str) – The alias for the document repository that the resource is part of
- basefile (str) – The basefile which contains resource
- title (str) – User-displayable title of resource (if applicable). Should not contain the same information as identifier.
- identifier (str) – User-displayable short identifier for resource (if applicable)
- commit()¶
Commit all pending updates to the fulltext index.
- close()¶
Commits all pending updates and closes the index.
- doccount()¶
Returns the number of currently indexed (non-deleted) documents.
- query(q, **kwargs)¶
- Perform a free text query against the full text index, optionally
- restricted with parameters for individual fields.
Parameters: Returns: matching documents, each document as a dict of fields
Return type: list
Note
The kwargs parameters do not yet do anything – only simple full text queries are possible.