The Facet
class¶
-
class
ferenda.
Facet
(rdftype=rdflib.term.URIRef('http://purl.org/dc/terms/title'), label=None, pagetitle=None, indexingtype=None, selector=None, key=None, identificator=None, toplevel_only=None, use_for_toc=None, use_for_feed=None, selector_descending=None, key_descending=None, multiple_values=None, dimension_type=None, dimension_label=None)[source]¶ Create a facet from the given rdftype and some optional parameters.
Parameters: - rdftype (rdflib.term.URIRef) – The type of facet being created
- label (str) – A template for the label property of TocPageset objects created from this facet
- pagetitle (str) – A template for the title property of TocPage objects created from this facet
- indexingtype (ferenda.fulltext.IndexedType) – Object specifying how to store the data selected by this facet in the fulltext index
- selector (callable) – A function that takes (row, binding, resource_graph) and returns a string acting as a category of some kind
- key (callable) – A function that takes (row, binding, resource_graph) and returns a string usable for sorting
- toplevel_only (bool) – Whether this facet should be applied to documents only, or any named (ie. given an URI) fragment of a document.
- use_for_toc (bool) – Whether this facet should be used for TOC generation
- use_for_feed (bool) – Whether this facet should be used for newsfeed generation
- selector_descending (bool) – Whether the values returned by
selector
should be presented in lexical descending order - key_descending (bool) – Whether documents, when sorted through the
key
function, should be presented in reverse order. - multiple_values (bool) – Whether more than one instance of the
rdftype
value should be processed (such as multiple keywords each specified by onedcterms:subject
triple). - dimension_type (str) – The general type of this facet – can be
"type"
(values arerdf:type
),"ref"
(values are URIs),"year"
(values are xsd:datetime or similar), or"value"
(values are string literals). - dimension_label (str) – An alternate label for this facet to be used if
the
selector
logic is more transformative than selectional (ie. if it transforms dates to True or False values depending on whether they’re April 1st, you might set this to “aprilfirst”) - identificator (callable) – A function that takes (row, binding, resource_graph) and returns an identifier-like string usable as an id string or URL segment.
If optional parameters aren’t provided, then appropriate values are selected if rdfrtype is one of some common rdf properties:
facet description rdf:type Grouped by qname()
of therdf:type
of the document, eg.foaf:Document
. Not used for tocdcterms:title Grouped by first “sortable” letter, eg for a document titled “The Little Prince” returns “l”. Is used as a facet for the API, but it’s debatable if it’s useful dcterms:identifier Also grouped by first sortable letter. When indexing, the resulting fulltext index field has a high boost value, which increases the chances of this document ranking high when one searches for its identifier. dcterms:abstract Not used for toc dc:creator Should be a free-test (string literal) value dcterms:publisher Should be a URIRef dcterms:references dcterms:issued Used for grouping documents published/issued in the same year dc:subject A document can have multiple dc:subjects and all are indexed/processed dcterms:subject Works like dc:subject, but the value should be a URIRef schema:free A boolean value This module contains a number of classmethods that can be used as arguments to
selector
andkey
, eg>>> from rdflib import Namespace >>> MYVOCAB = Namespace("http://example.org/vocab/") >>> f = Facet(MYVOCAB.enactmentDate, selector=Facet.year) >>> f.selector({'myvocab_enactmentDate': '2014-07-06'}, ... 'myvocab_enactmentDate') '2014'
-
classmethod
defaultselector
(row, binding, resource_graph=None)[source]¶ This returns
row[binding]
without any transformation.>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.defaultselector(row, "dcterms_title") 'A Tale of Two Cities'
-
classmethod
defaultidentificator
(row, binding, resource_graph=None)[source]¶ This returns
row[binding]
run through a simple slug-like transformation.>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.defaultidentificator(row, "dcterms_title") 'a-tale-of-two-cities'
-
classmethod
year
(row, binding='dcterms_issued', resource_graph=None)[source]¶ This returns the the year part of
row[binding]
.>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.year(row, "dcterms_issued") '1859'
-
classmethod
booleanvalue
(row, binding='schema_free', resource_graph=None)[source]¶ Returns True iff row[binding] == “true”, False otherwise.
>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.booleanvalue(row, "schema_free") True
-
classmethod
titlesortkey
(row, binding='dcterms_title', resource_graph=None)[source]¶ Returns a version of row[binding] suitable for sorting. The function
title_sortkey()
is used for string transformation.>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.titlesortkey(row, "dcterms_title") 'ataleoftwocities'
-
classmethod
firstletter
(row, binding='dcterms_title', resource_graph=None)[source]¶ Returns the first letter of row[binding], transformed into a sortable string.
>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.firstletter(row, "dcterms_title") 'a'
-
classmethod
resourcelabel
(row, binding='dcterms_publisher', resource_graph=None)[source]¶ Lookup a suitable text label for row[binding] in resource_graph.
>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> import rdflib >>> resources = rdflib.Graph().parse(format="turtle", data=""" ... @prefix foaf: <http://xmlns.com/foaf/0.1/> . ... ... <http://example.org/chapman_hall> a foaf:Organization; ... foaf:name "Chapman & Hall" . ... ... """) >>> Facet.resourcelabel(row, "dcterms_publisher", resources) 'Chapman & Hall'
-
classmethod
sortresource
(row, binding='dcterms_publisher', resource_graph=None)[source]¶ Returns a sortable version of the resource label for
row[binding]
.>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> import rdflib >>> resources = rdflib.Graph().parse(format="turtle", data=""" ... @prefix foaf: <http://xmlns.com/foaf/0.1/> . ... ... <http://example.org/chapman_hall> a foaf:Organization; ... foaf:name "Chapman & Hall" . ... ... """) >>> Facet.sortresource(row, "dcterms_publisher", resources) 'chapmanhall'
-
classmethod
term
(row, binding='dcterms_publisher', resource_graph=None)[source]¶ Returns the leaf part of the URI found in
row[binding]
.>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> Facet.term(row, "dcterms_publisher") 'chapman_hall'
-
classmethod
qname
(row, binding='rdf_type', resource_graph=None)[source]¶ Returns the qname of the rdf URIref contained in row[binding], as determined by the namespace prefixes registered in resource_graph.
>>> row = {"rdf_type": "http://purl.org/ontology/bibo/Book", ... "dcterms_title": "A Tale of Two Cities", ... "dcterms_issued": "1859-04-30", ... "dcterms_publisher": "http://example.org/chapman_hall", ... "schema_free": "true"} >>> import rdflib >>> resources = rdflib.Graph() >>> resources.bind("bibo", "http://purl.org/ontology/bibo/") >>> Facet.qname(row, "rdf_type", resources) 'bibo:Book'