The ReST API for querying¶
Ferenda tries to adhere to Linked Data principles, which makes it easy
to explain how to get information about any individual document or any
complete dataset (see URLs for retrieving resources). Sometimes it’s desirable to
query for all documents matching a particular criteria, including full
text search. Ferenda has a simple API, based on the rinfo-service
component of RDL, and inspired by
Linked data API, that
enables you to do that. This API only provides search/select
operations that returns a result list. For information about each
individual result in that list, use the methods described in
URLs for retrieving resources.
Note
Much of the things described below are also possible to do in pure SPARQL. Ferenda does not expose any open SPARQL endpoints to the world, though. But if you find the below API lacking in some aspect, it’s certainly possible to directly expose your chosen triplestores SPARQL endpoint (as long as you’re using Fuseki or Sesame) to the world.
The default endpoint to query is your main URL + /api/
,
eg. http://localhost:8000/api/
. The requests always use GET and
encode their parameters in the URL, and the responses are always in
JSON format.
Free text queries¶
The simplest form of query is a free text query that is run against
all text of all documents. Use the parameter q
,
eg. http://localhost:8000/api/?q=tail
returns all documents
(and document fragments) containing the word “tail”.
Result lists¶
The result of a query will be a JSON document containing some general properties of the result, and a list of result items, eg:
{
"current": "/myapi/?q=tail",
"duration": null,
"items": [
{
"dcterms_identifier": "123(A)",
"dcterms_issued": "2014-01-04",
"dcterms_publisher": {
"iri": "http://example.org/publisher/A",
"label": "http://example.org/publisher/A"
},
"dcterms_title": "Example",
"matches": {
"text": "<em class=\"match\">tail</em> end of the main document"
},
"rdf_type": "http://purl.org/ontology/bibo/Standard",
"iri": "http://example.org/base/123/a"
}
],
"itemsPerPage": 10,
"startIndex": 0,
"totalResults": 1
}
Each result item contain all fields that have been indexed (as
specified by your docrepos’ facets, see Grouping documents with facets, the document
URI (as the field iri
) and optionally a field matches
that
provides a snipped of the matching text.
Parameters¶
Any indexed property, as defined by your facets, can be used for
querying. The parameter is the same as the qname for the rdftype with
_
instead of :
, eq to search all documents that have
dcterms:publisher
set to `http://example.org/publisher/A
, use
http://localhost:8000/api/?dcterms_publisher=http%3A%2F%2Fexample.org%2Fpublisher%2FA
You can use * as a wildcard for any string data, eg. the above could
be shortened to
http://localhost:8000/api/?dcterms_publisher=*%2Fpublisher%2FA
.
If you have a facet with a set dimension_label
, you can use that
label directly as a parameter, eg http://localhost:8000/api/?aprilfools=true
.
Paging¶
By default, the result list only contains 10 results. You can inspect
the properties startIndex
and totalResults
of the response to
find out if there are more results, and use the special parameter
_page
to request subsequent pages of results. You can also request
a different length of the result list through the _pageSize
parameter.
Statistics¶
By requesting the special resource ;stats
, eg
http://localhost:8000/api/;stats
, you can get a statistics view
over all documents in all your docrepos for each of your defined
facets including the number of document for each value of it’s
selector, eg:
{
"type": "DataSet",
"slices": [
{
"dimension": "rdf_type",
"observations": [
{"count": 3,
"term": "bibo:Standard"}
]
},
{
"dimension": "dcterms_publisher",
"observations": [ {
"count": 1,
"ref": "http://example.org/publisher/A"
}, {
"count": 2,
"ref": "http://example.org/publisher/B"
} ]
}, {
"dimension": "dcterms_issued",
"observations": [ {
"count": 1,
"year": "2013"
}, {
"count": 2,
"year": "2014"
} ]
} ]
}
You can also get the same information for the documents in any result
list by setting the special parameter _stats=on
.
Ranges¶
For some parameters, particularly those that use datetime values, it’s
useful to specify ranges instead of exact values. By prefixing the
parameter name with min-
, max-
or year-
, it’s possible to
do that,
eg. http://localhost:8000/api/?min-dcterms_issued=2012-04-01
to
retrieve all documents that have a dcterms:issued later than
2012-04-01, or http://localhost:8000/api/?year-dcterms_issued=2012
to retrieve all documents that are dct:issued during 2012.
Support resources¶
The special resources common.json
and terms.json
(eg. http://localhost:8000/api/common.json
and
http://localhost:8000/api/terms.json
) contains all the extra data
(see Custom common data) and ontologies (see
Custom ontologies) that your repositories use, in JSON-LD
format. You can use these to display user-friendly labels for
properties and things in your application.
Legacy mode¶
Ferenda can be made directly compatible with the API used by
rinfo-service
(mentioned above) by activating the setting
legacyapi
, eg by setting legacyapi = True
in ferenda.conf or
using the option --legacyapi
on the command line.
Note that this setting is used both during the makeresources
step
as well as when serving the API eg with the runserver
command. If
you want to play with this setting, you’ll need to re-run
makeresources --force
with this enabled.
Running makeresources
with this setting enabled also installs a
API explorer app, taken from rinfo-service
. You can try it out at
http://localhost:8000/rsrc/ui/
.