Explore collections of multilingual public procurement data through a Restful API:
/documents
: list of existing documents/documents/{id}
: details of a document/documents/{id}/items
: similar documents
Or search for a similar document given a text:
/items
: similar documents
- A Swagger-based API is available online at:
http://tbfy.librairy.linkeddata.es/search-api - Get the list of available documents, and filter by language or source, using
/documents
:
http://tbfy.librairy.linkeddata.es/search-api/documents - Get the content, and additional information, of a document through
/documents/{id}
:
http://tbfy.librairy.linkeddata.es/search-api/documents/jrc32002D0996-en - Obtain similar documents, regardless of language, through
/documents/{id}/items
: http://tbfy.librairy.linkeddata.es/search-api/documents/jrc32002D0996-en/items - To obtain only documents in Spanish, just add
lang=es
to the query:
http://tbfy.librairy.linkeddata.es/search-api/documents/jrc32002D0996-en/items?lang=es
Similar documents to a free text can also be searched. All you have to do is make a HTTP-POST request with a json like this at :
{
"size": 10,
"source": "jrc",
"text": "Council Directive 9343EEC on the hygiene of foodstuffs as regards the transport of bulk liquid oils and fats by seaText with EEA relevance."
}
In order to obtain only documents in Spanish, just add lang=es
to the json:
{
"size": 10,
"source": "jrc",
"text": "Council Directive 9343EEC on the hygiene of foodstuffs as regards the transport of bulk liquid oils and fats by seaText with EEA relevance.",
"lang":"es"
}
- Download the latest data dump available at Zenodo:
https://doi.org/10.5281/zenodo.3783736 - Unzip it, for example in
/tmp
. A folder is created per month. - Download the indexing script. It is implemented in Python, but is easily exportable to other languages:
http://tbfy.librairy.linkeddata.es/search-api/src/main/python/index-tenders.py - Edit it to set the root directory where the documents are. For example
/tmp
:As you can see, a filtering of directories to be indexed can be defined in the path itself by addingmain('/tmp/20*')
*
characters. - Run it! That's it.
More info here
This tool is part of the librAIry ecosystem, and needs librAIry-API for deployment.
- It can start as a service via docker-compose.yml:
- Or through Maven dependencies:
- Add the JitPack repository to your build file
<repositories> <repository> <id>jitpack.io</id> <url>https://jitpack.io</url> </repository> </repositories>
- Add the dependency
<dependency> <groupId>com.github.TBFY</groupId> <artifactId>search-API</artifactId> <version>last-stable-release-version</version> </dependency>
Please take a look at our contributing guidelines if you're interested in helping!