R43ples (Revision for triples) is an open source Revision Management Tool for the Semantic Web.
It provides different revisions of named graphs via a SPARQL interface. All information about revisions, changes, commits, branches and tags are stored in additional named graphs beside the original graph in an attached external triple store.
This project provides an enhanced SPARQL endpoint for revision management of named graphs. R43ples uses an internal Jena TDB is attached to an existing SPARQL endpoint of a Triple Store and acts as another endpoint both for normal SPARQL queries as well as for revision-enhanced SPARQL queries, named R43ples queries. The R43ples endpoint allows to specify revisions which should be queried for each named graph used inside a SPARQL query. The whole revision information is stored in additional graphs in the attached Jena TDB.
The javadoc can be found at the website under http://plt-tud.github.io/r43ples/site/apidocs/.
A running test server should be available under http://eatld.et.tu-dresden.de:9998/r43ples/sparql
- JDK 1.7
- Maven
sudo apt-get install maven default-jdk
Releases are stored on GitHub. They just have to be unzipped and started with Java
java -jar r43ples-*-with-dependencies.jar
Debian packages are going to be deployed soon.
Maven is used for compiling
mvn exec:java
Releases can be be built with:
mvn assembly:single
Debian packages can be built with:
mvn package:jdeb
There is a configuration file named resources/r43ples.conf. The most important ones are the following:
- database.directory - directory for Jena TDB database
- service.port - port under which R43ples provides its services
- service.uri - URI under which R43ples provides its services
- revision.graph - named graph which is used by R43ples to store revision graph information
- sdd.graph - named graph for storing the SDD
The logging configuration is stored in resources/log4j.properties
SPARQL endpoint is available at:
[uri]:[port]/r43ples/sparql
The endpoint directly accepts SPARQL queries with HTTP GET parameters for query and format:
[uri]:[port]/r43ples/sparql?query=[]&format=[]
The formats can be specified as URL Path Parameter format, as HTTP post paramter format or as HTTP header parameter Accept:
- text/turtle
- application/json
- application/rdf+xml
- text/html
- text/plain
There are some additional keywords which can be used to control the revisions of graphs:
-
Create graph
CREATE GRAPH <graph>
-
Select query
SELECT * WHERE { GRAPH <graph> REVISION "23" {?s ?p ?o} }
-
Update query
USER "mgraube" MESSAGE "test commit" INSERT { GRAPH <test> REVISION "2" { <a> <b> <c> . } }
-
Branching
USER "mgraube" MESSAGE "test commit" BRANCH GRAPH <test> REVISION "2" TO "unstable"
-
Tagging
USER "mgraube" MESSAGE "test commit" TAG GRAPH <test> REVISION "2" TO "v0.3-alpha"
-
Merging
USER "mgraube" MESSAGE "merge example" MERGE GRAPH <test> BRANCH "branch-1" INTO "branch-2"
There is a new option for R43ples which improves the performance. The necessary revision is not temporarily generated anymore. The SPARQL query is rewritten in such a way that the branch and the change sets are directly joined inside the query. This includes the order of the change sets. It is currently under development and further research.
The option can be enabled by:
OPTION r43ples:SPARQL_JOIN
It currently supports:
- Multiple Graphs
- Multiple TriplePath
- FILTER
- MINUS
For more details, have a look into the doc/ directory.
Without SPARQL Join option the algorithms are very simple:
For each named graph 'g' in a query, a temporary graph 'TempGraph_g_r' is generated for the specified revision 'r' according to this formula ('g_x' = full materialized revision 'x' of graph 'g'):
TempGraph_g_r = g_nearestBranch + SUM[revision i= nearestBranch to r]( deleteSet_g_i - addSet_g_i )
def select_query(query_string):
for (graph,revision) in query_string.get_named_graphs_and_revisions():
execQuery("COPY GRAPH <"+graph+"> TO GRAPH <tmp-"+graph+"-"+revision+">")
for rev in graph.find_shortest_path_to_revision(revision):
execQuery("REMOVE GRAPH "+ rev.add_set_graph+" FROM GRAPH <tmp-"+graph+"-"+revision+">")
execQuery("ADD GRAPH "+ rev.delete_set_graph+" TO GRAPH <tmp-"+graph+"-"+revision+">")
query_string.replace(graph, "tmp-"+graph+"-"+revision)
result = execQuery(query_string)
execQuery("DROP GRAPH <tmp-*>")
return result
def update_query(query_string):
for (graph,revision) in query_string.get_named_graphs_and_revisions():
newRevision = revision +1
execQuery("ADD GRAPH "+ rev.delete_set_graph+" TO GRAPH <tmp-"+graph+"-"+revision+">")
...
Following libraries are used in R43ples: