Skip to content

markusgraube/r43ples

 
 

Repository files navigation

R43ples

R43ples (Revision for triples) is an open source Revision Management Tool for the Semantic Web.

It provides different revisions of named graphs via a SPARQL interface. All information about revisions, changes, commits, branches and tags are stored in additional named graphs beside the original graph in an attached external triple store.

Build Status Coverity Scan Build Status Ohloh Project Status

This project provides an enhanced SPARQL endpoint for revision management of named graphs. R43ples uses an internal Jena TDB is attached to an existing SPARQL endpoint of a Triple Store and acts as another endpoint both for normal SPARQL queries as well as for revision-enhanced SPARQL queries, named R43ples queries. The R43ples endpoint allows to specify revisions which should be queried for each named graph used inside a SPARQL query. The whole revision information is stored in additional graphs in the attached Jena TDB.

The javadoc can be found at the website under http://plt-tud.github.io/r43ples/site/apidocs/.

A running test server should be available under http://eatld.et.tu-dresden.de:9998/r43ples/sparql

Dependencies

  • JDK 1.7
  • Maven
sudo apt-get install maven default-jdk

Releases

Releases are stored on GitHub. They just have to be unzipped and started with Java

java -jar r43ples-*-with-dependencies.jar

Debian packages are going to be deployed soon.

Compiling

Maven is used for compiling

mvn exec:java

Releases can be be built with:

mvn assembly:single

Debian packages can be built with:

mvn package:jdeb

Configuration

There is a configuration file named resources/r43ples.conf. The most important ones are the following:

  • database.directory - directory for Jena TDB database
  • service.port - port under which R43ples provides its services
  • service.uri - URI under which R43ples provides its services
  • revision.graph - named graph which is used by R43ples to store revision graph information
  • sdd.graph - named graph for storing the SDD

The logging configuration is stored in resources/log4j.properties

Interfaces

SPARQL endpoint is available at:

[uri]:[port]/r43ples/sparql

The endpoint directly accepts SPARQL queries with HTTP GET parameters for query and format:

[uri]:[port]/r43ples/sparql?query=[]&format=[]

Supported Formats

The formats can be specified as URL Path Parameter format, as HTTP post paramter format or as HTTP header parameter Accept:

  • text/turtle
  • application/json
  • application/rdf+xml
  • text/html
  • text/plain

R43ples keywords

There are some additional keywords which can be used to control the revisions of graphs:

  • Create graph

      CREATE GRAPH <graph>
    
  • Select query

      SELECT * 
      WHERE { 
      	GRAPH <graph> REVISION "23" {?s ?p ?o}
      }
    
  • Update query

      USER "mgraube" MESSAGE "test commit" 
      INSERT {
          GRAPH <test> REVISION "2" {
              <a> <b> <c> .
          }
      }
    
  • Branching

      USER "mgraube"
      MESSAGE "test commit"
      BRANCH GRAPH <test> REVISION "2" TO "unstable"
    
  • Tagging

      USER "mgraube"
      MESSAGE "test commit"
      TAG GRAPH <test> REVISION "2" TO "v0.3-alpha"
    
  • Merging

      USER "mgraube"
      MESSAGE "merge example"
      MERGE GRAPH <test> BRANCH "branch-1" INTO "branch-2"
    

SPARQL Join option

There is a new option for R43ples which improves the performance. The necessary revision is not temporarily generated anymore. The SPARQL query is rewritten in such a way that the branch and the change sets are directly joined inside the query. This includes the order of the change sets. It is currently under development and further research.

The option can be enabled by:

OPTION r43ples:SPARQL_JOIN

It currently supports:

  • Multiple Graphs
  • Multiple TriplePath
  • FILTER
  • MINUS

For more details, have a look into the doc/ directory.

Algorithm

Without SPARQL Join option the algorithms are very simple:

For each named graph 'g' in a query, a temporary graph 'TempGraph_g_r' is generated for the specified revision 'r' according to this formula ('g_x' = full materialized revision 'x' of graph 'g'):
    TempGraph_g_r = g_nearestBranch + SUM[revision i= nearestBranch to r]( deleteSet_g_i - addSet_g_i )
def select_query(query_string):
    for (graph,revision) in query_string.get_named_graphs_and_revisions():   
        execQuery("COPY GRAPH <"+graph+"> TO GRAPH <tmp-"+graph+"-"+revision+">")
        for rev in graph.find_shortest_path_to_revision(revision):
            execQuery("REMOVE GRAPH "+ rev.add_set_graph+" FROM GRAPH <tmp-"+graph+"-"+revision+">")
            execQuery("ADD GRAPH "+ rev.delete_set_graph+" TO GRAPH <tmp-"+graph+"-"+revision+">")
        query_string.replace(graph, "tmp-"+graph+"-"+revision)
    result = execQuery(query_string)
    execQuery("DROP GRAPH <tmp-*>")
    return result
def update_query(query_string):
    for (graph,revision) in query_string.get_named_graphs_and_revisions():
        newRevision = revision +1
        execQuery("ADD GRAPH "+ rev.delete_set_graph+" TO GRAPH <tmp-"+graph+"-"+revision+">")
        ...

Used libraries and frameworks

Following libraries are used in R43ples:

Packages

No packages published

Languages

  • Java 82.2%
  • JavaScript 6.9%
  • HTML 3.9%
  • Python 3.7%
  • Shell 2.5%
  • ApacheConf 0.4%
  • CSS 0.4%