Skip to content

INCATools/verificado

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Verificado DOI

Validate ontology relationships using Ubergraph as source of truth. Relationships in this context may be subClassOf axioms between names classes (e.g. 'lymphocyte' subClassOf 'cell') or existential restrictions, (e.g. 'enterocyte' part_of some ‘intestinal epithelium’).

Ubergraph is an RDF triplestore with 39 OBO ontologies merged, precomputed OWL classification and materialised class relationship from existential property restrictions. Validation therefore works for any directly asserted or inferred/indirect subClassOf relationship or existential restriction.

Install

Dependencies

This package depends on Graphviz and OBOGraphviz to represent the validation as a graph.

Graphviz

On macOS:

brew install graphviz

On Linux:

apt install graphviz

For another platform, please follow this instruction to install Graphviz.

OBOGraphviz

Before installing OBOGraphviz, make sure you have installed Node.js version >= 14.16. Please follow this instructions to install Node and npm.

Then install the obographviz package globally:

npm install -g obographviz

verificado package

pip install verificado

Configure YAML file

In the config file, it is defined the list of relationships the validation should run on. The order is essential.

The yaml file needs to have the keys relationships and filename. Check an example below:

relationships:
  sub_class_of: rdfs:subClassOf
  part_of: BFO:0000050
  connected_to: RO:0001025
  has_soma_location: RO:0002100
  ...

filename: path/to/filename.csv

The filename can be in TSV or CSV. When using CSV, double-quote if the label contains a common. It's preferred to have the following columns:

s slabel user_slabel o olabel user_olabel
the subject term ID the label of the term in the column s optional label for the term given by user the object term ID the label of the term in the column s optional label for the term given by user

However, the package can also accept TSV or CSV files representing a hierarchy. You can specify an undetermined number of levels, each level defined with an ontology term ID and the label of the term. Please check an example in the tests directory.

Add to_be_parsed: true to the yaml file when using this type of file.

relationships:
  sub_class_of: rdfs:subClassOf
  part_of: BFO:0000050
  connected_to: RO:0001025
  has_soma_location: RO:0002100
  ...

filename: path/to/filename.csv
to_be_parsed: true

Run verificado CLI

verificado validate --input path/to/config.yaml --output path/to/output.csv

The output.csv file will be in the same format as the filename.csv. It will return the cases where a triple (subject, relationship, object) with the relationships listed in the yaml file was not found in Ubergraph.

List of ontologies available

To know which ontologies and their version are available in Ubergraph, use the following CLI:

verificado ontologies_version --output filename.json