Skip to content
This repository has been archived by the owner on Jan 24, 2019. It is now read-only.

Project that generates BioSample XML submissions from CEDAR BioSample template instances

License

Notifications You must be signed in to change notification settings

metadatacenter-attic/biosample-exporter-deprecated

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CEDAR 2 BioSample Converter

Build Status

This is an experimental CEDAR project to generate BioSample submissions.

This converter takes CEDAR BioSample submission instances and converts them into BioSample XML-based submissions.

The ./src/main/resources/json-schema/ directory contains a CEDAR BioSample template called NCBIBioSampleSubmissionTemplate.json. This template was generated using the CEDAR Template Designer.

The ./src/main/resources/xsd/ directory contains an XML Schema document describing a BioSample submission. It is called BioSampleSubmission.xsd. Two sub-schemas are defined in the files SP.common.xsd and biosample.xsd. The schema files were downloaded from the NCBI site.

The CEDAR Metadata Editor can use the CEDAR submission template to generate a CEDAR instance of a BioSample submission. The ./src/main/resources/json/ directory contains an example instance created using this template. It is called NCBIBioSampleSubmissionInstance1.json. Other instances can be generated using the CEDAR Metadata Editor.

This converter takes these CEDAR BioSample submission instances and generates XML documents conforming to the BioSample submission XML Schema.

These XML documents can then be validated using the NCBI BioSample validator.

Documentation for this service can be found here.

The following is an example curl command to submit XML to this validator:

curl -X POST -d @<Submission XML>  https://www.ncbi.nlm.nih.gov/projects/biosample/validate/

Some example submissions can be found in the ./examples directory.

Each submission requires a BioSample project identifier. Our identifier for testing is PRJNA212117.

An XML document is returned from the validator with the validation status. This schema for this XML response can be found here.

Note that - as per instructons - we have to change the root node from SubmissionStatus to BioSampleValidate. Also, typeFile has to be reanamed to typeFile2 to avoid name collision with an element of the same name in the BioSampleSubmission XML Schema.

A success could look as follows:

<?xml version="1.0" encoding="UTF-8"?>
<BioSampleValidate>
  <Action status="processed-ok" action_id="SUB123456-1" target_db="BioSample">
    <Response status="processed-ok">
      <Message error_code="34" severity="warning" error_source="data">Submission processing may be delayed due to necessary curator review. Please check spelling of organism, current information generated the following error message and will require a taxonomy consult: Organism not found, value 'Midi-chlorian'.</Message>
      <Object target_db="BioSample" object_id="" spuid="MIDI_ISO_9154" spuid_namespace="JEDI-MIDI"/>
    </Response>
  </Action>
</BioSampleValidate>

A failure could look as follows:

<?xml version="1.0" encoding="UTF-8"?>
<BioSampleValidate>
  <Action status="processed-error" action_id="SUB123456-1" target_db="BioSample">
    <Response status="processed-error">
      <Message error_code="62" severity="error-stop" error_source="data">Invalid BioProject accession: PRJNA212117XXXXX. Please provide a valid BioProject accession with format PRJxxxxx.</Message>
      <Object target_db="BioSample" object_id="" spuid="MIDI_ISO_9154" spuid_namespace="JEDI-MIDI"/>
    </Response>
    <Response status="processed-ok">
      <Message error_code="34" severity="warning" error_source="data">Submission processing may be delayed due to necessary curator review. Please check spelling of organism, current information generated the following error message and will require a taxonomy consult: Organism not found, value 'Midi-chlorian'.</Message>
      <Object target_db="BioSample" object_id="" spuid="MIDI_ISO_9154" spuid_namespace="JEDI-MIDI"/>
    </Response>
  </Action>
</BioSampleValidate>

Packages

Information on the overall submission process can be found on the NCBI Submission page and also here and here. If needed, it is possible to log on to the system with Stanford institutional access.

A description of current BioSample attributes can be found here. BioSample defines a set of packages that define attribute groups for certain domains. These are described here.

This converter does not have inbuilt support for packages. Similarly, the CEDAR BioSample submission template does not support them. More work remains to correctly deal with packages.

Human.1.0 Package

An definition of the Human.1.0 package can be found here.

A detailed description the attributes in this package can be found here.

An descriptions of a submission using the Human.1.0 package can be found here.

An example XML submission using the Human.1.0 package can be found in the ./examples/ subdirectory.

Building and Running

To build this library you must have the following items installed:

Get a copy of the latest code:

git clone https://github.com/metadatacenter/biosample-exporter.git

Change into the biosample-exporter directory:

cd biosample-exporter 

Then build it with Maven:

mvn clean install

To run:

mvn exec:java

About

Project that generates BioSample XML submissions from CEDAR BioSample template instances

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages