Skip to content

Code for importing ZWD into RNAcentral

Notifications You must be signed in to change notification settings

Rfam/rfam-zwd-import

Repository files navigation

Code for importing ZWD into RNAcentral

ZWD (Zasha Weinberg Database) is a repository containing RNA motif alignments produced by Dr Zasha Weinberg.

Many sequences in ZWD alignments are from environmental samples and cannot be included in Rfam seed alignments because they do not have stable identifiers and NCBI taxids.

In order to get stable identifiers and NCBI taxids for these RNAs, the ZWD sequences are first imported into RNAcentral using the RNAcentral JSON schema.

The zwd.json file is used in RNAcentral import.

Usage

# build Docker image
docker build -t zwd2rnacentral .

# run Docker container and mount the current directory inside the container
docker run -v `pwd`:/data/rnacentral -it zwd2rnacentral bash

# generate JSON file
cd /data/rnacentral && python zwd2rnacentral.py

# validate JSON file against RNAcentral schema
cd /data/rnacentral-data-schema && python2 validate.py /data/rnacentral/zwd.json

The mapping between the Rfam 14.0 families and ZWD can be found in this Google Doc.