Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading Refseq and GFF into Chado #116

Open
akmg6 opened this issue Sep 25, 2020 · 3 comments
Open

Loading Refseq and GFF into Chado #116

akmg6 opened this issue Sep 25, 2020 · 3 comments

Comments

@akmg6
Copy link

akmg6 commented Sep 25, 2020

Hi,
Following the assembly of various genomes, now I have to propose dedicated visualization tools.
I installed a Chado database, which is accessible with a dedicated user that I named Chado. I can connect to this database and view the tables and their contents.
When installing Chado, I created my organism, which is present in the "organism" table.
Now, I want to integrate the sequence of my genome (a fasta file with chromosomes) as well as the associated genes annotation (gff3 file).
Whether it's for the refseq or the gff3, I'm having problems. Since I'm not yet familiar with how Chado works, I don't understand the errors I'm getting.

What i do with my fasta :

  1. I converted my fasta containing the chromosomes into genbank format :
from Bio import SeqIO
SeqIO.convert("sequences.fa", "fasta", "sequences.genbank", "genbank", molecule_type="DNA")
  1. I converted my genbank in gff3 :
    bp_genbank2gff3.pl sequences.genbank
  2. I tried gmod_bulk_load_gff3.pl and I received some errors with the Adapter.pm file and I downloaded the patch which is recommended :
    https://raw.githubusercontent.com/GMOD/Chado/fix_shift_if_defined/chado/lib/Bio/GMOD/DB/Adapter.pm
  3. I launched the loading :
    gmod_bulk_load_gff3.pl --gfffile sequences.genbank.gff --organism "organism" --dbname chado

And I got the following error : MSG: no cvterm for region

For the genes annotation file (gff3), I followed the instructions and I got the error : MSG: no cvterm for gene

So, I followed instructions specified in the installation file that I did not do initially :

wget http://purl.obolibrary.org/obo/ro/subsets/ro-chado.obo
go2fmt.pl -p obo_text -w xml ro-chado.obo | go-apply-xslt oboxml_to_chadoxml - > obo_text.xml
stag-storenode.pl -d 'dbi:Pg:dbname=chado;host=localhost;port=5432' --user chado --password *** obo_text.xml

And I had some errors with stag_storenode.pl :

Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
Redundant argument in sprintf at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3273.
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c1" [for Statement "INSERT INTO cvterm (is_relationshiptype, name, cv_id, dbxref_id, definition) VALUES (?, ?, ?, ?, ?)" with ParamValues: 1='1', 2='precedes', 3='13', 4='2199', 5=''] at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3322.
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c1" [for Statement "INSERT INTO cvterm (is_relationshiptype, name, cv_id, dbxref_id, definition) VALUES (?, ?, ?, ?, ?)" with ParamValues: 1='1', 2='precedes', 3='13', 4='2199', 5=''] at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3322.
 at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 3332.
        DBIx::DBStag::insertrow(DBIx::DBStag=HASH(0x7db0b0), "cvterm", HASH(0x1374818), "cvterm_id") called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 1928
        DBIx::DBStag::_storenode(DBIx::DBStag=HASH(0x7db0b0), Data::Stag::StagImpl=ARRAY(0x106b630)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/DBIx/DBStag.pm line 1180
        DBIx::DBStag::storenode(DBIx::DBStag=HASH(0x7db0b0), Data::Stag::StagImpl=ARRAY(0x1ac0270)) called at /cm/shared/apps/Perl_conda/bin/stag-storenode.pl line 85
        eval {...} called at /cm/shared/apps/Perl_conda/bin/stag-storenode.pl line 84
        main::store(Data::Stag::BaseHandler=HASH(0x15d7bf8), Data::Stag::StagImpl=ARRAY(0x1ac0270)) called at /cm/shared/apps/Perl_conda/bin/stag-storenode.pl line 134
        main::__ANON__(Data::Stag::BaseHandler=HASH(0x15d7bf8), Data::Stag::StagImpl=ARRAY(0x1ac0270)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag/BaseHandler.pm line 601
        Data::Stag::BaseHandler::end_event(Data::Stag::BaseHandler=HASH(0x15d7bf8), "cvterm") called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag/BaseHandler.pm line 742
        Data::Stag::BaseHandler::end_element(Data::Stag::BaseHandler=HASH(0x15d7bf8), HASH(0x1aba5b0)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/XML/Parser/PerlSAX.pm line 239
        XML::Parser::PerlSAX::_handle_end(XML::Parser::PerlSAX=HASH(0x15a7938), XML::Parser::Expat=HASH(0x15d3b58), "cvterm") called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/XML/Parser/PerlSAX.pm line 79
        XML::Parser::PerlSAX::__ANON__(XML::Parser::Expat=HASH(0x15d3b58), "cvterm") called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0//x86_64-linux-thread-multi/XML/Parser/Expat.pm line 474
        XML::Parser::Expat::parse(XML::Parser::Expat=HASH(0x15d3b58), FileHandle=GLOB(0x1825338)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0//x86_64-linux-thread-multi/XML/Parser.pm line 187
        eval {...} called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0//x86_64-linux-thread-multi/XML/Parser.pm line 186
        XML::Parser::parse(XML::Parser=HASH(0x15a7d58), FileHandle=GLOB(0x1825338)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/XML/Parser/PerlSAX.pm line 147
        XML::Parser::PerlSAX::parse(XML::Parser::PerlSAX=HASH(0x15a7938), "Handler", Data::Stag::BaseHandler=HASH(0x15d7bf8), "Source", HASH(0x15a79e0)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag/XMLParser.pm line 69
        Data::Stag::XMLParser::parse_fh(Data::Stag::XMLParser=HASH(0x15a7188), FileHandle=GLOB(0x1825338)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag/BaseGenerator.pm line 476
        Data::Stag::BaseGenerator::parse(Data::Stag::XMLParser=HASH(0x15a7188), "-file", "obo_text.xml", "-str", undef, "-fh", undef) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag/XMLParser.pm line 58
        Data::Stag::XMLParser::parse(Data::Stag::XMLParser=HASH(0x15a7188), "-file", "obo_text.xml", "-str", undef, "-fh", undef) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag/StagImpl.pm line 275
        Data::Stag::StagImpl::parse("Data::Stag", "-format", undef, "-file", "obo_text.xml", "-handler", Data::Stag::BaseHandler=HASH(0x15d7bf8)) called at /cm/shared/apps/Perl_conda/lib/perl5/site_perl/5.22.0/Data/Stag.pm line 181
        Data::Stag::AUTOLOAD("Data::Stag", "-format", undef, "-file", "obo_text.xml", "-handler", Data::Stag::BaseHandler=HASH(0x15d7bf8)) called at /cm/shared/apps/Perl_conda/bin/stag-storenode.pl line 140

Could someone explain to me what I'm missing? What is the problem with these errors ?

Thank you very much.
Best
A.

@laceysanderson
Copy link
Contributor

laceysanderson commented Sep 25, 2020

Hi @akmg6,

I'm not personally familiar with the Chado command-line tools as I use the built-in Tripal Importers. However, based on your error messages, I would guess that the chado.cvterm table does not include the terms in the 3rd column of your GFF3.

Have you heard of Tripal? It's a great solution for creating a website associated with your chado database (see http://tripal.info/ and https://tripal.readthedocs.io). Regarding the Tripal Importers, there is a great tutorial for loading fasta and gff3 here: https://tripal.readthedocs.io/en/latest/user_guide/example_genomics/genomes_genes.html.

@dreyes17
Copy link

Hello @akmg6 ,

You can check the cvterm constraints here https://laceysanderson.github.io/chado-docs/tables/cvterm.html. That error means that there exist in the database that cvterm that you are trying to import from de Chado XML. What I do to solve this problem is to delete the existing cvterms from the database and then executing the stag-storenode.pl again. If your database is empty this won't corrupt the previous version of your database since the cvterm being inserted is the same as the previous. The only thing you have to matter of is to update all the cv_id references of the cvterm you are deleting to point to the new inserted one.

@scottcain
Copy link
Member

@akmg6 Did either of these suggestions help you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants