Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with generated models #76

Closed
TBerg42 opened this issue Apr 29, 2021 · 5 comments
Closed

Issue with generated models #76

TBerg42 opened this issue Apr 29, 2021 · 5 comments

Comments

@TBerg42
Copy link

TBerg42 commented Apr 29, 2021

Hello, each time I try to reconstruct a draft model or gapfilling I get the following warning at the end:


Warning message:
In is.na(mod_notes) :
is.na() applied to non-(list or vector) of type 'NULL'


I have tried redoing the examples included for the tool such as for myb71.fna.gz as well as the Eubacterium rectale ATCC 33656 in the "cross feeding" tutorial. For both of them I get the warning above. When I then try to open the generated .xml files with COBRA I get the following errors:


Output argument "qualifiers" (and maybe others) not assigned during call to "getDataBases".

Error in parseCVTerms (line 27)
[databases,identifiers,relations] = cellfun(@getDataBases, {CVTerms.resources},
{CVTerms.qualifier},'UniformOutput',0);

Error in readSBML (line 121)
[databases,identifiers,qualifiers] = cellfun(@parseCVTerms, cvterms,'UniformOutput',0);

Error in readCbModel (line 211)
model = readSBML(fileName,defaultBound);


There did not seem to be any issues with the installation. When I ran the test there were no reported problems (see below).
Would you have any idea what could be causing the issue? Is it something to do with SBML?


Test:
gapseq version: 1.1 5f5a3e9
linux-gnu
#60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020

#######################
#Checking dependencies#
#######################
awk: not an option: --version

sed (GNU sed) 4.4
grep (GNU grep) 3.1
This is perl 5, version 26, subversion 1 (v5.26.1) built for x86_64-linux-gnu-thread-multi
tblastn: 2.6.0+
exonerate from exonerate version 2.4.0
bedtools v2.26.0
Synopsis:
barrnap 0.8 - rapid ribosomal RNA prediction
Author:
Torsten Seemann [email protected]
Usage:
barrnap [options] <chromosomes.fasta>
Options:
--help This help
--version Print version and exit
--citation Print citation for referencing barrnap
--kingdom [X] Kingdom: bac euk mito arc (default 'bac')
--quiet No screen output (default OFF)
--threads [N] Number of threads/cores/CPUs to use (default '8')
--lencutoff [n.n] Proportional length threshold to label as partial (default '0.8')
--reject [n.n] Proportional length threshold to reject prediction (default '0.5')
--evalue [n.n] Similarity e-value cut-off (default '1e-06')
--incseq Include FASTA input sequences in GFF3 output (default OFF)

R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
R scripting front-end version 3.4.4 (2018-03-15)
git version 2.17.1
GNU parallel 20161222

Missing dependencies: 0

#####################
#Checking R packages#
#####################
data.table 1.14.0
stringr 1.4.0
sybil 2.1.5
getopt 1.20.3
doParallel 1.0.16
foreach 1.5.1
R.utils 2.10.1
stringi 1.5.3
glpkAPI 1.3.2
BiocManager 1.30.12
Biostrings 2.46.0
jsonlite 1.7.2
CHNOSZ 1.4.1

Missing R packages: 0

##############################
#Checking basic functionality#
##############################
Optimization test: OK
Blast test: OK

Passed tests: 2/2

@Waschina
Copy link
Collaborator

Hi @bergthorT,

it looks like an issue related to the installed versions of sybilSBML R-Package and/or libsbml. Could you check, which versions you have installed? Also the R-Version with (3.4.4) might be a little old; or at least we have never tested gapseq with R-Versions older than 3.6.3 and we would recommend to update to 4.0.0 or later. Plus, awk --version does not seem to work, indicating that another issue might be here with the available awk version.

Alternatively you could try installing gapseq in an conda envrionment as indicated in the documentation.

Best, Silvio

@TBerg42
Copy link
Author

TBerg42 commented May 2, 2021

Dear @Waschina,

Thank you for your prompt reply. I tried using the conda environment instead as suggested. I now have a more up to date R and this also seems to remove the awk --version error. I did not notice any issues running the test now:


gapseq version: 1.1 b568a4e
linux-gnu
#60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020

#######################
#Checking dependencies#
#######################
GNU Awk 5.1.0, API: 3.0
sed (GNU sed) 4.8
grep (GNU grep) 3.4
This is perl 5, version 32, subversion 0 (v5.32.0) built for x86_64-linux-thread-multi
tblastn: 2.5.0+
exonerate from exonerate version 2.4.0
bedtools v2.30.0
barrnap 0.9 - rapid ribosomal RNA prediction
R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out"
R scripting front-end version 4.0.3 (2020-10-10)
git version 2.30.2
GNU parallel 20210422

Missing dependencies: 0

#####################
#Checking R packages#
#####################
data.table 1.14.0
stringr 1.4.0
sybil 2.1.5
getopt 1.20.3
doParallel 1.0.16
foreach 1.5.1
R.utils 2.10.1
stringi 1.5.3
glpkAPI 1.3.2
BiocManager 1.30.12
Biostrings 2.58.0
jsonlite 1.7.2
CHNOSZ 1.4.1

Missing R packages: 0

##############################
#Checking basic functionality#
##############################
Optimization test: OK
Blast test: OK

Passed tests: 2/2


I also don't seem to get any warning messages when I run gapseq now as I got previously. However, when I try to use the generated .xml models with COBRA I still get the following issue:


Output argument "qualifiers" (and maybe others) not assigned during call to "getDataBases".

Error in parseCVTerms (line 27)
[databases,identifiers,relations] = cellfun(@getDataBases, {CVTerms.resources},
{CVTerms.qualifier},'UniformOutput',0);

Error in readSBML (line 121)
[databases,identifiers,qualifiers] = cellfun(@parseCVTerms, cvterms,'UniformOutput',0);

Error in readCbModel (line 211)
model = readSBML(fileName,defaultBound);


I even tried it on the ecoli and "Toy" models that were included as examples and they both generate the same error when using COBRA on Matlab. This error does not occur when I use .xml models from literature. Did I misunderstand that the models generated should be in a correct SBML format and in such a way that they are ready to use for FBA analysis? Is this perhaps rather an issue with the COBRA function?

@Waschina
Copy link
Collaborator

Waschina commented May 2, 2021

Hi!
good to see, that some of the errors and warnings are resolved with the new installation.

It looks like that there's some 'flavour' in gapseq's sbml files, that the cobra toolbox in matlab does not like, presumably in some of the metabolite, reaction, or gene annotations... Unfortunately, I don't use matlab and have no easy way to check, what might be the issue.

To be sure, I checked the xmls from gapseq's toy directory in cobrapy, which worked without problems.

Maybe you could ask in the cobratoolbox community, what might be the problem? If there is a specific problem in the way gapseq annotates the models, which causes the issue in cobratoolbox, we can try to adjust our model output format.

Best
Silvio

@TBerg42
Copy link
Author

TBerg42 commented May 3, 2021

Hi,
This issue was flagged and a pull request to address this was opened (see here).
Implementing those changes makes the generated models function with the Matlab COBRA toolbox.
Thank you again so much for your quick help.

@TBerg42 TBerg42 closed this as completed May 3, 2021
@Waschina
Copy link
Collaborator

Waschina commented May 3, 2021

Great! I'm happy to see, that the models can also be further used in matlab :)
Thank you for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants