Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: index 0 is out of bounds for axis 0 with size 0 #53

Open
sheikki opened this issue Feb 3, 2022 · 2 comments
Open

IndexError: index 0 is out of bounds for axis 0 with size 0 #53

sheikki opened this issue Feb 3, 2022 · 2 comments

Comments

@sheikki
Copy link

sheikki commented Feb 3, 2022

How to reproduce:

wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/007/463/835/GCA_007463835.1_PDT000542330.1/GCA_007463835.1_PDT000542330.1_ge
nomic.fna.gz
gunzip GCA_007463835.1_PDT000542330.1_genomic.fna.gz
sistr -m --qc -f tab -t 2 -i GCA_007463835.1_PDT000542330.1_genomic.fna GCA_007463835.1_PDT000542330.1 -o GCA_007463835.1_PDT000542330.1
Traceback (most recent call last):
  File "/usr/local/bin/sistr", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/sistr/sistr_cmd.py", line 410, in main
    outputs = [x.get() for x in res]
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
IndexError: index 0 is out of bounds for axis 0 with size 0

As far as I can tell, there is nothing weird about this putative Salmonella genome assembly..

@kbessonov1984
Copy link
Collaborator

kbessonov1984 commented Feb 3, 2022

Hello,
I was able to reproduce your error for GCA_007463835.1 Salmonella enterica subsp. salamae that is caused by not WHO listed antigenic formula II 1,9,12,46,27:z29:1,5 for cgMLST prediction mapping to serovar. This causes the results dataframe to be empty, as there are no compatible WHO serovar to map this antigenic formula to. The exact location of the crash in the source code occurs in antigen_predictor.lookup_serovar_antigens(serovar_table(),cgmlst_serovar) function:

Traceback (most recent call last):
  File "/sistr/sistr_cmd.py", line 275, in sistr_predict
    overall_serovar_call(prediction, serovar_predictor)
  File "/sistr/src/serovar_prediction/__init__.py", line 542, in overall_serovar_call
    cgmlst_serovar_antigens = antigen_predictor.lookup_serovar_antigens(serovar_table(),cgmlst_serovar)
  File "/sistr/src/serovar_prediction/__init__.py", line 390, in lookup_serovar_antigens
    spp = df_prediction['subspecies'].values.item(0)
IndexError: index 0 is out of bounds for axis 0 with size 0

Thank you for reporting this legitimate issue. Will definitely need to address this edge case in code in the next release.

@sheikki
Copy link
Author

sheikki commented Feb 3, 2022

Hi,

Here are two more assemblies which fail similarly:

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/017/826/655/GCA_017826655.1_PDT001000035.1_genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/017/845/815/GCA_017845815.1_PDT001001406.1_genomic.fna.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants