Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using more than 2 bam files with mode 1B #94

Open
cameronyoungpark opened this issue May 30, 2023 · 5 comments
Open

Error when using more than 2 bam files with mode 1B #94

cameronyoungpark opened this issue May 30, 2023 · 5 comments

Comments

@cameronyoungpark
Copy link

Hello,
I am trying to use cellsnp-lite to format genotype files for demultiplexing with vireo. There are 8 genotypes in my multiplexed sample and I am trying to use 3-4 known genotyped samples to aid in the demultiplexing. The method works great with only 2 but increasing to 3 I get lots of errors.
step 1:
cellsnp-lite --genotype -R ref_file -s multiplexed_singlecell_file -b barcodes.tsv -O output_folder -p 22 --minMAF 0.1 --minCOUNT 100 --gzip

step 2: This works great for 2 files, but then doesnt seem to work with 3- I get 20 extra files for each of the expected 6 output files.

cellsnp-lite -s BAM1, BAM2, BAM3 -I donor1, donor2, donor3 -O germline_folder -R ref_file -p 20 --cellTAG None --UMItag None --gzip --genotype
step 3- When I run step 2 with 3 bam files, I get an Index out of range error that I do not get when only using 2 bam files in step 2

vireo -c output_folder/cellSNP.cells.vcf.gz -d germline_folder/cellSNP.cells.vcf.gz -o vireo_output -p 12 -N 8
Is it possible to run step 2 with multiple bam files? If so, would love some input on what I am doing incorrectly!
Thanks!

@hxj5
Copy link
Collaborator

hxj5 commented May 31, 2023

Hi, could you share the log file, especially the "index out of range error" part you mentioned? That should be helpful.

@cameronyoungpark
Copy link
Author

Hello, I don't have the log output file because it did not successfully run vireo (not sure if you mean a different log file) but here is example of the terminal output with the error:
(base) cyp2111_columbia_edu@vireo:~$ vireo -c /home/cyp2111_columbia_edu/KMA3_1.cellsnp/cellSNP.cells.vcf.gz -d /home/cyp2111_columbia_edu/germlineKMA3/cellSNP.cells.vcf.gz -o /home/cyp2111_columbia_edu/KMA3_1.vireo/ -p 12 -N 8
[vireo] Loading cell VCF file ...
[vireo] Loading donor VCF file ...
[vireo] 5500 out 6893 variants matched to donor VCF
[vireo] Demultiplex 27532 cells to 8 donors with 5500 variants.
[vireo] lower bound ranges [-63696.5, -61707.1, -58860.4]
[vireo] allelic rate mean and concentrations:
[[0.013 0.448 0.952]]
[[86129.4 74625.5 37268.1]]
[vireo] donor size before removing doublets:
donor0 donor1 donor2 donor3 donor4 donor5 donor6 donor7
3403 3369 3456 3466 3444 3461 3437 3494
Traceback (most recent call last):
File "/home/cyp2111_columbia_edu/anaconda3/bin/vireo", line 8, in
sys.exit(main())
File "/home/cyp2111_columbia_edu/anaconda3/lib/python3.8/site-packages/vireoSNP/vireo.py", line 217, in main
write_donor_id(out_dir, donor_names, cell_dat['samples'], n_vars, res_vireo)
File "/home/cyp2111_columbia_edu/anaconda3/lib/python3.8/site-packages/vireoSNP/utils/io_utils.py", line 99, in write_donor_id
donor_singlet = np.array(donor_names, "U100")[np.argmax(ID_prob, axis=1)]
IndexError: index 7 is out of bounds for axis 0 with size 7

@hxj5
Copy link
Collaborator

hxj5 commented Jun 1, 2023

Hi, you mentioned there were 20 extra files for each of the expected 6 output files in step 2, which indicates either cellsnp had not finished yet, or some errors occured. It should help a lot if you could run step 2 again and then share the log file of step 2.

@cameronyoungpark
Copy link
Author

cameronyoungpark commented Jun 1, 2023 via email

@hxj5
Copy link
Collaborator

hxj5 commented Jun 2, 2023

Hi, you can use Linux I/O redirection to get the log file (i.e., run cellsnp-lite [options...] &>log_file), or you can directly share the terminal output of step 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants