Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNPlocs.Hsapiens.dbSNP156.GRCh37 #167

Open
AndyYang0924 opened this issue Sep 6, 2023 · 4 comments
Open

SNPlocs.Hsapiens.dbSNP156.GRCh37 #167

AndyYang0924 opened this issue Sep 6, 2023 · 4 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@AndyYang0924
Copy link

Hi, how can I access to the SNPlocs.Hsapiens.dbSNP156.GRCh37 or SNPlocs.Hsapiens.dbSNP156.GRCh38. Thank you!

@Al-Murphy
Copy link
Collaborator

Hey Wenjun,

I don't believe Bioconductor versions of dbSNP 156 have been created yet - @hpages may have more information but I know it took time to create dbSNP 155 so I'm not sure of the timeline for this. Sorry Hervé, do you have any thoughts on this?

Thanks,
Alan.

@hpages
Copy link

hpages commented Sep 7, 2023

Hi Alan, Wenjun,

It might take a while before I get to produce the SNPlocs.Hsapiens.dbSNP156.* packages. The approach I'm currently using for generating the SNPlocs packages has reached its limits and doesn't scale well with the ever increasing size of dbSNP. So it would need to be revisited e.g. by splitting the whole thing into smaller packages or by moving the data to AnnotationHub or both. It might take a while before I get to this.

In the mean time, if you really need SNPlocs.Hsapiens.dbSNP156.GRCh37 now, you can try to forge it by using the scripts provided in the SNPlocsForge package here. The package lacks documentation, sorry. The scripts for dbSNP156 are in inst/scripts/dbSNP156/. You first need to manually create the shell of the SNPlocs.Hsapiens.dbSNP156.GRCh37 package (use the SNPlocs.Hsapiens.dbSNP155.GRCh37 package as a template). Then run the following scripts in that order: download_json.sh, extract_snvs_from_RefSNP_json_files.sh, select_GRCh37_snvs.sh, build_GRCh37_OnDiskLongTable.sh.

Note that you'll need a powerful Linux machine to run these scripts (I used a machine with 80 logical cpus and 384 Gb of RAM to forge the SNPlocs.Hsapiens.dbSNP155.* packages, and the scripts took about 1 week for each package). You'll also need a lot of disk space (300 or 400 Gb or something like that).

Let me know if you decide to give it a try and I'll do my best to help.

Best,
H.

@Al-Murphy
Copy link
Collaborator

Hey Herve,

Thanks very much for the explanation, this is not something I have time/resources to do right now but I do believe it's important to find a more manageable way to produces these packages with subsequent releases. I'll get in touch with any suggestions on how to do this in the future.

Cheers,
Alan.

@Al-Murphy Al-Murphy added enhancement New feature or request help wanted Extra attention is needed labels Sep 21, 2023
@Al-Murphy
Copy link
Collaborator

Let's leave this open for now since it has not been addressed in any meaningful way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants