Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace genome_ref file with Bioconductor data #53

Open
bschilder opened this issue Nov 11, 2021 · 2 comments
Open

Replace genome_ref file with Bioconductor data #53

bschilder opened this issue Nov 11, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@bschilder
Copy link
Collaborator

Downloading the genome_ref file is one of the slower steps.

input_url <- "https://ctg.cncr.nl/software/MAGMA/ref_data/g1000_eur.zip"

Would be best if we could switch to using a Bioconductor package that has the same info, and then save that to disk so MAGMA can access it.

e.g.
https://bioconductor.org/packages/release/data/annotation/html/BSgenome.Hsapiens.1000genomes.hs37d5.html

@bschilder
Copy link
Collaborator Author

Currently, we only use the European subset of 1KG. We should definitely expand to the other sub-populations MAGMA makes available as well:

https://ctg.cncr.nl/software/MAGMA/ref_data

@bschilder
Copy link
Collaborator Author

After exploring BSgenome.Hsapiens.1000genomes.hs37d5 it seems this is just a single genome ref (not the individual-level genotype data from 1kg). It's also only phase 2 data.

After some additional searching, I can't seem to do any package that contains the data we're looking for. So for now, sticking with the MAGMA data.

@bschilder bschilder added the enhancement New feature or request label Nov 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant