Skip to content

Commit

Permalink
Merge pull request #29 from mancusolab/test_spectral
Browse files Browse the repository at this point in the history
Enhancement
  • Loading branch information
zeyunlu committed Sep 25, 2023
2 parents adb0313 + 302d0be commit 2d07a6f
Show file tree
Hide file tree
Showing 11 changed files with 1,249 additions and 1,141 deletions.
6 changes: 6 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@ Version History
- Update io.corr function so that report all the correlation results no matter cs is pruned or not.
* - 0.13
- Add ``--keep`` command to enable user to specify a file that contains the subjects ID SuShiE will perform on. Add ``--ancestry_index`` command to enable user to specify a file that contains the ancestry index for fine-mapping. With this, user can input single phenotype, genotype, and covariate file that contains all the subjects across ancestries. Implement padding to increase inference time. Record elbo at each iteration and can access it in the ``infer.SuShiEResult`` object. The alphas table now outputs the average purity and KL divergence for each ``L``. Change ``--kl_threshold`` to ``--divergence``. Add ``--maf`` command to remove SNPs that less than minor allele frequency threshold within each ancestry. Add ``--max_select`` command to randomly select maximum number of SNPs to compute purity to avoid unnecessary memory spending. Add a QC function to remove duplicated SNPs.
* - 0.14
- Remove KL-Divergence pruning. Enhance command line appearance and improve the output files contents. Fix small bugs on multivariate KL.

.. _Support:
.. |Support| replace:: **Support**
Expand All @@ -159,6 +161,10 @@ Feel free to use other software developed by `Mancuso Lab <https://www.mancusola

* `twas_sim <https://github.com/mancusolab/twas_sim>`_: a Python software to simulate `TWAS <https://www.nature.com/articles/ng.3506>`_ statistics.

* `FactorGo <https://github.com/mancusolab/factorgo>`_: a scalable variational factor analysis model that learns pleiotropic factors from GWAS summary statistics.

* `HAMSTA <https://github.com/tszfungc/hamsta>`_: a Python software to estimate heritability explained by local ancestry data from admixture mapping summary statistics.

---------------------

.. _pyscaffold-notes:
Expand Down
5 changes: 0 additions & 5 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,11 +114,6 @@
(r"class:sushie.infer.*", "Infer Classes"),
(r".*:sushie.io.*", "IO Public-members"),
(r"class:sushie.io.*", "IO Classes"),
# (r"method:.*\.__(str|repr)__", "String representation"),
# ("method:.*", "Methods"),
# ("classmethod:.*", "Class methods"),
# (r"method:.*\.__(init|new)__", "Constructors"),
# (r"method:.*\.[A-Z][a-z]*", "Constructors"),
]

python_apigen_default_order = [
Expand Down
34 changes: 21 additions & 13 deletions docs/files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,14 @@ For ``*.meta.cs.tsv``, it will row-bind the output for single-ancestry SuShiE di
- Float
- 0.95
- The cumulative posterior probability of SNPs to be causal in the descending order. This decides which SNPs are included in the credible sets.
* - pip
* - pip_all
- Float
- 0.95
- The posterior inclusion probability (:math:`\text{PIP}_j` in :ref:`Model`). For ``*.meta.cs.tsv`` and ``*.mega.cs.tsv``, it will have ``meta_pip`` and ``mega_pip``, respectively.
- The posterior inclusion probability (:math:`\text{PIP}_j` in :ref:`Model`) calculated across :math:`L` credible sets. For ``*.meta.cs.tsv``, it will have additional column called ``meta_pip_all`` .
* - pip_cs
- Float
- 0.95
- The posterior inclusion probability (:math:`\text{PIP}_j` in :ref:`Model`) calculated across credible sets that are kept after pruning based on purity. For ``*.meta.cs.tsv``, it will have additional column called ``meta_pip_cs``.
* - trait
- String
- GeneABC
Expand Down Expand Up @@ -147,14 +151,10 @@ For ``*.meta.alphas.tsv``, it will row-bind the output for single-ancestry SuShi
- float
- 0.634
- The sample-size-weighted average purity across ancestries. To compare with the ``--purity``, it will decide the value in ``in_cs_l1``. Depending on ``--L``, it can have extra columns.
* - kl_l1
- float
- 5.3
- The KL-divergence between the posterior probability of SNPs to be causal in the first credible set (:math:`\alpha_{l,j}` in :ref:`Model`) and uniform distribution. It will be ``-jnp.inf`` if ``--no_kl`` is specified. Depending on ``--L``, it can have extra columns.
* - pass_pruning_l1
* - kept_l1
- Integer
- 0, 1
- The indicator whether the credible set passes the pruning threshold. The criteria contain purity and divergence. Specifying ``--no_kl`` removes divergence criterion. Depending on ``--L``, it can have extra columns.
- The indicator whether the credible set is kept after pruning based on purity threshold. Depending on ``--L``, it can have extra columns.
* - trait
- String
- GeneABC
Expand All @@ -163,6 +163,10 @@ For ``*.meta.alphas.tsv``, it will row-bind the output for single-ancestry SuShi
- Integer
- 500
- The number of total SNPs in the inference.
* - purity_threshold
- float
- 0.5
- The purity threshold to prune the credible sets.
* - ancestry
- String
- sushie, mega, ancestry_1
Expand Down Expand Up @@ -215,14 +219,18 @@ If ``--meta`` and ``--mega`` are specified (see definitions in :ref:`meta`), it
- Float
- 1.3
- The ancestry-specific SNP prediction weights inferred by SuShiE. For ``*.meta.weights.tsv``, it will have ``ancestry1_single_weight`` (It will have extra columns depending on the number of ancestries). If ``--mega``, it will have ``mega_weight`` for all ancestries.
* - sushie_pip
* - sushie_pip_all
- Float
- 0.95
- The posterior inclusion probability (:math:`\text{PIP}_j` in :ref:`Model`) for all the SNPs. (``*.cs.tsv`` only contains the PIPs of SNPs that are only in the credible sets). For ``*.meta.weights.tsv``, it will have ``ancestry1_single_pip``, ``meta_pip`` (It will have extra columns depending on the number of ancestries). For ``*.mega.weights.tsv``, it will have ``mega_pip``.
* - sushie_in_cs
- The posterior inclusion probability (:math:`\text{PIP}_j` in :ref:`Model`) for all the SNPs calculated across :math:`L` credible sets. (``*.cs.tsv`` only contains the PIPs of SNPs that are only in the credible sets). For ``*.meta.weights.tsv``, it will have ``ancestry1_single_pip``, ``meta_pip_all`` (It will have extra columns depending on the number of ancestries). For ``*.mega.weights.tsv``, it will have ``mega_pip_all``.
* - sushie_pip_cs
- Float
- 0.95
- The posterior inclusion probability (:math:`\text{PIP}_j` in :ref:`Model`) for all the SNPs calculated across credible sets that are kept after purning based on purity. (``*.cs.tsv`` only contains the PIPs of SNPs that are only in the credible sets). For ``*.meta.weights.tsv``, it will have ``ancestry1_single_pip``, ``meta_pip_cs`` (It will have extra columns depending on the number of ancestries). For ``*.mega.weights.tsv``, it will have ``mega_pip_cs``.
* - sushie_cs_index
- Integer
- 0, 1
- The indicator whether the SNP is in the credible set (0 means no and 1 means yes). For ``*.meta.weights.tsv``, it will have ``ancestry1_in_cs``(It will have extra columns depending on the number of ancestries). For ``*.mega.weights.tsv``, it will have ``mega_in_cs``.
- 0, 1, ..., :math:`L`
- The credible set index where the SNPs fall into. 0 means no credible sets contain this SNP. For ``*.meta.weights.tsv``, it will have ``ancestry1_cs_index``(It will have extra columns depending on the number of ancestries). For ``*.mega.weights.tsv``, it will have ``mega_cs_index``.
* - n_snps
- Integer
- 500
Expand Down
Loading

0 comments on commit 2d07a6f

Please sign in to comment.