Skip to content

Latest commit

 

History

History
130 lines (89 loc) · 5.94 KB

DEVELOPMENT.md

File metadata and controls

130 lines (89 loc) · 5.94 KB

Deployment

The KDEpy project uses GitHub Actions to build wheels for Linux, Mac and Windows. Wheels are built using the cibuildwheel Python package. After developing, the following will push a tagged commit -- from there CI will build wheels to distribute to PyPI automatically.

$ <run tests and linting>
$ git commit -m "Release of v0.X.Y"
$ git tag v0.X.Y
$ git push origin V0.X.Y

Milestones

The list below roughly shows what needs to be done.

  • univariate BaseKDE (todo: check if more common code can be moved)

  • univariate NaiveKDE

  • univariate TreeKDE

  • univariate FFTKDE (implement linbin even faster in cython)

  • univariate DiffusionKDE

  • Refactor kernel funcs - add solver for effective bandwidth

  • Implement univariate, fixed bandwidth KDEs naively

  • Implement weighted, fixed bandwidth, univariate KDEs naively

  • Implement variable bandwidth KDEs naively

  • Implement TreeKDE, test against other implementations

  • Implement Scott and Silverman rules for bandwidth selection

  • Make sure that speed and functionally matches statsmodels, scikit-learn and scipy

  • Implement methods taking care of boundaries

  • Make sure TreeKDE works without finite support too

Misc

General guidelines for this project

I hope to follow these guidelines for this project:

  • Import as few external dependencies as possible, ideally only NumPy.
  • Use test driven development, have tests and docs for every method.
  • Cite literature and implement recent methods.
  • Unless it's a bottleneck computation, readability trumps speed.
  • Employ object orientation, but resist the temptation to implement many methods - stick to the basics.
  • Follow PEP8

Theory

Existing implementations

Implementations in Python

Implementations in conda packages

  • sklearn/neighbors/kde.py
  • scipy/stats/kde.py
  • statsmodels/nonparametric/*
  • seaborn/distributions.py

Other Python implementations

Implementations in other languages

References

Books

  • Silverman, B. W. Density Estimation for Statistics and Data Analysis. Boca Raton: Chapman and Hall, 1986. -- Page 99 for reference to kd-tree
  • Wand, M. P., and M. C. Jones. Kernel Smoothing. London ; New York: Chapman and Hall/CRC, 1995. -- Page 182 for computation using linbin and fft

Wikipedia and other articles

Papers

Misc

Computation

  • An Algorithm for Finding Best Matches in Logarithmic Expected Time, Friedman et al, DOI 10.1145/355744.355745

https://www.ics.uci.edu/~ihler/code/kde.html

http://www-stat.wharton.upenn.edu/~lzhao/papers/MyPublication/Fast_jcgs.2010.pdf

https://indico.cern.ch/event/397113/contributions/1837849/attachments/1213965/1771772/main.pdf

http://www.cs.ubc.ca/~nando/papers/empirical.pdf

http://iopscience.iop.org/article/10.1088/1742-6596/762/1/012042/pdf

https://arxiv.org/pdf/1206.5278.pdf

https://www.researchgate.net/publication/228773329_Insights_on_fast_kernel_density_estimation_algorithms