Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example where KMedoids is better than existing scikit-learn clustering algorithms #22

Open
znd4 opened this issue Jul 28, 2019 · 6 comments

Comments

@znd4
Copy link
Contributor

znd4 commented Jul 28, 2019

From @rth in #12:

A few more comments @zdog234 , otherwise (after a light review) LGTM.

We adopted black for code style recently. Please run black sklearn_extra/ examples/ for fixing the linter CI.

I would rather we merged this and opened follow up issues than keep this PR open until everything is perfect.

Maybe @jeremiedbb who worked on KMeans lately would also have some comments.

Later it would be nice to add an example on some dataset where KMedoids is a better than existing scikit-learn clustering algorithms as discussed in scikit-learn/scikit-learn#11099 (comment)

@kno10
Copy link
Contributor

kno10 commented Jan 26, 2020

The current code implements an inferior algorithm, so I'd rather suggest to compare the results of non-Python implementations (R, ELKI, pip install kmedoids) for now if you want to study result quality.

@TimotheeMathieu
Copy link
Contributor

TimotheeMathieu commented Jun 26, 2020

kmedoid can be better than kmeans for example for robust purposes. For example, see this figure where kmedoid gives a really good result while kmeans detect any outlier as belonging to a class of its own (the data consist in 3 gaussian blobs and an "outlier" group situated far away from these blobs, and I don't know a lot of clustering algorithm that would exhibit this kind of robustness (in fact kmedoid is a little more stable on this example than the algorithm I did specifically to be robust, the second figure). This example could be added to the doc I think.

@rth
Copy link
Contributor

rth commented Jun 26, 2020

This example could be added to the doc I think.

That would be great! Do you already have the code for that example @TimotheeMathieu ?

@TimotheeMathieu
Copy link
Contributor

TimotheeMathieu commented Jun 26, 2020

Yes, in fact it is an example I came up for the PR #42, you can find it here, I just added k-medoid with default parameters and I got the result displayed. Maybe it would be interesting to change the doc page I made to include k-medoid because in fact k-medoid is robust. I will try making a PR for this if it's alright for you.

@rth
Copy link
Contributor

rth commented Jun 26, 2020

That would be great thank you !

@rth
Copy link
Contributor

rth commented Jun 26, 2020

In general if you see other things to improve in this repo don't hesitate to submit PRs, we are actively looking for maintainers :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants