Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process gets killed over 50M points #18

Open
mrp3anut opened this issue Jan 4, 2024 · 4 comments
Open

Process gets killed over 50M points #18

mrp3anut opened this issue Jan 4, 2024 · 4 comments

Comments

@mrp3anut
Copy link

mrp3anut commented Jan 4, 2024

I am facing an issue where if I try to cluster more then 50M points my kernel/python process dies. And weirdly it doesn't seem to be a memory issue. I monitored the memory usage throughout and it never exceeded %50. Because the process is killed I also don't get an error stack I can provide. Any ideas?

@lmcinnes
Copy link
Contributor

lmcinnes commented Jan 6, 2024

That is weird, and I'm not sure what the best way to debug that is -- potentially it is dying inside of the compiled numba code and that's why it just fails out. In an ideal world you might be able to try running it with numba off (set the environment variable NUMBA_DISABLE_JIT to 1), but potentially it will be so slow at that point that you won't get to your crash. It might be worth a try anyway, just in case.

@mrp3anut
Copy link
Author

mrp3anut commented Jan 6, 2024

So I ran the code as a script and got an exit code 139. It is a segmentation error which is probably like you mentioned related to the numba part. Maybe due to some set variable types and limits?

@mrp3anut
Copy link
Author

mrp3anut commented Jan 8, 2024

So I managed to track down where the error occurs. It is the eom_recursion function in cluster_trees.py. My guess is it is hitting a recursion limit. Switching to cluster_selection_method='leaf' seems to work. And increasing recursion depth also might solve the issue for eom.

@lmcinnes
Copy link
Contributor

lmcinnes commented Jan 8, 2024

Good catch -- yes, it is certainly possible that with that many points you are hitting a recursion limit; it might be internal to numba however, which might make it harder to remedy. If the python recursion limit setting isn't enough I would suggest trying to reach out to the numba team (they are usually pretty responsive on gitter) and see if they have ideas.

You can also try increasing the min_cluster_size which would simplify the tree a little.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants