Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent removing only common subsumers in resnik similarity #239

Open
johnbradley opened this issue Sep 24, 2021 · 2 comments
Open

Prevent removing only common subsumers in resnik similarity #239

johnbradley opened this issue Sep 24, 2021 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@johnbradley
Copy link
Contributor

The resnik_similarity() function removes terms with 0 frequency. It is possible that this logic might remove the only common subsumer between two terms which would produce invalid results. Prevent this problem from occurring.

See #235 (comment) for more details.

@johnbradley
Copy link
Contributor Author

johnbradley commented Sep 24, 2021

@hlapp suggested the following:
After removing terms with 0 frequency check Jaccard similarity. If any terms have a Jaccard similarity of 0 raise an error instead. It would also be good to show a warning when removing any rows as users might not expect that this is happening.

@hlapp
Copy link
Member

hlapp commented Sep 24, 2021

After removing them, not before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants