Skip to content

how to dynamic add/delete documents #20

Answered by xhluca
luoyangen asked this question in Q&A
Discussion options

You must be logged in to vote

At the moment, it is not possible to add or remove a document. The reason a document cannot be added is because the BM25 scores are computed based on the term-frequency and average document lengths, which are both based on the corpus when it is indexed. So modifying the corpus after the scores are computed (during indexing) means that all the scores need to be recomputed. Thus, it is the same as re-indexing a new corpus.

A similar discussion can be found here: #5

Note that traditional bm25 implementations, like rank-bm25, support this - so I recommend checking it out if you specifically need dynamic add/remove. Otherwise, bm25s should be fairly fast for re-indexing for small documents (<5…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@luoyangen
Comment options

@xhluca
Comment options

Answer selected by luoyangen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #19 on July 08, 2024 03:14.