-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix computation of Word2Vec
loss & add loss value to logging string
#2135
Open
alreadytaikeune
wants to merge
20
commits into
piskvorky:develop
Choose a base branch
from
alreadytaikeune:develop
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Commits on Jul 19, 2018
-
Fixing the computation of the Word2Vec loss.
This commit re-writes the computation of the loss for both CBOW and SG Word2Vec. The loss that is computed and reported is the running average NCE loss within the epoch. This means that for each new epoch, the counters are reset to 0, and the new average is computed. This was not the cas before, and the loss was incremented during the whole training, which is not very informative, beside being also incorrect in the implementation (see below) The computation of the word2vec loss was flawed in many ways: - race condition on the running_training_loss parameter (updated concurrently in a GIL-free portion of the code) - incorrect dividing factor for the average in the case of SG. The averaging factor in the case of SG should not be the effective words, but the effective samples (a new variable I introduce), because the loss is incremented as many times as there are positive examples that are sampled for an effective word. Addtionnally, I add the logging of the current value of the loss in the progress logger, when compute_loss is set to True, and I add a parameter to the word2vec_standalone script to trigger the reporting of the loss.
Configuration menu - View commit details
-
Copy full SHA for e96798c - Browse repository at this point
Copy the full SHA e96798cView commit details -
Configuration menu - View commit details
-
Copy full SHA for f447df0 - Browse repository at this point
Copy the full SHA f447df0View commit details -
Configuration menu - View commit details
-
Copy full SHA for a6548c4 - Browse repository at this point
Copy the full SHA a6548c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for a2fd340 - Browse repository at this point
Copy the full SHA a2fd340View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1bdd4a5 - Browse repository at this point
Copy the full SHA 1bdd4a5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7b457d6 - Browse repository at this point
Copy the full SHA 7b457d6View commit details
Commits on Aug 6, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 18735e2 - Browse repository at this point
Copy the full SHA 18735e2View commit details
Commits on Aug 14, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 0bcae41 - Browse repository at this point
Copy the full SHA 0bcae41View commit details
Commits on Oct 1, 2018
-
Merging work done in PR piskvorky#2127
In word2vec_inner.pyx, functions now used the new config object while still returning the number of samples. In base_any2vec, logging includes the new loss values, (the addition of this branch)
Configuration menu - View commit details
-
Copy full SHA for eb4b14d - Browse repository at this point
Copy the full SHA eb4b14dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 00e7b7d - Browse repository at this point
Copy the full SHA 00e7b7dView commit details
Commits on Jan 9, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 995b5f8 - Browse repository at this point
Copy the full SHA 995b5f8View commit details
Commits on Jan 18, 2019
-
Fixing broken interface with doc2vec
akhlif committedJan 18, 2019 Configuration menu - View commit details
-
Copy full SHA for 6b46f64 - Browse repository at this point
Copy the full SHA 6b46f64View commit details -
akhlif committed
Jan 18, 2019 Configuration menu - View commit details
-
Copy full SHA for f6a5cc5 - Browse repository at this point
Copy the full SHA f6a5cc5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3a453a9 - Browse repository at this point
Copy the full SHA 3a453a9View commit details -
Configuration menu - View commit details
-
Copy full SHA for a8e4a66 - Browse repository at this point
Copy the full SHA a8e4a66View commit details -
akhlif committed
Jan 18, 2019 Configuration menu - View commit details
-
Copy full SHA for 854c8fd - Browse repository at this point
Copy the full SHA 854c8fdView commit details
Commits on Jan 22, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 5e21a85 - Browse repository at this point
Copy the full SHA 5e21a85View commit details
Commits on Feb 20, 2019
-
Fixing docstrings and code redundancy
akhlif committedFeb 20, 2019 Configuration menu - View commit details
-
Copy full SHA for 0f4d572 - Browse repository at this point
Copy the full SHA 0f4d572View commit details -
Merge branch 'develop' of github.com:alreadytaikeune/gensim into develop
akhlif committedFeb 20, 2019 Configuration menu - View commit details
-
Copy full SHA for 3eec299 - Browse repository at this point
Copy the full SHA 3eec299View commit details
Commits on Feb 21, 2019
-
Configuration menu - View commit details
-
Copy full SHA for aaf9ed9 - Browse repository at this point
Copy the full SHA aaf9ed9View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.