Fix computation of `Word2Vec` loss & add loss value to logging string #2135

alreadytaikeune · 2018-07-19T09:05:54Z

The current computation of word2vec losses has flaws. I address them in this PR.

This PR re-writes the computation of the loss for both CBOW and SG Word2Vec. The loss that is computed and reported is the running average NCE loss within the epoch. This means that for each new epoch, the counters are reset to 0, and the new average is computed. This was not the case before, and the loss was incremented during the whole training (but the total_examples used to average was only incremented during on epoch), which is not very informative, beside being also incorrect in the implementation. Moreover, reporting the average loss on an epoch appeared to me more in the spirit of what was trying to be achieved before.

The computation of the word2vec loss was flawed in many ways:

race condition on the running_training_loss parameter (updated concurrently in a
GIL-free portion of the code)
As mentioned above, there was an incoherence between the accumulation of the loss on the whole run, and the reset of the averaging factor at each epoch.
Even if the points above were fixed, remains the following: the dividing factor for the average in the case of SG is wrong. The averaging factor in the case of SG should not be the effective words, but the effective samples (a new variable I introduce), because the loss is incremented as many times as there are positive examples that are sampled for an effective word.

Additionnally, I add the logging of the current value of the loss in the progress logger, when compute_loss is set to True, and I add a parameter to the word2vec_standalone script to trigger the reporting of the loss.

As a hint towards the fact that the current implementation is now correct, one can look at the first reported values of the loss, when the word embeddings are still relatively uniformly distributed. In this situation, the expectancy of the NCE loss (for Skip-Gram) should be -(N+1)\log(\sigma(0)). Which is 5.545 for N=7, 14.556 for N=20... which corresponds to the reported loss values in the current implementation.

This commit re-writes the computation of the loss for both CBOW and SG Word2Vec. The loss that is computed and reported is the running average NCE loss within the epoch. This means that for each new epoch, the counters are reset to 0, and the new average is computed. This was not the cas before, and the loss was incremented during the whole training, which is not very informative, beside being also incorrect in the implementation (see below) The computation of the word2vec loss was flawed in many ways: - race condition on the running_training_loss parameter (updated concurrently in a GIL-free portion of the code) - incorrect dividing factor for the average in the case of SG. The averaging factor in the case of SG should not be the effective words, but the effective samples (a new variable I introduce), because the loss is incremented as many times as there are positive examples that are sampled for an effective word. Addtionnally, I add the logging of the current value of the loss in the progress logger, when compute_loss is set to True, and I add a parameter to the word2vec_standalone script to trigger the reporting of the loss.

piskvorky · 2018-07-19T09:24:32Z

Thanks @alreadytaikeune . The training loss is a recent addition and may contain errors. What is your use-case for using it? How did you find about this issue?

Re. your PR: what are the performance implications of your changes? Can you post a before/after benchmark?

alreadytaikeune · 2018-07-19T09:35:17Z

Thanks for your reply. I started digging into it because I needed to compare the evolution and values of the loss between two word2vec implementations. Therefore I needed to understand exactly how the loss was computed in gensim which lead me to uncover some issues and proposing these changes. Are the issues I pointed out clear to you? I don't know if I expressed them well enough.

As to the benchmark, I haven't really done any since the changes are very minor and should really not impact performances, but I'll do one if needed. Do you have some standard tools for this purpose that I can use, or do I need to write mine?

Finally, it seems the documentation has a hard time building. From what I understand, it is the lines

-loss 
                If present, the loss will be computed and printed during training

in the word2vec_standalone.py that are the cause. The problem is that there isn't any type after loss, because it is only a flag. I am not sure what the sphinx syntax is for this kind of stuff. Any idea?

alreadytaikeune · 2018-07-19T13:27:11Z

@piskvorky Everything seems to be fixed now.

piskvorky · 2018-07-22T15:11:11Z

Thanks! Can you run the benchmark too? Simply train a model on the text9 corpus with/without this PR, using 4 parallel workers and computing / not computing the loss. Then post the four logs (old+loss, old-loss, new+loss, new-loss) to gist and add the links here in a comment.

There'll be some minor variance in the timings, even with same parameters, that's expected. But we're interested in a sanity check here (<<10% difference).

alreadytaikeune · 2018-07-23T12:31:44Z

@piskvorky

I have run the tests you said in the following environment

FROM python:2
RUN mkdir -p /home/gensim_user
WORKDIR /home/gensim_user
RUN apt-get update && apt-get install -y wget build-essential 
RUN pip install numpy scipy six smart_open

Using the script below:

#! /bin/bash

mkdir -p data
cur_dir=$(pwd)

if [ ! -f data/text9 ]; then
  if [ ! -f data/enwik9.zip ]; then
    wget -c http://mattmahoney.net/dc/enwik9.zip -P data
  fi
  if [ ! -f data/enwik9 ]; then
    unzip data/enwik9.zip -d data
  fi
  perl wikifil.pl data/enwik9 > data/text9
fi

GENSIM=/home/gensim_user/gensim
ARGS="-train data/text9 -output /tmp/test -window 5 -negative 5 -threads 4 -min_count 5 -iter 5 -cbow 0"
SCRIPT=/usr/local/lib/python2.7/site-packages/gensim-3.5.0-py2.7-linux-x86_64.egg/gensim/scripts/word2vec_standalone.py
if [ -d $GENSIM ]; then
  rm -r $GENSIM
fi
cd /home/gensim_user/ && git clone https://github.com/RaRe-Technologies/gensim && cd gensim &&\
git checkout 96444a7d0357fc836641ec32c65eb2fbffbee68d && python setup.py install
cd $cur_dir
echo "RUNNING WORD2VEC BASE NO LOSS"
python $SCRIPT $ARGS 2>&1 | tee logs_base_word2vec_no_loss
pip uninstall -y gensim
cd /home/gensim_user/ && git clone https://github.com/alreadytaikeune/gensim && cd gensim && git checkout develop && python setup.py install
cd $cur_dir
echo "RUNNING WORD2VEC NEW NO LOSS"
python $SCRIPT $ARGS 2>&1 |tee logs_new_word2vec_no_loss
echo "RUNNING WORD2VEC NEW+LOSS"
python $SCRIPT $ARGS -loss 2>&1 |tee logs_new_word2vec_loss

The results of the run can be found here:
(Base no loss): https://gist.github.com/alreadytaikeune/554f21e95aa73ed414482d07b2e6314b
(New no loss): https://gist.github.com/alreadytaikeune/a184e88c059e038e2023170bb29a1eb7
(New loss): https://gist.github.com/alreadytaikeune/62ac3ca2da0d5d860404f0993acc81ac

I've only run them once, because it takes a bit of time. If we need more solid proof, probably we should settle on a smaller size corpus and average more runs. Anyways, it seems that the cost incurred to the computation of the loss amounts to around 3% of the total runtime, while the computation time is left unaffected (ignoring small variations between runs) as was expected.

I haven't run the benchmark on the base code with the loss computation, because if we agree on the fact that the previous computation is flawed, then I don't really see the relevance. Maybe we should discuss more on this point to double check my reasoning and implementation.

piskvorky · 2018-07-23T18:45:11Z

Thanks a lot @alreadytaikeune ! That sounds good to me.

Let's wait for @menshikh-iv 's final verdict & merge, once he gets back from holiday.

menshikh-iv

Wow, nice catch @alreadytaikeune 👍 please make needed fixes and I think we can merge it :)

menshikh-iv · 2018-08-01T06:12:02Z

gensim/models/base_any2vec.py

@@ -124,7 +124,17 @@ def _clear_post_train(self):
        raise NotImplementedError()

    def _do_train_job(self, data_iterable, job_parameters, thread_private_mem):
-        """Train a single batch. Return 2-tuple `(effective word count, total word count)`."""
+        """Train a single batch. Return 3-tuple


Please follow numpy-style for docstrings, more links available here

menshikh-iv · 2018-08-01T06:15:03Z

gensim/models/base_any2vec.py


            for callback in self.callbacks:
                callback.on_batch_end(self)

-            progress_queue.put((len(data_iterable), tally, raw_tally))  # report back progress
+            # report back progress
+            progress_queue.put(


no need to break line (we use 120 char limit for gensim), here and everywhere

menshikh-iv · 2018-08-01T06:15:32Z

gensim/models/base_any2vec.py

@@ -260,6 +281,7 @@ def _log_train_end(self, raw_word_count, trained_word_count, total_elapsed, job_

    def _log_epoch_progress(self, progress_queue, job_queue, cur_epoch=0, total_examples=None, total_words=None,
                            report_delay=1.0):
+


please revert

menshikh-iv · 2018-08-01T06:18:12Z

gensim/models/word2vec_inner.pyx

@@ -491,8 +491,8 @@ def train_batch_sg(model, sentences, alpha, _work, compute_loss):
    cdef int negative = model.negative
    cdef int sample = (model.vocabulary.sample != 0)

-    cdef int _compute_loss = (1 if compute_loss else 0)
-    cdef REAL_t _running_training_loss = model.running_training_loss
+    cdef int _compute_loss = (1 if compute_loss is True else 0)


Why? old variant work correctly (same for cbow)

menshikh-iv · 2018-08-01T06:19:59Z

gensim/models/word2vec_inner.pyx

-    model.running_training_loss = _running_training_loss
-    return effective_words
+    model.running_training_loss += _running_training_loss
+    return effective_words, effective_words


It is worth writing a comment, why return same value twice

I added the reason for that in the new docstring.

menshikh-iv · 2018-08-01T06:22:05Z

gensim/models/word2vec.py

-            tally += train_batch_cbow(self, sentences, alpha, work, neu1, self.compute_loss)
-        return tally, self._raw_word_count(sentences)
+            (tally, effective_samples) = train_batch_cbow(self, sentences, alpha, work, neu1, self.compute_loss)
+        return tally, self._raw_word_count(sentences), effective_samples


Need to update an docstrings everywhere when you change returning type

@alreadytaikeune Still not done, please check

menshikh-iv · 2018-08-01T06:46:25Z

gensim/models/base_any2vec.py

@@ -966,6 +989,9 @@ def train(self, sentences=None, input_streams=None, total_examples=None, total_w
            total_words=total_words, epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
            queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)

+    def get_latest_training_loss(self):
+        return 0


should raise NotImplementet (loss feature works only for w2v now, not for d2v/fasttext/etc) or return -1 maybe?

I changed it to raise an exception. I think -1 will be confusing since it is not the value that will be displayed. The value that is displayed is the running training loss divided by the number of samples. That is why I chose to return 0 and -1. -1 will look like the loss is decreasing as the number of processed words increases.

menshikh-iv · 2018-08-01T06:52:46Z

gensim/models/base_any2vec.py

+            else:
+                # Model doesn't implement the samples tallying. We assume
+                # that the number of samples is the effective words tally. This
+                # gives coherent outputs with previous implementaitons


please add TODO here - if/else should be removed when compute_loss implemented for all models

menshikh-iv · 2018-08-01T06:59:02Z

gensim/models/base_any2vec.py

-                cur_epoch + 1, 100.0 * example_count / total_examples, trained_word_count / elapsed,
-                utils.qsize(job_queue), utils.qsize(progress_queue)
-            )
+            if self.compute_loss:


can you fully refactor this function please (make it clearer & shorter, not if { if { } else { } } else { if { } else { } } )?

Hint

generate pattern (logging template first)

collect needed parameters to list/tuple

use *my_parameters

alreadytaikeune · 2018-08-13T08:05:00Z

I have made the changes you requested, however I am a bit surprised by the outcomes of the tests. They all run smoothly on my machine and most of them pass in travis, except the py35-win one, for some reason that don't seem related at all with my changes.

menshikh-iv

Good work @alreadytaikeune 👍
I think we need merge #2127 first (and after - current PR with additional fix for new "mode" of training).

BTW, I checked py35-win - it's not a your fault, this is fail of non-determenistic test, I re-run build.

menshikh-iv · 2018-08-14T02:38:53Z

gensim/models/base_any2vec.py

@@ -260,6 +281,7 @@ def _log_train_end(self, raw_word_count, trained_word_count, total_elapsed, job_

    def _log_epoch_progress(self, progress_queue, job_queue, cur_epoch=0, total_examples=None, total_words=None,
                            report_delay=1.0):
+


menshikh-iv · 2018-08-14T02:41:06Z

gensim/models/base_any2vec.py

-            )
+            div = total_words
+
+        msg = "EPOCH %i - PROGRESS: at %.2f%% examples, %.0f words/s, in_qsize %i, out_qsize %i"


This can be PROGRESS: at %.2f%% words (not only examples)

You are right. Good catch. I'll fix it.

gensim/models/base_any2vec.py

menshikh-iv · 2018-08-14T02:42:42Z

gensim/models/word2vec.py

        if self.sg:
-            tally += train_batch_sg(self, sentences, alpha, work, self.compute_loss)
+            (tally, effective_samples) = train_batch_sg(self, sentences, alpha, work, self.compute_loss)


() no needed here (and same below)

menshikh-iv · 2018-08-15T02:33:09Z

@alreadytaikeune thanks, looks good!

Next steps:

Wait for merge File-based fast training for Any2Vec models #2127 (this almost done, I hope it will be merged soon)
Resolve merge-conflicts
Add same calculations for new training mode (from File-based fast training for Any2Vec models #2127)
Merge current PR

@alreadytaikeune I'll ping you when we'll be done with #2127.

menshikh-iv · 2018-09-24T04:02:24Z

Hi @alreadytaikeune, we finally merged & release #2127, please do the steps mentioned in #2135 (comment) and I'll merge your PR too 🌟

In word2vec_inner.pyx, functions now used the new config object while still returning the number of samples. In base_any2vec, logging includes the new loss values, (the addition of this branch)

…n functions

alreadytaikeune · 2018-10-01T09:33:59Z

Hi, I've completed the steps you mentioned, but when running the tests it seems to me that one is hanging, in the doc2vec test set. I don't really have a clue why, it wasn't the case before the merge.

menshikh-iv · 2018-10-02T04:33:21Z

@alreadytaikeune I see no commits after my comment & merge conflict in PR, please do needed changes (or maybe you forgot to push your changes?).
About any test hangs - reproduce it here first, please.

alreadytaikeune · 2018-10-18T15:27:25Z

Hey sorry for the delay, I was on holidays. Yes, I hadn't pushed my changes, I wanted to test them locally first. But here they are.

alreadytaikeune · 2018-10-18T15:45:10Z

Ah it seems the CI server experiences the same stalling issues as me. I am not sure when I will have time to investigate that though...

menshikh-iv · 2019-01-09T10:34:31Z

@alreadytaikeune CI issues fixed, hang reproduced in Travis, do you have time to fix & finalize?

mpenkov · 2019-02-21T02:02:30Z

Don't worry about the conflicts. I'll take care of them during the merge: they are in autogenerated code, so it's easy to resolve.

Regarding tests: I think the actual loss output may be difficult to test (and not worth it) because it's being output by logs. What is worth unit testing is the new behavior you added to the training functions (counting and returning the effective number of samples). Can you add tests to cover that new behavior?

For example:

word2vec.py:train_batch_sg
word2vec.py:train_batch_cbow
word2vec.py:score_sentence_sg
word2vec.py:_do_train_job
... and their counterparts in the Cython code.

You use the same testing logic and data for Python/Cython (they should be 100% compatible).

mpenkov · 2019-04-28T04:32:27Z

@alreadytaikeune What is the status of this PR? Are you able to finish it?

tridelt · 2019-08-19T15:19:24Z

@alreadytaikeune What do you think of the current status of this PR?
best regards from austria

alreadytaikeune · 2019-08-19T15:46:05Z

Hello @mpenkov and @tridelt. Sorry for not being more involved in this PR. I've had plenty on my plate in my job, and in my personal life as well, and didn't feel like spending free/family time working on tests. Although I'm not comfortable with leaving this as is either. I'll do my best in the coming days to address your comments. Sorry for not handling this earlier. Best.

alreadytaikeune · 2019-11-12T15:31:49Z

Hello, @mpenkov. I found a bit of time to throw myself back into the PR. I thought I would rebase my develop branch on the current one and work on the tests from there, but it was not a good idea.... anyway, I've restored my branch as it was before.

Just one thing, when rebasing it seemed to me that all the non-cython implementation of word2vec was removed. Is it correct? Regarding your last comment about checks, I imagine I should now only care about writing tests for the cython functions.

Sorry about the mess...

gojomo · 2019-11-12T19:53:31Z

@alreadytaikeune Yes,, the non-cython variants of many of the algorithms have been eliminated to reduce duplication/maintenance effort – so work going forward need only concern itself with the cython versions!

alreadytaikeune · 2019-11-12T22:04:33Z

Alright thank you @gojomo . I also see that word2vec.py became a script as well as a module, and it does the same thing as scripts/word2vec_standalone.py, the file I had originally modified. Which one should I focus on?

gojomo · 2019-11-13T17:06:39Z

AFAIK, word2vec.py has always implemented a main & been runnable as a script. Unsure of the reason for (partial) overlap with scripts/word2vec_standalone.py – & neither version has been updated recently to keep-up with newer Word2Vec initialization options (like ns_exponent or corpus_file). I suspect gensim usage is far more via imported code (vs command-line invocation), and new functions only get added to the main paths when someone specifically needs them. @piskvorky or @mpenkov would have to comment on whether both, or just one or the other, should be updated going forward.

piskvorky · 2019-11-13T17:17:38Z

@gojomo is right. I'm not aware of any changes to the CLI version of word2vec in Gensim. IIRC, the command line version was created to match the CLI of the original C tool by Google, including the CLI parameters names and defaults. I don't think it's been updated since.

I didn't remember what word2vec_standalone.py was. I tracked that script down to PR #593, which doesn't talk about its motivation. But a standalone script makes more sense to me, and is also cleaner (for example avoids pickling issues when run as __main__).

So my preference would be to get rid of __main__ in word2vec.py, and keep word2vec_standalone.py. We definitely don't need both.

alreadytaikeune · 2019-11-13T17:47:42Z

Thank you for the clarification. So, going forward, I'll just write some tests for cython only implementations (which is a bit confusing since I'm working on a version with still the two implementations, hence why I wanted to rebase in the first place), and leave to you clarifying between what's to be kept and dropped between word2vec.py and word2vec_standalone.py.

geopapa11 · 2020-12-12T13:53:12Z

I was wondering about the status of this PR. Will it be fixed in gensim 4.0.0? Erroneous loss computation happens only in Word2Vec or in other classes too (i.e., FastText)? Thanks in advance

piskvorky · 2020-12-12T14:51:12Z

Yes, ideally we want to clean this up for 4.0 too. @alreadytaikeune will you able to finish this PR?

Anyone else helping is welcome.

geopapa11 · 2020-12-12T14:54:34Z

Thank you very much @piskvorky! Does wrong loss computation affect other classes too (e.g., FastText)?

piskvorky · 2020-12-12T14:59:37Z

I don't remember, but I think the loss tallying was idiosyncratic – different for each class, missing from some models.

geopapa11 · 2020-12-12T17:20:10Z

Thanks @piskvorky that's very helpful. On an unrelated topic, do you guys plan (or find use) in supporting different initialization schemes for embeddings? Right now you are only supporting uniform initialization I think between [-0.5embedding_size, 0.5embedding_size]. I have created other types of initializations as well ('glorot_uniform', 'lecun_uniform', 'he_uniform', 'glorot_normal', 'lecun_normal', 'he_normal'). If you find it might add something in the tooling, I can give some ideas on a different PR.

piskvorky · 2020-12-12T23:14:00Z

Possibly; @Witiko did some work on weight initialization. That's out of topic for this ticket though.

gojomo · 2020-12-14T18:17:17Z

FYI #2617 is a sort of 'umbrella issue' mentioned/referencing all the tangled loss-issues. Ideally we'd want our measure/reporting to be similar to the output in Facebook's FastText logging. (Right now Gensim's FastText hasn't ever tracked loss, and whatever Gensim's Word2Vec is doing doesn't report similar tallies to FB FastText-in-plain-Word2Vec mode - so there's some discrepancy to be tracked-down & rationalized.)

Regarding alternate weight initializations, it shouldn't be hard to externally re-initialize (or include an option to not initialize for when a user is planning to do this themselves) between vocabulary-discovery and training. We could add some standard options if there's a strong case for how they improve things. (Do they noticeably hasten convergence?)

geopapa11 · 2020-12-14T19:50:38Z

FYI #2617 is a sort of 'umbrella issue' mentioned/referencing all the tangled loss-issues. Ideally we'd want our measure/reporting to be similar to the output in Facebook's FastText logging. (Right now Gensim's FastText hasn't ever tracked loss, and whatever Gensim's Word2Vec is doing doesn't report similar tallies to FB FastText-in-plain-Word2Vec mode - so there's some discrepancy to be tracked-down & rationalized.)

Regarding this, how are you updating the embedding weights on every iteration if you don't compute the loss? Do you use some kind of ready formula for gradient that's directly applied to update the embedding weights?

Regarding alternate weight initializations, it shouldn't be hard to externally re-initialize (or include an option to not initialize for when a user is planning to do this themselves) between vocabulary-discovery and training. We could add some standard options if there's a strong case for how they improve things. (Do they noticeably hasten convergence?)

Not sure if they improve or change convergence in a meaningful way. I just mentioned it because Embeddings as defined in other libraries (e.g., TensorFlow) offer different initialization schemes.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding (see embeddings_initializer argument)
https://www.tensorflow.org/api_docs/python/tf/keras/initializers

Also, your initialization which is more like [-0.5*fan_out, 0.5*fan_out] does not really correspond to one of the more known ones: He, Glorot, Lecun but maybe that's ok! :-)

Witiko · 2020-12-17T11:04:14Z

Possibly; @Witiko did some work on weight initialization. That's out of topic for this ticket though.

@piskvorky @geopapa11 I evaluated different initizations of the positional model of Mikolov et al. (2018) and Grave et al. (2018), since no reference implementation existed and it's unclear how the weights should be initialized, see pages 59 through 62 in the RASLAN 2020 proceedings.

nicocheh · 2022-08-25T17:45:41Z

I was wondering about the status of this PR. Is there any news about this PR? @mpenkov

mpenkov · 2022-08-25T18:50:54Z

Not that I'm aware of. I think we're waiting for the original contributor to finish the PR: #2135 (comment)

alreadytaikeune added 2 commits July 19, 2018 10:41

Updated _log_progress docstring with new parameter

f447df0

alreadytaikeune added 4 commits July 19, 2018 11:42

Fixing docstring

a6548c4

Fixing flake8 error line too long

a2fd340

Fixing compatibility issues with Doc2Vec

1bdd4a5

Fixing pep8

7b457d6

menshikh-iv mentioned this pull request Aug 1, 2018

override _log_progress() in word2vec to track training loss #2081

Closed

menshikh-iv suggested changes Aug 1, 2018

View reviewed changes

Some refactoring and docstring fixing

18735e2

menshikh-iv changed the title ~~Fixing the computation of Word2Vec losses.~~ Fix computation of Word2Vec loss & add loss value to logging string Aug 14, 2018

menshikh-iv suggested changes Aug 14, 2018

View reviewed changes

Fixing the progress unit

0bcae41

menshikh-iv added the almost complete label Aug 15, 2018

alreadytaikeune added 2 commits October 1, 2018 10:13

Merging work done in PR piskvorky#2127

eb4b14d

In word2vec_inner.pyx, functions now used the new config object while still returning the number of samples. In base_any2vec, logging includes the new loss values, (the addition of this branch)

Adding the tallying for effective samples in the new corpus_file trai…

00e7b7d

…n functions

Merge remote-tracking branch 'upstream/develop' into develop

995b5f8

Merge remote-tracking branch 'upstream/develop' into develop

aaf9ed9

mpenkov added the stale Waiting for author to complete contribution, no recent effort label Apr 28, 2019

gojomo mentioned this pull request Oct 1, 2019

Fix, improve, complete 'training loss' computation for *2Vec models #2617

Open

mpenkov removed the almost complete label Oct 8, 2019

mpenkov added this to Needs triage in PR triage Nov 3, 2019

alreadytaikeune force-pushed the develop branch from c279f5b to aaf9ed9 Compare November 12, 2019 15:22

gojomo mentioned this pull request Dec 27, 2019

Strange loss behaviour #2709

Closed

mpenkov added this to In progress in gensim-4.0 via automation Jun 10, 2020

mpenkov removed this from In progress in gensim-4.0 Feb 25, 2021

		@@ -260,6 +281,7 @@ def _log_train_end(self, raw_word_count, trained_word_count, total_elapsed, job_

		def _log_epoch_progress(self, progress_queue, job_queue, cur_epoch=0, total_examples=None, total_words=None,
		report_delay=1.0):

Fix computation of Word2Vec loss & add loss value to logging string #2135

Are you sure you want to change the base?

Fix computation of Word2Vec loss & add loss value to logging string #2135

Conversation

alreadytaikeune commented Jul 19, 2018

piskvorky commented Jul 19, 2018 • edited Loading

alreadytaikeune commented Jul 19, 2018

alreadytaikeune commented Jul 19, 2018

piskvorky commented Jul 22, 2018 • edited Loading

alreadytaikeune commented Jul 23, 2018 • edited Loading

piskvorky commented Jul 23, 2018

menshikh-iv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alreadytaikeune commented Aug 13, 2018

menshikh-iv left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

menshikh-iv commented Aug 15, 2018

menshikh-iv commented Sep 24, 2018 • edited Loading

alreadytaikeune commented Oct 1, 2018

menshikh-iv commented Oct 2, 2018 • edited Loading

alreadytaikeune commented Oct 18, 2018 • edited Loading

alreadytaikeune commented Oct 18, 2018

menshikh-iv commented Jan 9, 2019 • edited Loading

mpenkov commented Feb 21, 2019

mpenkov commented Apr 28, 2019

tridelt commented Aug 19, 2019

alreadytaikeune commented Aug 19, 2019 • edited Loading

alreadytaikeune commented Nov 12, 2019

gojomo commented Nov 12, 2019

alreadytaikeune commented Nov 12, 2019

gojomo commented Nov 13, 2019

piskvorky commented Nov 13, 2019 • edited Loading

alreadytaikeune commented Nov 13, 2019

geopapa11 commented Dec 12, 2020

piskvorky commented Dec 12, 2020

geopapa11 commented Dec 12, 2020

piskvorky commented Dec 12, 2020

geopapa11 commented Dec 12, 2020

piskvorky commented Dec 12, 2020

gojomo commented Dec 14, 2020

geopapa11 commented Dec 14, 2020

Witiko commented Dec 17, 2020

nicocheh commented Aug 25, 2022

mpenkov commented Aug 25, 2022

Fix computation of `Word2Vec` loss & add loss value to logging string #2135

Fix computation of `Word2Vec` loss & add loss value to logging string #2135

piskvorky commented Jul 19, 2018 •

edited

Loading

piskvorky commented Jul 22, 2018 •

edited

Loading

alreadytaikeune commented Jul 23, 2018 •

edited

Loading

menshikh-iv left a comment •

edited

Loading

menshikh-iv commented Sep 24, 2018 •

edited

Loading

menshikh-iv commented Oct 2, 2018 •

edited

Loading

alreadytaikeune commented Oct 18, 2018 •

edited

Loading

menshikh-iv commented Jan 9, 2019 •

edited

Loading

alreadytaikeune commented Aug 19, 2019 •

edited

Loading

piskvorky commented Nov 13, 2019 •

edited

Loading