Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any pre-trained model? #59

Open
GZJackCai opened this issue Oct 5, 2016 · 32 comments
Open

Is there any pre-trained model? #59

GZJackCai opened this issue Oct 5, 2016 · 32 comments

Comments

@GZJackCai
Copy link

My computer is not very powerful , can someone or could give a link where we can download a pre-trained model?

@maeda
Copy link

maeda commented Oct 30, 2016

Hi, you can use a EC2 GPU amazon machines to train your models.
The model file size usually could be large, like 200MB or more. For example, I've trained an model using the params similar than @macournoyer showed and my model file size was near to 1GB.

Anyway, you can start playing on amazon EC2 instances for now, since you guess your machine is not sufficient.

@TTN-
Copy link

TTN- commented Mar 14, 2017

Hi friends,
I'm a bit on the same boat here. I have a decent laptop with a high end cpu, but intel integrated graphics.
Using parameters: th train.lua --dataset 50000 --hiddenSize 1000 it will take 31 days to process which is far too long. I could train over the amazon service which starts at 29usd/month. Or build a computer (hundreds of $$).

Any uploads of trained data would be wonderful! Be it torrent, dropbox, googledrive, all works for me.

@kenkit
Copy link

kenkit commented Mar 22, 2017

well, I have just opened a similar issue.
EDIT:And if anyone uploads one, please share it as cpu loadable too.
You can convert it with model = model:float()

@TTN-
Copy link

TTN- commented Mar 25, 2017

Getting OpenCL working is a absolute pain on a mobile hybrid graphics system. I've given up on it and will build a desktop with an Nvidia card later that has proper support unlike my current hardware.

Atm I'm just training on my cpu with parameters: th train.lua --dataset 30000 --hiddenSize 1000 --maxEpoch 10
Its going to take close to a week to complete, its almost half way. I'll share the data when its done to dropbox or google drive. No idea if its going to be any good but I'm hopeful.

Current terminal output: http://pastebin.com/RfhqKHNd
Looks promising!

@kenkit
Copy link

kenkit commented Mar 26, 2017

Nice but won't that consume alot of power, I think you should've tried some of aws instances, they could've taken much less time i think.
EDIT: On a the full training set.

@TTN-
Copy link

TTN- commented Mar 26, 2017

Thats what I thought too, until I looked at the costs. I signed up with AWS but its going to cost me 30usd for a month's dev subscription, and then $0.65/hr for just one gpu's worth of power. If I want to run that for 3 days thats $76.8 all up. A week would cost 139usd.
Figured it'd be better to drop 250usd or so on a card instead, especially if I plan to do this more often. Power wise my, my laptop's cpu is 45W max tdp, one week of running that under full load would only cost 1.23usd. :-)

@kenkit
Copy link

kenkit commented Mar 27, 2017

hehe, I never looked at it from that perspective. I thought your pc consumed more power than it would cost on aws .

@TTN-
Copy link

TTN- commented Mar 27, 2017

Its processing epoch 9 atm, I'm playing around chatting with epoch 8 and its looking rather promising. That thing has absolutely watched too many cop movies that is clear haha.
examples.t7 https://www.dropbox.com/s/zhlha2sh335zypy/examples.t7?dl=0
modelt.t7 https://www.dropbox.com/s/7nqpt7ogq8cigwm/model_epoch_8.t7.zip?dl=0
vocab.t7 https://www.dropbox.com/s/hzrsrxlbu0w5qz6/vocab.t7?dl=0
I'll be back in about 8-10 hours time and I'll paste the other links when its finished uploading.
epoch 8 stats:

  Errors: min= 1.2443210083606	
          max= 5.8878899451946	
       median= 2.1652770519947	
         mean= 2.1942398971716	
          std= 0.33685866438056	
          ppl= 8.9731779260862	

EDIT: all uploaded. Links as above. Remember to rename the model_epoch_8.y7 to model.t7 (I could rename it but the connection is crap don't really want to reupload)

@kenkit
Copy link

kenkit commented Mar 28, 2017

Nice, keep up the good work.
But lets still try and get the full training set done.
Good work.

@kenkit
Copy link

kenkit commented Mar 28, 2017

what specs does you pc have btw, it looks like a nice machine ?

@TTN-
Copy link

TTN- commented Mar 28, 2017

Thanks :-)
Its a dv6 from late 2011. A couple years ago I got hold of the highest performing cpu of that generation that was produced for mobile systems. Its an i7-2860QM. I got it second hand for like $50 from a buddy who stopped using it in his gaming laptop (lucky me!). Still holds its own even against some of the new processors which goes to show how much cpu speed developments have slowed in recent years.

I'm planning on building a desktop pc with a GTX1060 6GB (the 1070 looks tempting but its a bit pricey). Once I got that I'll see if I can crunch a full set of data and will upload it. This stuff is fascinating.

@kenkit
Copy link

kenkit commented Mar 28, 2017

i'm also eyeing for a better desktop, am currently on an amd x4 b93 with 2.8ghz, currently it's the best machine i've used so far for my projects.
I hope I can use it to fund for a better machine.

@TTN-
Copy link

TTN- commented Mar 28, 2017

You should drop a graphics card into what you have and it should work great. Training really should be done on GPU's rather than CPU. Good reading: http://timdettmers.com/2017/03/19/which-gpu-for-deep-learning/

@kenkit
Copy link

kenkit commented Mar 28, 2017

I was dissapointed to discover that am on a propriety powersupply which is limited to 300watts I saw a gtx 970 not really sure but this is the only one that might work. I couldn't find a store with this gpu so i'll have to buy one online, But I still think it's better to build a new machine instead.
I have a HP Compaq 6005 Pro SFF PC btw.
EDIT:I meant GTX 730 GDDR5 with minimu power requirement of 300 W

@TTN-
Copy link

TTN- commented Mar 31, 2017

https://www.dropbox.com/s/9xc9fisil1xee69/model.t7.zip?dl=0
Thats epoch 10 completed there. ^

I can continue to post good model results if people are interested. Bought a GTX1060 6GB a few days ago, more parts arriving in the mail.. Will be able to grind more data, much much quicker :)

@bienbinod
Copy link

@TTN- Thank you for sharing the model. I downloaded and start making conversation but it really does not make a good conversation. Following is a snippet of the conversation. Could you please share a snippet of a good conversation that you had?

`you> hi
neuralconvo> It's not the cut that's bothering me.

you> how are you?
neuralconvo> I'm not sure, but you don't want to see the dress.

you> Which dress are you talking about?
neuralconvo> I'm not.

you> Are you robot?
neuralconvo> I'm not.

you> Are you human?
neuralconvo> I'm not.

you> Who are you?
neuralconvo> I'm mimi i'm dying.

you> Don't die
neuralconvo> I'm sorry about your plane.

you> What are you talking about dude?
neuralconvo> I'm not.
you>
`

@TTN-
Copy link

TTN- commented Mar 31, 2017

@bienbinod I didn't have very good conversations with it sorry. It's mostly been the same as what you say there.. Its been trained on too small a data set I think. Once I get that new gaming PC running (atm I'm limited to cpu training on my laptop, takes forever) I'll train it on the full dataset and see how that goes . I'll share that too.

@bienbinod
Copy link

@TTN- Thank you for your reply. Same here, I am also relying on a laptop and it is going to take ages to train the model on full dataset. Let's keep on posting the models, whoever builds first.

@TTN-
Copy link

TTN- commented Apr 1, 2017

I'll also have a play around with this: https://github.com/mtanana/torchneuralconvo
There's some additional features.

@kenkit
Copy link

kenkit commented Apr 1, 2017

let me download, this thanks guys for sharing and making this available, maybe we should make a repo with trained models (.t7) detaling the number of epochs trained cpu info and other details.

@kenkit
Copy link

kenkit commented Apr 1, 2017

by the way @TTN- share cpu loadable versions when you can.

@TTN-
Copy link

TTN- commented Apr 1, 2017

@kenkit Are gpu trained models cpu loadable?

I'll continue to share what I make progress on.

This pc build is going to be a at least a week maybe 2 away. I'm waiting for the Ryzen 5 cpu launch on april 11 to finish the build.

@TTN-
Copy link

TTN- commented Apr 1, 2017

Not a bad idea to have a repo. Probably best to share .torrents or even magnet links or put it on the piratebay.The github limit is 1GB and my dropbox only has so much space. Google drive will hold 15GB.

@kenkit
Copy link

kenkit commented Apr 2, 2017

@TTN- they are loadable you just need to load the model then convert it to cpu loadable with
model = model:float()
Save and you are done.

@TTN-
Copy link

TTN- commented Apr 18, 2017

Cheers, thanks @kenkit for the tip. I only program in C and python. th is a bit different, lots to learn.

I'm still building that pc. Got a new ryzen 5 cpu on the 11th of april on the launch, just waiting for it to arrive in the snail mail (it got delayed for some reason). Should be here tomorrow. More trained data sets to be posted once I get things up and running. I haven't forgotten about this :-)

@TTN-
Copy link

TTN- commented May 9, 2017

@kenkit could you post a file for me to run to convert to float? I'm no good with lua sorry. I'm trying a couple things but the resulting file is 4 bytes in size. Pretty sure the model is float as is.

Uploading new trained data now. stats:

Epoch stats:	
  Errors: min= 1.8776668432881	
          max= 8.6376652998083	
       median= 4.2078251736138	
         mean= 4.2796026050325	
          std= 0.80612978381653	
          ppl= 72.211737723467	

Full terminal output paste (for stats n stuff): https://pastebin.com/c6pJcCCP
File upload: https://www.dropbox.com/sh/v3smqi6ee8iycjt/AAD-Hx4fqHJK6qumXIvgElLga?dl=0

hardware:

  • AMD Ryzen 5 1600 CPU @3.2GHz
  • ram running at 2400MHz, though it is rated for 3200, that would be OC'ing the mobo, which isn't quiet stable.
  • GPU is GTX 1606 6GB

Took me a while to get it trained up to this point, my ryzen system was unstable for a while and crashed a bunch of times, but thats fixed now. Interestingly, the program is heavily CPU constrained. It maxes out a single core (of the 12) and gets limited to that while the GPU sits mostly idle, even though I was training with --cuda. Video memory usage sat at around 2.5GB most of the time (I have the full movie dataset loaded with no limits on vocabulary).

The perplexity (ppl) was decreasing fast up to this point, from here on, I think more training will just result in over fitting.

@kenkit
Copy link

kenkit commented May 12, 2017

just load the model normally, then convert the loaded model to float and save as you would any other model.
I found this here:
https://groups.google.com/forum/#!topic/torch7/ugBCwaoXw_s
and
https://groups.google.com/forum/#!msg/torch7/i8sJYlgQPeA/au-WVMSmbvkJ

If don't manage to convert it. Ping me I'll build a complete working code which you can use.
EDIT:My pc's powersupply fried, currently on a laptop which is too slow. Anyway let me come up with something right away.

@kenkit
Copy link

kenkit commented May 12, 2017

Try this, just put it where we have train.lua

filename:gpu_to_cpu.lua

require 'neuralconvo'
require 'xlua'
require 'optim'
require 'cutorch'
require 'cunn'


model = torch.load("data/model.t7")
model = model:float()


torch.save("data/cpu_model.t7", model)

@TTN-
Copy link

TTN- commented May 14, 2017

Sweet. Thanks @kenkit

I did that, tested the results, but throws error when testing with th eval.lua:

user@machine:~/Projects/macournoyer-neuralconvo$ th eval.lua 
Loading vocabulary from data/vocab.t7 ...	
-- Loading model	

Type a sentence and hit enter to submit.	
CTRL+C then enter to quit.
	
you> hi
/home/user/Scripts-libs/torch/install/bin/luajit: eval.lua:57: attempt to index global 'model' (a nil value)
stack traceback:
	eval.lua:57: in function 'say'
	eval.lua:71: in main chunk
	[C]: in function 'dofile'
	...libs/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

@kenkit
Copy link

kenkit commented May 19, 2017

you might want to check if the files actually exist.
the files generated are usually in this structure after completing training.

ls
neuralconvo/data (master)
cornell_movie_dialogs/
examples.t7
model.t7
vocab.t7

You are currently the only hope we have at getting some working files, i've managed to get a laptop that should put me back to programming though it's not fast enough

@kenkit
Copy link

kenkit commented May 19, 2017

I had trained ages ago and acquired some 35mb file

drwxr-xr-x 1 Cosmo 197609    0 Jun  6  2016 cornell_movie_dialogs/
-rw-r--r-- 1 Cosmo 197609  19M Jun  4  2016 examples.t7
-rw-r--r-- 1 Cosmo 197609  35M Jun  6  2016 model.t7
-rw-r--r-- 1 Cosmo 197609 1.4M Jun  4  2016 vocab.t7

Also u should know that, after changing from cpu to gpu or vice versa u must first delete the generated files as they will not be useable

@kenkit
Copy link

kenkit commented May 19, 2017

did you try my code ?
if not u should paste it into a new lua file and run from shell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants