python - Convert word2vec bin file to text -
from word2vec site can download googlenews-vectors-negative300.bin.gz. .bin file (about 3.4gb) binary format not useful me. tomas mikolov assures us "it should straightforward convert binary format text format (though take more disk space). check code in distance tool, it's rather trivial read binary file." unfortunately, don't know enough c understand http://word2vec.googlecode.com/svn/trunk/distance.c.
supposedly gensim can also, tutorials i've found seem converting from text, not other way.
can suggest modifications c code or instructions gensim emit text?
i use code load binary model, save model text file,
from gensim.models.keyedvectors import keyedvectors model = keyedvectors.load_word2vec_format('path/to/googlenews-vectors-negative300.bin', binary=true) model.save_word2vec_format('path/to/googlenews-vectors-negative300.txt', binary=false)
note:
above code new version of gensim. previous version, used code:
from gensim.models import word2vec model = word2vec.word2vec.load_word2vec_format('path/to/googlenews-vectors-negative300.bin', binary=true) model.save_word2vec_format('path/to/googlenews-vectors-negative300.txt', binary=false)
Comments
Post a Comment