V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
fendouai_com

自然语言处理工具:中文 word2vec 开源项目,教程,数据集

  •  
  •   fendouai_com · Oct 1, 2017 · 3091 views
    This topic created in 3144 days ago, the information mentioned may be changed or developed.

    中文 word2vec

    开源项目

    Chinese word vectors

    This project uses Word2vec and GloVe tools to train word vectors for Chinese using data from wikipedia dump.

    https://github.com/candlewill/Chinsese_word_vectors

    wordvectors

    Pre-trained word vectors of 30+ languages

    https://github.com/Kyubyong/wordvectors

    chinese-word2vec

    word2vec/glove/swivel binary file on chinese corpus

    https://github.com/to-shimo/chinese-word2vec

    教程

    维基百科语料中的词语相似度探索

    http://www.52nlp.cn/tag/gensim

    利用 word2vec 对关键词进行聚类

    http://blog.csdn.net/zhaoxinfan/article/details/11069485

    Training Word2Vec Model on English Wikipedia by Gensim

    http://textminingonline.com/training-word2vec-model-on-english-wikipedia-by-gensim

    数据集

    wiki

    https://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-articles.xml.bz2

    sogou

    http://www.sogou.com/labs/resource/list_news.php

    更多机器学习资源,教程: http://www.tensorflownews.com/

    No Comments Yet
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   5739 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 40ms · UTC 08:55 · PVG 16:55 · LAX 01:55 · JFK 04:55
    ♥ Do have faith in what you're doing.