Spark大数据中文分词统计Java工程源码

上传者: yangdanbo1975 | 上传时间: 2019-12-21 22:08:28 | 文件大小: 379KB | 文件类型: 7z
参考网络资源使用IKAnalyzer分词组件,实现了纯Java语言、MapReduce模式及Spark 框架三种方式对唐诗宋词等国学经典进行中文分词统计排序的功能,你可以轻松发现唐诗宋词中最常用的词是那几个。

文件下载

资源详情

[{"title":"( 48 个子文件 379KB ) Spark大数据中文分词统计Java工程源码","children":[{"title":"JavaWordCount","children":[{"title":"src","children":[{"title":"com","children":[{"title":"magicstudio","children":[{"title":"spark","children":[{"title":"SortableMap.java <span style='color:#111;'> 3.32KB </span>","children":null,"spread":false},{"title":"text","children":[{"title":"老子道德经.txt <span style='color:#111;'> 21.59KB </span>","children":null,"spread":false},{"title":"论语.txt <span style='color:#111;'> 74.01KB </span>","children":null,"spread":false},{"title":"宋词三百首.txt <span style='color:#111;'> 111.45KB </span>","children":null,"spread":false},{"title":"孟子.txt <span style='color:#111;'> 130.63KB </span>","children":null,"spread":false},{"title":"唐诗三百首.txt <span style='color:#111;'> 84.94KB </span>","children":null,"spread":false},{"title":"庄子南华经.txt <span style='color:#111;'> 240.08KB </span>","children":null,"spread":false}],"spread":true},{"title":"WordCounter.java <span style='color:#111;'> 15.64KB </span>","children":null,"spread":false},{"title":"stopword.dic <span style='color:#111;'> 161B </span>","children":null,"spread":false},{"title":"SparkWordCount.java <span style='color:#111;'> 9.78KB </span>","children":null,"spread":false},{"title":"HadoopWordCount.java <span style='color:#111;'> 9.32KB </span>","children":null,"spread":false},{"title":"FileUtil.java <span style='color:#111;'> 3.58KB </span>","children":null,"spread":false},{"title":"IKAnalyzer.cfg.xml <span style='color:#111;'> 414B </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true},{"title":"screen shoot.png <span style='color:#111;'> 131.17KB </span>","children":null,"spread":false}],"spread":true},{"title":"bin","children":[{"title":"com","children":[{"title":"magicstudio","children":[{"title":"spark","children":[{"title":"text","children":[{"title":"老子道德经.txt <span style='color:#111;'> 21.59KB </span>","children":null,"spread":false},{"title":"论语.txt <span style='color:#111;'> 74.01KB </span>","children":null,"spread":false},{"title":"宋词三百首.txt <span style='color:#111;'> 111.45KB </span>","children":null,"spread":false},{"title":"孟子.txt <span style='color:#111;'> 130.63KB </span>","children":null,"spread":false},{"title":"唐诗三百首.txt <span style='color:#111;'> 84.94KB </span>","children":null,"spread":false},{"title":"庄子南华经.txt <span style='color:#111;'> 240.08KB </span>","children":null,"spread":false}],"spread":true},{"title":"SparkWordCount$1.class <span style='color:#111;'> 1.33KB </span>","children":null,"spread":false},{"title":"SparkWordCount.class <span style='color:#111;'> 5.90KB </span>","children":null,"spread":false},{"title":"SortableMap.class <span style='color:#111;'> 3.88KB </span>","children":null,"spread":false},{"title":"SparkWordCount$2.class <span style='color:#111;'> 1.39KB </span>","children":null,"spread":false},{"title":"SortableMap$MapValueComparator.class <span style='color:#111;'> 1.45KB </span>","children":null,"spread":false},{"title":"stopword.dic <span style='color:#111;'> 161B </span>","children":null,"spread":false},{"title":"HadoopWordCount$TokenizerMapper.class <span style='color:#111;'> 3.16KB </span>","children":null,"spread":false},{"title":"WordCounter$1.class <span style='color:#111;'> 1.30KB </span>","children":null,"spread":false},{"title":"HadoopWordCount$IntWritableDecreasingComparator.class <span style='color:#111;'> 1010B </span>","children":null,"spread":false},{"title":"SparkWordCount$3.class <span style='color:#111;'> 1.30KB </span>","children":null,"spread":false},{"title":"SortableMap$MapKeyComparator.class <span style='color:#111;'> 1.08KB </span>","children":null,"spread":false},{"title":"WordCounter.class <span style='color:#111;'> 14.79KB </span>","children":null,"spread":false},{"title":"SparkWordCount$5.class <span style='color:#111;'> 1.60KB </span>","children":null,"spread":false},{"title":"HadoopWordCount$IntSumReducer.class <span style='color:#111;'> 2.67KB </span>","children":null,"spread":false},{"title":"WordCounter$3.class <span style='color:#111;'> 949B </span>","children":null,"spread":false},{"title":"SparkWordCount$6.class <span style='color:#111;'> 1.80KB </span>","children":null,"spread":false},{"title":"IKAnalyzer.cfg.xml <span style='color:#111;'> 414B </span>","children":null,"spread":false},{"title":"WordCounter$2.class <span style='color:#111;'> 1.22KB </span>","children":null,"spread":false},{"title":"HadoopWordCount.class <span style='color:#111;'> 5.42KB </span>","children":null,"spread":false},{"title":"FileUtil.class <span style='color:#111;'> 2.18KB </span>","children":null,"spread":false},{"title":"SparkWordCount$4.class <span style='color:#111;'> 1.60KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}],"spread":true},{"title":"screen shoot.png <span style='color:#111;'> 131.17KB </span>","children":null,"spread":false}],"spread":true},{"title":".classpath <span style='color:#111;'> 641B </span>","children":null,"spread":false},{"title":".settings","children":[{"title":"org.eclipse.core.runtime.prefs <span style='color:#111;'> 52B </span>","children":null,"spread":false},{"title":"org.eclipse.core.resources.prefs <span style='color:#111;'> 174B </span>","children":null,"spread":false},{"title":"org.eclipse.jdt.ui.prefs <span style='color:#111;'> 5.50KB </span>","children":null,"spread":false},{"title":"org.eclipse.jdt.core.prefs <span style='color:#111;'> 670B </span>","children":null,"spread":false}],"spread":true},{"title":".project <span style='color:#111;'> 389B </span>","children":null,"spread":false}],"spread":true}],"spread":true}]

评论信息

  • sun89 :
    这个spark资源还是有一定作用的 谢谢
    2021-03-08
  • daizhiming :
    具有一定参考性。
    2020-05-23
  • yalinmmsj :
    感谢你的分享
    2019-11-18
  • 玄明Hanko :
    为什么网上用纯java实现的例子比较少,是不是scala的比较多
    2018-05-08
  • 渣渣打字员 :
    谢谢博主。。。
    2018-05-03

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明