Skip to content

This project intends to add more incremental algorithm support for Spark MLlib, including Naive Bayes, Collaborative filtering, SVM, freqent pattern mining, etc.

Notifications You must be signed in to change notification settings

milesyao/Incremental-Algorithms-for-Spark-MLlib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Incremental-Algorithms-for-Spark-MLlib

This project intends to add more incremental algorithm support for Spark MLlib, including Naive Bayes, Collaborative filtering, SVM, freqent pattern mining, etc.

Compatibility

This framwork should be worked on with Spark 1.3.0 and Scala 2.10.4, JDK 7+.

Develop Requirements

Our first work should be Streaming Naive Bayes. We are expected to fullfill the following things:

  1. Write streaming Naive Bayes algorithm, referring to original Naive Bayes implementation and streaming algorithm implementation.

  2. Write streaming naive bayes test suite. Prove step 1's correctness and functionality. Please refer to the "ScalaTest" for more information.

The Coding Template

This is a coding template incorporating SBT settings and typical Spark MLlib re-development framework. However, function entries are undefined, which are the developer's duty to fill them up, according to their own understanding to streaming algorithms.

Please refer to Spark MLlib 1.3.0 source code (Streaming LR, Streaming K-means, especially) to implement Streaming Naive Bayes.

Import project

In Intellij IDEA(14, maybe similar for 13), click File-Import Project. Find the root of this framwork, click next. Choose "Import project from external model", then choose SBT. Click next. Choose "User auto-import" and Project SDK(1.7.0 or above). Click finish. Wait some time till it's done.

Good Good Work, Day Day Up!

About

This project intends to add more incremental algorithm support for Spark MLlib, including Naive Bayes, Collaborative filtering, SVM, freqent pattern mining, etc.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages