Pentaho Data Mining (Weka)Welcome to the community home for Pentaho Data Mining Community Edition (CE) also known as Weka. Pentaho Data Mining is a comprehensive set of tools for machine learning and data mining. Its broad suite of classification, regression, association rules and clustering algorithms can be used to help you understand the business better and also be exploited to improve future performance through predictive analytics. Community Edition is self supported open source software. An Enterprise Edition (EE) of Pentaho Data Mining including technical support and managed upgrades is also available. For more information about EE or for screen shots and datasheets, visit Pentaho Data Mining EE on Pentaho's corporate site.Recent News and Releases
- 02/22/13 Weka 3.7.9 maintenance release available, more info. Stable
New Features since 3.4
In Development
New Features in 3.7.8
In core weka:
* EM and SimpleKMeans now allow for parallel processing on multi cpu/core machines
* CSVLoader re-written to be more memory efficient and support incremental loading
* Error plots for classifiers can optionally have point sizes set proportional to the prediction margin
* Pluggable evaluation metrics for classifiers/regressors
* Weighted resampling using the Walker's alias method
* FlowByExpression - KnowledgeFlow component to split incoming instances (or instance stream) according to the evaluation of a logical expression
* ReplaceMissingWithUserConstant filter
* PartitionMembership filter - adds partition membership attributes as computed by a classifier that implements PartitionGenerator
* Stream throughput metrics in the KnowledgeFlow when running incrementally
* TextSaver KnowledgeFlow component
* Search facility in the KnowledgeFlow design palette
* Keyboard shortcuts in the KnowledgeFlow for toolbar buttons
* Offline mode for the packge manager
* Improved out of memory detection and new low memory detection for GUIs
In packages:
* New isolationForest package - isolation forests for outlier detection
* New multilayerPerceptrons package
* New extraTrees package
* Performance improvements for optics_dbScan
* New lazyAssociativeClassifier package contributed by Gessé Dafé
* New EvolutionarySearch package contributed by Sebastian Luna Valero
Upcoming Training
Quick Links
Helpful Links Contribute to the Project
You can participate by contributing new code, reporting bugs, testing new releases, answering questions and more; Email us the proposed contribution and any other relevant details. Welcome to the team. |
|||||||||||||||||||||||||||||||||||||||||||||||