Hadoop is one of the reputed general purpose computing platforms used to process big data. Mapreduce is the Hadoop project’s main processing engine that provides a framework for distributed computing. This is generated from the combination of ‘Map’ and Reduce concepts in the functional programming. This functional programming model hides the entire complexities related to distributive computing nodes, so that the developer can focus on the implementation of the Map and Reduce functions.
Decision Tree is one of the most efficient ways of classification and decision making. Decision tree consist of root node and decision nodes and the data related to a training module for example can be divided based on the measurable, functional attribute. The entire data thus can be divided into number of partitions based on the attributes fixed. If the samples within a partition belong to a single class, then the algorithm gets terminated automatically. Otherwise the division process continues until it eliminates all the non-identical samples. The samples that don’t fit into the attributed partitions are labeled as unknown category.
No comments:
Post a Comment