From: Data classification algorithm for data-intensive computing environments
Require: NodeQueue NQ, TreeModel TM, Training record | |
(x,y) ∈D 1. For each Tcurr∈NQ do 2. If JudgeLeaf(Tcurr) is false then 3. bestSplit=FindBestSplit(Tcurr) 4. Tcurr→splitAtt=bestSplit→splitAtt 5. If bestSplit→splitAtt is category then 6. Tcurr→leftAttSet=bestSplit→leftAttSet 7. Tcurr→rightAttSet=bestSplit→rightAttSet 8. Else 9. Tcurr→splitValue=bestSplit→splitValue 10. parationTrainingSet(Tcurr→D, leftD, rightD) 11. remove(Tcurr→splitAtt) 12. Create new nodes Tleft, Tright 13. Initiate(Tleft, leftD,Att) 14. Initiate(Tright, rightD,Att) 15. Tcurr→left=Tleft 16. Tcurr→right=Tright 17. NQ.push_back(Tleft) 18. NQ.push_back(Tright) 19. Else 20. Tcurr→isLeaf = true 21. Tcurr→label =y //y is the most common label |