Artificial Intelligence Research Group

Some Past and Current Projects
Under Development

12. Decision Tree Pathology
Investigator(s): Dr Weizhong Liu

ID3 type decision tree learners are one of the most commonly used types of machine learning system. Although these systems are well known and widely used there are still a number of open questions about their implementation and performance. This project seeks to address these problems. The selection of attributes for splitting the tree is one such area of investigation. Mingers has claimed that random selection of attributes followed by pruning can be used to induce a tree with the same level of classification accuracy as any of a variety of orthodox methods followed by pruning. He ran experiments with four data sets in which he tested a variety of measures against a purely random selection method. In each case, the resulting tree was pruned using Breiman's error complexity method. The results appeared to show that there was no significant difference in the classification performance, whichever method was used. Butine and Niblett refuted this claim with more carefully constructed experiments and suggested reasons for the disparity between their respective results. We investigated the decrement expected in classification accuracy when random attribute selection is employed and examined factors that might be expected to influence the magnitude of this decrement. We used precisely formulated synthetic datasets to test specific hypothesis. Another problem investigated was that of estimating the predictive accuracy when assessing and comparing classification techniques in the presence of noise. We used cross validation as a thorough way of doing this, however since it is particularly computationally expensive we use the technique of dynamic path generation to evaluate only paths to nodes in the decision tree as opposed to the whole decision tree. The techniques investigated in this study have since been applied to the domain of oncology, in particular the early detection of gastric cancer.

Liu, W.Z. & White, A.P. (1994). "The importance of attribute selection measures in decision tree induction." Machine Learning, 15, pp. 25-41. Liu, W.Z., White, A.P., Hallissey, M.T. and Fielding, J.W.L. (in press). "Machine learning techniques in early screening for gastric and oesophageal cancer." Artificial Intelligence in Medicine.