Файл:Initializing Neural Networks using Decision Trees Clnl94.pdf
The task of inductive learning from examples is to �nd an approximate definition for an unknown function f(x), given training examples of the form hxi;f(xi)i. Existing algorithms to solve this problem follow either of two basic approaches. Decision tree methods, such as ID3 (Quinlan, 1986) and CART (Breiman, Friedman, Olshen, and Stone, 1984) construct trees whose leaves are labeled with the predicted classi�cation. The connectionist methods, on the other hand, apply neural network learning algorithms such as the perceptron algorithm(Rosenblatt, 1958), and the error-backpropagation algorithm (Rumelhart, Hinton and Williams, 1986) that learn through incremental changes of weights in a network consisting of elementary units called neurons.
Comparative studies (Mooney, Shavlik, Towell, and Gove, 1989; Weiss, and Kapouleas, 1989) have shown that whereas the decision tree algorithms run signi�cantly faster during training, the connectionist methods almost always perform better at classifying novel examples in the presence of noisy data.
The studies have also demonstrated that when comparisons are con�ned strictly to the connectionist approach, the back-propagation algorithm outperforms the perceptron algorithmin classifying real world datasets because real world datasets are often not linearly separable.
Back-propagation however, su�ers from a host of drawbacks. First, it is extremely slow to train. Second, and more important, variable parameters like learning rate, momentum, and network topology a�ect the accuracy of the resulting classi�er. Setting these parameters to their respective optimal values is a painstaking and time-consuming process.
When it comes to solving real-world problems, restrictions on time and accuracy become of utmost importance. This is to say that most applications require accurate real-time solutions to their respective problems. The methods mentioned above either come short of attaining the required accuracy for classifying novel examples, or train far too slowly to be functional in such environments.
This paper describes an algorithm which is much faster than back-propagation, and at the same time matches it in accuracy in classifying novel examples. The key idea is to construct a decision tree, convert it into an equivalent network, and then tune the network by training it over the same dataset for a short period of time.
Many researchers have proposed converting decision trees into equivalent neural networks - see for example (Utgo�, 1989), (Sethi, 1990), (Brent, 1991), and (Cios and Liu, 1992). The di�erence in our approach lies in the fact that whereas all these methods generate networks that classify as accurately as the input decision trees, our method, aided by subsequent tuning, outperforms the decision tree in its accuracy in classifying new examples.
Towell and Shavlik (1993) advance a technique of symbolic rules to neural network conversion and subsequent training, that resembles our approach. The conversion algorithm they employ is however, distinct and the resultant network has a di�erent topology. Whereas the depth of the network in their approach depends upon the input ruleset, our approach creates a network with a �xed depth.
This paper is structured as follows. Section 1.2 demonstrates how a neural network may be derived from a decision tree. In addition, it highlights certain other advantages of the proposed method. Section 1.3 clari�es the algorithm by converting an example decision tree into an equivalent neural network. In section 1.4 experimental results pertaining to three di�erent domains are presented and the new approach is compared to back-propagation and ID3 in their context. Section 1.5 discusses future research directions based on someknown shortcomingsof this approach, and section 1.6presents concluding remarks.
Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.
|текущий||14:11, 22 декабря 2016||0 × 0 (122 КБ)||Slikos|
- Вы не можете перезаписать этот файл.