Analyzing Learning Algorithm with Adaptive Sampling




Many researchers proposed experimental or theoretical methods for analyzing the behavior of learning algorithms. However, it is difficult to analyze the exact behavior of learning algorithms because it is hard to translate to mathematical model and they need high computational time as the instance space increases.

One possible approach to solve this difficulty is to introduce random sampling. In this paper, we will propose AdaRCA, which is a method for analyzing the behavior of learning algorithms by using adaptive random sampling approach. In AdaRCA, a training set from an example space is selected by random sampling, and the training set is fed into the learning algorithm and the accuracy of a hypothesis generated from the learning algorithm is calculated. These steps are iterated until the classification accuracy is close enough to its expectation.

In Random sampling, it is difficult to determine an appropriate sample size. To overcome this difficulty, we will introduce two techniques to AdaRCA. One is adaptive sampling method. Adaptive sampling obtains examples in an on-line fashion and it determines the number of sample size from obtained training sets. The other technique is the way of estimating maximum/minimum value of the classification accuracy. As the result, we can flexibly analyze the behavior of learning algorithms.