Exploiting Monotonicity Constraints in Active Learning
MetadataShow full item record
If one wants to build a classi?er, but the available data is not classi?ed, active learning can be a good solution. If the data in question has the monotonicity property, we can classify much more data using fewer queries. The data is said to be monotone if the class label is known to be increasing or decreasing in the attributes. The goal of active learning with monotonicity constraints is to use queried and monotonically inferred data to build a classi?er. A central problem in active learning is ?nding the best query points. It has been suggested to use the concept of lower sets to search in the partially ordered sets (posets) that represent the dataset , , . We show that this concept can also be used to select good query points in active learning with monotonicity constraints. Because counting lower sets becomes very costly quickly, it's hard to test this hypothesis on general posets. We provide a dynamic programming algorithm that counts all lower sets in linear time for the special case of a matrix order. We then show that this query strategy is at most a logarithmic factor worse than the optimal query strategy at classifying the whole dataset. We contrast this strategy with a simple but good known greedy heuristic . Testing shows that the lower-set-count strategy outperforms the greedy heuristic in classifying the whole dataset, but is worse when an a priori set maximum number of queries is made. Tests are performed on arti?cial and real datasets and the heuristics are judged based on both the quantity of inferred class labels and their quality, by means of building a classi?er. These tests lead us to conclude that the greedy heuristic is very good and that the lower-set-count strategy is not a promising one for general posets. The latter conclusion is reached by reasoning that approximating lower set counts, which would be needed, will nullify the marginal edge over the greedy strategy it might have at best. The most challenging part of creating arti?cial data is generating random monotone classi?cation functions. We show how to do this without generating all of them using the Propp-Wilson algorithm. Furthermore, concerning datasets with more than two class labels, we show that when building a classi?er after a restricted number of queries, the classi?er can be improved substantially by using partially classi?ed data entries in addition to the fully classi?ed data.