WEKA-LR: A Label Ranking Extension for WEKA
The problem setting of label ranking, which has
recently been introduced in machine learning research, is a specific
type of preference learning and can be seen as an
extension of conventional multi-class classification. In comparison to
the latter, there are notable differences regarding the type of
training data and the type of models (predictions) produced.
While a classifier is a mapping from instances to
class labels, assigning to each instance x one label y among a finite
set of candidates Y, a label ranker is a mapping from
instances to rankings (total orders) over Y. Thus, given any instance x
as an input, a label ranker produces a prediction in the form of a
ranking of the complete set of labels Y as an output. Typically, the
ranking thus produced is interpreted as a preference relation.
As an example, consider the problem to predict the preferences of
people (e.g., characterized in terms of a feature vector which
constitutes the input of the label ranker) regarding the set of music
genres {classic, jazz, popular, traditional}. The prediction
popular > jazz > classic > traditional suggests that
the person for whom the prediction is made mostly likes popular music,
which he or she prefers to jazz, which is preferred to classic, which
is in turn preferred to traditional music. Please note, however, that
this is only an example, and that the order relation > does not
necessarily need to be interpreted in terms of a preference semantics.
Instead, the prediction of a ranking may also be of interest in other
cases. For example, if instances are proteins and labels are small
molecules, then y>y’ could mean that y is binding better (stronger)
to a protein x than y’.
Just like a classification algorithm induces a classifier, a label
ranking algorithm learns a label ranker from a set of training data.
Here, the training data essentially consists of exemplary preference
information. In the simplest case, this information is given in the
form of pairwise comparisons, i.e., in the form of an instance x
together with a comparison y>y’ suggesting that, for x as an input,
y should be ranked ahead of y’.
For a more detailed introduction to label ranking, see the references
in the list below.
We have developed an extension of the Java machine learning
framework WEKA which is able to handle preference data and
includes label ranking algorithms. This extension, called
WEKA-LR, can be downloaded here. The
essentials of WEKA-LR are described in a short documentation.
Finally, here are some
sample data sets for label ranking, stored in our new data format
.xarff, which is an extension of the conventional .arff format of
WEKA.
References:
J. Fürnkranz and E. Hüllermeier.
Preference Learning.
Künstliche Intelligenz, 1/05, pp. 60-61, 2005.
[ a very concise introduction formalizing the settings of label and
object ranking,
PDF ]
J. Fürnkranz and E. Hüllermeier.
Preference Learning.
Springer-Verlag, Berlin, 2010.
[ our edited book on preference learning,
PDF of the introductory chapter ]
E. Hüllermeier, J. Fürnkranz, W. Cheng, and K. Brinker.
Label Ranking by Learning Pairwise Preferences.
Artificial Intelligence 172:1897-1917, 2008.
[ Draft-PDF ]
W. Cheng, J. Hühn, and E. Hüllermeier.
Decision Tree and Instance-Based Learning for Label
Ranking.
Proc. ICML-09, International Conference on Machine Learning.
Montreal, Canada, June 2009.
[ PDF ]
W. Cheng, K. Dembczynski and E. Hüllermeier.
Label Ranking based on the Placket-Luce Model.
Proc. ICML-2010, International Conference on Machine Learning.
Haifa, Israel, June 2010.
[
PDF ]

