This is an experiment on text classification using different supervised learning classifiers and their variants conducted on the Reuters-21578 dataset. The aim is to evaluate the best performance for each of the classifiers by properly tuning the parameters of each classifier so that the least error is recorded during the classification.
The original Reuters-21578 corpus originally contains 135 categories and the categories are overlapped, i.e., a document may exist in several categories. Hence, we consider the Mod Apte version of Reuters, which contains 12902 documents with 90 categories and the corpus is divided into training and test sets. For the given experiment, we are given the following 10 categories:
alum, barley, coffee, dmk, fuel, livestock, palm-oil, retail, soybean, veg-oil