ARFF tools

arff-tools is a software toolkit for manipulating data files in the Attribute-Relation File Format, used extensively in the Mutrics project and in the Machine Learning research community.

The tools:
  • arff-cat: concatenate many files with the same header
  • arff-checkperf: check performance of a classifier (compare values in two columns)
  • arff-combine: combine several files by choosing columns in them
  • arff-grep: filter files by column values
  • arff-rewrite: rewrite values in given column using an external dictionary
  • arff-sample: randomly sample a given percentage of rows
  • arff-select: select given columns and output as either TXT or ARFF file
Additionally, the toolkit contains the following Python libraries:
  • ArffReader.py: a reader of ARFF files with many features

Source code