DNS-Class algorithm

About

DNS-Class is an algorithm for classifying network traffic flows using DNS information. It leverages information carried in the DNS query-response packets that these flows evoke.

For more information on the algorithm, see the following paper:

Foremski P., Callegari C., Pagano M., "DNS-Class: Immediate classification of IP flows using DNS"
International Journal of Network Management, John Wiley & Sons, 2014 (see Publications)

Download source code

Here, we publish an open source implementation of the DNS-Class algorithm in Python, dnsclass. It uses the libshorttext library and consists of several classification steps. See the source code documentation for more details.

dnsclass takes as input ARFF files generated using Flowcalc. Apart of the standard traffic features provided by Flowcalc, you will need flow domain names (the dns module) and the ground-truth protocol (e.g. the lpi module).

Below you will find the implementation of the two DNS-Class stages described in the paper (in Section 2):
Please note that the DNS-Class algorithm was also implemented as one of the modules in the MUTRICS classifier, which implements the Waterfall cascade classification architecture.