
Moreover, the execution time of the algorithm on Apache Spark is very short, even on large traffic traces.

Anomalies not detected by the gold standard are also detected by our approach. Results show that the algorithm is highly accurate in terms of Precision and Recall for port and net scan detection. The execution time of the algorithm has also been experimentally evaluated, running Apache Spark on a private Cloud. The paper firstly describes the approach and the algorithm proposed and then presents an experimental analysis of its performance, containing also a comparison with Mawilab gold standard.

In accordance with this approach, an algorithm has been created able to detect IP addresses that generate port and net scanning activities, and suited for the execution on Apache Spark framework. The approach works at flow level and has been conceived to detect such anomalous events on high-speed networks in a short time. In this work a novel approach for detecting port and net scan using Big Data Analytics frameworks is presented.

The two most spread network anomalies related to network security are port and net scan, activities performed by a malicious host to find and examine potential victims. In fact it is necessary to detect all behaviors which do not comply with a well-defined notion of a normal behavior in order to avoid further harms. Today, due to the high number of attacks and of anomalous events in network traffic, the network anomaly detection has become an important research area.
