POST ANALYSIS OF SNORT INTRUSION FILES USING DATA MINING TECHNIQUES: DECISION TREE AND BAYESIAN NETWORK

JEGEDE T.J, ASANBE M.O

Abstract


Network security is a crucial information technology activity today. Intrusion Detection Systems (IDS) are among the fastest growing technologies in computer security domain. These systems are designed to identify/ prevent any hostile intrusion into a network. Most conventional intrusion detection systems have limitations in the way they log their alerts which snort exhibit is known as the infidelity issue, that is to say snort IDS does not infer the behavior of the network traffic generated, which can result in misinterpretations. Therefore in this project data mining techniques was applied to the logged alert in order to extract hidden knowledge of the traffic pattern. This research investigates the network domain of data mining using the network alerts generated from snort intrusion detection system in order to mine the alerts for re-classification. The data comprised of nine sixty (960) records of alerts. Classification task is used to evaluate the alerts making use of Bayesian Network and Decision Tree methods. The output of the two classification methods – Bayesian Network and Decision Tree are compared to determine the one that gives the best classification results. At the modeling stage, open source software called WEKA 3.6.13 was used. The data set was divided into two sets – Training and Testing. Sixty eight percent (68%) was used for training while thirty two percent (32%) was used for testing. From the output generated from the experiment, Decision tree outperformed Bayesian network in most aspects and the existing snort with data mining is more reliable and efficient over snort alone. The results obtained from the analysis clearly demonstrated that Decision tree outperformed Bayesian network. Decision tree demonstrated a superior performance than Bayesian network in term of the number of correctly classified instances and also in terms of Root Mean Squared Error, Root Relative Squared Error, Mean Absolute Error, Relative Absolute Error. Bayesian Network outfitted Decision Tree in time taken to build the model but performed poorly at the classification. The time taken for naïve bayes and decision tree classifiers are 0.12 and 0.32 seconds respectively.

Full Text:

PDF

References


Debar, H., Dacier, M. and Wespi, A. (2000): A revised taxonomy for intrusion detection systems, Annales des Telecommunications, 55, 7-8 (2000) 361-378. [2] Axelsson, S. (2000): The Base-rate fallacy and the difficulty of intrusion detection, ACM Transactions on Information and System Security, vol 3, pp 186–205. [3] W. Robertson and W. K. Robertson (2004): “Alert verification determining the success of intrusion attemptsâ€, The Proceedings of the Detection of Intrusions and Malware and Vulnerability Assessment, Dortmund, Germany, pp. 25-38. [4] T. Chyssler, et al (2004): “Alarm Reduction and Correlation in Intrusion Detection Systemsâ€, Proceedings of the International Workshops on Enabling Technologies, Infrastructures for Collaborative Enterprises, pp. 229-234. [5] T. Pietraszek (2004): “Using adaptive alert classification to reduce false positives in intrusion detectionâ€, The Proceedings of the symposium on Recent Advances in Intrusion Detection (RAID’04), Sophia Antipolis, France, pp. 102-124.

Georgios P. Spathoulas and S. K. Katsikas, “Reducing False Positives in Intrusion Detection Systemâ€, Computer and Security, vol. 29, no. 1, (2010), pp. 35-44. [7] M.A.Shanti (2014): Application of Data Mining Using Snort rule for intrusion Detection. [8] Faeiz Alserhani, et al (2009): “Evaluating Intrusion Detection Systems in High Speed Networksâ€, In Press, Fifth International Conference of Information Assurance and Security (IAS 2009), IEEE Computer Society. [9] Schultz, M., Eskin, E., Zadok, F. and Stolfo, S. (2001): Data Mining Methods for Detection of New Malicious Executables. Proceedings of 2001 IEEE Symposium on Security and Privacy, Oakland, 38-49. [10] Kolter, J. and Maloof, M. (2004): Learning to Detect Malicious Executables in the Wild. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 470-478. [11] Kong, D. and Yan, G. (2013): Discriminant Malware Distance Learning on Structural Information for Automated Malware Classification. Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, 347-348. [12] Osunade O., Adeyemo B. and Oriola, O. (2012): Network Threat Characterization in Multiple Intrusion Perspectives using Data Mining Technique, International Journal of Network Security and Its Applications (IJNSA), Vol.4, No.6. [13] Siddiqui, M., Wang, M.C. and Lee, J. (2009): Detecting Internet Worms Using Data Mining Techniques. Journal of Systemics, Cybernetics and Informatics, 6, 48-53. [14] Anderson, B. et al (2011): Graph Based Malware Detection Using Dynamic Analysis. Journal in Computer Virology, 7, 247-258. http://dx.doi.org/10.1007/s11416-011-0152-x. [15] Tian, R., Islam, M.R., Batten, L. and Versteeg, S. (2010): Differentiating Malware from Cleanwares Using Behavioral Analysis. Proceedings of 5th International Conference on Malicious and Unwanted Software (Malware), Nancy, 19-20.

Lee, T. and Mody, J.J. (2006): Behavioral Classification. Proceedings of the European Institute for Computer Antivirus Research Conference (EICAR’06). [17] Santos, I., et al (2013): A Static-Dynamic Approach for Machine Learning Based Malware Detection. Proceedings of International Conference CISIS’12ICEUTE’12, Special Sessions Advances in Intelligent Systems and Computing, 189, 271-280. [18] Raftopoulos E. and Dimitropoulos X (2010): Detecting, Validating and Characterizing Computer Infections in the Wild. [19] Subbulakshmi T. et al (2010): Real Time Classification and Clustering of ids Alerts using Machine Learning Algorithms international journal of artificial intelligence and applications (ijaia), vol. 1, no.1. [20] Romero C, Olmo JL, Ventura S (2013): A meta-learning approach for recommending a subset of white-box classification algorithms for Moodle datasets. Department of Computer Science, University of Cordoba, Spain.


Refbacks

  • There are currently no refbacks.