A Machine Learning approach to detect Malwares in the system.
Malware is malicious software that was intentionally developed to infiltrate or damage a computer system without consent of the owner. This includes, among others, viruses, worms, and Trojan horses.
Shorthand for malicious software, malware typically consists of code developed by cyberattackers, designed to cause extensive damage to data and systems or to gain unauthorized access to a network. Once installed on a system, malware can cause a wide range of problems, from stealing personal information to destroying critical data. The impact of malware on individuals and organizations can be devastating. For individuals, malware can result in identity theft, financial loss, and loss of privacy. For organizations, malware can cause significant financial and reputational damage, as well as loss of sensitive data.
Malware detection refers to the process of detecting the presence of malware on a host system or of distinguishing whether a specific program is malicious or benign.
Here I have performed many Machine Learning Algorithm so that I can check which algorithm is performing better on classifying whether a specific program is malicious or not.
The Logistic Regression model gave an accuracy
of 98%
which is really good. This is my first cut solution for detecting Malwares.
The Gradient Boosting model performed much better than Logistic Regression and gave an accuracy
of 98.88%
.
The MLP Classifier model gave an accuracy
of 99%
which is better than Logistic Regression and Gradient Boosting.
The Decision Tree model gave an accuracy
of 99.04%
which is slightly better than MLP Classifier but not any drastic increase in performance.
The Random Forest model performed the best!! It gave an accuarcy
of 99.54%
. It is my final solution.
The best model and the scaler is saved and present in the asset folder. Do give it a try.
This project is licensed under the MIT License.