Skip to content

rishabh11336/Phishing-Domain-Detection

Repository files navigation

Phishing Domain Detection Project

Project Overview

This project focuses on developing a machine learning solution for detecting phishing domains. Phishing, a prevalent form of cyber fraud, involves attackers impersonating reputable entities to obtain sensitive information. The primary goal is to predict whether domains are real or malicious, thereby enhancing cybersecurity measures.
Challenge lies in differentiating between legitimate and malicious domains. Performed data exploration, cleaning, feature engineering, model building, and testing.

Dataset

Approach

The project employs various machine learning algorithms tailored to the problem at hand. Feature engineering includes URL-based, domain-based, page-based, and content-based features later decided to used Random Forest.

Technologies Used

Database

  • Cassandra database is utilized for this project saving every transaction as history and saving datbase for safe and phising url.

Cloud Platform

  • Cloud platform Azure is used for hosting the solution.

Logging

  • Python logging library is employed for logging every action performed by the code. finally logs are being saved in GitHub Repository Logging.

Wireframe

Solution Design

Architecture Diagram

Azure PythonFlask
sklearn Cassandra

To Build this project

git clone https://github.com/rishabh11336/iNeuron-Internship-Phishing-Domain-Detection.git

create virtual enviroment use python 3.9

python -m venv <name of virtual environment>

or

conda create -n <name of virtual environment> python=3.9

Learn more about virtual enviroment refer to

https://medium.com/@asusrishabh/requirements-txt-in-python-947b0b43bbe6

use requirement.txt in Phishing Domain Detection App to install compatible packages

pip install -r requirement.txt
python app.py

API Endpoint

POST request

localhost:5000/predict

or

urlphishingdetection.azurewebsites.net/predict
Send url
{
  "url" : "https://www.example.com"
}

Get request

localhost:5000/predict

or

urlphishingdetection.azurewebsites.net/fetch

For more details, refer to the GitHub repository for the project.
If you are building this project locally, then new cassandra secret, token and clientid with keyspace and table will be required.