Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
RajKhanke authored Jul 20, 2024
1 parent a7a4f5b commit da8ceea
Show file tree
Hide file tree
Showing 16 changed files with 15,459 additions and 0 deletions.
13,404 changes: 13,404 additions & 0 deletions stack_overflow_security_questions_analysis/IoT-Security-Dataset.csv

Large diffs are not rendered by default.

48 changes: 48 additions & 0 deletions stack_overflow_security_questions_analysis/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import streamlit as st
import pandas as pd
import joblib
import re
from sklearn.feature_extraction.text import TfidfVectorizer

# Load the dataset
df = pd.read_csv('IoT-Security-Dataset.csv')

# Load the saved Random Forest model
rf_model_loaded = joblib.load('random_forest_model.pkl')

# Load and fit the TF-IDF vectorizer on the dataset
tfidf_vectorizer = TfidfVectorizer(max_features=5000)
tfidf_vectorizer.fit(df['Cleaned Sentence'])

# Function to preprocess the input text
def preprocess_text(text):
text = text.lower()
text = re.sub(r'\W', ' ', text)
text = re.sub(r'\d', ' ', text)
text = re.sub(r'\s+[a-z]\s+', ' ', text)
text = re.sub(r'\s+', ' ', text).strip()
return text

# Function to predict if a question is security-related
def predict_security(question, model, vectorizer):
clean_question = preprocess_text(question)
question_tfidf = vectorizer.transform([clean_question])
prediction = model.predict(question_tfidf)
return prediction[0]

# Streamlit app
st.title("Security text Predictor")

st.write("Enter your question below to determine if it is related to security.")

user_question = st.text_area("Your Question")

if st.button("Predict"):
if user_question.strip() != "":
prediction = predict_security(user_question, rf_model_loaded, tfidf_vectorizer)
if prediction == 0:
st.success("This question is security-related.")
else:
st.info("This question is not security-related.")
else:
st.error("Please enter a question.")
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
42 changes: 42 additions & 0 deletions stack_overflow_security_questions_analysis/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Stack Overflow iot security question analysis and predictor

## Models used

- logistic regression
- Random Forest
- SVM
- GBM

## Libraries Used

1. **joblib**: To dowload and laod the model
2. **plotly**: For plotting zooming and 3d visualizations
3. **Matplotlib**: For plotting and visualizing the detection results.
4. **Pandas**: For image manipulation.
5. **NumPy**: For efficient numerical operations.
6. **Streamlit** : for building web app gui.

## dowload model from drive

https://drive.google.com/file/d/12h_fU5WI3KQvXH_qG7RoIKnteceb2fLw/view?usp=sharing

## How to Use

1. **Clone the Repository**:
```sh
git clone url_to_this_repository
```

2. **Install Dependencies**:
```sh
pip install -r requirements.txt
```

3. **Run the Model**:
```python
python main.py
```

4. **View Results**: The script will allow you to predict the text or question from stack overflow is security based or not.


Loading

0 comments on commit da8ceea

Please sign in to comment.