The Tagging system is a web application that automatically assigns tags to stack overflow questions based on their content.
- Improved discoverability of questions : When questions are tagged correctly, they are more likely to be found by users who are lookin for help with similar problems- Increased relevance of answers : When answers are tagged with the same tags as the questions they are answering, they are more likely to be relevant and helpful
- Reduced workload for moderators : Moderators can focus on other tasks, such as reviewing questions and answers, when an autonomous tagging system is in place
- Roopa Dharshini - Teal Lead, ML Developer
- Harish Raghavendar - Cloud Developer
- Ujjwal Anand - Back-end Developer
- Hritesh Sinha - Front-end Developer
- Introduced Copy-to-Clipboard button The copy to clipboard button is a small button that allows users to copy tags to the clipboard. This can be useful to copy tags from 'Tagging System' and include those tags while asking questions in Stack overflow website, so that the user can found solution easily
- Ensure you have Python 3.x installed on your system
- Clone this repository
- Navigate to the project directory
- Install the necessary modules
- Run the app from the command line
- Launch the app Find the app in port 8000
python --version
git clone https://github.com/Rupa-Rd/SBSPS-Challenge-9913-Autonomous-Tagging-Of-Stack-Overflow-Questions.git
cd auto_tagging
pip install -r requirements.txt
python manage.py runserver
- Landing Page / Home Page
- Question & Title Input Page
- Tags Prediction Page
- Tags Copied to Clipboard
- Question & Title Input Page
- Tags Prediction Page
- The user should provide the Title and body of the question in the landing page of the "Tagging System App"
- The title and the body of the question will undergo vector transformation before feeding the data to the trained model
- The App access already trained model in `model.py` as a pickle file inorder to make predictions
- The predicted tags will be a set of numeric arrays, this then undergo inverse transformations inorder to convert it as a `string`
- Then the predcited tags will be displayed in the web page which allows the users to copy and use it
- Model Preprocessing
- IBM Watson (NLP)
- Pandas, Numpy
- Scikit-learn
- Model Training
- Support Vector Machine Model
- Logistic Regression Model
- OneVsRest Classifier Model
- TfIdVectorization
- Front-End
- Html
- Css
- Javascript
- Back-End
- Django
- Python
- IBM Watson Studio We have used IBM Watson to pre-process the Stack overflow dataset given to us. We used 'NLP module' in IBM Watson studio to remove the following :
- To remove punctuations (,.?/:""''*&)
- To remove stop words (a,an,not,..)
- To replace words (don't - donot, can't - cannot, ...)
- To drop the unwanted columns (Score, Date, Time)
- To check and replace the null values
- Preparing to deploy the app in the cloud Our app will be available to everyone, regardless of their location, by deploying it in the cloud
- Improving the predictability of the model The current model has a 60% prediction rate; with more data, it will be able to predict a larger number of tags with better accuracy
- Click here to watch