https://github.com/melanie-t/twitter-language-detection
This project uses Naive Bayes Classification for Natural Language Processing. The goal of the project is to detect the language (in a pre-specified list) of tweets using variations of N-Grams models. The languages supported are:
- Basque (eu)
- Catalan (ca)
- Galician (gl)
- Spanish (es)
- English (en)
- Portuguese (pt)
- Python Version 3.7+
- Required Python packages
numpy
- Download the project via clone (on Git Repository) or ZIP file and extract the folder
- Open the folder (twitter-language-detection) as a Python project with your choice of IDE
- Ensure that your Python interpreter is set to Python 3.7
- Set working directory to
twitter-language-detection/src
- Run
Main.py
- Enter the absolute path to the test file
- The trace and evaluation files will be saved in src/output