-
-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Select a pre-trained model or fine-tune a sentiment analysis model. #2329
Comments
I think we can go with DistilBERT which is managed by Hugging Face and has many advantages over the generally used models for Sentiment analysis. DistilBERTThe DistilBERT model was proposed in the blog post Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT, and the paper DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than google-bert/bert-base-uncased, runs 60% faster while preserving over 95% of BERT’s performances as measured on the GLUE language understanding benchmark. BERTGoogle developed BERT to serve as a bidirectional transformer model that examines words within text by considering both left-to-right and right-to-left contexts. It helps computer systems understand text as opposed to creating text, which GPT models are made to do. BERT excels at NLU tasks as well as performing sentiment analysis. It's ideal for Google searches and customer feedback. How is BERT different from GPTGPT models differ from BERT in both their objectives and their use cases. GPT models are forms of generative AI that generate original text and other forms of content. They're also well-suited for summarizing long pieces of text and text that's hard to interpret. |
I have two issues with this: First, it requires a powerful PC or a premium server to use and train the model, as free trials can't handle the load. Recommended Specifications for Better Performance:
The second issue is why use something complex when a simple machine learning project could suffice. The functionality is very basic, and with just logistic regression, I achieved 92% accuracy. By incorporating multiple layers of different machine learning algorithms, the accuracy could easily reach 98% or even higher. Please share your thoughts on this |
This is something we should definitely look into.. as for the training part Google colab do provide you the resources to train the model. I have trained a model on this and was able to achieve around 90% of accuracy but the accuracy completely depends on the data set selected. The DistillBert is able to understand the context of the sentence so if a new word or arrangement of words is encountered it can easily understand it which is not possible by training regression model. Also once trained you dont need to retrain the model on every request. I think we can see if we get any hosting platform for a minimum cost cause anyway we would have to host the model somewhere, but i guess we can try hosting it on the same server where the backend is hosted currently. Would definitely like the mentors opinion on this tho . |
True, running a LLM just for simpler tasks will consume a lot of resources. Try tensorflow or pre built traditional models on GitHub. |
So, as for the update I increased the data set to almost 5k , I think focusing on that is really important other than that I tried using stacking ensemble learning(one of the three ways to use multiple models for one ) for this and if we use the right models I think we can do alot better and wouldn't need buy servers or anything and also training the model needs to done once here too , We can use joblib library to save the model and use it |
I was trying to create a dataset out of the current issues in the server but there are some issues with this:
So the problem for the dataset still remains the same. @DonnieBLT what should we do regarding this? |
If you want to use that approach then the best idea would be web scrapping but not directly as you're doing it |
How about we add the labeling option as I said and after we get a dataset of about 2k we can use it and then we will also store the future bugs and add them to dataset and after we find the dataset and our model good enough we can use earlier bugs too , this will create us a big dataset Not sure if this is the best approach |
I would have been in favor of this but currently the bugs being reported is mostly by anonymous users and that too about the blt app . Also not sure of the real traffic on the application currently. Also we cant perform web scraping cause some sites have proxies to prevent scraping on them and this can break the scraper too. We cant rely on only the bugs reported on blt for the dataset , we need to have other sources too . I tried looking for such dataset on Kaggle to but no success. |
There are no datasets online and if that's the thing then we will have to use gpt for it at least for now , One thing you can look into is jira if you can find its dataset , then that's more than enough (Research about what jira is first) |
I have worked with Jira before but didnt knew that it provided a dataset too.. Thanks for the info but i could only find this |
I don't think they share any dataset officially , It was just an idea if we can get hands on any dataset it would be helpful |
I think we can close this issue since we are going with GPT to generate the fields and other models. |
No description provided.
The text was updated successfully, but these errors were encountered: