Skip to content

An NLP model to classify abstract sentences into the role they play (e.g. objective, methods, results, etc..) to enable researchers to skim through the literature and dive deeper when necessary.

Notifications You must be signed in to change notification settings

abhi1geeks/SkimLit

 
 

Repository files navigation

SkimLit

An NLP model to classify abstract sentences into the role they play (e.g. objective, methods, results, etc..) to enable researchers to skim through the literature and dive deeper when necessary.

Try Demo; WEB APP

Dataset Used

PubMed 200k RCT dataset

Some miscellaneous information:

  • PubMed 20k is a subset of PubMed 200k. I.e., any abstract present in PubMed 20k is also present in PubMed 200k.

  • PubMed_200k_RCT is the same as PubMed_200k_RCT_numbers_replaced_with_at_sign, except that in the latter all numbers had been replaced by @. (same for PubMed_20k_RCT vs. PubMed_20k_RCT_numbers_replaced_with_at_sign).

  • Count Plot

Models Tried

All the note books are availabel here

  • NaiveBiase Model -> 72% Accuracy
  • Conv1D Model -> 78% Accuracy
  • Model using pretrained token embedding ( Universal sentence embedding ) -> 75% Accuracy
  • Conv1D Model using character level embedding -> 73% Accuracy
  • Model with both token and charcter level embedding -> 76% Accuracy
  • Model with token, character and position level embedding ( https://arxiv.org/pdf/1612.05251.pdf ) -> 81% Accuracy
  • Model described in this paper with bert embedding -> 88% Accuracy

Final Results

Results of all Models

Best Performong Model

Final Outputs

Packages Used

  • Tensorflow
  • tensorflow_text
  • tensorflow_hub
  • sklearn
  • Matplotlib
  • numpy
  • pandas
  • spaCy

Contact Me

Github  LinkedIn  Instagram 

About

An NLP model to classify abstract sentences into the role they play (e.g. objective, methods, results, etc..) to enable researchers to skim through the literature and dive deeper when necessary.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.1%
  • Other 0.9%