Skip to content

chocosando/PyShortTextCategorization

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Short Text Categorization in Python

This repository is a collection of algorithms for multi-class classification to short texts using Python. Modules are backward compatible unless otherwise specified. Feel free to give suggestions.

To install it, in a console, use pip.

>>> pip install -U shorttext

or, if you want the most updated code that is not released on PyPI yet, type

>>> pip install -U git+https://github.com/stephenhky/PyShortTextCategorization@master

Developers are advised to make sure Keras >=2 be installed. Users are advised to install the backend Tensorflow (preferred) or Theano in advance.

Before using, check the language model of spaCy has been installed or updated, by running:

>>> spacy download en

See tutorial for how to use the package.

Issues

To report any issues, go to the Issues tab of the Github page and start a thread. It is welcome for developers to submit pull requests on their own to fix any errors.

Useful Links

News

  • 02/27/2018: shorttext 0.6.0 released.
  • 01/19/2018: shorttext 0.5.11 released.
  • 01/15/2018: shorttext 0.5.10 released.
  • 12/14/2017: shorttext 0.5.9 released.
  • 11/08/2017: shorttext 0.5.8 released.
  • 10/27/2017: shorttext 0.5.7 released.
  • 10/17/2017: shorttext 0.5.6 released.
  • 09/28/2017: shorttext 0.5.5 released.
  • 09/08/2017: shorttext 0.5.4 released.
  • 09/02/2017: end of GSoC project. (Report)
  • 08/22/2017: shorttext 0.5.1 released.
  • 07/28/2017: shorttext 0.4.1 released.
  • 07/26/2017: shorttext 0.4.0 released.
  • 06/16/2017: shorttext 0.3.8 released.
  • 06/12/2017: shorttext 0.3.7 released.
  • 06/02/2017: shorttext 0.3.6 released.
  • 05/30/2017: GSoC project (Chinmaya Pancholi, with gensim)
  • 05/16/2017: shorttext 0.3.5 released.
  • 04/27/2017: shorttext 0.3.4 released.
  • 04/19/2017: shorttext 0.3.3 released.
  • 03/28/2017: shorttext 0.3.2 released.
  • 03/14/2017: shorttext 0.3.1 released.
  • 02/23/2017: shorttext 0.2.1 released.
  • 12/21/2016: shorttext 0.2.0 released.
  • 11/25/2016: shorttext 0.1.2 released.
  • 11/21/2016: shorttext 0.1.1 released.

Possible Future Updates

  • Spelling corrections and fuzzy logic;
  • Gradually replacing keras with direct TensorFlow or the keras package within TensorFlow;
  • Jupyter notebooks as tutorials;
  • Python 3 compatibility;
  • More neural networks;
  • More available corpus.

About

Various Algorithms for Short Text Categorization

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 64.4%
  • C 33.5%
  • Makefile 2.1%