rake4j

This is a re-write of Python RAKE in Java.

An implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm as described in: Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents

Run

Sample

Normal run

        Document doc = new Document(text);
        RakeAnalyzer rake = new RakeAnalyzer();
        rake.loadDocument(doc);
        rake.runWithoutOffset();
        System.out.println(doc.termListToString());

Run with offset information and stemming

        Document doc = new Document(text);
        RakeAnalyzer rake = new RakeAnalyzer();
        rake.loadDocument(doc);
        rake.run();
        System.out.println(doc.termMapToString());

Features

Recognized keywords from the algorithm based on stop words

Adjoining keywords to recognized "axis of evil".
KStemming algorithm ported from Lucene, to stem "university students" to "university student".
Construct index of keywords with term frequency tf and document frequency df.

Dependencies

In pom.xml, another custom maven module dependency is required:

        <dependency>
            <groupId>io.deepreader.java.commons</groupId>
            <artifactId>commons-util</artifactId>
            <version>1.0-SNAPSHOT</version>
        </dependency>

You can get the module manually by:

git clone https://github.com/idf/commons-util

, which is hosted here.

References

Python RAKE
Python RAKE (forked)
Java RAKE

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml
rake4j.iml		rake4j.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rake4j

Run

Sample

Features

Dependencies

References

About

Releases

Packages

Languages

License

idf/rake4j

Folders and files

Latest commit

History

Repository files navigation

rake4j

Run

Sample

Features

Dependencies

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages