Skip to content

A package to extract important keywords from a document using TF-IDF technique

License

Notifications You must be signed in to change notification settings

agtabesh/keyword-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

@agtabesh/keyword-extractor

npm (scoped)

npm (scoped)

TF-IDF is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. This package uses TF-IDF to extract important keywords from a document.

Install

$ npm install @agtabesh/keyword-extractor --save

Usage

const { EnglishTokenizer, KeywordExtractor } = require("@agtabesh/keyword-extractor")
const documents = [
  'Austria will press ahead with a proposed tax on internet giants after plans for an European Union-wide levy fell through this week, Finance Minister Hartwig Loeger said on Friday.',
  'Facebook Inc said on Friday it would use artificial intelligence to combat the spread of intimate photos shared without peoples permission, sometimes called "revenge porn," on its social networks.',
  'Tesla Inc unveiled its Model Y electric sports utility vehicle on Thursday evening in California, promising a much-awaited crossover that will face competition from European car makers rolling out their own electric rivals.'
]

const tokenizer = new EnglishTokenizer()
const keywordExtractor = new KeywordExtractor()
keywordExtractor.setTokenizer(tokenizer)

documents.forEach((text, i) => {
  keywordExtractor.addDocument(i, text)
})

const randomDocument = documents[Math.floor(Math.random() * documents.length)]
const keywords = keywordExtractor.extractKeywords(randomDocument, {
  sortByScore: true,
  limit: 10
})

console.log(keywords)

All contributions are welcome.

About

A package to extract important keywords from a document using TF-IDF technique

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published