Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Extraction / Additional Data #2

Open
wgmueller1 opened this issue Dec 30, 2016 · 4 comments
Open

Feature Extraction / Additional Data #2

wgmueller1 opened this issue Dec 30, 2016 · 4 comments

Comments

@wgmueller1
Copy link
Collaborator

We need to identify features of interest and extract them from each article. This may necissitate bringing in additional data. For example, Impact Factor may be useful. I have access to Web of Science and can download the impact factors for each year, but need to identify the date of publication for each article.

What other information / features are people interested in?

@souravsingh
Copy link
Collaborator

We could have Subject of the research, the domain in which the research was conducted.

@wgmueller1
Copy link
Collaborator Author

@souravsingh do you have any ideas for this? we would probably need some additional work for this. for the pubmed dataset, we have journal name, but don't have categories or keywords (that I know of)/

@souravsingh
Copy link
Collaborator

I think the PubMed contains a tag called MeSH Major Topic, we could use that.

@wgmueller1
Copy link
Collaborator Author

I'll work on a feature extractor and include the following (add to list if you have other ideas).

  1. Journal Id
  2. Journal Impact Factor
  3. Major Topic
  4. Author Id
  5. Author Institutions
  6. Abstract
  7. Full text

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants