SchNovel

Description

The repository contains the code for the paper "Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications." Each folder contains the basic file structure and code required to replicate the experiments and the SchNovel benchmark proposed in the paper.

Installation

Using Pip

Clone the project repository:

git clone https://github.com/ethannlin/schnovel
cd schnovel

[Optional] Create and activate a conda environment:

conda create -n myenv python=3.12
conda activate myenv

Install the required Python packages:
```
pip install -r requirements.txt
```

Using Conda

Clone the project repository:

git clone https://github.com/ethannlin/schnovel
cd schnovel

Create a conda environment and install dependencies:
```
conda env create -f environment.yaml -n myenv
```
Activate the conda environment:
```
conda activate myenv
```

Setting up

Download the labeled datasets

git clone https://huggingface.co/datasets/ethannlin/SchNovel

Setup .env file

Update the .env file to include your OpenAI API keys:

OPENAI_API_KEY = ""
OPENAI_ORG_ID = ""
OPENAI_PROJECT_ID = ""

Replace the empty strings ("") with your actual API key, organization ID, and project ID.

Setup RAG-Novelty

Navigate to rag-novelty folder.
```
cd rag-novelty
```
Update scripts/generate.py with the filepaths to the vector db data, the desired directory path, and database name.
- Running this script will create a vector database for the desired category.
```
python scripts/generate.py
```

How to run

Navigate to the project folder:
```
cd [$folder_name]
```
Update generate_batch.py, average_results.ipynb, and que_batch.ipynb with the desired category and file paths.
```
# example: replace with filepath to [CATEGORY]'s json dataset
FILEPATH = ""
```
Run generate_batch.py
- This will generate and store the batch files into the desired directory.
```
python generate_batch.py
```
Open que_batch.ipynb
- Follow instructions in the notebook to queue the batches either manually or automatically.
Retrieve results from OpenAI Batch API and run average_results.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cot		cot
llm-discussion		llm-discussion
other-llms		other-llms
rag-novelty		rag-novelty
self-consistency		self-consistency
self-reflection		self-reflection
two-shot		two-shot
zero-shot		zero-shot
.env		.env
.gitignore		.gitignore
LICENCE.md		LICENCE.md
README.md		README.md
environment.yaml		environment.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SchNovel

Description

Installation

Using Pip

Using Conda

Setting up

Download the labeled datasets

Setup .env file

Setup RAG-Novelty

How to run

About

Releases

Packages

Languages

License

ethannlin/SchNovel

Folders and files

Latest commit

History

Repository files navigation

SchNovel

Description

Installation

Using Pip

Using Conda

Setting up

Download the labeled datasets

Setup .env file

Setup RAG-Novelty

How to run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages