Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI version upgrade (latest version) #56

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Sh1gechan
Copy link

pyproject.tomlにおけるopenaiのバージョンの変更を行いました。最新版にしてあります。

@Sh1gechan
Copy link
Author

Sh1gechan commented Jun 24, 2024

概要
複数回の評価を簡単に行えるようにするため、gen_model_answer.py スクリプトと gen_judgment.py スクリプトに --num_answers_per_question オプションを追加しました。また、llm_judge/common.py を最新のOpenAIパッケージで動作するように変更し、使用方法をREADMEに追加しました。
使用例

# モデルの回答を生成
$ python llm_judge/gen_model_answer.py --num_answers_per_question 5
# 回答を評価
$ python llm_judge/gen_judgment.py --num_answers_per_question 5

@hkiyomaru hkiyomaru self-requested a review June 25, 2024 04:01
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configs/ を消しているのはなぜですか?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

こちらは必要ないコミットをしてしまったので削除しました。

@@ -9,17 +9,20 @@
from typing import Optional, Union

import openai
from openai import AzureOpenAI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure の API だけでなく OpenAI の API でも動く実装にしてください.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AzureのAPIしか現状使用できないので、検証はできませんが大丈夫でしょうか。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

それならこの部分はこちらで実装 & テストします.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

承知しました。

@@ -26,7 +26,6 @@
"generic": 0.1,
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

品質管理のためにリンターとフォーマッターを入れています.以下のコマンドを実行してください.

$ pre-commit install  # 以降,コミット時に自動的にリンターとフォーマッターが走ります
$ pre-commit run -a  # 今いるディレクトリ以下の全ファイルにリンターとフォーマッターを適用します

@@ -132,7 +140,7 @@ def make_match_groups_pairwise(
parser.add_argument(
"--judge-model",
type=str,
default="gpt-4",
default="gpt-4-0613",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

judge-model のデフォルト値は gpt-4 のままにしておいてください.これは複数回の評価をサポートするための PR なので,それと関係ない変更はしないでください.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

すみません、こちらについては自分の環境のままpushしてしまいました。修正しておきました。

@@ -63,6 +65,8 @@ def make_match_groups_single(
ref_answer=ref_answer,
)
)
if num_answers_per_question:
matches = matches[:num_answers_per_question]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

実装が間違っています.各質問について num_answers_per_question 件の回答を抽出してください.

@@ -111,6 +117,8 @@ def make_match_groups_pairwise(
ref_answer=ref_answer,
)
)
if num_answers_per_question:
matches = matches[:num_answers_per_question]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

実装が間違っています.各質問について num_answers_per_question 件の回答を抽出してください.

@Sh1gechan Sh1gechan force-pushed the add/num_answers_per_question branch from 38c2715 to 01e6043 Compare July 31, 2024 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants