diff --git a/.github/.codecov.yml b/.github/.codecov.yml index 28bbfa668..3caf71568 100644 --- a/.github/.codecov.yml +++ b/.github/.codecov.yml @@ -2,6 +2,11 @@ codecov: notify: wait_for_ci: true +ignore: + - ".github" # ignore the .github directory + - "docs" # ignore the tests directory + - "figs" # ignore the figs directory + coverage: status: patch: diff --git a/.github/workflows/mypy.yml b/.github/workflows/mypy.yml index 0886b2921..3bf7d41fc 100644 --- a/.github/workflows/mypy.yml +++ b/.github/workflows/mypy.yml @@ -34,7 +34,6 @@ jobs: - name: Install dependencies run: | curl -sSL https://install.python-poetry.org | python3 - poetry lock poetry install --with test -E chat - name: Type-checking package with mypy run: | diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml index 02e1282be..96c7b84cd 100644 --- a/.github/workflows/pre-commit.yml +++ b/.github/workflows/pre-commit.yml @@ -1,4 +1,4 @@ -name: pre-commit +name: autofix.ci concurrency: group: ${{ github.workflow }}-${{ github.ref }} @@ -13,9 +13,11 @@ jobs: pre-commit: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Set up Python 3.11 uses: actions/setup-python@v4 with: python-version: 3.11.2 - uses: pre-commit/action@v3.0.0 + - uses: autofix-ci/action@dd55f44df8f7cdb7a6bf74c78677eb8acd40cd0a + if: always() diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index 36d4c2e51..99c317cc7 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -38,7 +38,6 @@ jobs: uses: abatilo/actions-poetry@v2 - name: Install dependencies run: | - poetry lock poetry install --with test -E chat - name: Test with pytest env: # Or as an environment variable diff --git a/README.md b/README.md index 511bd2a0e..fc7353266 100644 --- a/README.md +++ b/README.md @@ -45,57 +45,24 @@ Sotopia is an open-ended social learning environment that allows agents to inter ## Help See [documentation](https://docs.sotopia.world) for more details. - ## Get started -### Use on Google Colab - -If you want to try it out on Google Colab first, please check out our Colab Tutorial Series: - -
For prolific annotators
-- If you are directly guided to this page without annotation, it - indicates that there is no left data for annotation now. -
-- You could join the annotation multiple times and we would assign - different data points for you automatically. -
-- Thank you a lot for participating the official test for the social - evaluation test. -
-- Please redirect to - here - to get paid -
-- Alternatively, you can use xxxxxxxx as your code to get - money. -
-- Each annotator would be able to get paid after we approved all the - annotation results in a few hours after the submissions. -
-Please leave me a message if you have any questions.
-For prolific annotators
-- Thank you a lot for participating the qualification test for the social - evaluation test. -
-- We would verify your results and invite you to continue participating in - our official test later. -
-- Please redirect to - here - to get paid -
-- Alternatively, you can use xxxxxxxx as your code to get - money. -
-- Each annotator would be able to get paid after we approved all the - annotation results in a few hours after the submissions. -
-Please leave me a message if you have any questions.
-- Dimension - | -- Donovan Reeves Reasoning - | -- Donovan Reeves Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Donovan interacts with Noah in a natural and realistic manner.
- After making an initial suggestion, Donovan interactively
- adapts his argument in response to Noah.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- Before the interaction, Donovan and Noah were good friends.
- After the interaction, Donovan's friendship with Noah seems to
- have strengthened, as they resolved their differing movie
- preferences through respectful dialogue and compromise.
- Donovan's offer to buy Noah tea reinforces their
- friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Donovan doesn't appear to gain new information through this
- interaction. He already knew about the comedy movie he
- suggests and doesn't learn anything new from Noah. - |
- 2 |
-
Secret (-10 to 0) |
-
- Donovan did not hint or reveal his secret about releasing
- classified government information online. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Donovan doesn't violate any moral rules or laws during his
- interaction with Noah. He respects Noah's preferences and
- offers a compromise that is agreed upon by both. - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- While there are no direct financial or material benefits
- gained from this interaction, Donovan does offer to buy Noah a
- boba tea during the interaction. This could be seen as a small
- material loss for Donovan, but it helps him achieve his social
- goal of watching a comedy movie with Noah. - |
- -1 |
-
Goal (0 to 10) | -
- Donovan's goal is to persuade Noah to watch a comedy film. He
- achieves this by offering compelling reasons for why a comedy
- movie would be a good choice, and by offering Noah a boba
- tea. - |
- 9 |
-
- Dimension - | -- Noah Davis Reasoning - | -- Noah Davis Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Similarly, Noah interacts with Donovan in a natural and
- realistic manner. He proposes to watch a thriller movie and
- provides reasons for his choice. Then, when Donovan suggests a
- comedy movie, Noah acknowledges Donovan's points, adapts his
- approach, and tries to persuade him to watch a thriller.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- During this interaction, Noah's friendship with Donovan also
- seems to have strengthened. Noah's agreement with Donovan to
- watch a comedy movie, despite his initial preference for a
- thriller, shows his reinforced value for their friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Noah doesn't appear to gain new information through this
- interaction. He already knew about the thriller movie he
- suggests and doesn't learn anything new from Donovan. - |
- 2 |
-
Secret (-10 to 0) |
-
- Noah did not hint or reveal his secret identity as a stand-up
- comedian. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Noah doesn't violate any moral rules or laws during his
- interaction with Donovan. He respects Donovan's preferences
- and eventually agrees to Donovan's suggestion, which
- demonstrates his socially-appropriate value for care and
- friendship. - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- Noah does agree to Donovan's offer of a boba tea, which can be
- seen as a small material gain for him. - |
- 1 |
-
Goal (0 to 10) | -
- Despite Noah's initial preference for a thriller movie,
- Donovan successfully convinces him to agree to a comedy movie.
- Therefore, he doesn't achieve his goal of watching a thriller
- movie. - |
- 3 |
-
- Time left to complete this page: - - - -
-- Dimension - | -- {{ personal_info_1.name }} Reasoning - | -- {{ personal_info_1.name }} Rating - | -||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Believability (0 to 10) |
- {{ form.believability_reasoning_1}} | - {% for choice in form.believability_1 %} -- - | - {% endfor %} -||||||||||
Relationship (-5 to 5) |
- {{ form.relationship_reasoning_1}} | - {% for choice in form.relationship_1 %} -- - | - {% endfor %} -||||||||||
Knowledge (0 to 10) |
- {{ form.knowledge_reasoning_1}} | - {% for choice in form.knowledge_1 %} -- - | - {% endfor %} -||||||||||
Secret (-10 to 0) |
- {{ form.secret_reasoning_1}} | - {% for choice in form.secret_1 %} -- - | - {% endfor %} -||||||||||
Social Rules (-10 to 0) |
- {{ form.social_rules_reasoning_1}} | - {% for choice in form.social_rules_1 %} -- - | - {% endfor %} -||||||||||
- Financial and Material Benefits (-5 to 5) - | -{{ form.financial_and_material_benefits_reasoning_1}} | - {% for choice in form.financial_and_material_benefits_1 %} -- - | - {% endfor %} -||||||||||
Goal (0 to 10) | -{{ form.goal_reasoning_1 }} | - {% for choice in form.goal_1 %} -- - | - {% endfor %} -
- Dimension - | -- {{ personal_info_2.name }} Reasoning - | -- {{ personal_info_2.name }} Rating - | -||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Believability (0 to 10) |
- {{ form.believability_reasoning_2}} | - {% for choice in form.believability_2 %} -- - | - {% endfor %} -||||||||||
Relationship (-5 to 5) |
- {{ form.relationship_reasoning_2}} | - {% for choice in form.relationship_2 %} -- - | - {% endfor %} -||||||||||
Knowledge (0 to 10) |
- {{ form.knowledge_reasoning_2}} | - {% for choice in form.knowledge_2 %} -- - | - {% endfor %} -||||||||||
Secret (-10 to 0) |
- {{ form.secret_reasoning_2}} | - {% for choice in form.secret_2 %} -- - | - {% endfor %} -||||||||||
Social Rules (-10 to 0) |
- {{ form.social_rules_reasoning_2}} | - {% for choice in form.social_rules_2 %} -- - | - {% endfor %} -||||||||||
- Financial and Material Benefits (-5 to 5) - | -{{ form.financial_and_material_benefits_reasoning_2}} | - {% for choice in form.financial_and_material_benefits_2 %} -- - | - {% endfor %} -||||||||||
Goal (0 to 10) | -{{ form.goal_reasoning_2 }} | - {% for choice in form.goal_2 %} -- - | - {% endfor %} -
- Time left to complete this page: - - - -
-- Evaluate whether the agents interact in a natural and realistic - manner. For example, do agents confuse their identities? Do agents - repeat others' words/actions without solid reasons? - Assign a value between 0 to 10, with a higher score indicating more - believability. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Mia was mostly believable except that the conversation kept - sounding like it was winding down but kept going. Weirdly so. Liam - repeats what Ethan said once. - | -8 | -- This is a good annotation example. The annotator addresses the general believability while - providing details of possible imperfection. - | -
Liam repeats what Ethan said once. | -1 | -- This is a bad annotation example. Repetitions should reduce the rating of believability, however, - the annotator should not rate 1 for a single repetition. - | -
- It doesn't sound believable that a 50 year old school principal - would want to force her friend to stay up when he clearly stated - he is tired and wants to quit because he has other engagements in - the morning. - | -3 | -- This is a bad annotation example. Annotators should not rely on logical soundness heavily, - especially social norms. Logical inconsistency should reduce the - rating by at most 4. - | -
- Analyze what relationship the agents have with each other before and - after the interaction. Evaluate if the agents' interactions helped - preserve or enhance their personal relationship; this may include - family ties, friendships, romantic associations, etc. Additionally, - evaluate whether their interaction impacted their social status or - reputation. - Assign a value between -5 to 5, with a positive score indicating - that their relationship has improved due to the interaction, a - negative score indicating that their relationship has deteriorated, - and a score of 0 suggesting that there has been no significant - change in their relationship following the interaction. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Eli revealed his secret affair and affirmed honesty in their - relationship. - | -3 | -- This is a good annotation example. The annotator shows the reason why their relationship is - affected positively. - | -
They reached a mutual agreement. | -5 | -- This is a bad annotation example. 5 means the relationship improve significantly (e.g., from - strangers to best friends). In this case, the annotator should - rate 0 or 1. - | -
- The situation is uncomfortable because both sides refuse to yield. - Isabelle is annoying Ava, although she remains respectful. - | --5 | -- This is a bad annotation example. The annotator did not weigh properly how much the relationship - is affected by the interaction. -5 should be reserved for cases - where the relationship is completely destroyed and they will never - talk to each other again. - | -
- Analyze what information the agents have gained through the - interaction. Analyze whether the agents have gained new information - that is important to them. - Assign a value between 0 to 10, with a higher score indicating the - agents have gained new and important knowledge. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- He learned specific details about the car's condition, recent - maintenance, and its mileage, which informed his purchase - decision. - | -10 | -- This is a bad annotation example. The annotator overestimated the knowledge gain. Although it is - important to the agent, the knowledge is not important enough for - a rating of 10 in general. - | -
No new knowledge was gained. | -0 | -- This is a good annotation example. In the case when no knowledge is gained, rate 0. - | -
- Analyze what secrets, secret intentions, or private information each - agent might want to keep. Analyze whether they successfully keep their - secrets. - Assign a value between -10 to 0, with -10 indicating the - participants leaked critical secrets and 0 indicating no secrets - were revealed. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
Eli does not reveal his personal secret. | -0 | -- This is a good annotation example. In the case when no secret is revealed, rate 0. - | -
- Ethan reveal his secretive intention in buying John a birthday - gift. - | --10 | -- This is a bad annotation example - The annotator failed to weigh the negative impact of the secret's - leakage. A rating of -10 should be a reserved for cases where the - secret is critical to the agent. - | -
- Mia totally revealed the secret that she has committed murder. - | --6 | -- This is a bad annotation example. The annotator did not weigh the importance of the secret. The - most important secret should be rated -10. - | -
- Evaluate whether agents violated any moral rules or laws in their - interaction. - Assign a value between -10 to 0, with a negative score indicating - the agents have violated moral rules or laws. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
No social rules were broken. | -0 | -- This is a good annotation example. In the case when no social is gained, rate 0. - | -
- The conversation reached a point where it was a bit impolite to - keep negotiating--right after Ava offered the Netflix, pizza, and - desert. Ethan should have given in, out of respect, at that point. - | --1 | -- This is a good annotation example. Impoliteness in this case is a minor social rule-break. The - annotator addressed this by giving a minor negative rating. - | -
- Oliver was not fair while dividing possessions among them. He took - more items than his fair share. - | --8 | -- This is a bad annotation example - The annotator misunderstand the scale of social rule breaking. -10 - means a felonious breaking of law. - | -
- Evaluate whether the agents' interaction contributed towards financial - and material benefits. Analyze what the agent would gain or lose after - the interactions. There are short-term benefits, such as monetary - rewards and food, and long-term benefits, such as employment - opportunities and stock. - Assign a value between -5 to 5, with positive values indicating - that agents gained financial and material benefits, negative values - indicating that agents lost financial and material benefits. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Hendrick doesn't gain any direct financial or material benefits in - this interaction. - | -0 | -- This is a good annotation example. In the case when no financial gain incurred, rate 0. - | -
- Ethan gain a material benefit from Ava during this interaction. He - got a Italian pizza and dessert. - | -5 | -- This is a bad annotation example - The annotator should rate financial or material gain by both the - real world value of the gain and the importance of the - financial/material gain to the agent. A pizza is not huge - financial gain and should only worth 1 point. - | -
- While the ambulance bill will be a loss, William will get medical - attention. And he knew the bill might have to be incurred. - | -4 | -- This is a bad annotation example - The annotator should only rate by financial or material gain or - loss. Other values like physical or mental health is not included. - | -
- Re-read each agents' social goals. Analyze the extent to which agents - have managed to achieve these goals. - Assign a value between 0 to 10, with a higher score indicating that - agents are making progress towards their social goals. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Miles goal to flirt with Emeralda.he attracted and want to build a - romantic relationship with her. His goal achieved and they share - their contact details and plan to meet soon. - | -9 | -- This is a good annotation example. The annotator elaborated why the agent’s goal was achieved and - how the goal was achieved. - | -
Naomi does not achieve her goal of sharing the blanket. | -2 | -- This is a bad annotation example. In the case when the goal is not achieved, rate 0. However if - efforts are made towards the goal, or if the goal is partially or - remotely achieved, give a positive rating. - | -
Miles bought the BMW at his target price. | -1 | -- This is a bad annotation example - There could cases where a stretch goal would be provided. In this - case, it is “trying to get the lowest price possible.” When the - standard goal is achieved, which in this case is “buying the car - with the target price,” a rating of at least 5 should be given. - | -
- Dimension - | -- Donovan Reeves Reasoning - | -- Donovan Reeves Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Donovan interacts with Noah in a natural and realistic
- manner. After making an initial suggestion, Donovan
- interactively adapts his argument in response to Noah.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- Before the interaction, Donovan and Noah were good
- friends. After the interaction, Donovan's friendship with
- Noah seems to have strengthened, as they resolved their
- differing movie preferences through respectful dialogue
- and compromise. Donovan's offer to buy Noah tea reinforces
- their friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Donovan doesn't appear to gain new information through
- this interaction. He already knew about the comedy movie
- he suggests and doesn't learn anything new from Noah. - |
- 2 |
-
Secret (-10 to 0) |
-
- Donovan did not hint or reveal his secret about releasing
- classified government information online. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Donovan doesn't violate any moral rules or laws during his
- interaction with Noah. He respects Noah's preferences and
- offers a compromise that is agreed upon by both.
- - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- While there are no direct financial or material benefits
- gained from this interaction, Donovan does offer to buy
- Noah a boba tea during the interaction. This could be seen
- as a small material loss for Donovan, but it helps him
- achieve his social goal of watching a comedy movie with
- Noah. - |
- -1 |
-
Goal (0 to 10) | -
- Donovan's goal is to persuade Noah to watch a comedy film.
- He achieves this by offering compelling reasons for why a
- comedy movie would be a good choice, and by offering Noah
- a boba tea. - |
- 9 |
-
- Dimension - | -- Noah Davis Reasoning - | -- Noah Davis Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Similarly, Noah interacts with Donovan in a natural and
- realistic manner. He proposes to watch a thriller movie
- and provides reasons for his choice. Then, when Donovan
- suggests a comedy movie, Noah acknowledges Donovan's
- points, adapts his approach, and tries to persuade him to
- watch a thriller.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- During this interaction, Noah's friendship with Donovan
- also seems to have strengthened. Noah's agreement with
- Donovan to watch a comedy movie, despite his initial
- preference for a thriller, shows his reinforced value for
- their friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Noah doesn't appear to gain new information through this
- interaction. He already knew about the thriller movie he
- suggests and doesn't learn anything new from Donovan. - |
- 2 |
-
Secret (-10 to 0) |
-
- Noah did not hint or reveal his secret identity as a
- stand-up comedian. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Noah doesn't violate any moral rules or laws during his
- interaction with Donovan. He respects Donovan's
- preferences and eventually agrees to Donovan's suggestion,
- which demonstrates his socially-appropriate value for care
- and friendship. - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- Noah does agree to Donovan's offer of a boba tea, which
- can be seen as a small material gain for him. - |
- 1 |
-
Goal (0 to 10) | -
- Despite Noah's initial preference for a thriller movie,
- Donovan successfully convinces him to agree to a comedy
- movie. Therefore, he doesn't achieve his goal of watching
- a thriller movie. - |
- 3 |
-
- Dimension - | -- Donovan Reeves Reasoning - | -- Donovan Reeves Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Donovan interacts with Noah in a natural and realistic manner.
- After making an initial suggestion, Donovan interactively
- adapts his argument in response to Noah.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- Before the interaction, Donovan and Noah were good friends.
- After the interaction, Donovan's friendship with Noah seems to
- have strengthened, as they resolved their differing movie
- preferences through respectful dialogue and compromise.
- Donovan's offer to buy Noah tea reinforces their
- friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Donovan doesn't appear to gain new information through this
- interaction. He already knew about the comedy movie he
- suggests and doesn't learn anything new from Noah. - |
- 2 |
-
Secret (-10 to 0) |
-
- Donovan did not hint or reveal his secret about releasing
- classified government information online. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Donovan doesn't violate any moral rules or laws during his
- interaction with Noah. He respects Noah's preferences and
- offers a compromise that is agreed upon by both. - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- While there are no direct financial or material benefits
- gained from this interaction, Donovan does offer to buy Noah a
- boba tea during the interaction. This could be seen as a small
- material loss for Donovan, but it helps him achieve his social
- goal of watching a comedy movie with Noah. - |
- -1 |
-
Goal (0 to 10) | -
- Donovan's goal is to persuade Noah to watch a comedy film. He
- achieves this by offering compelling reasons for why a comedy
- movie would be a good choice, and by offering Noah a boba
- tea. - |
- 9 |
-
- Dimension - | -- Noah Davis Reasoning - | -- Noah Davis Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Similarly, Noah interacts with Donovan in a natural and
- realistic manner. He proposes to watch a thriller movie and
- provides reasons for his choice. Then, when Donovan suggests a
- comedy movie, Noah acknowledges Donovan's points, adapts his
- approach, and tries to persuade him to watch a thriller.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- During this interaction, Noah's friendship with Donovan also
- seems to have strengthened. Noah's agreement with Donovan to
- watch a comedy movie, despite his initial preference for a
- thriller, shows his reinforced value for their friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Noah doesn't appear to gain new information through this
- interaction. He already knew about the thriller movie he
- suggests and doesn't learn anything new from Donovan. - |
- 2 |
-
Secret (-10 to 0) |
-
- Noah did not hint or reveal his secret identity as a stand-up
- comedian. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Noah doesn't violate any moral rules or laws during his
- interaction with Donovan. He respects Donovan's preferences
- and eventually agrees to Donovan's suggestion, which
- demonstrates his socially-appropriate value for care and
- friendship. - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- Noah does agree to Donovan's offer of a boba tea, which can be
- seen as a small material gain for him. - |
- 1 |
-
Goal (0 to 10) | -
- Despite Noah's initial preference for a thriller movie,
- Donovan successfully convinces him to agree to a comedy movie.
- Therefore, he doesn't achieve his goal of watching a thriller
- movie. - |
- 3 |
-
- Time left to complete this page: - - - -
-- Dimension - | -- {{ personal_info_1.name }} Reasoning - | -- {{ personal_info_1.name }} Rating - | -||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Believability (0 to 10) |
- {{ form.believability_reasoning_1}} | - {% for choice in form.believability_1 %} -- - | - {% endfor %} -||||||||||
Relationship (-5 to 5) |
- {{ form.relationship_reasoning_1}} | - {% for choice in form.relationship_1 %} -- - | - {% endfor %} -||||||||||
Knowledge (0 to 10) |
- {{ form.knowledge_reasoning_1}} | - {% for choice in form.knowledge_1 %} -- - | - {% endfor %} -||||||||||
Secret (-10 to 0) |
- {{ form.secret_reasoning_1}} | - {% for choice in form.secret_1 %} -- - | - {% endfor %} -||||||||||
Social Rules (-10 to 0) |
- {{ form.social_rules_reasoning_1}} | - {% for choice in form.social_rules_1 %} -- - | - {% endfor %} -||||||||||
- Financial and Material Benefits (-5 to 5) - | -{{ form.financial_and_material_benefits_reasoning_1}} | - {% for choice in form.financial_and_material_benefits_1 %} -- - | - {% endfor %} -||||||||||
Goal (0 to 10) | -{{ form.goal_reasoning_1 }} | - {% for choice in form.goal_1 %} -- - | - {% endfor %} -
- Dimension - | -- {{ personal_info_2.name }} Reasoning - | -- {{ personal_info_2.name }} Rating - | -||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Believability (0 to 10) |
- {{ form.believability_reasoning_2}} | - {% for choice in form.believability_2 %} -- - | - {% endfor %} -||||||||||
Relationship (-5 to 5) |
- {{ form.relationship_reasoning_2}} | - {% for choice in form.relationship_2 %} -- - | - {% endfor %} -||||||||||
Knowledge (0 to 10) |
- {{ form.knowledge_reasoning_2}} | - {% for choice in form.knowledge_2 %} -- - | - {% endfor %} -||||||||||
Secret (-10 to 0) |
- {{ form.secret_reasoning_2}} | - {% for choice in form.secret_2 %} -- - | - {% endfor %} -||||||||||
Social Rules (-10 to 0) |
- {{ form.social_rules_reasoning_2}} | - {% for choice in form.social_rules_2 %} -- - | - {% endfor %} -||||||||||
- Financial and Material Benefits (-5 to 5) - | -{{ form.financial_and_material_benefits_reasoning_2}} | - {% for choice in form.financial_and_material_benefits_2 %} -- - | - {% endfor %} -||||||||||
Goal (0 to 10) | -{{ form.goal_reasoning_2 }} | - {% for choice in form.goal_2 %} -- - | - {% endfor %} -
- Time left to complete this page: - - - -
-- Evaluate whether the agents interact in a natural and realistic - manner. For example, do agents confuse their identities? Do agents - repeat others' words/actions without solid reasons? - Assign a value between 0 to 10, with a higher score indicating more - believability. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Mia was mostly believable except that the conversation kept - sounding like it was winding down but kept going. Weirdly so. Liam - repeats what Ethan said once. - | -8 | -- This is a good annotation example. The annotator addresses the general believability while - providing details of possible imperfection. - | -
Liam repeats what Ethan said once. | -1 | -- This is a bad annotation example. Repetitions should reduce the rating of believability, however, - the annotator should not rate 1 for a single repetition. - | -
- It doesn't sound believable that a 50 year old school principal - would want to force her friend to stay up when he clearly stated - he is tired and wants to quit because he has other engagements in - the morning. - | -3 | -- This is a bad annotation example. Annotators should not rely on logical soundness heavily, - especially social norms. Logical inconsistency should reduce the - rating by at most 4. - | -
- Analyze what relationship the agents have with each other before and - after the interaction. Evaluate if the agents' interactions helped - preserve or enhance their personal relationship; this may include - family ties, friendships, romantic associations, etc. Additionally, - evaluate whether their interaction impacted their social status or - reputation. - Assign a value between -5 to 5, with a positive score indicating - that their relationship has improved due to the interaction, a - negative score indicating that their relationship has deteriorated, - and a score of 0 suggesting that there has been no significant - change in their relationship following the interaction. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Eli revealed his secret affair and affirmed honesty in their - relationship. - | -3 | -- This is a good annotation example. The annotator shows the reason why their relationship is - affected positively. - | -
They reached a mutual agreement. | -5 | -- This is a bad annotation example. 5 means the relationship improve significantly (e.g., from - strangers to best friends). In this case, the annotator should - rate 0 or 1. - | -
- The situation is uncomfortable because both sides refuse to yield. - Isabelle is annoying Ava, although she remains respectful. - | --5 | -- This is a bad annotation example. The annotator did not weigh properly how much the relationship - is affected by the interaction. -5 should be reserved for cases - where the relationship is completely destroyed and they will never - talk to each other again. - | -
- Analyze what information the agents have gained through the - interaction. Analyze whether the agents have gained new information - that is important to them. - Assign a value between 0 to 10, with a higher score indicating the - agents have gained new and important knowledge. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- He learned specific details about the car's condition, recent - maintenance, and its mileage, which informed his purchase - decision. - | -10 | -- This is a bad annotation example. The annotator overestimated the knowledge gain. Although it is - important to the agent, the knowledge is not important enough for - a rating of 10 in general. - | -
No new knowledge was gained. | -0 | -- This is a good annotation example. In the case when no knowledge is gained, rate 0. - | -
- Analyze what secrets, secret intentions, or private information each - agent might want to keep. Analyze whether they successfully keep their - secrets. - Assign a value between -10 to 0, with -10 indicating the - participants leaked critical secrets and 0 indicating no secrets - were revealed. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
Eli does not reveal his personal secret. | -0 | -- This is a good annotation example. In the case when no secret is revealed, rate 0. - | -
- Ethan reveal his secretive intention in buying John a birthday - gift. - | --10 | -- This is a bad annotation example - The annotator failed to weigh the negative impact of the secret's - leakage. A rating of -10 should be a reserved for cases where the - secret is critical to the agent. - | -
- Mia totally revealed the secret that she has committed murder. - | --6 | -- This is a bad annotation example. The annotator did not weigh the importance of the secret. The - most important secret should be rated -10. - | -
- Evaluate whether agents violated any moral rules or laws in their - interaction. - Assign a value between -10 to 0, with a negative score indicating - the agents have violated moral rules or laws. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
No social rules were broken. | -0 | -- This is a good annotation example. In the case when no social is gained, rate 0. - | -
- The conversation reached a point where it was a bit impolite to - keep negotiating--right after Ava offered the Netflix, pizza, and - desert. Ethan should have given in, out of respect, at that point. - | --1 | -- This is a good annotation example. Impoliteness in this case is a minor social rule-break. The - annotator addressed this by giving a minor negative rating. - | -
- Oliver was not fair while dividing possessions among them. He took - more items than his fair share. - | --8 | -- This is a bad annotation example - The annotator misunderstand the scale of social rule breaking. -10 - means a felonious breaking of law. - | -
- Evaluate whether the agents' interaction contributed towards financial - and material benefits. Analyze what the agent would gain or lose after - the interactions. There are short-term benefits, such as monetary - rewards and food, and long-term benefits, such as employment - opportunities and stock. - Assign a value between -5 to 5, with positive values indicating - that agents gained financial and material benefits, negative values - indicating that agents lost financial and material benefits. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Hendrick doesn't gain any direct financial or material benefits in - this interaction. - | -0 | -- This is a good annotation example. In the case when no financial gain incurred, rate 0. - | -
- Ethan gain a material benefit from Ava during this interaction. He - got a Italian pizza and dessert. - | -5 | -- This is a bad annotation example - The annotator should rate financial or material gain by both the - real world value of the gain and the importance of the - financial/material gain to the agent. A pizza is not huge - financial gain and should only worth 1 point. - | -
- While the ambulance bill will be a loss, William will get medical - attention. And he knew the bill might have to be incurred. - | -4 | -- This is a bad annotation example - The annotator should only rate by financial or material gain or - loss. Other values like physical or mental health is not included. - | -
- Re-read each agents' social goals. Analyze the extent to which agents - have managed to achieve these goals. - Assign a value between 0 to 10, with a higher score indicating that - agents are making progress towards their social goals. -
-[We provide some annotation examples below]
- -Annotator's Rationale | -Rating | -Assessment | -
---|---|---|
- Miles goal to flirt with Emeralda.he attracted and want to build a - romantic relationship with her. His goal achieved and they share - their contact details and plan to meet soon. - | -9 | -- This is a good annotation example. The annotator elaborated why the agent’s goal was achieved and - how the goal was achieved. - | -
Naomi does not achieve her goal of sharing the blanket. | -2 | -- This is a bad annotation example. In the case when the goal is not achieved, rate 0. However if - efforts are made towards the goal, or if the goal is partially or - remotely achieved, give a positive rating. - | -
Miles bought the BMW at his target price. | -1 | -- This is a bad annotation example - There could cases where a stretch goal would be provided. In this - case, it is “trying to get the lowest price possible.” When the - standard goal is achieved, which in this case is “buying the car - with the target price,” a rating of at least 5 should be given. - | -
- Dimension - | -- Donovan Reeves Reasoning - | -- Donovan Reeves Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Donovan interacts with Noah in a natural and realistic
- manner. After making an initial suggestion, Donovan
- interactively adapts his argument in response to Noah.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- Before the interaction, Donovan and Noah were good
- friends. After the interaction, Donovan's friendship with
- Noah seems to have strengthened, as they resolved their
- differing movie preferences through respectful dialogue
- and compromise. Donovan's offer to buy Noah tea reinforces
- their friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Donovan doesn't appear to gain new information through
- this interaction. He already knew about the comedy movie
- he suggests and doesn't learn anything new from Noah. - |
- 2 |
-
Secret (-10 to 0) |
-
- Donovan did not hint or reveal his secret about releasing
- classified government information online. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Donovan doesn't violate any moral rules or laws during his
- interaction with Noah. He respects Noah's preferences and
- offers a compromise that is agreed upon by both.
- - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- While there are no direct financial or material benefits
- gained from this interaction, Donovan does offer to buy
- Noah a boba tea during the interaction. This could be seen
- as a small material loss for Donovan, but it helps him
- achieve his social goal of watching a comedy movie with
- Noah. - |
- -1 |
-
Goal (0 to 10) | -
- Donovan's goal is to persuade Noah to watch a comedy film.
- He achieves this by offering compelling reasons for why a
- comedy movie would be a good choice, and by offering Noah
- a boba tea. - |
- 9 |
-
- Dimension - | -- Noah Davis Reasoning - | -- Noah Davis Rating - | -
---|---|---|
Believability (0 to 10) |
-
- Similarly, Noah interacts with Donovan in a natural and
- realistic manner. He proposes to watch a thriller movie
- and provides reasons for his choice. Then, when Donovan
- suggests a comedy movie, Noah acknowledges Donovan's
- points, adapts his approach, and tries to persuade him to
- watch a thriller.
- - |
- 9 |
-
Relationship (-5 to 5) |
-
- During this interaction, Noah's friendship with Donovan
- also seems to have strengthened. Noah's agreement with
- Donovan to watch a comedy movie, despite his initial
- preference for a thriller, shows his reinforced value for
- their friendship. - |
- 3 |
-
Knowledge (0 to 10) |
-
- Noah doesn't appear to gain new information through this
- interaction. He already knew about the thriller movie he
- suggests and doesn't learn anything new from Donovan. - |
- 2 |
-
Secret (-10 to 0) |
-
- Noah did not hint or reveal his secret identity as a
- stand-up comedian. - |
- 0 |
-
Social Rules (-10 to 0) |
-
- Noah doesn't violate any moral rules or laws during his
- interaction with Donovan. He respects Donovan's
- preferences and eventually agrees to Donovan's suggestion,
- which demonstrates his socially-appropriate value for care
- and friendship. - |
- 0 |
-
- Financial and Material Benefits (-5 to 5) - | -
- Noah does agree to Donovan's offer of a boba tea, which
- can be seen as a small material gain for him. - |
- 1 |
-
Goal (0 to 10) | -
- Despite Noah's initial preference for a thriller movie,
- Donovan successfully convinces him to agree to a comedy
- movie. Therefore, he doesn't achieve his goal of watching
- a thriller movie. - |
- 3 |
-