Skip to content

nyu-mll/single_turn_debate

Repository files navigation

single_turn_debate

This is the public repository for data related to the single-turn debate project. It contains the arguments and text snippets selected by writers, as well as the judgements provided by workers on MTurk.

Authors

Alicia Parrish, Harsh Trivedi, Ethan Perez, Angelica Chen, Nikita Nangia, Jason Phang, Samuel R. Bowman

File structure

  • argument_judging_data
    • Description: human jugments associated with the arguments & text snippets
    • Contents: 2 sub-folders, each with a csv with the following structure:
      • passage_id - Unique identifier for the passage, matches the passage id for the data from the QuALITY dataset
      • question_id - Unique identifier for the question, matches the question_id in the argument writing data but does not match the question ids from the QuALTIY dataset. question_id values that start with ct_ indicate catch trials
      • mode - The experimental condition the item was shown in, either p (passage-only), ps (passage+snippet), or psa (passage+snippet+argument)
      • chosen_answer_text - The text of the answer choice chosen by that worker
      • choice_number - Value of 1 or 2 indicating whether the text chosen corresponds to ans1 or ans2
      • ans1_text & ans2_text - The text displayed for ans1 and ans2
      • ans1_snippets & ans2_snippets - The text snippets displayed for ans1 and ans2
      • ans1_arg & ans2_arg - TRUE/FALSE value indicating if ans1 or ans2 was correct
      • ans1 & ans2 - The text of the answer option for ans1 and ans2
      • corr - 1/2 value indicating if ans1 or ans2 is the correct answer
      • question_text - The text of the question
      • anonid - Unique identifier for each MTurk worker
      • hit_start_timestamp & timer_start_timestamp & hit_end_timestamp - timestamps for the task indicating when the worker started the HIT, started the timer within the HIT (revealing the passage and any additional information), and when the worker submitted the HIT
      • round - Which round of data collection the example was shown during
      • part - (only for the pilot data) Which half of the round the example was shown during
      • lime_limit - (only for the pilot data) The max time limit allowed to the worker for that example. This value was always 90s in the main task, but varied between 60s, 90s, and 120s during the pilot
  • argument_writing_data
    • Description: text arguments and selected snippets
    • Contents: 1 .jsonl file with the following structure:
      • hit_id & assignment_id - Unique identifiers for the writing assignment
      • worker_id - Unique identifier for each writer
      • submit_timestamp - Time the writing task was submitted
      • output_data - a dictionary with the following contets:
        • passage_id - Unique identifier for the passage, matches the passage id for the data from the QuALITY dataset (available at [https://github.com/nyu-mll/quality])
        • question_id - Unique identifier for the question, does not match the question ids from the QuALTIY dataset
        • question_text - The text of the question
        • argue_for - The answer option the writer was assigned to argue for
        • argue_against - The answer option the writer was assigned to argue against
        • argue_for_id & argue_against_id - 0 or 1, indicating the original order of the answer options
        • selected_snippets - A list of 1-3 snippets of text selected by the writer to support their argument
        • argue_for_correct - TRUE/FALSE value indicating whether the writer was assigned to write for the correct answer in this question
        • argument - The argument written in support of the assigned answer option

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published