From ee33690e81c9e091921a734e7c1b3e55d6a4e794 Mon Sep 17 00:00:00 2001 From: Jonathan Chang Date: Tue, 16 Apr 2024 13:34:12 -0400 Subject: [PATCH] CGA dataset naming update in docs --- README.md | 6 +++--- docs/source/awry.rst | 4 ++-- docs/source/awry_cmv.rst | 4 ++-- docs/source/datasets.rst | 4 ++-- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index d6559f76..811e7756 100644 --- a/README.md +++ b/README.md @@ -57,10 +57,10 @@ Available as an interactive notebook: [full version (fine-tuning + inference)](h ConvoKit ships with several datasets ready for use "out-of-the-box". These datasets can be downloaded using the `convokit.download()` [helper function](https://github.com/CornellNLP/ConvoKit/blob/master/convokit/util.py). Alternatively you can access them directly [here](http://zissou.infosci.cornell.edu/convokit/datasets/). -### [Conversations Gone Awry Dataset](https://convokit.cornell.edu/documentation/awry.html) +### [Conversations Gone Awry Datasets](https://convokit.cornell.edu/documentation/awry.html) -Two related corpora of conversations that derail into antisocial behavior. One corpus consists of Wikipedia talk page conversations that derail into personal attacks as labeled by crowdworkers (4,188 conversations containing 30.021 comments). The other consists of discussion threads on the subreddit ChangeMyView (CMV) that derail into rule-violating behavior as determined by the presence of a moderator intervention (6,842 conversations containing 42,964 comments). -Name for download: `conversations-gone-awry-corpus` (Wikipedia version) or `conversations-gone-awry-cmv-corpus` (Reddit CMV version) +Two related corpora of conversations that derail into antisocial behavior. One corpus (CGA-WIKI) consists of Wikipedia talk page conversations that derail into personal attacks as labeled by crowdworkers (4,188 conversations containing 30.021 comments). The other (CGA-CMV) consists of discussion threads on the subreddit ChangeMyView (CMV) that derail into rule-violating behavior as determined by the presence of a moderator intervention (6,842 conversations containing 42,964 comments). +Name for download: `conversations-gone-awry-corpus` (for CGA-WIKI) or `conversations-gone-awry-cmv-corpus` (for CGA-CMV) ### [Cornell Movie-Dialogs Corpus](https://convokit.cornell.edu/documentation/movie.html) diff --git a/docs/source/awry.rst b/docs/source/awry.rst index e758ae24..161361a2 100644 --- a/docs/source/awry.rst +++ b/docs/source/awry.rst @@ -1,5 +1,5 @@ -Conversations Gone Awry Dataset -=============================== +Conversations Gone Awry Dataset - Wikipedia version (CGA-WIKI) +============================================================== A collection of conversations from Wikipedia talk pages that derail into personal attacks (4,188 conversations, 30,021 comments). diff --git a/docs/source/awry_cmv.rst b/docs/source/awry_cmv.rst index 3a0258bb..2d5388a1 100644 --- a/docs/source/awry_cmv.rst +++ b/docs/source/awry_cmv.rst @@ -1,5 +1,5 @@ -Conversations Gone Awry Dataset [Reddit CMV version] -==================================================== +Conversations Gone Awry Dataset - Reddit CMV version (CGA-CMV) +============================================================== A collection of conversations from the ChangeMyView (CMV) subreddit that derail into personal attacks (6,842 conversations, 42,964 comments). diff --git a/docs/source/datasets.rst b/docs/source/datasets.rst index 623d4d3e..faecb59f 100644 --- a/docs/source/datasets.rst +++ b/docs/source/datasets.rst @@ -2,8 +2,8 @@ Datasets ======== .. toctree:: - Conversations Gone Awry Dataset (Wikipedia version) - Conversations Gone Awry Dataset (Reddit CMV version) + Conversations Gone Awry Dataset - Wikipedia version (CGA-WIKI) + Conversations Gone Awry Dataset - Reddit CMV version (CGA-CMV) Cornell Movie-Dialogs Corpus CANDOR Corpus Parliament Question Time Corpus