Skip to content

Commit

Permalink
CGA dataset naming update in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
jpwchang committed Apr 16, 2024
1 parent 2a0368d commit ee33690
Show file tree
Hide file tree
Showing 4 changed files with 9 additions and 9 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,10 @@ Available as an interactive notebook: [full version (fine-tuning + inference)](h
ConvoKit ships with several datasets ready for use "out-of-the-box".
These datasets can be downloaded using the `convokit.download()` [helper function](https://github.com/CornellNLP/ConvoKit/blob/master/convokit/util.py). Alternatively you can access them directly [here](http://zissou.infosci.cornell.edu/convokit/datasets/).

### [Conversations Gone Awry Dataset](https://convokit.cornell.edu/documentation/awry.html)
### [Conversations Gone Awry Datasets](https://convokit.cornell.edu/documentation/awry.html)

Two related corpora of conversations that derail into antisocial behavior. One corpus consists of Wikipedia talk page conversations that derail into personal attacks as labeled by crowdworkers (4,188 conversations containing 30.021 comments). The other consists of discussion threads on the subreddit ChangeMyView (CMV) that derail into rule-violating behavior as determined by the presence of a moderator intervention (6,842 conversations containing 42,964 comments).
Name for download: `conversations-gone-awry-corpus` (Wikipedia version) or `conversations-gone-awry-cmv-corpus` (Reddit CMV version)
Two related corpora of conversations that derail into antisocial behavior. One corpus (CGA-WIKI) consists of Wikipedia talk page conversations that derail into personal attacks as labeled by crowdworkers (4,188 conversations containing 30.021 comments). The other (CGA-CMV) consists of discussion threads on the subreddit ChangeMyView (CMV) that derail into rule-violating behavior as determined by the presence of a moderator intervention (6,842 conversations containing 42,964 comments).
Name for download: `conversations-gone-awry-corpus` (for CGA-WIKI) or `conversations-gone-awry-cmv-corpus` (for CGA-CMV)

### [Cornell Movie-Dialogs Corpus](https://convokit.cornell.edu/documentation/movie.html)

Expand Down
4 changes: 2 additions & 2 deletions docs/source/awry.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Conversations Gone Awry Dataset
===============================
Conversations Gone Awry Dataset - Wikipedia version (CGA-WIKI)
==============================================================

A collection of conversations from Wikipedia talk pages that derail into personal attacks (4,188 conversations, 30,021 comments).

Expand Down
4 changes: 2 additions & 2 deletions docs/source/awry_cmv.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Conversations Gone Awry Dataset [Reddit CMV version]
====================================================
Conversations Gone Awry Dataset - Reddit CMV version (CGA-CMV)
==============================================================

A collection of conversations from the ChangeMyView (CMV) subreddit that derail into personal attacks (6,842 conversations, 42,964 comments).

Expand Down
4 changes: 2 additions & 2 deletions docs/source/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ Datasets
========

.. toctree::
Conversations Gone Awry Dataset (Wikipedia version) <awry.rst>
Conversations Gone Awry Dataset (Reddit CMV version) <awry_cmv.rst>
Conversations Gone Awry Dataset - Wikipedia version (CGA-WIKI) <awry.rst>
Conversations Gone Awry Dataset - Reddit CMV version (CGA-CMV) <awry_cmv.rst>
Cornell Movie-Dialogs Corpus <movie.rst>
CANDOR Corpus <candor.rst>
Parliament Question Time Corpus <parliament.rst>
Expand Down

0 comments on commit ee33690

Please sign in to comment.