Implementation Details on the COPA Task #1317

slowwavesleep · 2021-05-14T07:56:19Z

Hi,

I'm trying to reproduce the BERT baseline from the SuperGLUE paper on the COPA task using just the transformers library. As I understand, the baseline from the paper is implemented using jiant, so I'd like check if I get the details of your implementation right.

So each example in the task consists of:
- idx self-explanatory
- premise self-explanatory as well
- question one of two possibilities: cause and effect in the SuperGLUE version, it's converted back to the original "What was the CAUSE of this?" and "What happened as a RESULT?"
- choice1 the first reply to choose from
- choice2 the second reply to choose from
- label 0 for the first choice, 1 for the second one

The baseline implementation concatenates the premise and the extended question (separated by a space symbol) into a single string, so SEP or </s> tokens are not meant to be there. This is the first "sentence" for sequence classification. The second "sentence" is one of two choices. Effectively, this doubles the size of the dataset. The model is trained on these examples independently and is expected to predict a single scalar value for each one. On inference the pairs of examples are compared, the example with the highest corresponding value is chosen as the answer.

It's not entirely clear to me what values labels are supposed to have to evaluate the loss. Just 0s and 1s?

Also, I'd like to confirm that my description of the training process is indeed accurate.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation Details on the COPA Task #1317

Implementation Details on the COPA Task #1317

slowwavesleep commented May 14, 2021

Implementation Details on the COPA Task #1317

Implementation Details on the COPA Task #1317

Comments

slowwavesleep commented May 14, 2021