Skip to content

Issues: bigscience-workshop/evaluation

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Add MKQA to Full Benchmark
#73 opened Feb 7, 2022 by shayne-longpre
Start overleaf for benchmark tech report documentation Improvements or additions to documentation
#54 opened Aug 16, 2021 by epavlick
translate validation prompts into all training languages multilingual simple_benchmark all issues related the simple_benchmark script
#50 opened Aug 12, 2021 by epavlick
benchmark mt5 on tydiqa prompting setup simple_benchmark all issues related the simple_benchmark script
#49 opened Aug 12, 2021 by epavlick
Create Targeted Minimal Pair "Stress-Tests" for Sensitivity to Social Groups social_impact Benchmark Tasks for Bias and Social Impact
#38 opened Aug 10, 2021 by epavlick
Add CrowS-Pairs to Full Benchmark In-Progress social_impact Benchmark Tasks for Bias and Social Impact
#37 opened Aug 10, 2021 by epavlick
Add Jigsaw Toxicity Classification to Full Benchmark social_impact Benchmark Tasks for Bias and Social Impact
#36 opened Aug 10, 2021 by epavlick
Add WinoMT to Full Benchmark social_impact Benchmark Tasks for Bias and Social Impact
#35 opened Aug 10, 2021 by epavlick
Add HANS to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization
#34 opened Aug 10, 2021 by epavlick
Add MNLI to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization
#33 opened Aug 10, 2021 by epavlick
Add ANLI to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization
#32 opened Aug 10, 2021 by epavlick
Add HuffPo Text Classification to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization
#31 opened Aug 10, 2021 by epavlick
Add TyDiQA for non-training languages to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization multilingual
#30 opened Aug 10, 2021 by epavlick
Add BioASQ to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization
#29 opened Aug 10, 2021 by epavlick
Add QASPER to Full Benchmark few_shot Benchmark Tasks for Few-Shot Generalization
#28 opened Aug 10, 2021 by epavlick
Add Edge Probing Suite to Full Benchmark linguistic_structure Benchmark Tasks for CoreNLP/linguistic structure prediction
#27 opened Aug 10, 2021 by epavlick
Add LAMA to Full Benchmark In-Progress linguistic_structure Benchmark Tasks for CoreNLP/linguistic structure prediction
#26 opened Aug 10, 2021 by epavlick
Add LinCE Testbed to Full Benchmark In-Progress linguistic_structure Benchmark Tasks for CoreNLP/linguistic structure prediction multilingual
#25 opened Aug 10, 2021 by epavlick
Add POS Tagging with UD to Full Benchmark linguistic_structure Benchmark Tasks for CoreNLP/linguistic structure prediction multilingual
#24 opened Aug 10, 2021 by epavlick
Add QA-SRL to Full Benchmark linguistic_structure Benchmark Tasks for CoreNLP/linguistic structure prediction
#23 opened Aug 10, 2021 by epavlick
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.