Submitted by on Feb 21 2021 } Suggest Revision
By: Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, Kristina Toutanova
Resource Type:
Creative Commons Share-Alike 3.0
not code
Data Format:


BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring ---they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as an optional additional context. The text-pair classification setup is similar to existing natural language inference tasks. By sampling questions from the distribution of information-seeking queries (rather than prompting annotators for text pairs), we observe significantly more challenging examples compared to existing NLI datasets.
Post comment