A Healthcare Dataset for Complex Reasoning
HEAD-QA is a multi-choice HEAlthcare Dataset. The questions come from exams to access a specialized position in the Spanish healthcare system, and are challenging even for highly specialized humans. They are designed by the Ministerio de Sanidad, Consumo y Bienestar Social, who also provides direct access to the exams of the last 5 years (in Spanish).
Date of the last update of the documents object of the reuse: January, 14th, 2019.
HEAD-QA tries to make these questions accesible for the Natural Language Processing community. We hope it is an useful resource towards achieving better QA systems. The dataset contains questions about the following topics:
HEAD-QA can be imported from huggingface datasets. Thank you very much to Maria Grandury for adding it. Alternatively, if you prefer you can download the files yourself:
Question (medicine): A 13-year-old girl is operated on due to Hirschsprung illness at 3 months of age. Which of the following tumors is more likely to be present?
Question (pharmacology) The antibiotic treatment of choice for Meningitis caused by Haemophilus influenzae serogroup b is:
Question (psychology) According to research derived from the Eysenck model, there is evidence that extraverts, in comparison with introverts:
Model | Avg. accuracy | Avg. POINTS |
---|---|---|
Liu et al. (2020) | 44.4 | 172.3 |
IR Baseline - Vilares and Gómez-Rodríguez (2019) | 34.6 | 71.2 |
Model | Avg. accuracy | Avg. POINTS |
---|---|---|
Liu et al. (2020) | 46.7 | 199.8 |
IR Baseline - Vilares and Gómez-Rodríguez (2019) | 37.2 | 111.8 |
Model | Biology | Medicine | Nursing | Pharmacology | Psychology | Chemistry |
---|---|---|---|---|---|---|
Liu et al. (2020) | 45.5 | 42.4 | 42.3 | 48.0 | 44.3 | 44.3 |
IR Baseline - Vilares and Gómez-Rodríguez (2019) | 37.9 | 30.3 | 32.6 | 38.7 | 34.7 | 33.7 |
Model | Biology | Medicine | Nursing | Pharmacology | Psychology | Chemistry |
---|---|---|---|---|---|---|
Liu et al. (2020) | 189.4 | 158.8 | 158.8 | 209.6 | 160.6 | 173.0 |
IR Baseline - Vilares and Gómez-Rodríguez (2019) | 116.8 | 48.6 | 67.8 | 125.0 | 87.6 | 79.6 |
Model | Biology | Medicine | Nursing | Pharmacology | Psychology | Chemistry |
---|---|---|---|---|---|---|
Liu et al. (2020) | 47.1 | 45.6 | 46.7 | 48.8 | 46.7 | 45.5 |
IR Baseline - Vilares and Gómez-Rodríguez (2019) | 39.8 | 33.3 | 36.4 | 42.2 | 35.7 | 36.0 |
Model | Biology | Medicine | Nursing | Pharmacology | Psychology | Chemistry |
---|---|---|---|---|---|---|
Liu et al. (2020) | 200.0 | 198.4 | 184.6 | 217.0 | 197.2 | 186.8 |
IR Baseline - Vilares and Gómez-Rodríguez (2019) | 135.0 | 76.5 | 104.5 | 157.5 | 96.5 | 101.0 |
The Ministerio de Sanidad, Consumo y Biniestar Social allows the redistribution of the exams and their content under certain conditions: