BioASQ Participants Area
Task 9b: Test Results of Phase B
The test results are presented in separate tables for each type of annotation. The "System Description" of each system is used.The evaluation measures that are used in Task B are presented here .
Warning: For ideal answers, good ROUGE results do not always imply good manual scores.
Test batch 1
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Macro F1 | F1 Yes | F1 No | Accuracy | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| MQ-1 | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| MQ-2 | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| MQ-3 | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| simple truncation | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| kmeans | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| similarity measures | 0.5903 | 0.5600 | 0.6207 | 0.5926 | 0.0345 | 0.0345 | 0.0345 | - | - | - |
| LASIGE_ULISBOA | 0.7699 | 0.8125 | 0.7273 | 0.7778 | 0.2414 | 0.4828 | 0.3506 | 0.4357 | 0.6044 | 0.4860 |
| extractive | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| abstractive | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| bio-answerfinder | 0.5278 | 0.7027 | 0.3529 | 0.5926 | 0.1724 | 0.3448 | 0.2529 | 0.5508 | 0.4119 | 0.4289 |
| LASIGE_ULISBOA_2 | 0.6824 | 0.7647 | 0.6000 | 0.7037 | 0.2069 | 0.4828 | 0.3075 | 0.4667 | 0.5393 | 0.4828 |
| LASIGE_ULISBOA_3 | 0.7699 | 0.8125 | 0.7273 | 0.7778 | 0.2069 | 0.4138 | 0.2902 | 0.4443 | 0.6139 | 0.4823 |
| AUEB-System1 | 0.3250 | 0.6500 | - | 0.4815 | 0.1379 | 0.3103 | 0.2126 | 0.2365 | 0.2897 | 0.2380 |
| AUEB-System2 | 0.3250 | 0.6500 | - | 0.4815 | 0.2414 | 0.3103 | 0.2655 | 0.3114 | 0.3310 | 0.2796 |
| AUEB-System4 | 0.3250 | 0.6500 | - | 0.4815 | 0.3103 | 0.4138 | 0.3448 | 0.2060 | 0.2913 | 0.2198 |
| MQ-4 | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| UoT_multitask_learn | 0.6291 | 0.6429 | 0.6154 | 0.6296 | 0.2414 | 0.3103 | 0.2759 | 0.4929 | 0.3468 | 0.3719 |
| UoT_baseline | 0.5398 | 0.6250 | 0.4545 | 0.5556 | 0.2759 | 0.4138 | 0.3345 | 0.4397 | 0.3631 | 0.3661 |
| bio-answerfinder-2 | 0.5278 | 0.7027 | 0.3529 | 0.5926 | 0.1724 | 0.3448 | 0.2529 | 0.5508 | 0.4119 | 0.4289 |
| UoT_allquestions | 0.5833 | 0.7222 | 0.4444 | 0.6296 | 0.1724 | 0.3448 | 0.2443 | 0.3821 | 0.2623 | 0.2659 |
| Best yesno | 0.6346 | 0.7429 | 0.5263 | 0.6667 | 0.2069 | 0.4828 | 0.3132 | 0.4921 | 0.3266 | 0.3580 |
| Best factoid | 0.5000 | 0.6667 | 0.3333 | 0.5556 | 0.2414 | 0.4138 | 0.3132 | 0.5083 | 0.3365 | 0.3588 |
| The First System Run | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| Macquarie CRJ Run 2 | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| Macquarie CRJ Run 3 | 0.3250 | 0.6500 | - | 0.4815 | - | - | - | - | - | - |
| ALBERT | 0.3250 | 0.6500 | - | 0.4815 | 0.3448 | 0.5862 | 0.4379 | - | - | - |
| AUEB-System3 | 0.3250 | 0.6500 | - | 0.4815 | 0.2069 | 0.3448 | 0.2701 | 0.2756 | 0.3897 | 0.2790 |
| Ir_sys1 | 0.7699 | 0.8125 | 0.7273 | 0.7778 | 0.3793 | 0.4828 | 0.4224 | 0.5730 | 0.3317 | 0.3628 |
| Ir_sys2 | 0.8138 | 0.8276 | 0.8000 | 0.8148 | 0.3448 | 0.5172 | 0.4149 | 0.4881 | 0.2619 | 0.2800 |
| Ir_sys3 | 0.7699 | 0.8125 | 0.7273 | 0.7778 | 0.3103 | 0.4828 | 0.3707 | 0.5026 | 0.2524 | 0.2768 |
| Ir_sys4 | 0.8138 | 0.8276 | 0.8000 | 0.8148 | 0.2759 | 0.5517 | 0.3713 | 0.4848 | 0.2841 | 0.3021 |
| lalala | 0.7699 | 0.8125 | 0.7273 | 0.7778 | 0.4138 | 0.5172 | 0.4552 | 0.5730 | 0.3317 | 0.3628 |
| Ensemble | 0.3250 | 0.6500 | - | 0.4815 | 0.4138 | 0.5862 | 0.4632 | - | - | - |
| Another ALBERT | 0.3250 | 0.6500 | - | 0.4815 | 0.4138 | 0.5517 | 0.4621 | - | - | - |
| Final BERT | 0.3250 | 0.6500 | - | 0.4815 | 0.3103 | 0.4138 | 0.3563 | - | - | - |
| MRes | 0.3250 | 0.6500 | - | 0.4815 | 0.3448 | 0.4828 | 0.4034 | - | - | - |
| KU-DMIS-1 | 0.8107 | 0.8387 | 0.7826 | 0.8148 | 0.2759 | 0.5172 | 0.3718 | 0.4905 | 0.6667 | 0.5339 |
| KU-DMIS-2 | 0.7699 | 0.8125 | 0.7273 | 0.7778 | 0.2759 | 0.5172 | 0.3718 | 0.4315 | 0.5976 | 0.4690 |
| KU-DMIS-3 | 0.7273 | 0.7879 | 0.6667 | 0.7407 | 0.2414 | 0.5862 | 0.3879 | 0.4483 | 0.5984 | 0.4729 |
| KU-DMIS-4 | 0.6494 | 0.7273 | 0.5714 | 0.6667 | 0.2759 | 0.5172 | 0.3718 | 0.5240 | 0.4690 | 0.4521 |
| KU-DMIS-5 | 0.9258 | 0.9286 | 0.9231 | 0.9259 | 0.2759 | 0.5862 | 0.3856 | 0.3921 | 0.5099 | 0.4143 |
| BioASQ_Baseline | 0.4000 | 0.1333 | 0.6667 | 0.5185 | 0.1379 | 0.2759 | 0.1782 | 0.2568 | 0.4095 | 0.2571 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| MQ-1 | 0.4813 | 0.2919 | 0.4779 | 0.2806 | 4.10 | 4.44 | 3.98 | 4.06 |
| MQ-2 | 0.4934 | 0.3030 | 0.4913 | 0.2926 | 4.12 | 4.46 | 4.05 | 4.14 |
| MQ-3 | 0.4702 | 0.2890 | 0.4690 | 0.2803 | 4.13 | 4.52 | 4.07 | 4.15 |
| simple truncation | 0.5565 | 0.2796 | 0.5558 | 0.2676 | 3.87 | 4.64 | 4.01 | 3.81 |
| kmeans | 0.5999 | 0.2725 | 0.5951 | 0.2583 | 3.79 | 4.63 | 3.91 | 3.71 |
| similarity measures | 0.5999 | 0.2725 | 0.5951 | 0.2583 | 3.79 | 4.63 | 3.91 | 3.71 |
| LASIGE_ULISBOA | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| extractive | 0.3499 | 0.2615 | 0.3536 | 0.2533 | 4.17 | 4.00 | 4.02 | 4.43 |
| abstractive | 0.1971 | 0.1510 | 0.2042 | 0.1539 | 3.67 | 2.38 | 2.76 | 4.14 |
| bio-answerfinder | 0.4359 | 0.3182 | 0.4296 | 0.3085 | 4.27 | 4.36 | 4.20 | 4.56 |
| LASIGE_ULISBOA_2 | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| LASIGE_ULISBOA_3 | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| AUEB-System1 | 0.2002 | 0.1186 | 0.2259 | 0.1241 | 3.88 | 3.43 | 3.27 | 4.54 |
| AUEB-System2 | 0.1882 | 0.1075 | 0.2163 | 0.1148 | 3.94 | 3.33 | 3.24 | 4.49 |
| AUEB-System4 | 0.4584 | 0.2840 | 0.4571 | 0.2739 | 3.94 | 4.41 | 3.91 | 4.29 |
| MQ-4 | 0.5005 | 0.3082 | 0.4963 | 0.2966 | 4.02 | 4.45 | 4.03 | 4.04 |
| UoT_multitask_learn | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| UoT_baseline | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| bio-answerfinder-2 | 0.3684 | 0.3290 | 0.3647 | 0.3211 | 4.39 | 4.12 | 4.28 | 4.79 |
| UoT_allquestions | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Best yesno | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Best factoid | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| The First System Run | 0.3791 | 0.2746 | 0.3838 | 0.2693 | 4.15 | 4.17 | 4.03 | 4.29 |
| Macquarie CRJ Run 2 | 0.4787 | 0.2804 | 0.4785 | 0.2700 | 3.95 | 4.45 | 3.94 | 4.03 |
| Macquarie CRJ Run 3 | 0.5184 | 0.2796 | 0.5186 | 0.2683 | 3.87 | 4.46 | 3.86 | 3.95 |
| ALBERT | - | - | - | - | 0.36 | 0.33 | 0.36 | 0.33 |
| AUEB-System3 | 0.1986 | 0.1170 | 0.2233 | 0.1222 | 3.89 | 3.49 | 3.41 | 4.46 |
| Ir_sys1 | 0.2823 | 0.2063 | 0.2873 | 0.2064 | 4.12 | 3.50 | 3.48 | 4.42 |
| Ir_sys2 | 0.2871 | 0.1938 | 0.2929 | 0.1939 | 4.07 | 3.47 | 3.48 | 4.33 |
| Ir_sys3 | 0.2874 | 0.1767 | 0.2946 | 0.1767 | 3.99 | 3.36 | 3.37 | 4.30 |
| Ir_sys4 | 0.2868 | 0.1833 | 0.2932 | 0.1840 | 3.99 | 3.39 | 3.42 | 4.24 |
| lalala | 0.2802 | 0.1694 | 0.2888 | 0.1704 | 4.17 | 3.35 | 3.33 | 4.30 |
| Ensemble | - | - | - | - | 0.36 | 0.33 | 0.36 | 0.33 |
| Another ALBERT | - | - | - | - | 0.36 | 0.33 | 0.36 | 0.33 |
| Final BERT | - | - | - | - | 0.36 | 0.33 | 0.36 | 0.33 |
| MRes | - | - | - | - | 0.36 | 0.33 | 0.36 | 0.33 |
| KU-DMIS-1 | 0.2311 | 0.2060 | 0.2362 | 0.2066 | 4.42 | 3.52 | 3.67 | 4.67 |
| KU-DMIS-2 | 0.2157 | 0.1752 | 0.2234 | 0.1764 | 4.42 | 3.53 | 3.63 | 4.64 |
| KU-DMIS-3 | 0.2150 | 0.1771 | 0.2196 | 0.1759 | 4.45 | 3.57 | 3.59 | 4.67 |
| KU-DMIS-4 | 0.1908 | 0.1661 | 0.1997 | 0.1702 | 4.41 | 3.58 | 3.65 | 4.71 |
| KU-DMIS-5 | 0.2139 | 0.1802 | 0.2244 | 0.1843 | 4.42 | 3.54 | 3.61 | 4.70 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 2
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Macro F1 | F1 Yes | F1 No | Accuracy | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| MQ-1 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| MQ-2 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| MQ-3 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| MQ-4 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| MQ-5 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| Olive-DocFetchV2 | 0.4054 | 0.8108 | - | 0.6818 | 0.2353 | 0.3529 | 0.2941 | 0.0500 | 0.0125 | 0.0200 |
| Olive-DocFetchV1 | 0.4054 | 0.8108 | - | 0.6818 | 0.1176 | 0.2059 | 0.1618 | 0.0500 | 0.0125 | 0.0200 |
| MollyHaywardSmall | 0.6508 | 0.8571 | 0.4444 | 0.7727 | 0.2941 | 0.4412 | 0.3402 | 0.2589 | 0.4774 | 0.3195 |
| bio-answerfinder | 0.8952 | 0.9333 | 0.8571 | 0.9091 | 0.5000 | 0.5588 | 0.5294 | 0.4839 | 0.4860 | 0.4662 |
| AUEB-System1 | 0.4054 | 0.8108 | - | 0.6818 | 0.4118 | 0.5000 | 0.4510 | 0.1548 | 0.2423 | 0.1630 |
| AUEB-System2 | 0.4054 | 0.8108 | - | 0.6818 | 0.4118 | 0.5294 | 0.4706 | 0.3978 | 0.3519 | 0.3127 |
| AUEB-System3 | 0.4054 | 0.8108 | - | 0.6818 | 0.3824 | 0.5294 | 0.4559 | 0.2385 | 0.3590 | 0.2461 |
| AUEB-System4 | 0.4054 | 0.8108 | - | 0.6818 | 0.3529 | 0.5294 | 0.4186 | 0.0826 | 0.2021 | 0.1072 |
| bio-answerfinder-2 | 0.8952 | 0.9333 | 0.8571 | 0.9091 | 0.5000 | 0.5588 | 0.5294 | 0.4839 | 0.4860 | 0.4662 |
| The First System Run | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| Macquarie CRJ Run 2 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| Macquarie CRJ Run 3 | 0.4054 | 0.8108 | - | 0.6818 | - | - | - | - | - | - |
| LASIGE_ULISBOA | 0.9454 | 0.9677 | 0.9231 | 0.9545 | 0.5000 | 0.7647 | 0.6127 | 0.4701 | 0.5829 | 0.4818 |
| MollyHaywardBase | 0.5758 | 0.7879 | 0.3636 | 0.6818 | 0.2647 | 0.5294 | 0.3676 | 0.2658 | 0.2676 | 0.2509 |
| LASIGE_ULISBOA_2 | 0.7708 | 0.8750 | 0.6667 | 0.8182 | 0.4706 | 0.7353 | 0.5608 | 0.3900 | 0.3571 | 0.3516 |
| LASIGE_ULISBOA_3 | 0.3889 | 0.7778 | - | 0.6364 | 0.5000 | 0.7647 | 0.6127 | 0.3752 | 0.5786 | 0.4361 |
| UDEL-LAB1 | 0.4054 | 0.8108 | - | 0.6818 | 0.5000 | 0.7353 | 0.6078 | 0.0500 | 0.0125 | 0.0200 |
| UDEL-LAB2 | 0.4054 | 0.8108 | - | 0.6818 | 0.3824 | 0.7059 | 0.5172 | 0.0500 | 0.0125 | 0.0200 |
| KU-DMIS-1 | 0.8854 | 0.9375 | 0.8333 | 0.9091 | 0.4706 | 0.6765 | 0.5588 | 0.4211 | 0.5843 | 0.4697 |
| KU-DMIS-2 | 0.8854 | 0.9375 | 0.8333 | 0.9091 | 0.4706 | 0.6765 | 0.5564 | 0.4004 | 0.5996 | 0.4590 |
| KU-DMIS-3 | 0.8362 | 0.9032 | 0.7692 | 0.8636 | 0.4706 | 0.6765 | 0.5588 | 0.4013 | 0.6471 | 0.4637 |
| KU-DMIS-4 | 0.8854 | 0.9375 | 0.8333 | 0.9091 | 0.4706 | 0.6765 | 0.5564 | 0.5189 | 0.5342 | 0.4644 |
| KU-DMIS-5 | 0.8854 | 0.9375 | 0.8333 | 0.9091 | 0.4706 | 0.6765 | 0.5588 | 0.4209 | 0.5437 | 0.4471 |
| ALBERT | 0.4054 | 0.8108 | - | 0.6818 | 0.4118 | 0.6765 | 0.5181 | - | - | - |
| Ir_sys1 | 0.6970 | 0.8485 | 0.5455 | 0.7727 | 0.4118 | 0.7353 | 0.5500 | 0.6117 | 0.4582 | 0.4590 |
| Ir_sys2 | 0.6970 | 0.8485 | 0.5455 | 0.7727 | 0.4118 | 0.7647 | 0.5647 | 0.5980 | 0.4140 | 0.4350 |
| Ir_sys3 | 0.6970 | 0.8485 | 0.5455 | 0.7727 | 0.4118 | 0.7941 | 0.5588 | 0.6829 | 0.4544 | 0.4892 |
| lalala | 0.6970 | 0.8485 | 0.5455 | 0.7727 | 0.4706 | 0.7941 | 0.5926 | 0.6117 | 0.4582 | 0.4590 |
| Ensemble | 0.4054 | 0.8108 | - | 0.6818 | 0.4412 | 0.6765 | 0.5451 | - | - | - |
| pa-1 | 0.8182 | 0.9091 | 0.7273 | 0.8636 | 0.4412 | 0.6765 | 0.5328 | 0.2267 | 0.3864 | 0.2708 |
| Another ALBERT | 0.4054 | 0.8108 | - | 0.6818 | 0.4706 | 0.7059 | 0.5760 | - | - | - |
| Best factoid | 0.7412 | 0.8824 | 0.6000 | 0.8182 | 0.2941 | 0.6765 | 0.4397 | 0.4444 | 0.3844 | 0.3626 |
| pa-2 | 0.6508 | 0.8571 | 0.4444 | 0.7727 | 0.2353 | 0.5294 | 0.3456 | 0.1100 | 0.1692 | 0.1287 |
| UoT_allquestions | 0.7905 | 0.8667 | 0.7143 | 0.8182 | 0.1765 | 0.3235 | 0.2324 | 0.3457 | 0.3848 | 0.3361 |
| UoT_multitask_learn | 0.8854 | 0.9375 | 0.8333 | 0.9091 | 0.2941 | 0.5588 | 0.3931 | 0.4083 | 0.3752 | 0.3395 |
| UoT_baseline | 0.7412 | 0.8824 | 0.6000 | 0.8182 | 0.2353 | 0.5000 | 0.3417 | 0.5413 | 0.4123 | 0.4215 |
| Best yesno | 0.7412 | 0.8824 | 0.6000 | 0.8182 | 0.3235 | 0.5000 | 0.3907 | 0.4849 | 0.4069 | 0.3840 |
| Final BERT | 0.4054 | 0.8108 | - | 0.6818 | 0.4412 | 0.7353 | 0.5574 | - | - | - |
| MRes | 0.4054 | 0.8108 | - | 0.6818 | 0.4412 | 0.7353 | 0.5485 | - | - | - |
| Ir_sys4 | 0.7412 | 0.8824 | 0.6000 | 0.8182 | 0.4118 | 0.7941 | 0.5583 | 0.6417 | 0.5115 | 0.5047 |
| BioASQ_Baseline | 0.3979 | 0.3158 | 0.4800 | 0.4091 | 0.0882 | 0.3824 | 0.1882 | 0.2031 | 0.3710 | 0.2382 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| MQ-1 | 0.5178 | 0.3295 | 0.5246 | 0.3217 | 4.09 | 4.54 | 4.15 | 4.25 |
| MQ-2 | 0.5315 | 0.3336 | 0.5366 | 0.3256 | 4.01 | 4.49 | 4.06 | 4.22 |
| MQ-3 | 0.5355 | 0.3387 | 0.5392 | 0.3294 | 3.97 | 4.55 | 4.08 | 4.15 |
| MQ-4 | 0.5404 | 0.3375 | 0.5450 | 0.3291 | 4.03 | 4.60 | 4.11 | 4.21 |
| MQ-5 | 0.5363 | 0.3313 | 0.5389 | 0.3213 | 4.04 | 4.61 | 4.10 | 4.20 |
| Olive-DocFetchV2 | - | - | - | - | - | - | - | - |
| Olive-DocFetchV1 | - | - | - | - | - | - | - | - |
| MollyHaywardSmall | - | - | - | - | - | - | - | - |
| bio-answerfinder | 0.4704 | 0.3551 | 0.4717 | 0.3483 | 4.20 | 4.57 | 4.33 | 4.54 |
| AUEB-System1 | 0.1876 | 0.1191 | 0.2092 | 0.1259 | 3.77 | 3.67 | 3.45 | 4.34 |
| AUEB-System2 | 0.1758 | 0.1115 | 0.2063 | 0.1217 | 3.88 | 3.62 | 3.51 | 4.50 |
| AUEB-System3 | 0.2010 | 0.1302 | 0.2234 | 0.1379 | 3.86 | 3.54 | 3.45 | 4.41 |
| AUEB-System4 | 0.4761 | 0.3102 | 0.4829 | 0.3035 | 4.04 | 4.40 | 3.98 | 4.37 |
| bio-answerfinder-2 | 0.3874 | 0.3568 | 0.3904 | 0.3530 | 4.41 | 4.43 | 4.41 | 4.78 |
| The First System Run | 0.4103 | 0.3008 | 0.4210 | 0.2987 | 4.15 | 4.33 | 4.12 | 4.35 |
| Macquarie CRJ Run 2 | 0.5219 | 0.3132 | 0.5339 | 0.3081 | 3.97 | 4.55 | 3.99 | 4.06 |
| Macquarie CRJ Run 3 | 0.5373 | 0.3059 | 0.5488 | 0.2999 | 3.94 | 4.57 | 4.00 | 4.03 |
| LASIGE_ULISBOA | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| MollyHaywardBase | - | - | - | - | - | - | - | - |
| LASIGE_ULISBOA_2 | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| LASIGE_ULISBOA_3 | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| UDEL-LAB1 | - | - | - | - | - | - | - | - |
| UDEL-LAB2 | - | - | - | - | - | - | - | - |
| KU-DMIS-1 | 0.2794 | 0.2614 | 0.2803 | 0.2577 | 4.42 | 4.09 | 4.02 | 4.77 |
| KU-DMIS-2 | 0.2621 | 0.2460 | 0.2667 | 0.2457 | 4.52 | 4.05 | 3.95 | 4.79 |
| KU-DMIS-3 | 0.2361 | 0.2206 | 0.2425 | 0.2201 | 4.44 | 3.99 | 3.82 | 4.78 |
| KU-DMIS-4 | 0.2432 | 0.2266 | 0.2446 | 0.2240 | 4.54 | 3.74 | 3.77 | 4.83 |
| KU-DMIS-5 | 0.2376 | 0.2212 | 0.2385 | 0.2168 | 4.58 | 3.95 | 3.78 | 4.84 |
| ALBERT | - | - | - | - | 0.37 | 0.38 | 0.37 | 0.37 |
| Ir_sys1 | 0.2719 | 0.2360 | 0.2709 | 0.2310 | 4.12 | 3.37 | 3.47 | 4.51 |
| Ir_sys2 | 0.2775 | 0.2260 | 0.2763 | 0.2214 | 4.14 | 3.41 | 3.41 | 4.47 |
| Ir_sys3 | 0.2744 | 0.2225 | 0.2743 | 0.2191 | 4.09 | 3.46 | 3.48 | 4.48 |
| lalala | 0.2717 | 0.2443 | 0.2680 | 0.2368 | 4.15 | 3.40 | 3.50 | 4.51 |
| Ensemble | - | - | - | - | 0.37 | 0.38 | 0.37 | 0.37 |
| pa-1 | 0.2218 | 0.2223 | 0.2209 | 0.2201 | 4.33 | 3.35 | 3.51 | 4.74 |
| Another ALBERT | - | - | - | - | 0.37 | 0.38 | 0.37 | 0.37 |
| Best factoid | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| pa-2 | 0.1268 | 0.1314 | 0.1321 | 0.1362 | 4.25 | 2.76 | 3.29 | 4.74 |
| UoT_allquestions | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| UoT_multitask_learn | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| UoT_baseline | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Best yesno | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Final BERT | - | - | - | - | 0.37 | 0.38 | 0.37 | 0.37 |
| MRes | - | - | - | - | 0.37 | 0.38 | 0.37 | 0.37 |
| Ir_sys4 | 0.2566 | 0.2199 | 0.2576 | 0.2170 | 4.18 | 3.64 | 3.58 | 4.54 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 3
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Macro F1 | F1 Yes | F1 No | Accuracy | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| MQ-1 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| MQ-2 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| MQ-3 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| MQ-4 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| MQ-5 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| marga_LASIGE_1 | 0.7576 | 0.8485 | 0.6667 | 0.7917 | 0.4444 | 0.6111 | 0.5056 | 0.6228 | 0.4111 | 0.4602 |
| LASIGE_ULISBOA | 0.7576 | 0.8485 | 0.6667 | 0.7917 | 0.4444 | 0.6111 | 0.5056 | 0.6228 | 0.4111 | 0.4602 |
| LASIGE_ULISBOA_2 | 0.6211 | 0.8421 | 0.4000 | 0.7500 | 0.4444 | 0.5833 | 0.4931 | 0.5386 | 0.5181 | 0.4901 |
| LASIGE_ULISBOA_3 | 0.7363 | 0.8571 | 0.6154 | 0.7917 | 0.4722 | 0.6111 | 0.5157 | 0.5526 | 0.5841 | 0.5220 |
| bio-answerfinder | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.5833 | 0.6389 | 0.6111 | 0.5041 | 0.4171 | 0.4155 |
| bio-answerfinder-2 | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.5833 | 0.6389 | 0.6111 | 0.5041 | 0.4171 | 0.4155 |
| Olive-Answer-V1 | 0.4146 | 0.8293 | - | 0.7083 | 0.2500 | 0.3611 | 0.3056 | - | - | - |
| Olive-Answer-V2 | 0.4146 | 0.8293 | - | 0.7083 | 0.2222 | 0.3333 | 0.2778 | - | - | - |
| MollyHaywardBase | 0.6667 | 0.8333 | 0.5000 | 0.7500 | 0.2222 | 0.3611 | 0.2681 | 0.3541 | 0.3391 | 0.3030 |
| MollyHaywardSmall | 0.4693 | 0.7568 | 0.1818 | 0.6250 | 0.2500 | 0.3889 | 0.3148 | 0.2387 | 0.3885 | 0.2667 |
| simple truncation | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| pa-1 | 0.4000 | 0.8000 | - | 0.6667 | 0.5556 | 0.6667 | 0.5912 | 0.1895 | 0.3070 | 0.2194 |
| pa-2 | 0.4000 | 0.8000 | - | 0.6667 | 0.2500 | 0.5000 | 0.3398 | 0.1474 | 0.2544 | 0.1775 |
| The First System Run | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| Macquarie CRJ Run 2 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| Macquarie CRJ Run 3 | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| UDEL-LAB2 | 0.4146 | 0.8293 | - | 0.7083 | 0.5000 | 0.7222 | 0.5949 | - | - | - |
| UDEL-LAB3 | 0.4146 | 0.8293 | - | 0.7083 | 0.5556 | 0.7222 | 0.6319 | - | - | - |
| UDEL-LAB1 | 0.4146 | 0.8293 | - | 0.7083 | 0.5556 | 0.6944 | 0.6111 | - | - | - |
| fine-tuned biobert | 0.7474 | 0.8947 | 0.6000 | 0.8333 | 0.4444 | 0.6111 | 0.5194 | - | - | - |
| Base system of ZY | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.4444 | 0.6111 | 0.5162 | 0.6075 | 0.4205 | 0.4337 |
| AUEB-System1 | 0.4146 | 0.8293 | - | 0.7083 | 0.3333 | 0.3889 | 0.3565 | 0.2089 | 0.2243 | 0.1717 |
| AUEB-System2 | 0.4146 | 0.8293 | - | 0.7083 | 0.3611 | 0.4167 | 0.3889 | 0.4287 | 0.3548 | 0.3366 |
| AUEB-System3 | 0.4146 | 0.8293 | - | 0.7083 | 0.3889 | 0.4722 | 0.4236 | 0.3043 | 0.3832 | 0.2843 |
| AUEB-System4 | 0.4146 | 0.8293 | - | 0.7083 | 0.2778 | 0.4444 | 0.3495 | 0.0804 | 0.1725 | 0.1041 |
| KU-DMIS-2 | 0.7474 | 0.8947 | 0.6000 | 0.8333 | 0.3611 | 0.5278 | 0.4259 | 0.4811 | 0.6567 | 0.5233 |
| KU-DMIS-3 | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.3611 | 0.5278 | 0.4282 | 0.4420 | 0.6311 | 0.4891 |
| KU-DMIS-4 | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.3611 | 0.5278 | 0.4259 | 0.4408 | 0.7221 | 0.4785 |
| KU-DMIS-5 | 0.8231 | 0.9189 | 0.7273 | 0.8750 | 0.3889 | 0.5000 | 0.4306 | 0.4942 | 0.6118 | 0.5059 |
| KU-DMIS-1 | 0.8231 | 0.9189 | 0.7273 | 0.8750 | 0.3611 | 0.5278 | 0.4245 | 0.4995 | 0.6645 | 0.5466 |
| kmeans | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| similarity measures | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| extractive | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| abstractive | 0.4146 | 0.8293 | - | 0.7083 | - | - | - | - | - | - |
| UoT_allquestions | 0.7052 | 0.8649 | 0.5455 | 0.7917 | 0.2500 | 0.4722 | 0.3435 | 0.4798 | 0.3462 | 0.3606 |
| UoT_multitask_learn | 0.7052 | 0.8649 | 0.5455 | 0.7917 | 0.3611 | 0.5278 | 0.4259 | 0.5509 | 0.4292 | 0.4423 |
| Best factoid | 0.5214 | 0.8205 | 0.2222 | 0.7083 | 0.3889 | 0.5000 | 0.4375 | 0.5931 | 0.4283 | 0.4282 |
| Best yesno | 0.6581 | 0.8718 | 0.4444 | 0.7917 | 0.4444 | 0.5278 | 0.4769 | 0.5335 | 0.3724 | 0.3922 |
| UoT_baseline | 0.6581 | 0.8718 | 0.4444 | 0.7917 | 0.4444 | 0.4722 | 0.4583 | 0.5147 | 0.4224 | 0.4094 |
| Ir_sys1 | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.4722 | 0.5556 | 0.5139 | 0.6862 | 0.5079 | 0.5228 |
| Ir_sys2 | 0.7983 | 0.8824 | 0.7143 | 0.8333 | 0.5833 | 0.6667 | 0.6167 | 0.6485 | 0.4779 | 0.4944 |
| Ir_sys3 | 0.7983 | 0.8824 | 0.7143 | 0.8333 | 0.4167 | 0.6111 | 0.4894 | 0.6095 | 0.4679 | 0.4673 |
| Ir_sys4 | 0.8889 | 0.9444 | 0.8333 | 0.9167 | 0.5278 | 0.6389 | 0.5648 | 0.6095 | 0.4679 | 0.4673 |
| lalala | 0.9473 | 0.9714 | 0.9231 | 0.9583 | 0.4722 | 0.6389 | 0.5310 | 0.5947 | 0.6056 | 0.5721 |
| BioASQ_Baseline | 0.5833 | 0.5833 | 0.5833 | 0.5833 | 0.0278 | 0.3333 | 0.1491 | 0.1650 | 0.4225 | 0.1996 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| MQ-1 | 0.4573 | 0.2770 | 0.4697 | 0.2727 | 3.85 | 4.43 | 4.01 | 4.15 |
| MQ-2 | 0.4791 | 0.2957 | 0.4911 | 0.2896 | 3.89 | 4.52 | 4.08 | 4.16 |
| MQ-3 | 0.4805 | 0.2968 | 0.4902 | 0.2903 | 3.89 | 4.41 | 4.00 | 4.12 |
| MQ-4 | 0.4618 | 0.2849 | 0.4721 | 0.2795 | 3.90 | 4.53 | 4.09 | 4.12 |
| MQ-5 | 0.4833 | 0.2940 | 0.4969 | 0.2900 | 4.02 | 4.53 | 4.08 | 4.15 |
| marga_LASIGE_1 | - | - | - | - | 0.34 | 0.39 | 0.34 | 0.34 |
| LASIGE_ULISBOA | - | - | - | - | 0.34 | 0.39 | 0.34 | 0.34 |
| LASIGE_ULISBOA_2 | - | - | - | - | 0.34 | 0.39 | 0.34 | 0.34 |
| LASIGE_ULISBOA_3 | - | - | - | - | 0.34 | 0.39 | 0.34 | 0.34 |
| bio-answerfinder | 0.3694 | 0.2829 | 0.3825 | 0.2851 | 4.14 | 4.35 | 4.13 | 4.61 |
| bio-answerfinder-2 | 0.3138 | 0.2794 | 0.3258 | 0.2844 | 4.26 | 4.33 | 4.31 | 4.84 |
| Olive-Answer-V1 | - | - | - | - | - | - | - | - |
| Olive-Answer-V2 | - | - | - | - | - | - | - | - |
| MollyHaywardBase | - | - | - | - | - | - | - | - |
| MollyHaywardSmall | - | - | - | - | - | - | - | - |
| simple truncation | 0.5775 | 0.2667 | 0.5841 | 0.2573 | 3.78 | 4.71 | 3.91 | 3.89 |
| pa-1 | 0.2535 | 0.2489 | 0.2588 | 0.2478 | 4.22 | 3.61 | 3.88 | 4.72 |
| pa-2 | 0.0904 | 0.0926 | 0.1071 | 0.1042 | 4.08 | 2.71 | 3.05 | 4.49 |
| The First System Run | 0.3818 | 0.2635 | 0.3973 | 0.2624 | 4.06 | 4.34 | 4.18 | 4.41 |
| Macquarie CRJ Run 2 | 0.4960 | 0.2763 | 0.5086 | 0.2697 | 3.79 | 4.49 | 3.97 | 4.07 |
| Macquarie CRJ Run 3 | 0.5327 | 0.2790 | 0.5464 | 0.2714 | 3.76 | 4.55 | 3.84 | 4.11 |
| UDEL-LAB2 | - | - | - | - | - | - | - | - |
| UDEL-LAB3 | - | - | - | - | - | - | - | - |
| UDEL-LAB1 | - | - | - | - | - | - | - | - |
| fine-tuned biobert | - | - | - | - | 0.34 | 0.34 | 0.34 | 0.34 |
| Base system of ZY | - | - | - | - | 0.34 | 0.34 | 0.34 | 0.34 |
| AUEB-System1 | 0.1685 | 0.1037 | 0.1945 | 0.1112 | 3.75 | 3.60 | 3.39 | 4.27 |
| AUEB-System2 | 0.1753 | 0.0997 | 0.2026 | 0.1099 | 3.63 | 3.39 | 3.35 | 4.27 |
| AUEB-System3 | 0.1678 | 0.1009 | 0.1917 | 0.1076 | 3.64 | 3.67 | 3.46 | 4.20 |
| AUEB-System4 | 0.4405 | 0.2785 | 0.4534 | 0.2741 | 3.97 | 4.42 | 3.90 | 4.35 |
| KU-DMIS-2 | 0.2490 | 0.2227 | 0.2569 | 0.2245 | 4.48 | 3.85 | 3.96 | 4.75 |
| KU-DMIS-3 | 0.2161 | 0.2104 | 0.2279 | 0.2139 | 4.39 | 3.68 | 3.74 | 4.69 |
| KU-DMIS-4 | 0.1992 | 0.1874 | 0.2110 | 0.1930 | 4.38 | 3.60 | 3.70 | 4.65 |
| KU-DMIS-5 | 0.2085 | 0.1955 | 0.2180 | 0.1974 | 4.32 | 3.63 | 3.70 | 4.63 |
| KU-DMIS-1 | 0.2174 | 0.2059 | 0.2284 | 0.2106 | 4.40 | 3.85 | 3.98 | 4.66 |
| kmeans | 0.3900 | 0.2667 | 0.3998 | 0.2624 | 4.16 | 4.40 | 4.01 | 4.52 |
| similarity measures | 0.3536 | 0.2640 | 0.3592 | 0.2594 | 4.11 | 4.28 | 3.99 | 4.54 |
| extractive | 0.3900 | 0.2667 | 0.3998 | 0.2624 | 4.16 | 4.40 | 4.01 | 4.52 |
| abstractive | 0.3536 | 0.2640 | 0.3592 | 0.2594 | 4.11 | 4.28 | 3.99 | 4.54 |
| UoT_allquestions | - | - | - | - | 0.33 | 0.33 | 0.34 | 0.34 |
| UoT_multitask_learn | - | - | - | - | 0.33 | 0.33 | 0.34 | 0.34 |
| Best factoid | - | - | - | - | 0.33 | 0.33 | 0.34 | 0.34 |
| Best yesno | - | - | - | - | 0.33 | 0.33 | 0.34 | 0.34 |
| UoT_baseline | - | - | - | - | 0.33 | 0.33 | 0.34 | 0.34 |
| Ir_sys1 | 0.2348 | 0.2033 | 0.2387 | 0.1986 | 3.89 | 3.24 | 3.28 | 4.14 |
| Ir_sys2 | 0.2387 | 0.1815 | 0.2450 | 0.1780 | 3.92 | 3.15 | 3.13 | 4.09 |
| Ir_sys3 | 0.2486 | 0.1816 | 0.2535 | 0.1765 | 3.74 | 3.29 | 3.16 | 4.02 |
| Ir_sys4 | 0.2348 | 0.2033 | 0.2387 | 0.1986 | 3.89 | 3.24 | 3.28 | 4.14 |
| lalala | 0.1992 | 0.1720 | 0.2130 | 0.1756 | 4.34 | 3.85 | 3.85 | 4.64 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 4
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Macro F1 | F1 Yes | F1 No | Accuracy | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| LASIGE_ULISBOA | 0.7807 | 0.8947 | 0.6667 | 0.8400 | 0.4643 | 0.7500 | 0.5577 | 0.6009 | 0.6535 | 0.6135 |
| LASIGE_ULISBOA_2 | 0.4048 | 0.8095 | - | 0.6800 | 0.4643 | 0.7500 | 0.5577 | 0.6754 | 0.6118 | 0.6291 |
| LASIGE_ULISBOA_3 | 0.7401 | 0.8649 | 0.6154 | 0.8000 | 0.5000 | 0.7143 | 0.5863 | 0.6351 | 0.6513 | 0.6276 |
| MollyHaywardSmall | 0.7807 | 0.8947 | 0.6667 | 0.8400 | 0.4286 | 0.5714 | 0.4792 | 0.4491 | 0.7127 | 0.5258 |
| MollyHaywardBase | 0.6362 | 0.8108 | 0.4615 | 0.7200 | 0.3929 | 0.5000 | 0.4405 | 0.4491 | 0.6649 | 0.5097 |
| bio-answerfinder | 0.7500 | 0.9000 | 0.6000 | 0.8400 | 0.5000 | 0.6071 | 0.5387 | 0.5668 | 0.5632 | 0.5218 |
| bio-answerfinder-2 | 0.7500 | 0.9000 | 0.6000 | 0.8400 | 0.5000 | 0.6071 | 0.5387 | 0.5668 | 0.5632 | 0.5218 |
| Olive-Answer-V1 | 0.4186 | 0.8372 | - | 0.7200 | 0.3214 | 0.3214 | 0.3214 | - | - | - |
| Olive-Answer-V2 | 0.4186 | 0.8372 | - | 0.7200 | 0.2857 | 0.2857 | 0.2857 | - | - | - |
| The First System Run | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| Macquarie CRJ Run 2 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| Macquarie CRJ Run 3 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| MQ-1 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| MQ-2 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| MQ-3 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| MQ-4 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| MQ-5 | 0.4186 | 0.8372 | - | 0.7200 | - | - | - | - | - | - |
| KU-DMIS-1 | 0.9480 | 0.9730 | 0.9231 | 0.9600 | 0.5000 | 0.6071 | 0.5310 | 0.6805 | 0.8531 | 0.7405 |
| KU-DMIS-2 | 0.8904 | 0.9474 | 0.8333 | 0.9200 | 0.5000 | 0.6786 | 0.5589 | 0.5831 | 0.7794 | 0.6262 |
| KU-DMIS-3 | 0.8904 | 0.9474 | 0.8333 | 0.9200 | 0.5000 | 0.6429 | 0.5429 | 0.6342 | 0.8123 | 0.6799 |
| KU-DMIS-5 | 0.9008 | 0.9444 | 0.8571 | 0.9200 | 0.5000 | 0.7143 | 0.5726 | 0.6421 | 0.7706 | 0.6667 |
| AUEB-System1 | 0.4186 | 0.8372 | - | 0.7200 | 0.3571 | 0.4643 | 0.4018 | 0.2922 | 0.3830 | 0.2944 |
| AUEB-System2 | 0.4186 | 0.8372 | - | 0.7200 | 0.4286 | 0.4643 | 0.4375 | 0.4647 | 0.4506 | 0.3812 |
| UDEL-LAB1 | 0.4186 | 0.8372 | - | 0.7200 | 0.5357 | 0.7500 | 0.6071 | - | - | - |
| AUEB-System3 | 0.4186 | 0.8372 | - | 0.7200 | 0.3929 | 0.4643 | 0.4137 | 0.3342 | 0.4813 | 0.3327 |
| UDEL-LAB2 | 0.4186 | 0.8372 | - | 0.7200 | 0.5357 | 0.8214 | 0.6470 | - | - | - |
| AUEB-System4 | 0.4186 | 0.8372 | - | 0.7200 | 0.5357 | 0.6786 | 0.6012 | 0.1532 | 0.2529 | 0.1779 |
| AUEB-System5 | 0.4186 | 0.8372 | - | 0.7200 | 0.4643 | 0.5357 | 0.5000 | 0.1891 | 0.3187 | 0.2266 |
| UDEL-LAB3 | 0.4186 | 0.8372 | - | 0.7200 | 0.4643 | 0.6429 | 0.5321 | - | - | - |
| UDEL-LAB4 | 0.4186 | 0.8372 | - | 0.7200 | 0.5000 | 0.8214 | 0.6440 | - | - | - |
| NCU-IISR/AS-GIS-1 | 0.8441 | 0.9189 | 0.7692 | 0.8800 | 0.3571 | 0.6071 | 0.4232 | 0.5526 | 0.4386 | 0.4471 |
| NCU-IISR/AS-GIS-2 | 0.8441 | 0.9189 | 0.7692 | 0.8800 | 0.3571 | 0.6071 | 0.4232 | 0.5526 | 0.4386 | 0.4471 |
| NCU-IISR/AS-GIS-3 | 0.8441 | 0.9189 | 0.7692 | 0.8800 | 0.3571 | 0.6071 | 0.4232 | 0.5526 | 0.4386 | 0.4471 |
| simple truncation | 0.4186 | 0.8372 | - | 0.7200 | 0.0357 | 0.0357 | 0.0357 | - | - | - |
| kmeans | 0.4186 | 0.8372 | - | 0.7200 | 0.0357 | 0.0357 | 0.0357 | - | - | - |
| extractive | 0.4186 | 0.8372 | - | 0.7200 | 0.0357 | 0.0357 | 0.0357 | - | - | - |
| MRes | 0.4186 | 0.8372 | - | 0.7200 | 0.5357 | 0.7143 | 0.5893 | - | - | - |
| Final BERT | 0.4186 | 0.8372 | - | 0.7200 | 0.4643 | 0.6786 | 0.5399 | - | - | - |
| Another ALBERT | 0.4186 | 0.8372 | - | 0.7200 | 0.3929 | 0.5714 | 0.4732 | - | - | - |
| ALBERT | 0.4186 | 0.8372 | - | 0.7200 | 0.2500 | 0.5357 | 0.3810 | - | - | - |
| Ensemble | 0.4186 | 0.8372 | - | 0.7200 | 0.3214 | 0.5714 | 0.4345 | - | - | - |
| Ir_sys1 | 0.9480 | 0.9730 | 0.9231 | 0.9600 | 0.6429 | 0.7857 | 0.6929 | 0.6079 | 0.8004 | 0.6502 |
| Ir_sys2 | 0.8252 | 0.9231 | 0.7273 | 0.8800 | 0.6071 | 0.7500 | 0.6464 | 0.6177 | 0.6943 | 0.5948 |
| Ir_sys3 | 0.5660 | 0.6875 | 0.4444 | 0.6000 | 0.5000 | 0.6429 | 0.5458 | 0.7167 | 0.5583 | 0.5634 |
| Ir_sys4 | 0.5660 | 0.6875 | 0.4444 | 0.6000 | 0.5357 | 0.6786 | 0.5875 | 0.6939 | 0.5171 | 0.5352 |
| lalala | 0.8252 | 0.9231 | 0.7273 | 0.8800 | 0.4643 | 0.7500 | 0.5905 | 0.6226 | 0.4645 | 0.4747 |
| KU-DMIS-4 | 0.8904 | 0.9474 | 0.8333 | 0.9200 | 0.4286 | 0.6786 | 0.5101 | 0.5731 | 0.7478 | 0.6071 |
| UoT_allquestions | 0.6711 | 0.8421 | 0.5000 | 0.7600 | 0.2500 | 0.3929 | 0.3006 | 0.4412 | 0.3307 | 0.3327 |
| UoT_baseline | 0.6612 | 0.8780 | 0.4444 | 0.8000 | 0.4286 | 0.6071 | 0.5012 | 0.6614 | 0.4469 | 0.4804 |
| Best factoid | 0.7500 | 0.9000 | 0.6000 | 0.8400 | 0.5000 | 0.5357 | 0.5089 | 0.7155 | 0.4776 | 0.5121 |
| Best yesno | 0.5536 | 0.8571 | 0.2500 | 0.7600 | 0.3571 | 0.5000 | 0.4226 | 0.6702 | 0.4711 | 0.4932 |
| UoT_multitask_learn | 0.6711 | 0.8421 | 0.5000 | 0.7600 | 0.3571 | 0.4643 | 0.3988 | 0.6734 | 0.4601 | 0.4949 |
| BioASQ_Baseline | 0.3506 | 0.2727 | 0.4286 | 0.3600 | 0.1429 | 0.3571 | 0.2077 | 0.1767 | 0.3056 | 0.1843 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| LASIGE_ULISBOA | - | - | - | - | 0.37 | 0.39 | 0.36 | 0.36 |
| LASIGE_ULISBOA_2 | - | - | - | - | 0.37 | 0.39 | 0.36 | 0.36 |
| LASIGE_ULISBOA_3 | - | - | - | - | 0.37 | 0.39 | 0.36 | 0.36 |
| MollyHaywardSmall | - | - | - | - | - | - | - | - |
| MollyHaywardBase | - | - | - | - | - | - | - | - |
| bio-answerfinder | 0.4702 | 0.3401 | 0.4721 | 0.3366 | 4.14 | 4.39 | 4.19 | 4.38 |
| bio-answerfinder-2 | 0.3550 | 0.3431 | 0.3558 | 0.3406 | 4.39 | 4.18 | 4.30 | 4.78 |
| Olive-Answer-V1 | - | - | - | - | - | - | - | - |
| Olive-Answer-V2 | - | - | - | - | - | - | - | - |
| The First System Run | 0.3943 | 0.3131 | 0.3991 | 0.3098 | 4.03 | 4.24 | 3.95 | 4.33 |
| Macquarie CRJ Run 2 | 0.5008 | 0.3194 | 0.5066 | 0.3124 | 3.84 | 4.48 | 3.87 | 4.05 |
| Macquarie CRJ Run 3 | 0.5391 | 0.3175 | 0.5452 | 0.3099 | 3.84 | 4.53 | 3.84 | 4.09 |
| MQ-1 | 0.5382 | 0.3399 | 0.5390 | 0.3312 | 3.93 | 4.39 | 3.94 | 4.10 |
| MQ-2 | 0.5419 | 0.3434 | 0.5434 | 0.3348 | 3.92 | 4.43 | 3.96 | 4.15 |
| MQ-3 | 0.5297 | 0.3456 | 0.5292 | 0.3368 | 4.02 | 4.40 | 3.95 | 4.14 |
| MQ-4 | 0.5353 | 0.3376 | 0.5376 | 0.3296 | 3.97 | 4.35 | 3.95 | 4.16 |
| MQ-5 | 0.5138 | 0.3377 | 0.5172 | 0.3315 | 4.10 | 4.47 | 4.06 | 4.17 |
| KU-DMIS-1 | 0.2686 | 0.2549 | 0.2714 | 0.2514 | 4.41 | 3.99 | 3.92 | 4.75 |
| KU-DMIS-2 | 0.2541 | 0.2345 | 0.2565 | 0.2297 | 4.55 | 3.95 | 3.83 | 4.72 |
| KU-DMIS-3 | 0.2369 | 0.2127 | 0.2361 | 0.2103 | 4.39 | 3.54 | 3.67 | 4.74 |
| KU-DMIS-5 | 0.2381 | 0.2252 | 0.2375 | 0.2243 | 4.35 | 3.62 | 3.67 | 4.64 |
| AUEB-System1 | 0.1097 | 0.0718 | 0.1376 | 0.0837 | 3.96 | 2.64 | 2.54 | 4.40 |
| AUEB-System2 | 0.2413 | 0.1442 | 0.2527 | 0.1447 | 4.11 | 3.72 | 3.52 | 4.52 |
| UDEL-LAB1 | - | - | - | - | - | - | - | - |
| AUEB-System3 | 0.2410 | 0.1417 | 0.2522 | 0.1416 | 4.25 | 3.78 | 3.51 | 4.55 |
| UDEL-LAB2 | - | - | - | - | - | - | - | - |
| AUEB-System4 | 0.4313 | 0.2769 | 0.4342 | 0.2696 | 4.19 | 4.44 | 3.90 | 4.45 |
| AUEB-System5 | 0.4313 | 0.2769 | 0.4342 | 0.2696 | 4.19 | 4.44 | 3.90 | 4.45 |
| UDEL-LAB3 | - | - | - | - | - | - | - | - |
| UDEL-LAB4 | - | - | - | - | - | - | - | - |
| NCU-IISR/AS-GIS-1 | 0.3313 | 0.3267 | 0.3371 | 0.3297 | 4.33 | 3.87 | 4.17 | 4.71 |
| NCU-IISR/AS-GIS-2 | 0.4094 | 0.4012 | 0.4057 | 0.3941 | 4.53 | 4.18 | 4.38 | 4.89 |
| NCU-IISR/AS-GIS-3 | 0.2940 | 0.2815 | 0.2946 | 0.2784 | 4.14 | 3.77 | 3.90 | 4.81 |
| simple truncation | 0.6157 | 0.3333 | 0.6204 | 0.3224 | 3.87 | 4.71 | 3.94 | 3.91 |
| kmeans | 0.6504 | 0.3213 | 0.6533 | 0.3097 | 3.78 | 4.71 | 3.86 | 3.86 |
| extractive | 0.3867 | 0.3043 | 0.3890 | 0.2976 | 4.24 | 4.02 | 3.99 | 4.39 |
| MRes | - | - | - | - | 0.36 | 0.37 | 0.37 | 0.38 |
| Final BERT | - | - | - | - | 0.36 | 0.37 | 0.37 | 0.38 |
| Another ALBERT | - | - | - | - | 0.36 | 0.37 | 0.37 | 0.38 |
| ALBERT | - | - | - | - | 0.36 | 0.37 | 0.37 | 0.38 |
| Ensemble | - | - | - | - | 0.36 | 0.37 | 0.37 | 0.38 |
| Ir_sys1 | 0.2421 | 0.2001 | 0.2623 | 0.2041 | 4.23 | 4.01 | 3.87 | 4.69 |
| Ir_sys2 | 0.2339 | 0.1899 | 0.2456 | 0.1950 | 4.06 | 3.29 | 3.32 | 4.53 |
| Ir_sys3 | 0.5250 | 0.3346 | 0.5275 | 0.3268 | 3.98 | 4.48 | 4.01 | 4.15 |
| Ir_sys4 | 0.5271 | 0.3353 | 0.5288 | 0.3272 | 3.98 | 4.45 | 3.93 | 4.15 |
| lalala | 0.5426 | 0.3392 | 0.5431 | 0.3300 | 3.96 | 4.35 | 3.88 | 4.09 |
| KU-DMIS-4 | 0.2363 | 0.2211 | 0.2431 | 0.2223 | 4.35 | 3.61 | 3.76 | 4.71 |
| UoT_allquestions | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| UoT_baseline | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Best factoid | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Best yesno | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| UoT_multitask_learn | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 5
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Macro F1 | F1 Yes | F1 No | Accuracy | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| MQ-2 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| MQ-3 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| MQ-4 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| MQ-5 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| LASIGE_ULISBOA | 0.5929 | 0.7857 | 0.4000 | 0.6842 | 0.3889 | 0.6389 | 0.4884 | 0.3426 | 0.3029 | 0.3048 |
| LASIGE_ULISBOA_2 | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.4167 | 0.6111 | 0.4801 | 0.2611 | 0.2932 | 0.2645 |
| LASIGE_ULISBOA_3 | 0.8081 | 0.8889 | 0.7273 | 0.8421 | 0.3333 | 0.6389 | 0.4685 | 0.3611 | 0.2365 | 0.2724 |
| Olive-Answer-V1 | 0.4063 | 0.8125 | - | 0.6842 | 0.1389 | 0.2500 | 0.1944 | - | - | - |
| Olive-Answer-V2 | 0.4063 | 0.8125 | - | 0.6842 | 0.1111 | 0.2222 | 0.1667 | - | - | - |
| finetuning1 | 0.3214 | 0.1429 | 0.5000 | 0.3684 | 0.5000 | 0.6667 | 0.5671 | 0.3444 | 0.3013 | 0.2757 |
| bio-answerfinder | 0.4571 | 0.7143 | 0.2000 | 0.5789 | 0.4722 | 0.5278 | 0.4954 | 0.4550 | 0.3677 | 0.3625 |
| bio-answerfinder-2 | 0.4571 | 0.7143 | 0.2000 | 0.5789 | 0.4722 | 0.5278 | 0.4954 | 0.4550 | 0.3677 | 0.3625 |
| AUEB-System1 | 0.4063 | 0.8125 | - | 0.6842 | 0.3333 | 0.4722 | 0.3889 | 0.1102 | 0.2083 | 0.1396 |
| AUEB-System2 | 0.4063 | 0.8125 | - | 0.6842 | 0.3889 | 0.5278 | 0.4583 | 0.1337 | 0.2546 | 0.1663 |
| AUEB-System3 | 0.4063 | 0.8125 | - | 0.6842 | 0.3611 | 0.5833 | 0.4560 | 0.1099 | 0.2546 | 0.1496 |
| AUEB-System4 | 0.4063 | 0.8125 | - | 0.6842 | 0.3889 | 0.5278 | 0.4421 | 0.1389 | 0.1528 | 0.1405 |
| AUEB-System5 | 0.4063 | 0.8125 | - | 0.6842 | 0.4167 | 0.5278 | 0.4676 | 0.1491 | 0.2685 | 0.1778 |
| Ali_test1 | 0.6360 | 0.8276 | 0.4444 | 0.7368 | 0.4167 | 0.5278 | 0.4639 | 0.1913 | 0.3342 | 0.2269 |
| OliveAnsScore1 | 0.4904 | 0.7586 | 0.2222 | 0.6316 | 0.4167 | 0.5556 | 0.4815 | 0.1838 | 0.3666 | 0.2354 |
| OliveAnsScore2 | 0.6360 | 0.8276 | 0.4444 | 0.7368 | 0.4167 | 0.5278 | 0.4639 | 0.1913 | 0.3342 | 0.2269 |
| OliveSentSim1 | 0.4571 | 0.7143 | 0.2000 | 0.5789 | 0.4167 | 0.5833 | 0.4787 | 0.1783 | 0.3503 | 0.2246 |
| The First System Run | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| Macquarie CRJ Run 2 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| Macquarie CRJ Run 3 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| MDS_UNCC | 0.7246 | 0.7826 | 0.6667 | 0.7368 | 0.4167 | 0.6944 | 0.5204 | 0.3444 | 0.3013 | 0.2757 |
| MQ-1 | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| KU-DMIS-1 | 0.7077 | 0.8000 | 0.6154 | 0.7368 | 0.3889 | 0.5833 | 0.4630 | 0.3962 | 0.5919 | 0.4600 |
| KU-DMIS-2 | 0.7564 | 0.8462 | 0.6667 | 0.7895 | 0.4167 | 0.6111 | 0.4907 | 0.3354 | 0.4638 | 0.3661 |
| KU-DMIS-3 | 0.7564 | 0.8462 | 0.6667 | 0.7895 | 0.4444 | 0.5556 | 0.4884 | 0.3526 | 0.5692 | 0.4026 |
| KU-DMIS-4 | 0.7077 | 0.8000 | 0.6154 | 0.7368 | 0.3889 | 0.5556 | 0.4514 | 0.4052 | 0.6138 | 0.4637 |
| Ir_sys1 | 0.6801 | 0.8148 | 0.5455 | 0.7368 | 0.5000 | 0.6389 | 0.5532 | 0.4234 | 0.5501 | 0.4295 |
| Ir_sys2 | 0.6360 | 0.8276 | 0.4444 | 0.7368 | 0.5556 | 0.6111 | 0.5787 | 0.4790 | 0.5748 | 0.4773 |
| Ir_sys3 | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.4722 | 0.6111 | 0.5116 | 0.4114 | 0.3611 | 0.3518 |
| Ir_sys4 | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.4722 | 0.5556 | 0.5000 | 0.4545 | 0.3657 | 0.3685 |
| lalala | 0.6801 | 0.8148 | 0.5455 | 0.7368 | 0.5000 | 0.6389 | 0.5472 | 0.4726 | 0.4390 | 0.4237 |
| bart-base | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| mt5-bioasq | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| bart-large | 0.4063 | 0.8125 | - | 0.6842 | - | - | - | - | - | - |
| UoT_allquestions | 0.7286 | 0.8571 | 0.6000 | 0.7895 | 0.2222 | 0.3611 | 0.2602 | 0.3386 | 0.3056 | 0.3023 |
| UoT_baseline | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.3056 | 0.4444 | 0.3611 | 0.3975 | 0.2951 | 0.3157 |
| Best factoid | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.4167 | 0.5000 | 0.4491 | 0.4147 | 0.3248 | 0.3308 |
| Best yesno | 0.4063 | 0.8125 | - | 0.6842 | 0.4167 | 0.5000 | 0.4491 | 0.4147 | 0.3248 | 0.3308 |
| UoT_multitask_learn | 0.4063 | 0.8125 | - | 0.6842 | 0.4167 | 0.5000 | 0.4491 | 0.4147 | 0.3248 | 0.3308 |
| UDEL-LAB1 | 0.6801 | 0.8148 | 0.5455 | 0.7368 | 0.4722 | 0.7222 | 0.5727 | 0.5032 | 0.5710 | 0.5070 |
| UDEL-LAB3 | 0.6801 | 0.8148 | 0.5455 | 0.7368 | 0.5000 | 0.6667 | 0.5694 | 0.5289 | 0.6204 | 0.5306 |
| ALBERT | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.3611 | 0.6111 | 0.4606 | - | - | - |
| Another ALBERT | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.3611 | 0.6389 | 0.4778 | - | - | - |
| Final BERT | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.4444 | 0.6389 | 0.5171 | - | - | - |
| MRes | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.4444 | 0.5833 | 0.5056 | - | - | - |
| Ensemble | 0.5522 | 0.7407 | 0.3636 | 0.6316 | 0.3889 | 0.6389 | 0.4870 | - | - | - |
| NCU-IISR/AS-GIS-1 | 0.7077 | 0.8000 | 0.6154 | 0.7368 | 0.4444 | 0.6667 | 0.5287 | 0.4861 | 0.3279 | 0.3599 |
| NCU-IISR/AS-GIS-2 | 0.7077 | 0.8000 | 0.6154 | 0.7368 | 0.4444 | 0.6667 | 0.5287 | 0.4861 | 0.3279 | 0.3599 |
| NCU-IISR/AS-GIS-3 | 0.7077 | 0.8000 | 0.6154 | 0.7368 | 0.4444 | 0.6667 | 0.5287 | 0.4861 | 0.3279 | 0.3599 |
| KU-DMIS-5 | 0.7564 | 0.8462 | 0.6667 | 0.7895 | 0.3889 | 0.5833 | 0.4630 | 0.3826 | 0.4661 | 0.3996 |
| UDEL-LAB4 | 0.5929 | 0.7857 | 0.4000 | 0.6842 | 0.5000 | 0.6944 | 0.5833 | 0.3883 | 0.4556 | 0.3936 |
| UDEL-LAB2 | 0.5929 | 0.7857 | 0.4000 | 0.6842 | 0.5278 | 0.7222 | 0.6019 | 0.3813 | 0.4694 | 0.4031 |
| BioASQ_Baseline | 0.2841 | 0.1333 | 0.4348 | 0.3158 | 0.0278 | 0.2778 | 0.1273 | 0.2209 | 0.3723 | 0.2324 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| MQ-2 | 0.5806 | 0.3574 | 0.5841 | 0.3479 | 4.02 | 4.68 | 4.05 | 4.12 |
| MQ-3 | 0.5767 | 0.3633 | 0.5809 | 0.3545 | 4.03 | 4.65 | 4.04 | 4.11 |
| MQ-4 | 0.5571 | 0.3471 | 0.5627 | 0.3395 | 3.96 | 4.66 | 4.06 | 4.09 |
| MQ-5 | 0.5515 | 0.3411 | 0.5552 | 0.3333 | 3.99 | 4.60 | 3.98 | 4.08 |
| LASIGE_ULISBOA | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| LASIGE_ULISBOA_2 | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| LASIGE_ULISBOA_3 | - | - | - | - | 0.33 | 0.33 | 0.33 | 0.33 |
| Olive-Answer-V1 | - | - | - | - | - | - | - | - |
| Olive-Answer-V2 | - | - | - | - | - | - | - | - |
| finetuning1 | - | - | - | - | - | - | - | - |
| bio-answerfinder | 0.4373 | 0.3223 | 0.4381 | 0.3173 | 4.14 | 4.35 | 4.15 | 4.45 |
| bio-answerfinder-2 | 0.3388 | 0.3253 | 0.3411 | 0.3217 | 4.37 | 4.18 | 4.28 | 4.79 |
| AUEB-System1 | 0.2174 | 0.1373 | 0.2491 | 0.1488 | 4.16 | 3.60 | 3.34 | 4.43 |
| AUEB-System2 | 0.2174 | 0.1373 | 0.2491 | 0.1488 | 4.16 | 3.60 | 3.34 | 4.43 |
| AUEB-System3 | 0.2174 | 0.1373 | 0.2491 | 0.1488 | 4.16 | 3.60 | 3.34 | 4.43 |
| AUEB-System4 | 0.2174 | 0.1373 | 0.2491 | 0.1488 | 4.16 | 3.60 | 3.34 | 4.43 |
| AUEB-System5 | 0.4667 | 0.3190 | 0.4736 | 0.3121 | 4.19 | 4.43 | 3.90 | 4.48 |
| Ali_test1 | - | - | - | - | - | - | - | - |
| OliveAnsScore1 | - | - | - | - | - | - | - | - |
| OliveAnsScore2 | - | - | - | - | - | - | - | - |
| OliveSentSim1 | - | - | - | - | - | - | - | - |
| The First System Run | 0.4035 | 0.3049 | 0.4238 | 0.3060 | 4.11 | 4.54 | 4.01 | 4.42 |
| Macquarie CRJ Run 2 | 0.5213 | 0.3018 | 0.5322 | 0.2964 | 3.93 | 4.71 | 3.88 | 4.01 |
| Macquarie CRJ Run 3 | 0.5406 | 0.2946 | 0.5509 | 0.2890 | 3.88 | 4.74 | 3.87 | 4.02 |
| MDS_UNCC | - | - | - | - | - | - | - | - |
| MQ-1 | 0.5621 | 0.3449 | 0.5643 | 0.3365 | 3.98 | 4.63 | 4.04 | 4.11 |
| KU-DMIS-1 | 0.2500 | 0.2254 | 0.2648 | 0.2281 | 4.47 | 3.81 | 3.80 | 4.72 |
| KU-DMIS-2 | 0.2686 | 0.2504 | 0.2776 | 0.2484 | 4.49 | 3.94 | 3.87 | 4.69 |
| KU-DMIS-3 | 0.2309 | 0.2211 | 0.2440 | 0.2248 | 4.54 | 3.57 | 3.62 | 4.75 |
| KU-DMIS-4 | 0.2763 | 0.2558 | 0.2864 | 0.2563 | 4.52 | 3.71 | 3.78 | 4.76 |
| Ir_sys1 | 0.3071 | 0.2641 | 0.3144 | 0.2641 | 4.07 | 3.55 | 3.56 | 4.36 |
| Ir_sys2 | 0.2383 | 0.1961 | 0.2566 | 0.2029 | 4.26 | 4.04 | 3.82 | 4.54 |
| Ir_sys3 | - | - | - | - | 0.37 | 0.39 | 0.39 | 0.39 |
| Ir_sys4 | - | - | - | - | 0.37 | 0.39 | 0.39 | 0.39 |
| lalala | - | - | - | - | 0.33 | 0.34 | 0.34 | 0.34 |
| bart-base | 0.2609 | 0.2764 | 0.2560 | 0.2673 | 4.13 | 3.61 | 4.01 | 4.81 |
| mt5-bioasq | 0.2967 | 0.2014 | 0.2971 | 0.1949 | 3.49 | 3.62 | 3.58 | 3.94 |
| bart-large | 0.2574 | 0.2722 | 0.2554 | 0.2639 | 4.11 | 3.72 | 4.14 | 4.81 |
| UoT_allquestions | - | - | - | - | 0.36 | 0.34 | 0.36 | 0.38 |
| UoT_baseline | - | - | - | - | 0.36 | 0.34 | 0.36 | 0.38 |
| Best factoid | - | - | - | - | 0.36 | 0.34 | 0.36 | 0.38 |
| Best yesno | - | - | - | - | 0.40 | 0.43 | 0.38 | 0.42 |
| UoT_multitask_learn | - | - | - | - | 0.40 | 0.43 | 0.38 | 0.42 |
| UDEL-LAB1 | - | - | - | - | - | - | - | - |
| UDEL-LAB3 | - | - | - | - | - | - | - | - |
| ALBERT | - | - | - | - | 0.36 | 0.37 | 0.35 | 0.37 |
| Another ALBERT | - | - | - | - | 0.36 | 0.37 | 0.35 | 0.37 |
| Final BERT | - | - | - | - | 0.36 | 0.37 | 0.35 | 0.37 |
| MRes | - | - | - | - | 0.36 | 0.37 | 0.35 | 0.37 |
| Ensemble | - | - | - | - | 0.36 | 0.37 | 0.35 | 0.37 |
| NCU-IISR/AS-GIS-1 | 0.2900 | 0.2757 | 0.3037 | 0.2781 | 4.27 | 4.04 | 4.18 | 4.81 |
| NCU-IISR/AS-GIS-2 | 0.3809 | 0.3701 | 0.3846 | 0.3663 | 4.53 | 4.38 | 4.41 | 4.88 |
| NCU-IISR/AS-GIS-3 | 0.2654 | 0.2485 | 0.2785 | 0.2502 | 4.21 | 3.86 | 3.77 | 4.65 |
| KU-DMIS-5 | 0.2213 | 0.2074 | 0.2385 | 0.2138 | 4.47 | 3.47 | 3.53 | 4.75 |
| UDEL-LAB4 | - | - | - | - | - | - | - | - |
| UDEL-LAB2 | - | - | - | - | - | - | - | - |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |