BioASQ Participants Area
Task 12b: Test Results of Phase A+
The test results are presented in separate tables for each type of annotation. The "System Description" of each system is used.The evaluation measures that are used in Task A+ are presented here .
Warning: For ideal answers, good ROUGE results do not always imply good manual scores.
Test batch 1
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| mibi_rag_snippet | 0.8000 | 0.8387 | 0.7368 | 0.7878 | 0.0476 | 0.0476 | 0.0476 | 0.2730 | 0.2008 | 0.2175 |
| mibi_rag_abstract | 0.7600 | 0.8125 | 0.6667 | 0.7396 | 0.0476 | 0.0476 | 0.0476 | 0.3127 | 0.2571 | 0.2698 |
| UR-IW-5 | 0.8400 | 0.8571 | 0.8182 | 0.8377 | 0.0952 | 0.1429 | 0.1111 | 0.4258 | 0.3670 | 0.3758 |
| Fleming-1 | 0.8400 | 0.8750 | 0.7778 | 0.8264 | - | - | - | 0.2388 | 0.1848 | 0.2022 |
| GTBioASQsys2 | 0.7600 | 0.7857 | 0.7273 | 0.7565 | 0.1905 | 0.1905 | 0.1905 | 0.2722 | 0.2068 | 0.2229 |
| Gatech competition | 0.8000 | 0.8000 | 0.8000 | 0.8000 | 0.1429 | 0.1429 | 0.1429 | 0.4770 | 0.3477 | 0.3769 |
| GTBioASQsys3 | 0.8000 | 0.8148 | 0.7826 | 0.7987 | 0.2381 | 0.2381 | 0.2381 | 0.2579 | 0.1905 | 0.2006 |
| UR-IW-4 | 0.8800 | 0.8889 | 0.8696 | 0.8792 | 0.0476 | 0.1429 | 0.0873 | 0.4239 | 0.3761 | 0.3698 |
| UR-IW-2 | 0.8800 | 0.8889 | 0.8696 | 0.8792 | 0.0952 | 0.0952 | 0.0952 | 0.5548 | 0.4515 | 0.4706 |
| bioinfo-0 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| UR-IW-3 | 0.9600 | 0.9677 | 0.9474 | 0.9576 | 0.1429 | 0.1429 | 0.1429 | 0.4135 | 0.4282 | 0.4011 |
| UR-IW-1 | 0.8400 | 0.8667 | 0.8000 | 0.8333 | 0.2381 | 0.3333 | 0.2857 | 0.3420 | 0.3829 | 0.3474 |
| bioinfo-1 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| bioinfo-2 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| bioinfo-3 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| bioinfo-4 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| dmiip2024 | 0.8000 | 0.8571 | 0.6667 | 0.7619 | 0.1905 | 0.2381 | 0.2063 | 0.5056 | 0.4021 | 0.4312 |
| dmiip2024_1 | 0.8000 | 0.8571 | 0.6667 | 0.7619 | 0.2381 | 0.5238 | 0.3492 | 0.3413 | 0.3095 | 0.2899 |
| dmiip2024_3 | 0.8000 | 0.8571 | 0.6667 | 0.7619 | 0.2381 | 0.5238 | 0.3611 | 0.3460 | 0.3381 | 0.3123 |
| dmiip2024_2 | 0.8000 | 0.8571 | 0.6667 | 0.7619 | 0.2857 | 0.3810 | 0.3048 | 0.3508 | 0.3092 | 0.2883 |
| dmiip2024_4 | 0.8000 | 0.8571 | 0.6667 | 0.7619 | 0.0952 | 0.2857 | 0.1762 | 0.2746 | 0.2762 | 0.2462 |
| simple truncation | 0.8400 | 0.8824 | 0.7500 | 0.8162 | 0.1429 | 0.1905 | 0.1667 | 0.1996 | 0.2011 | 0.1908 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| mibi_rag_snippet | 0.3097 | 0.2242 | 0.3167 | 0.2212 | 4.40 | 4.44 | 4.08 | 4.47 |
| mibi_rag_abstract | 0.3246 | 0.2451 | 0.3340 | 0.2428 | 4.39 | 4.55 | 4.20 | 4.49 |
| UR-IW-5 | 0.2778 | 0.1639 | 0.3005 | 0.1670 | - | - | - | - |
| Fleming-1 | 0.2662 | 0.1136 | 0.2954 | 0.1196 | 4.16 | 4.56 | 3.79 | 4.19 |
| GTBioASQsys2 | 0.1974 | 0.1552 | 0.1998 | 0.1565 | 4.14 | 4.12 | 3.85 | 4.39 |
| Gatech competition | 0.1817 | 0.1625 | 0.1906 | 0.1660 | 4.16 | 3.94 | 3.65 | 4.47 |
| GTBioASQsys3 | 0.1816 | 0.1504 | 0.1910 | 0.1535 | 4.16 | 4.01 | 3.87 | 4.40 |
| UR-IW-4 | 0.2963 | 0.1775 | 0.3133 | 0.1778 | - | - | - | - |
| UR-IW-2 | 0.2469 | 0.2749 | 0.2419 | 0.2694 | - | - | - | - |
| bioinfo-0 | 0.3518 | 0.1165 | 0.3802 | 0.1208 | 4.08 | 4.61 | 3.72 | 4.06 |
| UR-IW-3 | 0.2418 | 0.2495 | 0.2454 | 0.2483 | - | - | - | - |
| UR-IW-1 | 0.2831 | 0.1844 | 0.2930 | 0.1823 | 4.33 | 4.40 | 3.99 | 4.35 |
| bioinfo-1 | 0.3489 | 0.1215 | 0.3753 | 0.1252 | 3.96 | 4.60 | 3.67 | 4.04 |
| bioinfo-2 | 0.3354 | 0.1216 | 0.3592 | 0.1257 | 3.95 | 4.58 | 3.71 | 3.96 |
| bioinfo-3 | 0.3520 | 0.1242 | 0.3799 | 0.1273 | 4.04 | 4.65 | 3.71 | 4.08 |
| bioinfo-4 | 0.3585 | 0.1244 | 0.3840 | 0.1282 | 4.05 | 4.64 | 3.69 | 3.99 |
| dmiip2024 | 0.2101 | 0.2190 | 0.2136 | 0.2161 | 4.53 | 4.42 | 4.39 | 4.60 |
| dmiip2024_1 | 0.2101 | 0.2190 | 0.2136 | 0.2161 | 4.53 | 4.42 | 4.39 | 4.60 |
| dmiip2024_3 | 0.2101 | 0.2190 | 0.2136 | 0.2161 | 4.53 | 4.42 | 4.39 | 4.60 |
| dmiip2024_2 | 0.2101 | 0.2190 | 0.2136 | 0.2161 | 4.53 | 4.42 | 4.39 | 4.60 |
| dmiip2024_4 | 0.2101 | 0.2190 | 0.2136 | 0.2161 | 4.53 | 4.42 | 4.39 | 4.60 |
| simple truncation | 0.1120 | 0.0997 | 0.1184 | 0.0993 | 3.59 | 3.55 | 3.60 | 4.06 |
Test batch 2
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| mibi_rag_snippet | 0.7692 | 0.8235 | 0.6667 | 0.7451 | 0.1579 | 0.1579 | 0.1579 | 0.3861 | 0.2844 | 0.3135 |
| mibi_rag_abstract | 0.6923 | 0.7647 | 0.5556 | 0.6601 | 0.1053 | 0.1053 | 0.1053 | 0.3213 | 0.1997 | 0.2243 |
| Gatech competition | 0.7308 | 0.7742 | 0.6667 | 0.7204 | 0.2105 | 0.2105 | 0.2105 | 0.2438 | 0.1916 | 0.1900 |
| GTBioASQsys2 | 0.6923 | 0.7333 | 0.6364 | 0.6848 | 0.2105 | 0.2105 | 0.2105 | 0.1765 | 0.1203 | 0.1288 |
| GTBioASQsys3 | 0.8077 | 0.8485 | 0.7368 | 0.7927 | 0.2632 | 0.2632 | 0.2632 | 0.1894 | 0.1559 | 0.1596 |
| UR-IW-5 | 0.8077 | 0.8387 | 0.7619 | 0.8003 | 0.3158 | 0.3684 | 0.3333 | 0.2376 | 0.2054 | 0.2004 |
| UR-IW-4 | 0.7692 | 0.8000 | 0.7273 | 0.7636 | 0.1579 | 0.3158 | 0.2123 | 0.3093 | 0.2592 | 0.2530 |
| UR-IW-3 | 0.8077 | 0.8387 | 0.7619 | 0.8003 | 0.3158 | 0.3684 | 0.3333 | 0.3422 | 0.2706 | 0.2870 |
| UR-IW-2 | 0.7692 | 0.8000 | 0.7273 | 0.7636 | 0.2632 | 0.3684 | 0.3026 | 0.2775 | 0.2967 | 0.2722 |
| simple truncation | 0.8462 | 0.8889 | 0.7500 | 0.8194 | 0.1579 | 0.2105 | 0.1711 | 0.0773 | 0.0638 | 0.0649 |
| kmeans | 0.7692 | 0.8235 | 0.6667 | 0.7451 | 0.1579 | 0.2105 | 0.1711 | 0.0930 | 0.0935 | 0.0894 |
| similarity measures | 0.8462 | 0.8889 | 0.7500 | 0.8194 | 0.1579 | 0.2105 | 0.1711 | 0.0773 | 0.0638 | 0.0649 |
| UR-IW-1 | 0.6923 | 0.7500 | 0.6000 | 0.6750 | 0.3158 | 0.4211 | 0.3465 | 0.2959 | 0.2374 | 0.2414 |
| Fleming-3 | 0.7308 | 0.8000 | 0.5882 | 0.6941 | 0.2632 | 0.3684 | 0.3070 | 0.2612 | 0.1476 | 0.1776 |
| dmiip2024 | 0.8846 | 0.9091 | 0.8421 | 0.8756 | 0.3684 | 0.5789 | 0.4649 | 0.4944 | 0.3662 | 0.3789 |
| dmiip2024_1 | 0.7308 | 0.7879 | 0.6316 | 0.7097 | 0.3684 | 0.4737 | 0.4211 | 0.5257 | 0.3532 | 0.3722 |
| dmiip2024_2 | 0.8077 | 0.8649 | 0.6667 | 0.7658 | 0.2105 | 0.4211 | 0.3026 | 0.3937 | 0.3238 | 0.3137 |
| dmiip2024_4 | 0.3077 | - | 0.4706 | 0.2353 | 0.2632 | 0.5263 | 0.3860 | 0.3241 | 0.2898 | 0.2694 |
| dmiip2024_3 | 0.8077 | 0.8649 | 0.6667 | 0.7658 | 0.4211 | 0.5263 | 0.4737 | 0.4296 | 0.2910 | 0.3158 |
| bioinfo-0 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-1 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-2 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-3 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-4 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| CPS | 0.6923 | 0.8000 | 0.3333 | 0.5667 | - | - | - | - | - | - |
| CPS2 | 0.6923 | 0.7895 | 0.4286 | 0.6090 | - | - | - | - | - | - |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| mibi_rag_snippet | 0.2616 | 0.1992 | 0.2653 | 0.1958 | 4.42 | 4.60 | 4.14 | 4.49 |
| mibi_rag_abstract | 0.2676 | 0.2036 | 0.2724 | 0.1989 | 4.34 | 4.55 | 4.16 | 4.39 |
| Gatech competition | 0.1463 | 0.1365 | 0.1479 | 0.1366 | 4.16 | 3.98 | 3.82 | 4.40 |
| GTBioASQsys2 | 0.1392 | 0.1266 | 0.1505 | 0.1320 | 4.16 | 4.00 | 3.96 | 4.45 |
| GTBioASQsys3 | 0.1288 | 0.1262 | 0.1385 | 0.1302 | 4.16 | 4.08 | 4.01 | 4.42 |
| UR-IW-5 | 0.2597 | 0.1696 | 0.2657 | 0.1653 | 4.39 | 4.44 | 3.94 | 4.42 |
| UR-IW-4 | 0.2307 | 0.1584 | 0.2478 | 0.1614 | 4.20 | 4.38 | 3.86 | 4.29 |
| UR-IW-3 | 0.1930 | 0.2114 | 0.1991 | 0.2105 | 4.38 | 4.20 | 4.12 | 4.49 |
| UR-IW-2 | 0.1934 | 0.2259 | 0.1878 | 0.2185 | 4.27 | 4.12 | 4.02 | 4.41 |
| simple truncation | 0.0692 | 0.0322 | 0.0700 | 0.0316 | 3.29 | 3.35 | 3.07 | 3.44 |
| kmeans | 0.0876 | 0.0539 | 0.0918 | 0.0545 | 3.84 | 3.87 | 3.71 | 4.14 |
| similarity measures | 0.0692 | 0.0322 | 0.0700 | 0.0316 | 3.29 | 3.35 | 3.07 | 3.44 |
| UR-IW-1 | 0.2138 | 0.1597 | 0.2275 | 0.1593 | 4.00 | 4.20 | 3.71 | 4.26 |
| Fleming-3 | 0.2356 | 0.1171 | 0.2520 | 0.1215 | 4.29 | 4.58 | 3.91 | 4.31 |
| dmiip2024 | 0.1954 | 0.2227 | 0.1897 | 0.2142 | 4.24 | 4.14 | 4.14 | 4.32 |
| dmiip2024_1 | 0.1971 | 0.2240 | 0.1948 | 0.2184 | 4.32 | 4.13 | 4.04 | 4.41 |
| dmiip2024_2 | 0.2262 | 0.2503 | 0.2191 | 0.2374 | 4.45 | 4.46 | 4.36 | 4.66 |
| dmiip2024_4 | 0.1992 | 0.2227 | 0.1883 | 0.2085 | 4.19 | 4.08 | 4.06 | 4.47 |
| dmiip2024_3 | 0.1455 | 0.1592 | 0.1533 | 0.1637 | 4.51 | 4.46 | 4.34 | 4.67 |
| bioinfo-0 | 0.2934 | 0.1446 | 0.3022 | 0.1437 | 4.26 | 4.60 | 3.89 | 4.24 |
| bioinfo-1 | 0.2740 | 0.1223 | 0.2892 | 0.1243 | 4.20 | 4.54 | 3.85 | 4.27 |
| bioinfo-2 | 0.3201 | 0.1551 | 0.3341 | 0.1575 | 4.25 | 4.65 | 3.93 | 4.31 |
| bioinfo-3 | 0.2246 | 0.2309 | 0.2239 | 0.2267 | 4.29 | 4.28 | 4.06 | 4.42 |
| bioinfo-4 | 0.1646 | 0.1826 | 0.1718 | 0.1865 | 4.15 | 4.00 | 3.92 | 4.21 |
| CPS | 0.2155 | 0.2071 | 0.2185 | 0.2029 | 4.21 | 4.05 | 3.84 | 4.33 |
| CPS2 | 0.1978 | 0.1649 | 0.2011 | 0.1623 | 4.13 | 3.69 | 3.49 | 4.40 |
Test batch 3
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| bioinfo-0 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-1 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-2 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-3 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-4 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| GTBioASQsys2 | 0.7500 | 0.7692 | 0.7273 | 0.7483 | 0.0769 | 0.0769 | 0.0769 | 0.2417 | 0.2287 | 0.2167 |
| GTBioASQsys4 | 0.8750 | 0.8889 | 0.8571 | 0.8730 | 0.1154 | 0.1154 | 0.1154 | 0.1281 | 0.1531 | 0.1284 |
| Gatech competition | 0.8750 | 0.8966 | 0.8421 | 0.8693 | 0.2692 | 0.2692 | 0.2692 | 0.2177 | 0.1962 | 0.1862 |
| GTBioASQsys3 | 0.8333 | 0.8462 | 0.8182 | 0.8322 | 0.3077 | 0.3077 | 0.3077 | 0.2206 | 0.1887 | 0.1832 |
| Fleming-3 | 0.8333 | 0.8571 | 0.8000 | 0.8286 | 0.1154 | 0.2692 | 0.1635 | 0.1804 | 0.1625 | 0.1533 |
| mibi_rag_abstract | 0.4583 | 0.3158 | 0.5517 | 0.4338 | 0.2692 | 0.2692 | 0.2692 | 0.2953 | 0.2358 | 0.2468 |
| mibi_rag_snippet | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.1923 | 0.1923 | 0.1923 | 0.2825 | 0.2259 | 0.2351 |
| mibi_rag_3 | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.1923 | 0.1923 | 0.1923 | 0.2825 | 0.2259 | 0.2351 |
| mibi_rag_4 | 0.5417 | 0.4762 | 0.5926 | 0.5344 | 0.3077 | 0.3077 | 0.3077 | 0.3813 | 0.2884 | 0.3116 |
| mibi_rag_5 | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.1923 | 0.1923 | 0.1923 | 0.2754 | 0.2226 | 0.2306 |
| CPS | 0.6250 | 0.7273 | 0.4000 | 0.5636 | 0.1538 | 0.1538 | 0.1538 | 0.1404 | 0.0684 | 0.0810 |
| CPS2 | 0.5833 | 0.7059 | 0.2857 | 0.4958 | - | - | - | - | - | - |
| CPS3 | 0.7500 | 0.8125 | 0.6250 | 0.7188 | - | - | - | - | - | - |
| UR-IW-1 | 0.8333 | 0.8667 | 0.7778 | 0.8222 | 0.2692 | 0.4231 | 0.3237 | 0.3189 | 0.3798 | 0.3020 |
| UR-IW-3 | 0.7917 | 0.8000 | 0.7826 | 0.7913 | 0.1923 | 0.2308 | 0.2115 | 0.1916 | 0.2545 | 0.2031 |
| UR-IW-4 | 0.8750 | 0.8889 | 0.8571 | 0.8730 | 0.2308 | 0.3077 | 0.2596 | 0.2502 | 0.2432 | 0.2189 |
| UR-IW-5 | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.1923 | 0.2308 | 0.2115 | 0.2980 | 0.2650 | 0.2561 |
| UR-IW-2 | 0.8333 | 0.8462 | 0.8182 | 0.8322 | 0.2308 | 0.3077 | 0.2628 | 0.2869 | 0.3157 | 0.2688 |
| dmiip2024_1 | 0.9583 | 0.9630 | 0.9524 | 0.9577 | 0.4231 | 0.4615 | 0.4423 | 0.4320 | 0.3747 | 0.3608 |
| dmiip2024 | 0.9583 | 0.9630 | 0.9524 | 0.9577 | 0.3462 | 0.4615 | 0.4038 | 0.4101 | 0.3419 | 0.3369 |
| dmiip2024_2 | 0.8750 | 0.8966 | 0.8421 | 0.8693 | 0.3462 | 0.5385 | 0.4199 | 0.2859 | 0.3494 | 0.2500 |
| dmiip2024_3 | 0.8750 | 0.8966 | 0.8421 | 0.8693 | 0.3077 | 0.3846 | 0.3397 | 0.4508 | 0.4098 | 0.3915 |
| dmiip2024_4 | 0.4167 | - | 0.5882 | 0.2941 | 0.3462 | 0.4231 | 0.3846 | 0.2406 | 0.3433 | 0.2264 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| bioinfo-0 | 0.3517 | 0.1571 | 0.3569 | 0.1550 | 4.28 | 4.69 | 3.94 | 4.33 |
| bioinfo-1 | 0.3364 | 0.1560 | 0.3425 | 0.1532 | 4.33 | 4.58 | 3.91 | 4.28 |
| bioinfo-2 | 0.3868 | 0.1923 | 0.3984 | 0.1919 | 4.25 | 4.85 | 3.95 | 4.24 |
| bioinfo-3 | 0.2404 | 0.1685 | 0.2474 | 0.1683 | 4.39 | 4.55 | 4.06 | 4.51 |
| bioinfo-4 | 0.2410 | 0.1618 | 0.2472 | 0.1620 | 4.49 | 4.54 | 3.98 | 4.56 |
| GTBioASQsys2 | 0.1980 | 0.1653 | 0.1910 | 0.1594 | 4.48 | 3.99 | 4.02 | 4.66 |
| GTBioASQsys4 | 0.1881 | 0.1688 | 0.1817 | 0.1628 | 4.25 | 3.88 | 3.80 | 4.52 |
| Gatech competition | 0.1952 | 0.1799 | 0.1892 | 0.1731 | 4.27 | 4.09 | 4.06 | 4.55 |
| GTBioASQsys3 | 0.1835 | 0.1836 | 0.1802 | 0.1792 | 4.24 | 3.92 | 4.07 | 4.51 |
| Fleming-3 | 0.2879 | 0.1092 | 0.3080 | 0.1153 | 4.14 | 4.36 | 3.62 | 4.11 |
| mibi_rag_abstract | 0.2367 | 0.2545 | 0.2219 | 0.2403 | 4.55 | 3.84 | 4.25 | 4.73 |
| mibi_rag_snippet | 0.2490 | 0.2352 | 0.2392 | 0.2237 | 4.35 | 3.91 | 3.94 | 4.54 |
| mibi_rag_3 | 0.2407 | 0.2312 | 0.2313 | 0.2196 | 4.34 | 3.91 | 3.94 | 4.54 |
| mibi_rag_4 | 0.2548 | 0.2464 | 0.2465 | 0.2342 | 4.52 | 3.99 | 4.20 | 4.73 |
| mibi_rag_5 | 0.2399 | 0.2359 | 0.2288 | 0.2224 | 4.45 | 3.92 | 4.02 | 4.60 |
| CPS | 0.2431 | 0.2326 | 0.2374 | 0.2242 | 4.26 | 3.91 | 3.88 | 4.36 |
| CPS2 | 0.2562 | 0.2346 | 0.2504 | 0.2260 | 4.21 | 3.92 | 3.88 | 4.38 |
| CPS3 | 0.2333 | 0.2373 | 0.2260 | 0.2272 | 4.31 | 3.78 | 4.00 | 4.52 |
| UR-IW-1 | 0.2706 | 0.1861 | 0.2683 | 0.1779 | 4.46 | 4.60 | 3.92 | 4.41 |
| UR-IW-3 | 0.3162 | 0.1771 | 0.3189 | 0.1741 | 4.32 | 4.47 | 3.94 | 4.33 |
| UR-IW-4 | 0.3252 | 0.2004 | 0.3269 | 0.1968 | 4.49 | 4.61 | 4.11 | 4.48 |
| UR-IW-5 | 0.3281 | 0.1643 | 0.3381 | 0.1653 | 4.25 | 4.28 | 3.72 | 4.24 |
| UR-IW-2 | 0.3407 | 0.1972 | 0.3510 | 0.1965 | 4.41 | 4.53 | 3.99 | 4.40 |
| dmiip2024_1 | 0.2533 | 0.2717 | 0.2483 | 0.2635 | 4.49 | 4.31 | 4.44 | 4.64 |
| dmiip2024 | 0.2475 | 0.2678 | 0.2374 | 0.2547 | 4.52 | 4.26 | 4.28 | 4.65 |
| dmiip2024_2 | 0.2694 | 0.2862 | 0.2520 | 0.2688 | 4.61 | 4.26 | 4.33 | 4.69 |
| dmiip2024_3 | 0.1959 | 0.2194 | 0.1929 | 0.2146 | 4.55 | 4.27 | 4.35 | 4.67 |
| dmiip2024_4 | 0.2375 | 0.2562 | 0.2264 | 0.2434 | 4.45 | 3.93 | 4.22 | 4.65 |
Test batch 4
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| UR-IW-1 | 0.8519 | 0.8947 | 0.7500 | 0.8224 | 0.5263 | 0.5789 | 0.5526 | 0.2170 | 0.2910 | 0.2318 |
| UR-IW-3 | 0.7037 | 0.7647 | 0.6000 | 0.6824 | 0.3684 | 0.4211 | 0.3816 | 0.1610 | 0.1452 | 0.1441 |
| UR-IW-4 | 0.7778 | 0.8333 | 0.6667 | 0.7500 | 0.3684 | 0.4211 | 0.3860 | 0.1924 | 0.1882 | 0.1719 |
| UR-IW-5 | 0.7407 | 0.8000 | 0.6316 | 0.7158 | 0.4737 | 0.4737 | 0.4737 | 0.2073 | 0.1775 | 0.1702 |
| Fleming-1 | 0.8148 | 0.8571 | 0.7368 | 0.7970 | 0.2105 | 0.2632 | 0.2211 | 0.1881 | 0.1643 | 0.1668 |
| mibi_rag_snippet | 0.4444 | 0.5161 | 0.3478 | 0.4320 | 0.1579 | 0.1579 | 0.1579 | 0.2343 | 0.1280 | 0.1574 |
| mibi_rag_abstract | 0.3704 | 0.1905 | 0.4848 | 0.3377 | 0.3158 | 0.3158 | 0.3158 | 0.3636 | 0.2246 | 0.2634 |
| mibi_rag_3 | 0.4444 | 0.5161 | 0.3478 | 0.4320 | 0.1579 | 0.1579 | 0.1579 | 0.2343 | 0.1280 | 0.1574 |
| mibi_rag_4 | 0.3704 | 0.1905 | 0.4848 | 0.3377 | 0.2105 | 0.2105 | 0.2105 | 0.3711 | 0.2295 | 0.2669 |
| mibi_rag_5 | 0.4815 | 0.5333 | 0.4167 | 0.4750 | 0.1579 | 0.1579 | 0.1579 | 0.2343 | 0.1280 | 0.1574 |
| bioinfo-0 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-1 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-2 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-3 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-4 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| GTBioASQsys2 | 0.6296 | 0.6875 | 0.5455 | 0.6165 | 0.2105 | 0.2105 | 0.2105 | 0.1126 | 0.1060 | 0.1071 |
| GTBioASQsys3 | 0.7037 | 0.7647 | 0.6000 | 0.6824 | 0.0526 | 0.0526 | 0.0526 | 0.1362 | 0.1285 | 0.1248 |
| GTBioASQsys4 | 0.6667 | 0.7097 | 0.6087 | 0.6592 | 0.1579 | 0.1579 | 0.1579 | 0.1872 | 0.1715 | 0.1664 |
| UR-IW-2 | 0.8519 | 0.8947 | 0.7500 | 0.8224 | 0.4737 | 0.5263 | 0.5000 | 0.1887 | 0.2207 | 0.1767 |
| CPS | 0.7778 | 0.8571 | 0.5000 | 0.6786 | 0.1053 | 0.1053 | 0.1053 | 0.1795 | 0.1052 | 0.1239 |
| CPS2 | 0.8148 | 0.8837 | 0.5455 | 0.7146 | - | - | - | - | - | - |
| CPS3 | 0.8148 | 0.8837 | 0.5455 | 0.7146 | - | - | - | - | - | - |
| dmiip2024_3 | 0.8148 | 0.8649 | 0.7059 | 0.7854 | 0.4211 | 0.4211 | 0.4211 | 0.3246 | 0.2974 | 0.3080 |
| Fleming-2 | 0.7778 | 0.8421 | 0.6250 | 0.7336 | 0.2105 | 0.2632 | 0.2211 | 0.1881 | 0.1643 | 0.1668 |
| dmiip2024 | 0.8148 | 0.8571 | 0.7368 | 0.7970 | 0.3684 | 0.4737 | 0.4123 | 0.3586 | 0.3614 | 0.3505 |
| dmiip2024_1 | 0.8889 | 0.9231 | 0.8000 | 0.8615 | 0.4737 | 0.5263 | 0.5000 | 0.4338 | 0.3903 | 0.4022 |
| dmiip2024_2 | 0.8889 | 0.9189 | 0.8235 | 0.8712 | 0.4211 | 0.4211 | 0.4211 | 0.2886 | 0.4087 | 0.3142 |
| dmiip2024_4 | 0.2963 | - | 0.4571 | 0.2286 | 0.2632 | 0.3684 | 0.3070 | 0.2274 | 0.3027 | 0.2424 |
| extractive | 0.8148 | 0.8649 | 0.7059 | 0.7854 | - | - | - | - | - | - |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| UR-IW-1 | 0.2465 | 0.1866 | 0.2511 | 0.1803 | 4.42 | 4.27 | 3.73 | 4.46 |
| UR-IW-3 | 0.2810 | 0.1647 | 0.2915 | 0.1649 | 4.09 | 3.92 | 3.39 | 4.21 |
| UR-IW-4 | 0.2712 | 0.1677 | 0.2828 | 0.1700 | 4.29 | 4.08 | 3.72 | 4.39 |
| UR-IW-5 | 0.3027 | 0.1683 | 0.3046 | 0.1655 | 4.15 | 3.88 | 3.49 | 4.16 |
| Fleming-1 | 0.2771 | 0.0967 | 0.2959 | 0.1014 | 4.31 | 4.02 | 3.49 | 4.29 |
| mibi_rag_snippet | 0.2520 | 0.2449 | 0.2509 | 0.2388 | 4.55 | 3.58 | 3.85 | 4.65 |
| mibi_rag_abstract | 0.2269 | 0.2419 | 0.2264 | 0.2370 | 4.64 | 3.40 | 3.85 | 4.78 |
| mibi_rag_3 | 0.2523 | 0.2447 | 0.2509 | 0.2386 | 4.55 | 3.58 | 3.84 | 4.65 |
| mibi_rag_4 | 0.2363 | 0.2355 | 0.2376 | 0.2312 | 4.67 | 3.46 | 3.79 | 4.81 |
| mibi_rag_5 | 0.2312 | 0.2318 | 0.2341 | 0.2284 | 4.53 | 3.47 | 3.80 | 4.68 |
| bioinfo-0 | 0.3597 | 0.1515 | 0.3710 | 0.1515 | 4.47 | 4.55 | 3.87 | 4.48 |
| bioinfo-1 | 0.3381 | 0.1484 | 0.3462 | 0.1492 | 4.45 | 4.42 | 3.75 | 4.42 |
| bioinfo-2 | 0.3807 | 0.1710 | 0.3838 | 0.1692 | 4.52 | 4.47 | 3.87 | 4.41 |
| bioinfo-3 | 0.2300 | 0.1572 | 0.2378 | 0.1587 | 4.61 | 4.28 | 4.09 | 4.65 |
| bioinfo-4 | 0.2239 | 0.1521 | 0.2337 | 0.1530 | 4.59 | 4.21 | 3.93 | 4.67 |
| GTBioASQsys2 | 0.1628 | 0.1443 | 0.1636 | 0.1425 | 4.34 | 3.52 | 3.61 | 4.69 |
| GTBioASQsys3 | 0.1443 | 0.1268 | 0.1479 | 0.1258 | 4.45 | 3.58 | 3.73 | 4.71 |
| GTBioASQsys4 | 0.1552 | 0.1332 | 0.1594 | 0.1331 | 4.47 | 3.59 | 3.79 | 4.72 |
| UR-IW-2 | 0.2599 | 0.2114 | 0.2600 | 0.2064 | 4.45 | 4.12 | 3.74 | 4.54 |
| CPS | 0.2643 | 0.1910 | 0.2733 | 0.1904 | 4.48 | 3.86 | 3.59 | 4.44 |
| CPS2 | 0.2733 | 0.1919 | 0.2825 | 0.1930 | 4.36 | 3.85 | 3.60 | 4.45 |
| CPS3 | 0.2391 | 0.2265 | 0.2351 | 0.2163 | 4.52 | 3.58 | 3.74 | 4.56 |
| dmiip2024_3 | 0.2488 | 0.2547 | 0.2374 | 0.2412 | 4.49 | 3.92 | 4.12 | 4.64 |
| Fleming-2 | 0.2771 | 0.0967 | 0.2959 | 0.1014 | 4.31 | 4.02 | 3.49 | 4.29 |
| dmiip2024 | 0.2494 | 0.2415 | 0.2513 | 0.2382 | 4.60 | 4.06 | 4.13 | 4.65 |
| dmiip2024_1 | 0.2723 | 0.2759 | 0.2721 | 0.2694 | 4.53 | 4.16 | 4.26 | 4.71 |
| dmiip2024_2 | 0.2723 | 0.2759 | 0.2721 | 0.2694 | 4.53 | 4.16 | 4.26 | 4.71 |
| dmiip2024_4 | 0.2381 | 0.2495 | 0.2328 | 0.2408 | 4.56 | 3.58 | 3.91 | 4.71 |
| extractive | - | - | - | - | - | - | - | - |