BioASQ Participants Area
Task 12b: Test Results of Phase A+
The test results are presented in separate tables for each type of annotation. The "System Description" of each system is used.
The evaluation measures that are used in Task A+ are presented
here .
Warning: For ideal answers, good ROUGE results do not always imply good manual scores.
Test batch 1
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
mibi_rag_snippet |
0.8000 |
0.8387 |
0.7368 |
0.7878 |
0.0476 |
0.0476 |
0.0476 |
0.2730 |
0.2008 |
0.2175 |
mibi_rag_abstract |
0.7600 |
0.8125 |
0.6667 |
0.7396 |
0.0476 |
0.0476 |
0.0476 |
0.3127 |
0.2571 |
0.2698 |
UR-IW-5 |
0.8400 |
0.8571 |
0.8182 |
0.8377 |
0.0952 |
0.1429 |
0.1111 |
0.4258 |
0.3670 |
0.3758 |
Fleming-1 |
0.8400 |
0.8750 |
0.7778 |
0.8264 |
- | - | - |
0.2388 |
0.1848 |
0.2022 |
GTBioASQsys2 |
0.7600 |
0.7857 |
0.7273 |
0.7565 |
0.1905 |
0.1905 |
0.1905 |
0.2722 |
0.2068 |
0.2229 |
Gatech competition |
0.8000 |
0.8000 |
0.8000 |
0.8000 |
0.1429 |
0.1429 |
0.1429 |
0.4770 |
0.3477 |
0.3769 |
GTBioASQsys3 |
0.8000 |
0.8148 |
0.7826 |
0.7987 |
0.2381 |
0.2381 |
0.2381 |
0.2579 |
0.1905 |
0.2006 |
UR-IW-4 |
0.8800 |
0.8889 |
0.8696 |
0.8792 |
0.0476 |
0.1429 |
0.0873 |
0.4239 |
0.3761 |
0.3698 |
UR-IW-2 |
0.8800 |
0.8889 |
0.8696 |
0.8792 |
0.0952 |
0.0952 |
0.0952 |
0.5548 |
0.4515 |
0.4706 |
bioinfo-0 |
0.6000 |
0.7500 |
- |
0.3750 |
- | - | - |
- | - | - |
UR-IW-3 |
0.9600 |
0.9677 |
0.9474 |
0.9576 |
0.1429 |
0.1429 |
0.1429 |
0.4135 |
0.4282 |
0.4011 |
UR-IW-1 |
0.8400 |
0.8667 |
0.8000 |
0.8333 |
0.2381 |
0.3333 |
0.2857 |
0.3420 |
0.3829 |
0.3474 |
bioinfo-1 |
0.6000 |
0.7500 |
- |
0.3750 |
- | - | - |
- | - | - |
bioinfo-2 |
0.6000 |
0.7500 |
- |
0.3750 |
- | - | - |
- | - | - |
bioinfo-3 |
0.6000 |
0.7500 |
- |
0.3750 |
- | - | - |
- | - | - |
bioinfo-4 |
0.6000 |
0.7500 |
- |
0.3750 |
- | - | - |
- | - | - |
dmiip2024 |
0.8000 |
0.8571 |
0.6667 |
0.7619 |
0.1905 |
0.2381 |
0.2063 |
0.5056 |
0.4021 |
0.4312 |
dmiip2024_1 |
0.8000 |
0.8571 |
0.6667 |
0.7619 |
0.2381 |
0.5238 |
0.3492 |
0.3413 |
0.3095 |
0.2899 |
dmiip2024_3 |
0.8000 |
0.8571 |
0.6667 |
0.7619 |
0.2381 |
0.5238 |
0.3611 |
0.3460 |
0.3381 |
0.3123 |
dmiip2024_2 |
0.8000 |
0.8571 |
0.6667 |
0.7619 |
0.2857 |
0.3810 |
0.3048 |
0.3508 |
0.3092 |
0.2883 |
dmiip2024_4 |
0.8000 |
0.8571 |
0.6667 |
0.7619 |
0.0952 |
0.2857 |
0.1762 |
0.2746 |
0.2762 |
0.2462 |
simple truncation |
0.8400 |
0.8824 |
0.7500 |
0.8162 |
0.1429 |
0.1905 |
0.1667 |
0.1996 |
0.2011 |
0.1908 |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
mibi_rag_snippet |
0.3097 |
0.2242 |
0.3167 |
0.2212 |
4.40 |
4.44 |
4.08 |
4.47 |
mibi_rag_abstract |
0.3246 |
0.2451 |
0.3340 |
0.2428 |
4.39 |
4.55 |
4.20 |
4.49 |
UR-IW-5 |
0.2778 |
0.1639 |
0.3005 |
0.1670 |
- |
- |
- |
- |
Fleming-1 |
0.2662 |
0.1136 |
0.2954 |
0.1196 |
4.16 |
4.56 |
3.79 |
4.19 |
GTBioASQsys2 |
0.1974 |
0.1552 |
0.1998 |
0.1565 |
4.14 |
4.12 |
3.85 |
4.39 |
Gatech competition |
0.1817 |
0.1625 |
0.1906 |
0.1660 |
4.16 |
3.94 |
3.65 |
4.47 |
GTBioASQsys3 |
0.1816 |
0.1504 |
0.1910 |
0.1535 |
4.16 |
4.01 |
3.87 |
4.40 |
UR-IW-4 |
0.2963 |
0.1775 |
0.3133 |
0.1778 |
- |
- |
- |
- |
UR-IW-2 |
0.2469 |
0.2749 |
0.2419 |
0.2694 |
- |
- |
- |
- |
bioinfo-0 |
0.3518 |
0.1165 |
0.3802 |
0.1208 |
4.08 |
4.61 |
3.72 |
4.06 |
UR-IW-3 |
0.2418 |
0.2495 |
0.2454 |
0.2483 |
- |
- |
- |
- |
UR-IW-1 |
0.2831 |
0.1844 |
0.2930 |
0.1823 |
4.33 |
4.40 |
3.99 |
4.35 |
bioinfo-1 |
0.3489 |
0.1215 |
0.3753 |
0.1252 |
3.96 |
4.60 |
3.67 |
4.04 |
bioinfo-2 |
0.3354 |
0.1216 |
0.3592 |
0.1257 |
3.95 |
4.58 |
3.71 |
3.96 |
bioinfo-3 |
0.3520 |
0.1242 |
0.3799 |
0.1273 |
4.04 |
4.65 |
3.71 |
4.08 |
bioinfo-4 |
0.3585 |
0.1244 |
0.3840 |
0.1282 |
4.05 |
4.64 |
3.69 |
3.99 |
dmiip2024 |
0.2101 |
0.2190 |
0.2136 |
0.2161 |
4.53 |
4.42 |
4.39 |
4.60 |
dmiip2024_1 |
0.2101 |
0.2190 |
0.2136 |
0.2161 |
4.53 |
4.42 |
4.39 |
4.60 |
dmiip2024_3 |
0.2101 |
0.2190 |
0.2136 |
0.2161 |
4.53 |
4.42 |
4.39 |
4.60 |
dmiip2024_2 |
0.2101 |
0.2190 |
0.2136 |
0.2161 |
4.53 |
4.42 |
4.39 |
4.60 |
dmiip2024_4 |
0.2101 |
0.2190 |
0.2136 |
0.2161 |
4.53 |
4.42 |
4.39 |
4.60 |
simple truncation |
0.1120 |
0.0997 |
0.1184 |
0.0993 |
3.59 |
3.55 |
3.60 |
4.06 |
Test batch 2
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
mibi_rag_snippet |
0.7692 |
0.8235 |
0.6667 |
0.7451 |
0.1579 |
0.1579 |
0.1579 |
0.3861 |
0.2844 |
0.3135 |
mibi_rag_abstract |
0.6923 |
0.7647 |
0.5556 |
0.6601 |
0.1053 |
0.1053 |
0.1053 |
0.3213 |
0.1997 |
0.2243 |
Gatech competition |
0.7308 |
0.7742 |
0.6667 |
0.7204 |
0.2105 |
0.2105 |
0.2105 |
0.2438 |
0.1916 |
0.1900 |
GTBioASQsys2 |
0.6923 |
0.7333 |
0.6364 |
0.6848 |
0.2105 |
0.2105 |
0.2105 |
0.1765 |
0.1203 |
0.1288 |
GTBioASQsys3 |
0.8077 |
0.8485 |
0.7368 |
0.7927 |
0.2632 |
0.2632 |
0.2632 |
0.1894 |
0.1559 |
0.1596 |
UR-IW-5 |
0.8077 |
0.8387 |
0.7619 |
0.8003 |
0.3158 |
0.3684 |
0.3333 |
0.2376 |
0.2054 |
0.2004 |
UR-IW-4 |
0.7692 |
0.8000 |
0.7273 |
0.7636 |
0.1579 |
0.3158 |
0.2123 |
0.3093 |
0.2592 |
0.2530 |
UR-IW-3 |
0.8077 |
0.8387 |
0.7619 |
0.8003 |
0.3158 |
0.3684 |
0.3333 |
0.3422 |
0.2706 |
0.2870 |
UR-IW-2 |
0.7692 |
0.8000 |
0.7273 |
0.7636 |
0.2632 |
0.3684 |
0.3026 |
0.2775 |
0.2967 |
0.2722 |
simple truncation |
0.8462 |
0.8889 |
0.7500 |
0.8194 |
0.1579 |
0.2105 |
0.1711 |
0.0773 |
0.0638 |
0.0649 |
kmeans |
0.7692 |
0.8235 |
0.6667 |
0.7451 |
0.1579 |
0.2105 |
0.1711 |
0.0930 |
0.0935 |
0.0894 |
similarity measures |
0.8462 |
0.8889 |
0.7500 |
0.8194 |
0.1579 |
0.2105 |
0.1711 |
0.0773 |
0.0638 |
0.0649 |
UR-IW-1 |
0.6923 |
0.7500 |
0.6000 |
0.6750 |
0.3158 |
0.4211 |
0.3465 |
0.2959 |
0.2374 |
0.2414 |
Fleming-3 |
0.7308 |
0.8000 |
0.5882 |
0.6941 |
0.2632 |
0.3684 |
0.3070 |
0.2612 |
0.1476 |
0.1776 |
dmiip2024 |
0.8846 |
0.9091 |
0.8421 |
0.8756 |
0.3684 |
0.5789 |
0.4649 |
0.4944 |
0.3662 |
0.3789 |
dmiip2024_1 |
0.7308 |
0.7879 |
0.6316 |
0.7097 |
0.3684 |
0.4737 |
0.4211 |
0.5257 |
0.3532 |
0.3722 |
dmiip2024_2 |
0.8077 |
0.8649 |
0.6667 |
0.7658 |
0.2105 |
0.4211 |
0.3026 |
0.3937 |
0.3238 |
0.3137 |
dmiip2024_4 |
0.3077 |
- |
0.4706 |
0.2353 |
0.2632 |
0.5263 |
0.3860 |
0.3241 |
0.2898 |
0.2694 |
dmiip2024_3 |
0.8077 |
0.8649 |
0.6667 |
0.7658 |
0.4211 |
0.5263 |
0.4737 |
0.4296 |
0.2910 |
0.3158 |
bioinfo-0 |
0.6923 |
0.8182 |
- |
0.4091 |
- | - | - |
- | - | - |
bioinfo-1 |
0.6923 |
0.8182 |
- |
0.4091 |
- | - | - |
- | - | - |
bioinfo-2 |
0.6923 |
0.8182 |
- |
0.4091 |
- | - | - |
- | - | - |
bioinfo-3 |
0.6923 |
0.8182 |
- |
0.4091 |
- | - | - |
- | - | - |
bioinfo-4 |
0.6923 |
0.8182 |
- |
0.4091 |
- | - | - |
- | - | - |
CPS |
0.6923 |
0.8000 |
0.3333 |
0.5667 |
- | - | - |
- | - | - |
CPS2 |
0.6923 |
0.7895 |
0.4286 |
0.6090 |
- | - | - |
- | - | - |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
mibi_rag_snippet |
0.2616 |
0.1992 |
0.2653 |
0.1958 |
4.42 |
4.60 |
4.14 |
4.49 |
mibi_rag_abstract |
0.2676 |
0.2036 |
0.2724 |
0.1989 |
4.34 |
4.55 |
4.16 |
4.39 |
Gatech competition |
0.1463 |
0.1365 |
0.1479 |
0.1366 |
4.16 |
3.98 |
3.82 |
4.40 |
GTBioASQsys2 |
0.1392 |
0.1266 |
0.1505 |
0.1320 |
4.16 |
4.00 |
3.96 |
4.45 |
GTBioASQsys3 |
0.1288 |
0.1262 |
0.1385 |
0.1302 |
4.16 |
4.08 |
4.01 |
4.42 |
UR-IW-5 |
0.2597 |
0.1696 |
0.2657 |
0.1653 |
4.39 |
4.44 |
3.94 |
4.42 |
UR-IW-4 |
0.2307 |
0.1584 |
0.2478 |
0.1614 |
4.20 |
4.38 |
3.86 |
4.29 |
UR-IW-3 |
0.1930 |
0.2114 |
0.1991 |
0.2105 |
4.38 |
4.20 |
4.12 |
4.49 |
UR-IW-2 |
0.1934 |
0.2259 |
0.1878 |
0.2185 |
4.27 |
4.12 |
4.02 |
4.41 |
simple truncation |
0.0692 |
0.0322 |
0.0700 |
0.0316 |
3.29 |
3.35 |
3.07 |
3.44 |
kmeans |
0.0876 |
0.0539 |
0.0918 |
0.0545 |
3.84 |
3.87 |
3.71 |
4.14 |
similarity measures |
0.0692 |
0.0322 |
0.0700 |
0.0316 |
3.29 |
3.35 |
3.07 |
3.44 |
UR-IW-1 |
0.2138 |
0.1597 |
0.2275 |
0.1593 |
4.00 |
4.20 |
3.71 |
4.26 |
Fleming-3 |
0.2356 |
0.1171 |
0.2520 |
0.1215 |
4.29 |
4.58 |
3.91 |
4.31 |
dmiip2024 |
0.1954 |
0.2227 |
0.1897 |
0.2142 |
4.24 |
4.14 |
4.14 |
4.32 |
dmiip2024_1 |
0.1971 |
0.2240 |
0.1948 |
0.2184 |
4.32 |
4.13 |
4.04 |
4.41 |
dmiip2024_2 |
0.2262 |
0.2503 |
0.2191 |
0.2374 |
4.45 |
4.46 |
4.36 |
4.66 |
dmiip2024_4 |
0.1992 |
0.2227 |
0.1883 |
0.2085 |
4.19 |
4.08 |
4.06 |
4.47 |
dmiip2024_3 |
0.1455 |
0.1592 |
0.1533 |
0.1637 |
4.51 |
4.46 |
4.34 |
4.67 |
bioinfo-0 |
0.2934 |
0.1446 |
0.3022 |
0.1437 |
4.26 |
4.60 |
3.89 |
4.24 |
bioinfo-1 |
0.2740 |
0.1223 |
0.2892 |
0.1243 |
4.20 |
4.54 |
3.85 |
4.27 |
bioinfo-2 |
0.3201 |
0.1551 |
0.3341 |
0.1575 |
4.25 |
4.65 |
3.93 |
4.31 |
bioinfo-3 |
0.2246 |
0.2309 |
0.2239 |
0.2267 |
4.29 |
4.28 |
4.06 |
4.42 |
bioinfo-4 |
0.1646 |
0.1826 |
0.1718 |
0.1865 |
4.15 |
4.00 |
3.92 |
4.21 |
CPS |
0.2155 |
0.2071 |
0.2185 |
0.2029 |
4.21 |
4.05 |
3.84 |
4.33 |
CPS2 |
0.1978 |
0.1649 |
0.2011 |
0.1623 |
4.13 |
3.69 |
3.49 |
4.40 |
Test batch 3
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
bioinfo-0 |
0.5833 |
0.7368 |
- |
0.3684 |
- | - | - |
- | - | - |
bioinfo-1 |
0.5833 |
0.7368 |
- |
0.3684 |
- | - | - |
- | - | - |
bioinfo-2 |
0.5833 |
0.7368 |
- |
0.3684 |
- | - | - |
- | - | - |
bioinfo-3 |
0.5833 |
0.7368 |
- |
0.3684 |
- | - | - |
- | - | - |
bioinfo-4 |
0.5833 |
0.7368 |
- |
0.3684 |
- | - | - |
- | - | - |
GTBioASQsys2 |
0.7500 |
0.7692 |
0.7273 |
0.7483 |
0.0769 |
0.0769 |
0.0769 |
0.2417 |
0.2287 |
0.2167 |
GTBioASQsys4 |
0.8750 |
0.8889 |
0.8571 |
0.8730 |
0.1154 |
0.1154 |
0.1154 |
0.1281 |
0.1531 |
0.1284 |
Gatech competition |
0.8750 |
0.8966 |
0.8421 |
0.8693 |
0.2692 |
0.2692 |
0.2692 |
0.2177 |
0.1962 |
0.1862 |
GTBioASQsys3 |
0.8333 |
0.8462 |
0.8182 |
0.8322 |
0.3077 |
0.3077 |
0.3077 |
0.2206 |
0.1887 |
0.1832 |
Fleming-3 |
0.8333 |
0.8571 |
0.8000 |
0.8286 |
0.1154 |
0.2692 |
0.1635 |
0.1804 |
0.1625 |
0.1533 |
mibi_rag_abstract |
0.4583 |
0.3158 |
0.5517 |
0.4338 |
0.2692 |
0.2692 |
0.2692 |
0.2953 |
0.2358 |
0.2468 |
mibi_rag_snippet |
0.9583 |
0.9655 |
0.9474 |
0.9564 |
0.1923 |
0.1923 |
0.1923 |
0.2825 |
0.2259 |
0.2351 |
mibi_rag_3 |
0.9583 |
0.9655 |
0.9474 |
0.9564 |
0.1923 |
0.1923 |
0.1923 |
0.2825 |
0.2259 |
0.2351 |
mibi_rag_4 |
0.5417 |
0.4762 |
0.5926 |
0.5344 |
0.3077 |
0.3077 |
0.3077 |
0.3813 |
0.2884 |
0.3116 |
mibi_rag_5 |
0.9167 |
0.9286 |
0.9000 |
0.9143 |
0.1923 |
0.1923 |
0.1923 |
0.2754 |
0.2226 |
0.2306 |
CPS |
0.6250 |
0.7273 |
0.4000 |
0.5636 |
0.1538 |
0.1538 |
0.1538 |
0.1404 |
0.0684 |
0.0810 |
CPS2 |
0.5833 |
0.7059 |
0.2857 |
0.4958 |
- | - | - |
- | - | - |
CPS3 |
0.7500 |
0.8125 |
0.6250 |
0.7188 |
- | - | - |
- | - | - |
UR-IW-1 |
0.8333 |
0.8667 |
0.7778 |
0.8222 |
0.2692 |
0.4231 |
0.3237 |
0.3189 |
0.3798 |
0.3020 |
UR-IW-3 |
0.7917 |
0.8000 |
0.7826 |
0.7913 |
0.1923 |
0.2308 |
0.2115 |
0.1916 |
0.2545 |
0.2031 |
UR-IW-4 |
0.8750 |
0.8889 |
0.8571 |
0.8730 |
0.2308 |
0.3077 |
0.2596 |
0.2502 |
0.2432 |
0.2189 |
UR-IW-5 |
0.9167 |
0.9286 |
0.9000 |
0.9143 |
0.1923 |
0.2308 |
0.2115 |
0.2980 |
0.2650 |
0.2561 |
UR-IW-2 |
0.8333 |
0.8462 |
0.8182 |
0.8322 |
0.2308 |
0.3077 |
0.2628 |
0.2869 |
0.3157 |
0.2688 |
dmiip2024_1 |
0.9583 |
0.9630 |
0.9524 |
0.9577 |
0.4231 |
0.4615 |
0.4423 |
0.4320 |
0.3747 |
0.3608 |
dmiip2024 |
0.9583 |
0.9630 |
0.9524 |
0.9577 |
0.3462 |
0.4615 |
0.4038 |
0.4101 |
0.3419 |
0.3369 |
dmiip2024_2 |
0.8750 |
0.8966 |
0.8421 |
0.8693 |
0.3462 |
0.5385 |
0.4199 |
0.2859 |
0.3494 |
0.2500 |
dmiip2024_3 |
0.8750 |
0.8966 |
0.8421 |
0.8693 |
0.3077 |
0.3846 |
0.3397 |
0.4508 |
0.4098 |
0.3915 |
dmiip2024_4 |
0.4167 |
- |
0.5882 |
0.2941 |
0.3462 |
0.4231 |
0.3846 |
0.2406 |
0.3433 |
0.2264 |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
bioinfo-0 |
0.3517 |
0.1571 |
0.3569 |
0.1550 |
4.28 |
4.69 |
3.94 |
4.33 |
bioinfo-1 |
0.3364 |
0.1560 |
0.3425 |
0.1532 |
4.33 |
4.58 |
3.91 |
4.28 |
bioinfo-2 |
0.3868 |
0.1923 |
0.3984 |
0.1919 |
4.25 |
4.85 |
3.95 |
4.24 |
bioinfo-3 |
0.2404 |
0.1685 |
0.2474 |
0.1683 |
4.39 |
4.55 |
4.06 |
4.51 |
bioinfo-4 |
0.2410 |
0.1618 |
0.2472 |
0.1620 |
4.49 |
4.54 |
3.98 |
4.56 |
GTBioASQsys2 |
0.1980 |
0.1653 |
0.1910 |
0.1594 |
4.48 |
3.99 |
4.02 |
4.66 |
GTBioASQsys4 |
0.1881 |
0.1688 |
0.1817 |
0.1628 |
4.25 |
3.88 |
3.80 |
4.52 |
Gatech competition |
0.1952 |
0.1799 |
0.1892 |
0.1731 |
4.27 |
4.09 |
4.06 |
4.55 |
GTBioASQsys3 |
0.1835 |
0.1836 |
0.1802 |
0.1792 |
4.24 |
3.92 |
4.07 |
4.51 |
Fleming-3 |
0.2879 |
0.1092 |
0.3080 |
0.1153 |
4.14 |
4.36 |
3.62 |
4.11 |
mibi_rag_abstract |
0.2367 |
0.2545 |
0.2219 |
0.2403 |
4.55 |
3.84 |
4.25 |
4.73 |
mibi_rag_snippet |
0.2490 |
0.2352 |
0.2392 |
0.2237 |
4.35 |
3.91 |
3.94 |
4.54 |
mibi_rag_3 |
0.2407 |
0.2312 |
0.2313 |
0.2196 |
4.34 |
3.91 |
3.94 |
4.54 |
mibi_rag_4 |
0.2548 |
0.2464 |
0.2465 |
0.2342 |
4.52 |
3.99 |
4.20 |
4.73 |
mibi_rag_5 |
0.2399 |
0.2359 |
0.2288 |
0.2224 |
4.45 |
3.92 |
4.02 |
4.60 |
CPS |
0.2431 |
0.2326 |
0.2374 |
0.2242 |
4.26 |
3.91 |
3.88 |
4.36 |
CPS2 |
0.2562 |
0.2346 |
0.2504 |
0.2260 |
4.21 |
3.92 |
3.88 |
4.38 |
CPS3 |
0.2333 |
0.2373 |
0.2260 |
0.2272 |
4.31 |
3.78 |
4.00 |
4.52 |
UR-IW-1 |
0.2706 |
0.1861 |
0.2683 |
0.1779 |
4.46 |
4.60 |
3.92 |
4.41 |
UR-IW-3 |
0.3162 |
0.1771 |
0.3189 |
0.1741 |
4.32 |
4.47 |
3.94 |
4.33 |
UR-IW-4 |
0.3252 |
0.2004 |
0.3269 |
0.1968 |
4.49 |
4.61 |
4.11 |
4.48 |
UR-IW-5 |
0.3281 |
0.1643 |
0.3381 |
0.1653 |
4.25 |
4.28 |
3.72 |
4.24 |
UR-IW-2 |
0.3407 |
0.1972 |
0.3510 |
0.1965 |
4.41 |
4.53 |
3.99 |
4.40 |
dmiip2024_1 |
0.2533 |
0.2717 |
0.2483 |
0.2635 |
4.49 |
4.31 |
4.44 |
4.64 |
dmiip2024 |
0.2475 |
0.2678 |
0.2374 |
0.2547 |
4.52 |
4.26 |
4.28 |
4.65 |
dmiip2024_2 |
0.2694 |
0.2862 |
0.2520 |
0.2688 |
4.61 |
4.26 |
4.33 |
4.69 |
dmiip2024_3 |
0.1959 |
0.2194 |
0.1929 |
0.2146 |
4.55 |
4.27 |
4.35 |
4.67 |
dmiip2024_4 |
0.2375 |
0.2562 |
0.2264 |
0.2434 |
4.45 |
3.93 |
4.22 |
4.65 |
Test batch 4
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
UR-IW-1 |
0.8519 |
0.8947 |
0.7500 |
0.8224 |
0.5263 |
0.5789 |
0.5526 |
0.2170 |
0.2910 |
0.2318 |
UR-IW-3 |
0.7037 |
0.7647 |
0.6000 |
0.6824 |
0.3684 |
0.4211 |
0.3816 |
0.1610 |
0.1452 |
0.1441 |
UR-IW-4 |
0.7778 |
0.8333 |
0.6667 |
0.7500 |
0.3684 |
0.4211 |
0.3860 |
0.1924 |
0.1882 |
0.1719 |
UR-IW-5 |
0.7407 |
0.8000 |
0.6316 |
0.7158 |
0.4737 |
0.4737 |
0.4737 |
0.2073 |
0.1775 |
0.1702 |
Fleming-1 |
0.8148 |
0.8571 |
0.7368 |
0.7970 |
0.2105 |
0.2632 |
0.2211 |
0.1881 |
0.1643 |
0.1668 |
mibi_rag_snippet |
0.4444 |
0.5161 |
0.3478 |
0.4320 |
0.1579 |
0.1579 |
0.1579 |
0.2343 |
0.1280 |
0.1574 |
mibi_rag_abstract |
0.3704 |
0.1905 |
0.4848 |
0.3377 |
0.3158 |
0.3158 |
0.3158 |
0.3636 |
0.2246 |
0.2634 |
mibi_rag_3 |
0.4444 |
0.5161 |
0.3478 |
0.4320 |
0.1579 |
0.1579 |
0.1579 |
0.2343 |
0.1280 |
0.1574 |
mibi_rag_4 |
0.3704 |
0.1905 |
0.4848 |
0.3377 |
0.2105 |
0.2105 |
0.2105 |
0.3711 |
0.2295 |
0.2669 |
mibi_rag_5 |
0.4815 |
0.5333 |
0.4167 |
0.4750 |
0.1579 |
0.1579 |
0.1579 |
0.2343 |
0.1280 |
0.1574 |
bioinfo-0 |
0.7037 |
0.8261 |
- |
0.4130 |
- | - | - |
- | - | - |
bioinfo-1 |
0.7037 |
0.8261 |
- |
0.4130 |
- | - | - |
- | - | - |
bioinfo-2 |
0.7037 |
0.8261 |
- |
0.4130 |
- | - | - |
- | - | - |
bioinfo-3 |
0.7037 |
0.8261 |
- |
0.4130 |
- | - | - |
- | - | - |
bioinfo-4 |
0.7037 |
0.8261 |
- |
0.4130 |
- | - | - |
- | - | - |
GTBioASQsys2 |
0.6296 |
0.6875 |
0.5455 |
0.6165 |
0.2105 |
0.2105 |
0.2105 |
0.1126 |
0.1060 |
0.1071 |
GTBioASQsys3 |
0.7037 |
0.7647 |
0.6000 |
0.6824 |
0.0526 |
0.0526 |
0.0526 |
0.1362 |
0.1285 |
0.1248 |
GTBioASQsys4 |
0.6667 |
0.7097 |
0.6087 |
0.6592 |
0.1579 |
0.1579 |
0.1579 |
0.1872 |
0.1715 |
0.1664 |
UR-IW-2 |
0.8519 |
0.8947 |
0.7500 |
0.8224 |
0.4737 |
0.5263 |
0.5000 |
0.1887 |
0.2207 |
0.1767 |
CPS |
0.7778 |
0.8571 |
0.5000 |
0.6786 |
0.1053 |
0.1053 |
0.1053 |
0.1795 |
0.1052 |
0.1239 |
CPS2 |
0.8148 |
0.8837 |
0.5455 |
0.7146 |
- | - | - |
- | - | - |
CPS3 |
0.8148 |
0.8837 |
0.5455 |
0.7146 |
- | - | - |
- | - | - |
dmiip2024_3 |
0.8148 |
0.8649 |
0.7059 |
0.7854 |
0.4211 |
0.4211 |
0.4211 |
0.3246 |
0.2974 |
0.3080 |
Fleming-2 |
0.7778 |
0.8421 |
0.6250 |
0.7336 |
0.2105 |
0.2632 |
0.2211 |
0.1881 |
0.1643 |
0.1668 |
dmiip2024 |
0.8148 |
0.8571 |
0.7368 |
0.7970 |
0.3684 |
0.4737 |
0.4123 |
0.3586 |
0.3614 |
0.3505 |
dmiip2024_1 |
0.8889 |
0.9231 |
0.8000 |
0.8615 |
0.4737 |
0.5263 |
0.5000 |
0.4338 |
0.3903 |
0.4022 |
dmiip2024_2 |
0.8889 |
0.9189 |
0.8235 |
0.8712 |
0.4211 |
0.4211 |
0.4211 |
0.2886 |
0.4087 |
0.3142 |
dmiip2024_4 |
0.2963 |
- |
0.4571 |
0.2286 |
0.2632 |
0.3684 |
0.3070 |
0.2274 |
0.3027 |
0.2424 |
extractive |
0.8148 |
0.8649 |
0.7059 |
0.7854 |
- | - | - |
- | - | - |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
UR-IW-1 |
0.2465 |
0.1866 |
0.2511 |
0.1803 |
4.42 |
4.27 |
3.73 |
4.46 |
UR-IW-3 |
0.2810 |
0.1647 |
0.2915 |
0.1649 |
4.09 |
3.92 |
3.39 |
4.21 |
UR-IW-4 |
0.2712 |
0.1677 |
0.2828 |
0.1700 |
4.29 |
4.08 |
3.72 |
4.39 |
UR-IW-5 |
0.3027 |
0.1683 |
0.3046 |
0.1655 |
4.15 |
3.88 |
3.49 |
4.16 |
Fleming-1 |
0.2771 |
0.0967 |
0.2959 |
0.1014 |
4.31 |
4.02 |
3.49 |
4.29 |
mibi_rag_snippet |
0.2520 |
0.2449 |
0.2509 |
0.2388 |
4.55 |
3.58 |
3.85 |
4.65 |
mibi_rag_abstract |
0.2269 |
0.2419 |
0.2264 |
0.2370 |
4.64 |
3.40 |
3.85 |
4.78 |
mibi_rag_3 |
0.2523 |
0.2447 |
0.2509 |
0.2386 |
4.55 |
3.58 |
3.84 |
4.65 |
mibi_rag_4 |
0.2363 |
0.2355 |
0.2376 |
0.2312 |
4.67 |
3.46 |
3.79 |
4.81 |
mibi_rag_5 |
0.2312 |
0.2318 |
0.2341 |
0.2284 |
4.53 |
3.47 |
3.80 |
4.68 |
bioinfo-0 |
0.3597 |
0.1515 |
0.3710 |
0.1515 |
4.47 |
4.55 |
3.87 |
4.48 |
bioinfo-1 |
0.3381 |
0.1484 |
0.3462 |
0.1492 |
4.45 |
4.42 |
3.75 |
4.42 |
bioinfo-2 |
0.3807 |
0.1710 |
0.3838 |
0.1692 |
4.52 |
4.47 |
3.87 |
4.41 |
bioinfo-3 |
0.2300 |
0.1572 |
0.2378 |
0.1587 |
4.61 |
4.28 |
4.09 |
4.65 |
bioinfo-4 |
0.2239 |
0.1521 |
0.2337 |
0.1530 |
4.59 |
4.21 |
3.93 |
4.67 |
GTBioASQsys2 |
0.1628 |
0.1443 |
0.1636 |
0.1425 |
4.34 |
3.52 |
3.61 |
4.69 |
GTBioASQsys3 |
0.1443 |
0.1268 |
0.1479 |
0.1258 |
4.45 |
3.58 |
3.73 |
4.71 |
GTBioASQsys4 |
0.1552 |
0.1332 |
0.1594 |
0.1331 |
4.47 |
3.59 |
3.79 |
4.72 |
UR-IW-2 |
0.2599 |
0.2114 |
0.2600 |
0.2064 |
4.45 |
4.12 |
3.74 |
4.54 |
CPS |
0.2643 |
0.1910 |
0.2733 |
0.1904 |
4.48 |
3.86 |
3.59 |
4.44 |
CPS2 |
0.2733 |
0.1919 |
0.2825 |
0.1930 |
4.36 |
3.85 |
3.60 |
4.45 |
CPS3 |
0.2391 |
0.2265 |
0.2351 |
0.2163 |
4.52 |
3.58 |
3.74 |
4.56 |
dmiip2024_3 |
0.2488 |
0.2547 |
0.2374 |
0.2412 |
4.49 |
3.92 |
4.12 |
4.64 |
Fleming-2 |
0.2771 |
0.0967 |
0.2959 |
0.1014 |
4.31 |
4.02 |
3.49 |
4.29 |
dmiip2024 |
0.2494 |
0.2415 |
0.2513 |
0.2382 |
4.60 |
4.06 |
4.13 |
4.65 |
dmiip2024_1 |
0.2723 |
0.2759 |
0.2721 |
0.2694 |
4.53 |
4.16 |
4.26 |
4.71 |
dmiip2024_2 |
0.2723 |
0.2759 |
0.2721 |
0.2694 |
4.53 |
4.16 |
4.26 |
4.71 |
dmiip2024_4 |
0.2381 |
0.2495 |
0.2328 |
0.2408 |
4.56 |
3.58 |
3.91 |
4.71 |
extractive |
- |
- |
- |
- |
- |
- |
- |
- |