BioASQ Participants Area
Task Synergy - version 2025: Test Results
Test round 1
Documents
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
Fleming-2 |
0.2238 |
0.3375 |
0.2121 |
0.2828 |
0.0076 |
Fleming-1 |
0.2426 |
0.3939 |
0.2403 |
0.2172 |
0.0142 |
dmiip2024 |
0.3275 |
0.4936 |
0.2899 |
0.4060 |
0.0436 |
dmiip2024_1 |
0.3397 |
0.4944 |
0.2906 |
0.4051 |
0.0497 |
dmiip2024_2 |
0.3217 |
0.4921 |
0.2848 |
0.3960 |
0.0423 |
dmiip2024_3 |
0.2410 |
0.4860 |
0.2743 |
0.3969 |
0.0645 |
dmiip2024_4 |
0.3051 |
0.4687 |
0.2703 |
0.3844 |
0.0309 |
SCIRE1 Results |
0.3227 |
0.2072 |
0.2249 |
0.1928 |
0.0011 |
Snippets
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
Fleming-2 |
0.1509 |
0.1899 |
0.1388 |
0.2068 |
0.0072 |
Fleming-1 |
0.1864 |
0.1290 |
0.1297 |
0.1164 |
0.0014 |
dmiip2024 |
0.2688 |
0.5011 |
0.3003 |
0.5653 |
0.0625 |
dmiip2024_1 |
0.2803 |
0.5049 |
0.3087 |
0.5880 |
0.0879 |
dmiip2024_2 |
0.2552 |
0.4682 |
0.2855 |
0.5544 |
0.0590 |
dmiip2024_3 |
0.2163 |
0.4364 |
0.2489 |
0.5043 |
0.0437 |
dmiip2024_4 |
0.2471 |
0.4710 |
0.2746 |
0.5636 |
0.0522 |
SCIRE1 Results |
0.1444 |
0.1095 |
0.1090 |
0.0749 |
0.0003 |
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
Fleming-2 |
0.4000 |
0.5714 |
- |
0.2857 |
- | - | - |
0.1429 |
0.0130 |
0.0238 |
Fleming-1 |
0.4000 |
0.5714 |
- |
0.2857 |
- | - | - |
0.1429 |
0.0130 |
0.0238 |
dmiip2024 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.6667 |
0.6667 |
0.6667 |
0.0238 |
0.0130 |
0.0168 |
dmiip2024_1 |
0.4000 |
0.5714 |
- |
0.2857 |
- | - | - |
- | - | - |
dmiip2024_2 |
0.8000 |
0.8000 |
0.8000 |
0.8000 |
0.6667 |
0.6667 |
0.6667 |
0.0476 |
0.0476 |
0.0476 |
dmiip2024_3 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.6667 |
0.6667 |
0.6667 |
0.1667 |
0.0606 |
0.0882 |
dmiip2024_4 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.6667 |
0.6667 |
0.6667 |
0.0714 |
0.0476 |
0.0571 |
SCIRE1 Results |
0.8000 |
0.8000 |
0.8000 |
0.8000 |
0.6667 |
0.6667 |
0.6667 |
- | - | - |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
Fleming-2 |
- |
- |
- |
- |
- |
- |
- |
- |
Fleming-1 |
- |
- |
- |
- |
- |
- |
- |
- |
dmiip2024 |
0.1574 |
0.1982 |
0.1604 |
0.2038 |
4.47 |
4.53 |
4.47 |
4.47 |
dmiip2024_1 |
0.1476 |
0.1880 |
0.1524 |
0.1953 |
4.42 |
4.42 |
4.47 |
4.58 |
dmiip2024_2 |
0.1336 |
0.1719 |
0.1354 |
0.1760 |
4.37 |
4.37 |
4.16 |
4.58 |
dmiip2024_3 |
0.1553 |
0.1831 |
0.1583 |
0.1890 |
4.47 |
4.32 |
4.00 |
4.53 |
dmiip2024_4 |
0.1855 |
0.2211 |
0.1905 |
0.2282 |
4.37 |
4.53 |
4.16 |
4.58 |
SCIRE1 Results |
0.1423 |
0.1756 |
0.1470 |
0.1810 |
4.16 |
4.32 |
4.16 |
4.53 |
Test round 2
Documents
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
dmiip2024 |
0.1860 |
0.5784 |
0.2549 |
0.3949 |
0.0426 |
dmiip2024_1 |
0.2000 |
0.6372 |
0.2760 |
0.4125 |
0.0679 |
dmiip2024_2 |
0.2000 |
0.6372 |
0.2760 |
0.4125 |
0.0679 |
dmiip2024_3 |
0.1837 |
0.5549 |
0.2519 |
0.3322 |
0.0371 |
dmiip2024_4 |
0.1930 |
0.5923 |
0.2671 |
0.3393 |
0.0474 |
SCIRE2 Results |
0.0116 |
0.0116 |
0.0116 |
0.0058 |
0.0000 |
Fleming-2 |
0.0736 |
0.0491 |
0.0505 |
0.0317 |
0.0000 |
Q&A based on RAG |
0.0422 |
0.0615 |
0.0446 |
0.0361 |
0.0001 |
Fleming-1 |
0.0736 |
0.0491 |
0.0505 |
0.0317 |
0.0000 |
Q&A based on RAG2 |
0.0490 |
0.0629 |
0.0512 |
0.0328 |
0.0001 |
Snippets
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
dmiip2024 |
0.1132 |
0.2028 |
0.1202 |
0.2261 |
0.0016 |
dmiip2024_1 |
0.1248 |
0.2805 |
0.1414 |
0.2478 |
0.0027 |
dmiip2024_2 |
0.1248 |
0.2805 |
0.1414 |
0.2478 |
0.0027 |
dmiip2024_3 |
0.1153 |
0.1923 |
0.1163 |
0.1947 |
0.0009 |
dmiip2024_4 |
0.1187 |
0.2395 |
0.1282 |
0.1971 |
0.0015 |
SCIRE2 Results |
0.3422 |
0.3072 |
0.2949 |
0.2740 |
0.0026 |
Fleming-2 |
0.1211 |
0.0846 |
0.0757 |
0.1190 |
0.0004 |
Q&A based on RAG |
0.0265 |
0.0225 |
0.0217 |
0.0203 |
0.0000 |
Fleming-1 |
0.1211 |
0.0846 |
0.0757 |
0.1190 |
0.0004 |
Q&A based on RAG2 |
0.0290 |
0.0207 |
0.0222 |
0.0180 |
0.0000 |
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
dmiip2024 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.4286 |
0.4286 |
0.4286 |
0.1000 |
0.1000 |
0.1000 |
dmiip2024_1 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.4286 |
0.4286 |
0.4286 |
0.2000 |
0.4000 |
0.2467 |
dmiip2024_2 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.2857 |
0.2857 |
0.2857 |
0.1700 |
0.3000 |
0.2000 |
dmiip2024_3 |
0.8571 |
0.8000 |
0.8889 |
0.8444 |
0.2857 |
0.2857 |
0.2857 |
0.1000 |
0.1000 |
0.1000 |
dmiip2024_4 |
0.8571 |
0.8000 |
0.8889 |
0.8444 |
0.2857 |
0.2857 |
0.2857 |
0.1833 |
0.2500 |
0.2000 |
SCIRE2 Results |
0.8571 |
0.8571 |
0.8571 |
0.8571 |
0.2857 |
0.2857 |
0.2857 |
0.1500 |
0.1500 |
0.1500 |
Fleming-2 |
0.5714 |
0.6667 |
0.4000 |
0.5333 |
0.1429 |
0.4286 |
0.2857 |
0.2167 |
0.2667 |
0.2333 |
Q&A based on RAG |
0.5714 |
0.5714 |
0.5714 |
0.5714 |
0.1429 |
0.1429 |
0.1429 |
0.0500 |
0.0500 |
0.0500 |
Fleming-1 |
0.8571 |
0.8571 |
0.8571 |
0.8571 |
0.2857 |
0.2857 |
0.2857 |
0.2111 |
0.2833 |
0.2100 |
Q&A based on RAG2 |
0.5714 |
0.6667 |
0.4000 |
0.5333 |
0.1429 |
0.1429 |
0.1429 |
0.1000 |
0.1083 |
0.1036 |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
dmiip2024 |
0.1305 |
0.1546 |
0.1375 |
0.1630 |
- |
- |
- |
- |
dmiip2024_1 |
0.1630 |
0.1866 |
0.1668 |
0.1922 |
4.06 |
3.79 |
3.79 |
4.39 |
dmiip2024_2 |
0.1312 |
0.1567 |
0.1374 |
0.1648 |
- |
- |
- |
- |
dmiip2024_3 |
0.1281 |
0.1539 |
0.1324 |
0.1597 |
4.12 |
3.52 |
3.58 |
4.39 |
dmiip2024_4 |
0.1500 |
0.1750 |
0.1536 |
0.1802 |
4.03 |
3.55 |
3.73 |
4.45 |
SCIRE2 Results |
0.1674 |
0.1767 |
0.1770 |
0.1858 |
2.09 |
2.24 |
2.03 |
2.09 |
Fleming-2 |
0.2120 |
0.1512 |
0.2341 |
0.1668 |
3.52 |
4.06 |
3.27 |
3.73 |
Q&A based on RAG |
0.1536 |
0.1193 |
0.1754 |
0.1354 |
3.55 |
3.45 |
3.18 |
3.76 |
Fleming-1 |
0.1752 |
0.1622 |
0.1896 |
0.1749 |
4.06 |
4.24 |
3.88 |
4.00 |
Q&A based on RAG2 |
0.1497 |
0.1141 |
0.1733 |
0.1325 |
3.39 |
3.18 |
3.03 |
3.67 |
Test round 3
Documents
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
dmiip2024 |
0.1364 |
0.5078 |
0.1954 |
0.3017 |
0.0179 |
dmiip2024_1 |
0.1455 |
0.5523 |
0.2095 |
0.3636 |
0.0272 |
dmiip2024_3 |
0.1303 |
0.4438 |
0.1863 |
0.2657 |
0.0098 |
dmiip2024_4 |
0.1333 |
0.4595 |
0.1916 |
0.2626 |
0.0104 |
dmiip2024_2 |
0.1394 |
0.4676 |
0.1985 |
0.2821 |
0.0110 |
Fleming-1 |
- |
- |
- |
- |
- |
SCIRE3 Results |
0.1638 |
0.2670 |
0.1498 |
0.1156 |
0.0009 |
Fleming-2 |
- |
- |
- |
- |
- |
Q&A ClusteredRAG |
- |
- |
- |
- |
- |
Q&A - ClusteredRAG |
- |
- |
- |
- |
- |
Q&A based on RAG |
- |
- |
- |
- |
- |
Q&A based on RAG2 |
- |
- |
- |
- |
- |
Fleming-3 |
- |
- |
- |
- |
- |
Snippets
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
dmiip2024 |
0.0826 |
0.2159 |
0.0986 |
0.1835 |
0.0009 |
dmiip2024_1 |
0.0922 |
0.2372 |
0.1099 |
0.2504 |
0.0013 |
dmiip2024_3 |
0.0781 |
0.1650 |
0.0893 |
0.1515 |
0.0005 |
dmiip2024_4 |
0.0685 |
0.1456 |
0.0793 |
0.1336 |
0.0004 |
dmiip2024_2 |
0.0809 |
0.1649 |
0.0911 |
0.1564 |
0.0006 |
Fleming-1 |
- |
- |
- |
- |
- |
SCIRE3 Results |
0.1564 |
0.2072 |
0.1453 |
0.2576 |
0.0010 |
Fleming-2 |
- |
- |
- |
- |
- |
Q&A ClusteredRAG |
- |
- |
- |
- |
- |
Q&A - ClusteredRAG |
- |
- |
- |
- |
- |
Q&A based on RAG |
- |
- |
- |
- |
- |
Q&A based on RAG2 |
- |
- |
- |
- |
- |
Fleming-3 |
- |
- |
- |
- |
- |
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
dmiip2024 |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.5000 |
0.5000 |
0.5000 |
0.1547 |
0.4115 |
0.1993 |
dmiip2024_1 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.5000 |
0.5000 |
0.5000 |
0.1875 |
0.4583 |
0.2304 |
dmiip2024_3 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.3000 |
0.2500 |
0.1980 |
0.5265 |
0.2423 |
dmiip2024_4 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.4000 |
0.6000 |
0.5000 |
0.1980 |
0.5890 |
0.2495 |
dmiip2024_2 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.3000 |
0.2500 |
0.2155 |
0.5473 |
0.2593 |
Fleming-1 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.2000 |
0.3000 |
0.2500 |
0.2177 |
0.5104 |
0.2634 |
SCIRE3 Results |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.3000 |
0.4000 |
0.3250 |
0.1751 |
0.4328 |
0.2080 |
Fleming-2 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.2000 |
0.3000 |
0.2500 |
0.2177 |
0.5104 |
0.2634 |
Q&A ClusteredRAG |
0.7000 |
0.5714 |
0.7692 |
0.6703 |
0.1000 |
0.1000 |
0.1000 |
0.0146 |
0.0682 |
0.0227 |
Q&A - ClusteredRAG |
0.6000 |
0.3333 |
0.7143 |
0.5238 |
- | - | - |
0.0246 |
0.0781 |
0.0313 |
Q&A based on RAG |
0.7000 |
0.5714 |
0.7692 |
0.6703 |
0.1000 |
0.1000 |
0.1000 |
0.1198 |
0.1151 |
0.0751 |
Q&A based on RAG2 |
0.7000 |
0.5714 |
0.7692 |
0.6703 |
0.1000 |
0.1000 |
0.1000 |
0.0547 |
0.1406 |
0.0712 |
Fleming-3 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.3000 |
0.2500 |
0.2177 |
0.5104 |
0.2634 |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
dmiip2024 |
0.1670 |
0.1968 |
0.1700 |
0.2011 |
4.49 |
4.37 |
4.27 |
4.47 |
dmiip2024_1 |
0.2438 |
0.2668 |
0.2485 |
0.2723 |
4.57 |
4.61 |
4.43 |
4.57 |
dmiip2024_3 |
0.2019 |
0.2198 |
0.2053 |
0.2241 |
4.47 |
4.47 |
4.31 |
4.45 |
dmiip2024_4 |
0.2019 |
0.2251 |
0.2066 |
0.2309 |
4.51 |
4.53 |
4.31 |
4.47 |
dmiip2024_2 |
0.1995 |
0.2173 |
0.2056 |
0.2253 |
4.57 |
4.59 |
4.31 |
4.55 |
Fleming-1 |
0.3064 |
0.2423 |
0.3178 |
0.2499 |
4.31 |
4.73 |
4.06 |
4.47 |
SCIRE3 Results |
0.2076 |
0.1724 |
0.2192 |
0.1822 |
3.76 |
3.94 |
3.47 |
3.80 |
Fleming-2 |
0.3064 |
0.2423 |
0.3178 |
0.2499 |
4.31 |
4.73 |
4.06 |
4.47 |
Q&A ClusteredRAG |
0.0875 |
0.0665 |
0.1062 |
0.0806 |
3.61 |
3.00 |
2.84 |
3.84 |
Q&A - ClusteredRAG |
0.1366 |
0.0942 |
0.1602 |
0.1094 |
3.43 |
2.92 |
2.69 |
3.67 |
Q&A based on RAG |
0.1196 |
0.0828 |
0.1414 |
0.0985 |
3.78 |
3.14 |
2.92 |
3.92 |
Q&A based on RAG2 |
0.1278 |
0.0961 |
0.1499 |
0.1109 |
3.55 |
2.94 |
2.73 |
3.84 |
Fleming-3 |
0.3064 |
0.2423 |
0.3178 |
0.2499 |
4.31 |
4.73 |
4.06 |
4.47 |
Test round 4
Documents
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
SCIRE4 Results |
0.0192 |
0.0077 |
0.0110 |
0.0064 |
0.0000 |
SCIRE4 Results (GPT) |
0.0632 |
0.0751 |
0.0554 |
0.0344 |
0.0000 |
dmiip2024 |
0.1115 |
0.4287 |
0.1549 |
0.3008 |
0.0043 |
dmiip2024_1 |
0.1192 |
0.4720 |
0.1662 |
0.3134 |
0.0067 |
dmiip2024_2 |
0.1462 |
0.7319 |
0.2184 |
0.4716 |
0.0855 |
dmiip2024_4 |
0.1462 |
0.7326 |
0.2187 |
0.4733 |
0.0855 |
dmiip2024_3 |
0.1346 |
0.7376 |
0.2066 |
0.4446 |
0.1145 |
Q&A based on RAG |
0.0353 |
0.0824 |
0.0423 |
0.0440 |
0.0000 |
sinai_uja_RAG |
0.0417 |
0.1209 |
0.0533 |
0.0824 |
0.0001 |
Q&A based on RAG2 |
0.0417 |
0.1209 |
0.0533 |
0.0824 |
0.0001 |
Fleming-4 |
- |
- |
- |
- |
- |
Fleming-1 |
- |
- |
- |
- |
- |
Fleming-2 |
- |
- |
- |
- |
- |
Fleming-3 |
- |
- |
- |
- |
- |
retrieval+reranking |
0.1160 |
0.1885 |
0.1383 |
0.1474 |
0.0002 |
Snippets
System |
Mean precision |
Recall |
F-Measure |
MAP |
GMAP |
SCIRE4 Results |
0.1490 |
0.1758 |
0.1284 |
0.2819 |
0.0017 |
SCIRE4 Results (GPT) |
0.1623 |
0.1890 |
0.1467 |
0.2831 |
0.0024 |
dmiip2024 |
0.0658 |
0.0518 |
0.0489 |
0.0858 |
0.0001 |
dmiip2024_1 |
0.0623 |
0.0597 |
0.0506 |
0.1012 |
0.0002 |
dmiip2024_2 |
0.0749 |
0.1038 |
0.0682 |
0.1299 |
0.0004 |
dmiip2024_4 |
0.0756 |
0.1038 |
0.0684 |
0.1250 |
0.0004 |
dmiip2024_3 |
0.0751 |
0.0990 |
0.0672 |
0.1138 |
0.0004 |
Q&A based on RAG |
0.2458 |
0.3780 |
0.2518 |
0.3039 |
0.0143 |
sinai_uja_RAG |
0.2451 |
0.3737 |
0.2505 |
0.3016 |
0.0142 |
Q&A based on RAG2 |
0.2451 |
0.3737 |
0.2505 |
0.3016 |
0.0142 |
Fleming-4 |
- |
- |
- |
- |
- |
Fleming-1 |
- |
- |
- |
- |
- |
Fleming-2 |
- |
- |
- |
- |
- |
Fleming-3 |
- |
- |
- |
- |
- |
retrieval+reranking |
0.1135 |
0.0722 |
0.0829 |
0.0978 |
0.0003 |
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
SCIRE4 Results |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.3000 |
0.2333 |
0.1727 |
0.3807 |
0.1988 |
SCIRE4 Results (GPT) |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
0.3000 |
0.3000 |
0.3000 |
0.1564 |
0.4119 |
0.1906 |
dmiip2024 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.4000 |
0.3000 |
0.1995 |
0.4640 |
0.2421 |
dmiip2024_1 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.3000 |
0.2500 |
0.1571 |
0.4375 |
0.1957 |
dmiip2024_2 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.4000 |
0.5000 |
0.4500 |
0.1942 |
0.4962 |
0.2369 |
dmiip2024_4 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.2000 |
0.2000 |
0.3125 |
0.2448 |
0.2625 |
dmiip2024_3 |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.3000 |
0.2500 |
0.1512 |
0.4119 |
0.1855 |
Q&A based on RAG |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.3000 |
0.3000 |
0.3000 |
0.2156 |
0.5104 |
0.2624 |
sinai_uja_RAG |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.4000 |
0.4000 |
0.4000 |
0.2331 |
0.4688 |
0.2667 |
Q&A based on RAG2 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.3000 |
0.3000 |
0.3000 |
0.2200 |
0.4479 |
0.2472 |
Fleming-4 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.3000 |
0.5000 |
0.4000 |
0.3250 |
0.4536 |
0.3536 |
Fleming-1 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.3000 |
0.5000 |
0.4000 |
0.3250 |
0.4536 |
0.3536 |
Fleming-2 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.3000 |
0.5000 |
0.4000 |
0.3250 |
0.4536 |
0.3536 |
Fleming-3 |
0.8000 |
0.8333 |
0.7500 |
0.7917 |
0.3000 |
0.5000 |
0.4000 |
0.3250 |
0.4536 |
0.3536 |
retrieval+reranking |
0.9000 |
0.9091 |
0.8889 |
0.8990 |
0.2000 |
0.2000 |
0.2000 |
0.0859 |
0.3125 |
0.1053 |
Ideal Answers
|
Automatic scores (Rouge - R) |
Manual scores |
System |
R-2 (Rec) |
R-2 (F1) |
R-SU4 (Rec) |
R-SU4 (F1) |
Readability |
Recall |
Precision |
Repetition |
SCIRE4 Results |
0.2032 |
0.1558 |
0.2160 |
0.1649 |
3.22 |
3.56 |
2.93 |
3.33 |
SCIRE4 Results (GPT) |
0.2089 |
0.1494 |
0.2237 |
0.1594 |
3.93 |
4.38 |
3.42 |
3.98 |
dmiip2024 |
0.2206 |
0.2406 |
0.2228 |
0.2437 |
4.25 |
4.36 |
4.02 |
4.27 |
dmiip2024_1 |
0.2485 |
0.2709 |
0.2555 |
0.2792 |
4.47 |
4.53 |
4.24 |
4.49 |
dmiip2024_2 |
0.2572 |
0.2811 |
0.2583 |
0.2828 |
4.40 |
4.44 |
4.09 |
4.38 |
dmiip2024_4 |
0.1943 |
0.2128 |
0.2010 |
0.2198 |
4.25 |
4.24 |
3.91 |
4.25 |
dmiip2024_3 |
0.1807 |
0.1971 |
0.1833 |
0.2002 |
4.25 |
4.31 |
3.91 |
4.25 |
Q&A based on RAG |
0.2093 |
0.2047 |
0.2169 |
0.2126 |
4.15 |
4.22 |
3.91 |
4.16 |
sinai_uja_RAG |
0.2005 |
0.2005 |
0.2008 |
0.2023 |
4.24 |
4.27 |
3.96 |
4.27 |
Q&A based on RAG2 |
0.2008 |
0.1954 |
0.2035 |
0.1988 |
4.22 |
4.22 |
3.85 |
4.16 |
Fleming-4 |
0.2654 |
0.2091 |
0.2780 |
0.2174 |
3.98 |
4.36 |
3.56 |
4.09 |
Fleming-1 |
0.2638 |
0.2067 |
0.2760 |
0.2143 |
4.05 |
4.51 |
3.69 |
4.09 |
Fleming-2 |
0.3038 |
0.1984 |
0.3090 |
0.2005 |
3.91 |
4.51 |
3.47 |
3.96 |
Fleming-3 |
0.2638 |
0.2067 |
0.2760 |
0.2143 |
4.05 |
4.51 |
3.69 |
4.09 |
retrieval+reranking |
0.1902 |
0.1367 |
0.2058 |
0.1464 |
3.71 |
4.07 |
3.29 |
3.73 |