BioASQ Participants Area
Task 12b: Test Results of Phase B
The test results are presented in separate tables for each type of annotation. The "System Description" of each system is used.The evaluation measures that are used in Task B are presented here .
Warning: For ideal answers, good ROUGE results do not always imply good manual scores.
Test batch 1
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| mibi_rag_abstract | 0.8400 | 0.8667 | 0.8000 | 0.8333 | 0.0476 | 0.0476 | 0.0476 | 0.5048 | 0.3804 | 0.4147 |
| mibi_rag_snippet | 0.9200 | 0.9333 | 0.9000 | 0.9167 | - | - | - | 0.5286 | 0.4107 | 0.4441 |
| UR-IW-1 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.2857 | 0.3810 | 0.3254 | 0.4840 | 0.4173 | 0.4266 |
| UR-IW-4 | 0.9600 | 0.9655 | 0.9524 | 0.9589 | 0.2381 | 0.2381 | 0.2381 | 0.4563 | 0.3478 | 0.3778 |
| UR-IW-2 | 0.8800 | 0.8889 | 0.8696 | 0.8792 | 0.2381 | 0.2857 | 0.2619 | 0.5255 | 0.4510 | 0.4764 |
| UR-IW-5 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.2381 | 0.2857 | 0.2540 | 0.6054 | 0.5158 | 0.5404 |
| UR-IW-3 | 0.9600 | 0.9655 | 0.9524 | 0.9589 | 0.2381 | 0.2381 | 0.2381 | 0.6010 | 0.5158 | 0.5337 |
| Gatech competition | 0.9200 | 0.9286 | 0.9091 | 0.9188 | 0.2381 | 0.2381 | 0.2381 | 0.4939 | 0.3433 | 0.3587 |
| Mistral-7B finetune | 0.9600 | 0.9655 | 0.9524 | 0.9589 | 0.2381 | 0.2381 | 0.2381 | 0.5786 | 0.5006 | 0.5265 |
| Synthia with first | 0.9200 | 0.9286 | 0.9091 | 0.9188 | 0.2857 | 0.2857 | 0.2857 | 0.4627 | 0.4107 | 0.4020 |
| LLM4SciLit | 0.4000 | - | 0.5714 | 0.2857 | - | - | - | - | - | - |
| RMC_append_snippets | 0.9600 | 0.9655 | 0.9524 | 0.9589 | 0.1905 | 0.1905 | 0.1905 | 0.4770 | 0.4400 | 0.4365 |
| bioinfo-0 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| bioinfo-1 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| bioinfo-2 | 0.6000 | 0.7500 | - | 0.3750 | - | - | - | - | - | - |
| bioinfo-3 | 0.0400 | 0.0769 | - | 0.0385 | 0.1905 | 0.1905 | 0.1905 | 0.3214 | 0.2797 | 0.2950 |
| bioinfo-4 | - | - | - | - | 0.2857 | 0.2857 | 0.2857 | 0.4169 | 0.3464 | 0.3698 |
| Fleming-1 | 0.8400 | 0.8750 | 0.7778 | 0.8264 | 0.0476 | 0.0952 | 0.0714 | 0.5196 | 0.4190 | 0.4420 |
| dmiip2024_2 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.2381 | 0.2381 | 0.2381 | 0.3943 | 0.5380 | 0.4249 |
| dmiip2024_3 | 0.9600 | 0.9677 | 0.9474 | 0.9576 | 0.3333 | 0.3810 | 0.3571 | 0.5942 | 0.4725 | 0.5068 |
| dmiip2024_4 | 0.4000 | - | 0.5714 | 0.2857 | 0.3333 | 0.5238 | 0.4206 | 0.4481 | 0.4682 | 0.4386 |
| dmiip2024_1 | 0.9200 | 0.9286 | 0.9091 | 0.9188 | 0.3333 | 0.4286 | 0.3810 | 0.6647 | 0.5011 | 0.5453 |
| dmiip2024 | 0.9200 | 0.9286 | 0.9091 | 0.9188 | 0.3333 | 0.4286 | 0.3810 | 0.6603 | 0.4967 | 0.5407 |
| IISR 5th submit | 0.9600 | 0.9655 | 0.9524 | 0.9589 | 0.2857 | 0.2857 | 0.2857 | 0.5466 | 0.4701 | 0.4915 |
| RAG for medicine | 0.7200 | 0.7200 | 0.7200 | 0.7200 | 0.1905 | 0.3333 | 0.2397 | 0.5524 | 0.5209 | 0.5054 |
| IBE-LM ver1 | 0.6000 | 0.7500 | - | 0.3750 | 0.3333 | 0.4762 | 0.3825 | - | - | - |
| IBE-LM ver3 | 0.6000 | 0.7500 | - | 0.3750 | 0.4286 | 0.4762 | 0.4444 | - | - | - |
| IBE-LM ver 5 | 0.6000 | 0.7500 | - | 0.3750 | 0.4286 | 0.4762 | 0.4444 | - | - | - |
| IBE-LM ver2 | 0.6000 | 0.7500 | - | 0.3750 | 0.3333 | 0.4762 | 0.3849 | 0.1143 | 0.1706 | 0.1280 |
| IBE-LM ver4 | 0.6000 | 0.7500 | - | 0.3750 | 0.3333 | 0.4762 | 0.3849 | 0.1143 | 0.1706 | 0.1280 |
| IISR 2nd submit | 0.9600 | 0.9655 | 0.9524 | 0.9589 | 0.2857 | 0.2857 | 0.2857 | 0.5813 | 0.4591 | 0.4931 |
| IISR 3rd submit | 0.9200 | 0.9333 | 0.9000 | 0.9167 | 0.2381 | 0.2381 | 0.2381 | 0.5461 | 0.4721 | 0.4960 |
| IISR 4th submit | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.1905 | 0.1905 | 0.1905 | 0.5449 | 0.4682 | 0.4934 |
| IISR first submit | 0.9200 | 0.9286 | 0.9091 | 0.9188 | 0.2381 | 0.2857 | 0.2540 | 0.6317 | 0.4685 | 0.5161 |
| CPS | 0.7200 | 0.7742 | 0.6316 | 0.7029 | 0.2857 | 0.2857 | 0.2857 | 0.3532 | 0.2286 | 0.2579 |
| lasige-ku | 0.6800 | 0.7895 | 0.3333 | 0.5614 | - | - | - | - | - | - |
| extractive | 0.8400 | 0.8824 | 0.7500 | 0.8162 | 0.1429 | 0.1905 | 0.1667 | 0.1996 | 0.2011 | 0.1908 |
| AUEB-System1 | 0.8000 | 0.8571 | 0.6667 | 0.7619 | 0.3333 | 0.4286 | 0.3690 | 0.4286 | 0.2988 | 0.3211 |
| BioASQ_Baseline | 0.4400 | 0.3000 | 0.5333 | 0.4167 | 0.0476 | 0.1905 | 0.0968 | 0.2366 | 0.2599 | 0.2100 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| mibi_rag_abstract | 0.3732 | 0.2670 | 0.3756 | 0.2594 | 4.44 | 4.79 | 4.27 | 4.42 |
| mibi_rag_snippet | 0.4135 | 0.3016 | 0.4068 | 0.2882 | 4.54 | 4.75 | 4.41 | 4.49 |
| UR-IW-1 | 0.3564 | 0.2745 | 0.3508 | 0.2623 | 4.41 | 4.60 | 4.26 | 4.36 |
| UR-IW-4 | 0.4141 | 0.3122 | 0.4068 | 0.2960 | - | - | - | - |
| UR-IW-2 | 0.3327 | 0.3651 | 0.3220 | 0.3535 | - | - | - | - |
| UR-IW-5 | 0.3998 | 0.2539 | 0.4022 | 0.2458 | - | - | - | - |
| UR-IW-3 | 0.3085 | 0.3244 | 0.3029 | 0.3167 | - | - | - | - |
| Gatech competition | 0.2624 | 0.2479 | 0.2580 | 0.2399 | 4.35 | 4.44 | 4.20 | 4.62 |
| Mistral-7B finetune | 0.3102 | 0.3236 | 0.3052 | 0.3149 | 4.35 | 4.35 | 4.22 | 4.47 |
| Synthia with first | 0.3065 | 0.2778 | 0.3022 | 0.2683 | 4.49 | 4.48 | 4.33 | 4.66 |
| LLM4SciLit | 0.0571 | 0.0776 | 0.0502 | 0.0695 | 3.07 | 3.08 | 3.41 | 3.91 |
| RMC_append_snippets | 0.3978 | 0.3360 | 0.3901 | 0.3215 | 4.51 | 4.66 | 4.42 | 4.59 |
| bioinfo-0 | 0.4060 | 0.1418 | 0.4170 | 0.1384 | 4.14 | 4.69 | 3.75 | 4.14 |
| bioinfo-1 | 0.4038 | 0.1309 | 0.4188 | 0.1292 | 3.99 | 4.65 | 3.71 | 3.94 |
| bioinfo-2 | 0.4636 | 0.1614 | 0.4761 | 0.1566 | 4.00 | 4.76 | 3.72 | 4.01 |
| bioinfo-3 | 0.2991 | 0.1401 | 0.3104 | 0.1400 | 4.19 | 4.42 | 3.78 | 4.28 |
| bioinfo-4 | 0.4208 | 0.1507 | 0.4336 | 0.1480 | 4.08 | 4.64 | 3.74 | 3.99 |
| Fleming-1 | 0.3241 | 0.1686 | 0.3458 | 0.1681 | 4.28 | 4.68 | 3.98 | 4.34 |
| dmiip2024_2 | 0.3139 | 0.3123 | 0.3134 | 0.3004 | 4.38 | 4.60 | 4.38 | 4.53 |
| dmiip2024_3 | 0.2426 | 0.2628 | 0.2393 | 0.2531 | 4.48 | 4.55 | 4.40 | 4.65 |
| dmiip2024_4 | 0.2834 | 0.3004 | 0.2700 | 0.2849 | 4.18 | 4.21 | 4.18 | 4.38 |
| dmiip2024_1 | 0.2909 | 0.3237 | 0.2854 | 0.3134 | 4.39 | 4.41 | 4.38 | 4.46 |
| dmiip2024 | 0.2879 | 0.3150 | 0.2787 | 0.3000 | 4.36 | 4.35 | 4.25 | 4.36 |
| IISR 5th submit | 0.4235 | 0.2063 | 0.4238 | 0.1990 | 4.26 | 4.71 | 3.92 | 4.34 |
| RAG for medicine | 0.3676 | 0.2008 | 0.3722 | 0.1955 | 4.27 | 4.75 | 4.06 | 4.29 |
| IBE-LM ver1 | - | - | - | - | 0.89 | 0.75 | 0.98 | 1.16 |
| IBE-LM ver3 | - | - | - | - | 0.89 | 0.75 | 0.98 | 1.16 |
| IBE-LM ver 5 | - | - | - | - | 0.89 | 0.75 | 0.98 | 1.16 |
| IBE-LM ver2 | - | - | - | - | 0.89 | 0.75 | 0.98 | 1.16 |
| IBE-LM ver4 | - | - | - | - | 0.89 | 0.75 | 0.98 | 1.16 |
| IISR 2nd submit | 0.3274 | 0.2889 | 0.3238 | 0.2781 | 4.56 | 4.68 | 4.56 | 4.59 |
| IISR 3rd submit | 0.4361 | 0.2311 | 0.4350 | 0.2207 | 4.34 | 4.71 | 4.00 | 4.36 |
| IISR 4th submit | 0.4259 | 0.1991 | 0.4210 | 0.1898 | 4.27 | 4.75 | 3.93 | 4.25 |
| IISR first submit | 0.3315 | 0.3111 | 0.3256 | 0.2997 | 4.48 | 4.56 | 4.42 | 4.54 |
| CPS | 0.3234 | 0.2860 | 0.3173 | 0.2773 | 4.38 | 4.08 | 3.98 | 4.53 |
| lasige-ku | 0.0532 | 0.0469 | 0.0788 | 0.0640 | 3.00 | 2.58 | 2.51 | 3.98 |
| extractive | 0.2447 | 0.2549 | 0.2447 | 0.2548 | 4.13 | 3.87 | 3.91 | 4.31 |
| AUEB-System1 | - | - | - | - | - | - | - | - |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 2
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| dmiip2024 | 0.8077 | 0.8485 | 0.7368 | 0.7927 | 0.5263 | 0.7368 | 0.6140 | 0.4648 | 0.3615 | 0.3748 |
| dmiip2024_1 | 0.8077 | 0.8485 | 0.7368 | 0.7927 | 0.5789 | 0.7368 | 0.6491 | 0.4787 | 0.3655 | 0.3821 |
| dmiip2024_2 | 0.9231 | 0.9444 | 0.8750 | 0.9097 | 0.3684 | 0.5263 | 0.4342 | 0.5141 | 0.3979 | 0.4153 |
| dmiip2024_4 | 0.3077 | - | 0.4706 | 0.2353 | 0.3158 | 0.6316 | 0.4474 | 0.5000 | 0.4072 | 0.4097 |
| dmiip2024_3 | 0.9231 | 0.9444 | 0.8750 | 0.9097 | 0.5789 | 0.5789 | 0.5789 | 0.5784 | 0.4115 | 0.4533 |
| mibi_rag_snippet | 0.8462 | 0.8889 | 0.7500 | 0.8194 | 0.1579 | 0.1579 | 0.1579 | 0.5444 | 0.3629 | 0.4078 |
| mibi_rag_abstract | 0.8846 | 0.9143 | 0.8235 | 0.8689 | 0.1053 | 0.1053 | 0.1053 | 0.3685 | 0.2630 | 0.2682 |
| Synthia with first | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.1053 | 0.1053 | 0.1053 | 0.4463 | 0.3008 | 0.3260 |
| RMC_append_snippets | 0.8846 | 0.9143 | 0.8235 | 0.8689 | 0.5263 | 0.5263 | 0.5263 | 0.4222 | 0.2923 | 0.3169 |
| bioinfo-0 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-1 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-2 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-3 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| bioinfo-4 | 0.6923 | 0.8182 | - | 0.4091 | - | - | - | - | - | - |
| UR-IW-1 | 0.8846 | 0.9143 | 0.8235 | 0.8689 | 0.6842 | 0.7895 | 0.7368 | 0.5061 | 0.4209 | 0.4235 |
| UR-IW-2 | 0.9231 | 0.9412 | 0.8889 | 0.9150 | 0.7368 | 0.7368 | 0.7368 | 0.5835 | 0.4585 | 0.4868 |
| UR-IW-3 | 0.8846 | 0.9091 | 0.8421 | 0.8756 | 0.5789 | 0.5789 | 0.5789 | 0.5650 | 0.4278 | 0.4604 |
| UR-IW-4 | 0.7692 | 0.8125 | 0.7000 | 0.7563 | 0.6316 | 0.6316 | 0.6316 | 0.5863 | 0.4585 | 0.4878 |
| UR-IW-5 | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.4737 | 0.4737 | 0.4737 | 0.5009 | 0.4278 | 0.4199 |
| Gatech competition | 0.8846 | 0.9091 | 0.8421 | 0.8756 | 0.4211 | 0.4211 | 0.4211 | 0.4263 | 0.2515 | 0.2855 |
| GTBioASQsys2 | 0.8077 | 0.8485 | 0.7368 | 0.7927 | 0.3158 | 0.3158 | 0.3158 | 0.5247 | 0.4041 | 0.4293 |
| IBE-LM ver1 | 0.6923 | 0.8182 | - | 0.4091 | 0.2105 | 0.5263 | 0.3333 | 0.1889 | 0.2508 | 0.1879 |
| IBE-LM ver2 | 0.6923 | 0.8182 | - | 0.4091 | 0.3684 | 0.5789 | 0.4491 | 0.2111 | 0.2356 | 0.1851 |
| IBE-LM ver3 | 0.6923 | 0.8182 | - | 0.4091 | 0.3158 | 0.5263 | 0.3895 | 0.1556 | 0.1669 | 0.1293 |
| IBE-LM ver4 | 0.6923 | 0.8182 | - | 0.4091 | 0.3684 | 0.5789 | 0.4342 | 0.2222 | 0.2766 | 0.2105 |
| IBE-LM ver 5 | 0.6923 | 0.8182 | - | 0.4091 | 0.3158 | 0.5789 | 0.4254 | 0.2222 | 0.2420 | 0.1934 |
| LLM4SciLit | 0.3077 | - | 0.4706 | 0.2353 | - | - | - | - | - | - |
| Fleming-3 | 0.8846 | 0.9143 | 0.8235 | 0.8689 | 0.3684 | 0.5263 | 0.4342 | 0.5475 | 0.4057 | 0.4314 |
| IISR first submit | 0.9231 | 0.9444 | 0.8750 | 0.9097 | 0.3158 | 0.3158 | 0.3158 | 0.5166 | 0.3932 | 0.4181 |
| IISR 2nd submit | 0.9231 | 0.9444 | 0.8750 | 0.9097 | 0.4211 | 0.4211 | 0.4211 | 0.6166 | 0.3855 | 0.4308 |
| IISR 3rd submit | 0.8462 | 0.8889 | 0.7500 | 0.8194 | 0.3684 | 0.3684 | 0.3684 | 0.4981 | 0.3447 | 0.3759 |
| IISR 4th submit | 0.8846 | 0.9143 | 0.8235 | 0.8689 | 0.5263 | 0.5263 | 0.5263 | 0.5436 | 0.3621 | 0.4016 |
| IISR 5th submit | 0.8846 | 0.9189 | 0.8000 | 0.8595 | 0.4737 | 0.4737 | 0.4737 | 0.5595 | 0.3886 | 0.4170 |
| CPS | 0.7308 | 0.8108 | 0.5333 | 0.6721 | 0.2105 | 0.2105 | 0.2105 | 0.4225 | 0.2948 | 0.3151 |
| CPS2 | 0.6923 | 0.8000 | 0.3333 | 0.5667 | - | - | - | - | - | - |
| Mistral-7B finetune | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.5263 | 0.5263 | 0.5263 | 0.5777 | 0.3754 | 0.4197 |
| simple truncation | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.1579 | 0.2632 | 0.2105 | - | - | - |
| kmeans | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.1579 | 0.2632 | 0.2105 | - | - | - |
| similarity measures | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.1579 | 0.2632 | 0.2105 | - | - | - |
| extractive | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.1579 | 0.2632 | 0.2105 | - | - | - |
| abstractive | 0.8462 | 0.8824 | 0.7778 | 0.8301 | 0.1579 | 0.4211 | 0.2649 | 0.3492 | 0.2367 | 0.2594 |
| lasige-ku | 0.6923 | 0.7895 | 0.4286 | 0.6090 | - | - | - | - | - | - |
| BioASQ_Baseline | 0.3846 | 0.2000 | 0.5000 | 0.3500 | 0.2105 | 0.3684 | 0.2737 | 0.2364 | 0.2586 | 0.1983 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| dmiip2024 | 0.2273 | 0.2608 | 0.2183 | 0.2458 | 4.47 | 4.53 | 4.44 | 4.67 |
| dmiip2024_1 | 0.2521 | 0.2831 | 0.2450 | 0.2740 | 4.58 | 4.47 | 4.41 | 4.61 |
| dmiip2024_2 | 0.2256 | 0.2591 | 0.2153 | 0.2430 | 4.56 | 4.60 | 4.48 | 4.67 |
| dmiip2024_4 | 0.2069 | 0.2402 | 0.1944 | 0.2238 | 4.42 | 4.25 | 4.14 | 4.59 |
| dmiip2024_3 | 0.1688 | 0.1908 | 0.1677 | 0.1897 | 4.61 | 4.62 | 4.47 | 4.62 |
| mibi_rag_snippet | 0.3344 | 0.2618 | 0.3378 | 0.2567 | 4.56 | 4.74 | 4.41 | 4.55 |
| mibi_rag_abstract | 0.3231 | 0.2372 | 0.3217 | 0.2292 | 4.46 | 4.61 | 4.33 | 4.53 |
| Synthia with first | 0.2678 | 0.2639 | 0.2638 | 0.2535 | 4.54 | 4.55 | 4.35 | 4.64 |
| RMC_append_snippets | 0.3029 | 0.2824 | 0.2988 | 0.2713 | 4.71 | 4.61 | 4.47 | 4.71 |
| bioinfo-0 | 0.3417 | 0.1555 | 0.3509 | 0.1544 | 4.24 | 4.68 | 4.01 | 4.28 |
| bioinfo-1 | 0.3863 | 0.1854 | 0.3862 | 0.1791 | 4.35 | 4.71 | 4.12 | 4.31 |
| bioinfo-2 | 0.1286 | 0.1508 | 0.1326 | 0.1522 | 3.85 | 3.67 | 3.67 | 4.11 |
| bioinfo-3 | 0.1880 | 0.2069 | 0.1920 | 0.2072 | 4.11 | 3.91 | 3.76 | 4.29 |
| bioinfo-4 | 0.3610 | 0.1667 | 0.3599 | 0.1617 | 4.26 | 4.68 | 4.00 | 4.26 |
| UR-IW-1 | 0.3005 | 0.2527 | 0.3043 | 0.2440 | 4.59 | 4.78 | 4.54 | 4.65 |
| UR-IW-2 | 0.2677 | 0.3049 | 0.2598 | 0.2944 | 4.54 | 4.48 | 4.39 | 4.59 |
| UR-IW-3 | 0.3083 | 0.3255 | 0.3076 | 0.3207 | 4.61 | 4.51 | 4.49 | 4.71 |
| UR-IW-4 | 0.3393 | 0.2693 | 0.3411 | 0.2651 | 4.39 | 4.59 | 4.24 | 4.45 |
| UR-IW-5 | 0.3278 | 0.2312 | 0.3316 | 0.2235 | 4.36 | 4.68 | 4.27 | 4.48 |
| Gatech competition | 0.1989 | 0.1925 | 0.1939 | 0.1816 | 4.20 | 4.31 | 4.21 | 4.36 |
| GTBioASQsys2 | 0.1655 | 0.1698 | 0.1680 | 0.1634 | 4.29 | 4.45 | 4.18 | 4.54 |
| IBE-LM ver1 | - | - | - | - | 1.06 | 0.93 | 1.01 | 1.28 |
| IBE-LM ver2 | - | - | - | - | 1.06 | 0.93 | 1.01 | 1.28 |
| IBE-LM ver3 | - | - | - | - | 1.06 | 0.93 | 1.01 | 1.28 |
| IBE-LM ver4 | - | - | - | - | 1.06 | 0.93 | 1.01 | 1.28 |
| IBE-LM ver 5 | - | - | - | - | 1.06 | 0.93 | 1.01 | 1.28 |
| LLM4SciLit | 0.0430 | 0.0634 | 0.0383 | 0.0562 | 3.06 | 3.25 | 3.53 | 3.62 |
| Fleming-3 | 0.3138 | 0.1659 | 0.3257 | 0.1640 | 4.25 | 4.73 | 4.09 | 4.27 |
| IISR first submit | 0.3165 | 0.2183 | 0.3213 | 0.2119 | 4.47 | 4.75 | 4.28 | 4.51 |
| IISR 2nd submit | 0.2424 | 0.2278 | 0.2367 | 0.2181 | 4.65 | 4.68 | 4.51 | 4.73 |
| IISR 3rd submit | 0.2617 | 0.2110 | 0.2581 | 0.2022 | 4.59 | 4.67 | 4.47 | 4.65 |
| IISR 4th submit | 0.2637 | 0.2442 | 0.2578 | 0.2343 | 4.47 | 4.60 | 4.41 | 4.66 |
| IISR 5th submit | 0.3182 | 0.2216 | 0.3183 | 0.2139 | 4.53 | 4.73 | 4.35 | 4.56 |
| CPS | 0.2521 | 0.2219 | 0.2459 | 0.2106 | 4.25 | 4.15 | 3.99 | 4.39 |
| CPS2 | 0.2574 | 0.2317 | 0.2509 | 0.2196 | 4.20 | 4.12 | 4.01 | 4.35 |
| Mistral-7B finetune | 0.2338 | 0.2500 | 0.2351 | 0.2480 | 4.27 | 4.38 | 4.20 | 4.54 |
| simple truncation | 0.1099 | 0.0896 | 0.1259 | 0.0969 | 3.05 | 3.01 | 2.72 | 3.32 |
| kmeans | 0.0759 | 0.0670 | 0.0761 | 0.0665 | 1.01 | 1.11 | 1.04 | 1.05 |
| similarity measures | 0.3238 | 0.1959 | 0.3234 | 0.1887 | 4.39 | 4.74 | 4.19 | 4.38 |
| extractive | 0.3143 | 0.1845 | 0.3178 | 0.1802 | 4.28 | 4.64 | 4.12 | 4.40 |
| abstractive | 0.1099 | 0.0896 | 0.1259 | 0.0969 | 3.05 | 3.01 | 2.72 | 3.32 |
| lasige-ku | 0.0659 | 0.0547 | 0.0919 | 0.0682 | 3.44 | 2.96 | 2.76 | 3.81 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 3
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| mibi_rag_snippet | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.2308 | 0.2308 | 0.2308 | 0.5629 | 0.4987 | 0.5140 |
| mibi_rag_abstract | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.2308 | 0.2308 | 0.2308 | 0.5927 | 0.4674 | 0.5020 |
| AUEB-System1 | 0.6250 | 0.7429 | 0.3077 | 0.5253 | 0.2308 | 0.4615 | 0.3397 | 0.4336 | 0.2708 | 0.3007 |
| CPS | 0.7500 | 0.8000 | 0.6667 | 0.7333 | 0.3462 | 0.3462 | 0.3462 | 0.3399 | 0.2694 | 0.2924 |
| CPS2 | 0.6250 | 0.7273 | 0.4000 | 0.5636 | - | - | - | - | - | - |
| bioinfo-0 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-1 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-2 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-3 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| bioinfo-4 | 0.5833 | 0.7368 | - | 0.3684 | - | - | - | - | - | - |
| Synthia with first | 0.9167 | 0.9333 | 0.8889 | 0.9111 | 0.1538 | 0.1538 | 0.1538 | 0.3807 | 0.2780 | 0.2990 |
| RMC_append_snippets | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.3462 | 0.3462 | 0.3462 | 0.4263 | 0.3626 | 0.3544 |
| Fleming-3 | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.2308 | 0.2692 | 0.2404 | 0.5938 | 0.4836 | 0.5176 |
| UR-IW-1 | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.3462 | 0.4231 | 0.3846 | 0.3656 | 0.5109 | 0.3804 |
| UR-IW-2 | 0.8750 | 0.8800 | 0.8696 | 0.8748 | 0.4231 | 0.4231 | 0.4231 | 0.5764 | 0.4978 | 0.5128 |
| UR-IW-3 | 0.9583 | 0.9630 | 0.9524 | 0.9577 | 0.3846 | 0.4615 | 0.4231 | 0.4947 | 0.4246 | 0.4439 |
| UR-IW-4 | 0.9583 | 0.9630 | 0.9524 | 0.9577 | 0.4231 | 0.4615 | 0.4423 | 0.5784 | 0.4868 | 0.5100 |
| Gatech competition | 0.8750 | 0.8800 | 0.8696 | 0.8748 | 0.3077 | 0.3077 | 0.3077 | 0.4069 | 0.2970 | 0.3209 |
| GTBioASQsys2 | 0.8750 | 0.8889 | 0.8571 | 0.8730 | 0.3462 | 0.3462 | 0.3462 | 0.5182 | 0.4165 | 0.4399 |
| RMC_llama3_IA | 0.9167 | 0.9231 | 0.9091 | 0.9161 | 0.2308 | 0.2308 | 0.2308 | 0.2995 | 0.4270 | 0.2913 |
| Fleming-1 | 0.9583 | 0.9630 | 0.9524 | 0.9577 | 0.2308 | 0.2692 | 0.2404 | 0.5412 | 0.4309 | 0.4649 |
| Fleming-2 | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.2308 | 0.2692 | 0.2404 | 0.5412 | 0.4309 | 0.4649 |
| lasige-ku | 0.4167 | 0.1250 | 0.5625 | 0.3438 | 0.0385 | 0.0385 | 0.0385 | 0.0263 | 0.0105 | 0.0150 |
| IISR first submit | 0.9167 | 0.9333 | 0.8889 | 0.9111 | 0.3077 | 0.3077 | 0.3077 | 0.4316 | 0.3682 | 0.3779 |
| IISR 2nd submit | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.4615 | 0.4615 | 0.4615 | 0.6313 | 0.5367 | 0.5611 |
| IISR 3rd submit | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.3462 | 0.3462 | 0.3462 | 0.3492 | 0.2343 | 0.2655 |
| IISR 4th submit | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.4615 | 0.4615 | 0.4615 | 0.6228 | 0.5284 | 0.5460 |
| IISR 5th submit | 0.8750 | 0.8966 | 0.8421 | 0.8693 | 0.3462 | 0.3462 | 0.3462 | 0.5019 | 0.4046 | 0.4265 |
| dmiip2024 | 0.9167 | 0.9231 | 0.9091 | 0.9161 | 0.4615 | 0.5769 | 0.5128 | 0.4430 | 0.3768 | 0.3781 |
| dmiip2024_1 | 0.8750 | 0.8800 | 0.8696 | 0.8748 | 0.4231 | 0.6154 | 0.4949 | 0.4570 | 0.3900 | 0.3944 |
| dmiip2024_2 | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.1923 | 0.4615 | 0.3045 | 0.5377 | 0.4180 | 0.4270 |
| dmiip2024_3 | 0.8750 | 0.8966 | 0.8421 | 0.8693 | 0.3846 | 0.4615 | 0.4103 | 0.6717 | 0.5645 | 0.5949 |
| dmiip2024_4 | 0.4167 | - | 0.5882 | 0.2941 | 0.2692 | 0.5000 | 0.3782 | 0.4737 | 0.4186 | 0.3827 |
| IBE-LM ver1 | 0.5833 | 0.7368 | - | 0.3684 | 0.2308 | 0.4231 | 0.2897 | 0.0737 | 0.1139 | 0.0827 |
| IBE-LM ver2 | 0.5833 | 0.7368 | - | 0.3684 | 0.2692 | 0.3846 | 0.3077 | 0.1053 | 0.1329 | 0.1034 |
| IBE-LM ver3 | 0.5833 | 0.7368 | - | 0.3684 | 0.2692 | 0.5000 | 0.3654 | 0.1474 | 0.2092 | 0.1557 |
| IBE-LM ver4 | 0.5833 | 0.7368 | - | 0.3684 | 0.2692 | 0.5000 | 0.3538 | 0.1158 | 0.1493 | 0.1118 |
| IBE-LM ver 5 | 0.5833 | 0.7368 | - | 0.3684 | 0.2692 | 0.4231 | 0.3462 | 0.1368 | 0.1256 | 0.1126 |
| mibi_rag_5 | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.1923 | 0.1923 | 0.1923 | 0.2754 | 0.2226 | 0.2306 |
| mibi_rag_4 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.5385 | 0.5385 | 0.5385 | 0.6501 | 0.5301 | 0.5580 |
| mibi_rag_3 | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.1923 | 0.1923 | 0.1923 | 0.2860 | 0.2259 | 0.2356 |
| AUEB-System3 | 0.5833 | 0.7222 | 0.1667 | 0.4444 | 0.3846 | 0.5000 | 0.4359 | 0.4746 | 0.3516 | 0.3289 |
| extractive | 0.8750 | 0.9032 | 0.8235 | 0.8634 | 0.2692 | 0.3846 | 0.3109 | 0.1579 | 0.0239 | 0.0412 |
| abstractive | 0.9167 | 0.9286 | 0.9000 | 0.9143 | 0.1154 | 0.1154 | 0.1154 | 0.4211 | 0.0903 | 0.1422 |
| simple truncation | 0.9583 | 0.9630 | 0.9524 | 0.9577 | 0.3846 | 0.4231 | 0.4038 | 0.3684 | 0.0941 | 0.1447 |
| kmeans | 0.9583 | 0.9655 | 0.9474 | 0.9564 | 0.2308 | 0.2308 | 0.2308 | 0.2268 | 0.2014 | 0.2036 |
| similarity measures | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.0385 | 0.0385 | 0.0385 | - | - | - |
| BioASQ_Baseline | 0.3750 | 0.2105 | 0.4828 | 0.3466 | 0.1154 | 0.1923 | 0.1474 | 0.2338 | 0.2728 | 0.2208 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| mibi_rag_snippet | 0.3964 | 0.3004 | 0.3870 | 0.2853 | 4.65 | 4.68 | 4.38 | 4.76 |
| mibi_rag_abstract | 0.3877 | 0.2703 | 0.3791 | 0.2591 | 4.58 | 4.67 | 4.31 | 4.72 |
| AUEB-System1 | - | - | - | - | - | - | - | - |
| CPS | 0.2951 | 0.2620 | 0.2797 | 0.2453 | 4.29 | 4.07 | 3.91 | 4.39 |
| CPS2 | 0.2852 | 0.2468 | 0.2755 | 0.2339 | 4.33 | 3.95 | 3.94 | 4.49 |
| bioinfo-0 | 0.4268 | 0.1949 | 0.4315 | 0.1897 | 4.33 | 4.69 | 4.06 | 4.34 |
| bioinfo-1 | 0.2713 | 0.1959 | 0.2825 | 0.1965 | 4.44 | 4.52 | 4.15 | 4.58 |
| bioinfo-2 | 0.4004 | 0.1805 | 0.4054 | 0.1776 | 4.31 | 4.69 | 3.95 | 4.25 |
| bioinfo-3 | 0.2614 | 0.1754 | 0.2711 | 0.1779 | 4.45 | 4.56 | 4.09 | 4.61 |
| bioinfo-4 | 0.2854 | 0.2109 | 0.2888 | 0.2061 | 4.54 | 4.55 | 4.24 | 4.64 |
| Synthia with first | 0.3058 | 0.3157 | 0.2916 | 0.2975 | 4.54 | 4.28 | 4.39 | 4.65 |
| RMC_append_snippets | 0.3500 | 0.3476 | 0.3302 | 0.3258 | 4.55 | 4.51 | 4.45 | 4.60 |
| Fleming-3 | 0.4329 | 0.2144 | 0.4246 | 0.2025 | 4.49 | 4.72 | 3.98 | 4.39 |
| UR-IW-1 | 0.3035 | 0.2060 | 0.2960 | 0.1921 | 4.54 | 4.74 | 4.08 | 4.53 |
| UR-IW-2 | 0.4226 | 0.3369 | 0.4156 | 0.3250 | 4.54 | 4.56 | 4.32 | 4.59 |
| UR-IW-3 | 0.3767 | 0.2487 | 0.3765 | 0.2409 | 4.54 | 4.61 | 4.25 | 4.47 |
| UR-IW-4 | 0.3931 | 0.3280 | 0.3753 | 0.3093 | 4.61 | 4.58 | 4.38 | 4.64 |
| Gatech competition | 0.2530 | 0.2602 | 0.2380 | 0.2433 | 4.38 | 4.22 | 4.31 | 4.67 |
| GTBioASQsys2 | 0.2406 | 0.2380 | 0.2393 | 0.2308 | 4.40 | 4.22 | 4.26 | 4.71 |
| RMC_llama3_IA | 0.3167 | 0.2933 | 0.2967 | 0.2695 | 3.59 | 4.22 | 3.61 | 4.25 |
| Fleming-1 | 0.4329 | 0.2144 | 0.4246 | 0.2025 | 4.49 | 4.72 | 3.98 | 4.39 |
| Fleming-2 | 0.4329 | 0.2144 | 0.4246 | 0.2025 | 4.49 | 4.72 | 3.98 | 4.39 |
| lasige-ku | 0.1269 | 0.0749 | 0.1484 | 0.0859 | 3.53 | 2.75 | 2.94 | 3.93 |
| IISR first submit | 0.3526 | 0.2470 | 0.3551 | 0.2414 | 4.64 | 4.71 | 4.35 | 4.64 |
| IISR 2nd submit | 0.2881 | 0.2835 | 0.2816 | 0.2731 | 4.67 | 4.39 | 4.48 | 4.78 |
| IISR 3rd submit | 0.2973 | 0.2489 | 0.2949 | 0.2410 | 4.64 | 4.46 | 4.39 | 4.65 |
| IISR 4th submit | 0.3326 | 0.3165 | 0.3215 | 0.3008 | 4.72 | 4.49 | 4.52 | 4.76 |
| IISR 5th submit | 0.3644 | 0.2533 | 0.3684 | 0.2485 | 4.60 | 4.68 | 4.35 | 4.68 |
| dmiip2024 | 0.3229 | 0.3415 | 0.3078 | 0.3256 | 4.54 | 4.42 | 4.40 | 4.68 |
| dmiip2024_1 | 0.3186 | 0.3396 | 0.3048 | 0.3239 | 4.62 | 4.42 | 4.47 | 4.71 |
| dmiip2024_2 | 0.3092 | 0.3209 | 0.2968 | 0.3049 | 4.52 | 4.44 | 4.40 | 4.59 |
| dmiip2024_3 | 0.2354 | 0.2640 | 0.2293 | 0.2564 | 4.67 | 4.46 | 4.56 | 4.76 |
| dmiip2024_4 | 0.2956 | 0.3159 | 0.2782 | 0.2955 | 4.58 | 4.11 | 4.38 | 4.72 |
| IBE-LM ver1 | - | - | - | - | 1.21 | 1.15 | 1.25 | 1.46 |
| IBE-LM ver2 | - | - | - | - | 1.21 | 1.15 | 1.25 | 1.46 |
| IBE-LM ver3 | - | - | - | - | 1.21 | 1.15 | 1.25 | 1.46 |
| IBE-LM ver4 | - | - | - | - | 1.21 | 1.15 | 1.25 | 1.46 |
| IBE-LM ver 5 | - | - | - | - | 1.21 | 1.15 | 1.25 | 1.46 |
| mibi_rag_5 | 0.2544 | 0.2426 | 0.2433 | 0.2280 | 4.44 | 3.95 | 3.96 | 4.60 |
| mibi_rag_4 | 0.3172 | 0.3104 | 0.2981 | 0.2896 | 4.58 | 4.46 | 4.39 | 4.66 |
| mibi_rag_3 | 0.2499 | 0.2399 | 0.2407 | 0.2274 | 4.42 | 3.95 | 3.98 | 4.60 |
| AUEB-System3 | - | - | - | - | - | - | - | - |
| extractive | 0.0555 | 0.0353 | 0.0569 | 0.0353 | 0.80 | 0.80 | 0.72 | 0.82 |
| abstractive | 0.0580 | 0.0409 | 0.0575 | 0.0394 | 0.82 | 0.86 | 0.81 | 0.86 |
| simple truncation | 0.0502 | 0.0359 | 0.0509 | 0.0349 | 0.81 | 0.89 | 0.75 | 0.80 |
| kmeans | 0.0410 | 0.0394 | 0.0421 | 0.0391 | 0.74 | 0.75 | 0.68 | 0.81 |
| similarity measures | 0.0515 | 0.0337 | 0.0524 | 0.0337 | 0.78 | 0.82 | 0.71 | 0.76 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |
Test batch 4
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
| dmiip2024 | 0.9259 | 0.9444 | 0.8889 | 0.9167 | 0.5263 | 0.6842 | 0.5921 | 0.7632 | 0.4899 | 0.5583 |
| dmiip2024_1 | 0.8889 | 0.9143 | 0.8421 | 0.8782 | 0.5263 | 0.6842 | 0.5921 | 0.7756 | 0.5317 | 0.5911 |
| dmiip2024_2 | 0.9259 | 0.9474 | 0.8750 | 0.9112 | 0.6316 | 0.6316 | 0.6316 | 0.6002 | 0.4705 | 0.5116 |
| dmiip2024_3 | 0.9630 | 0.9730 | 0.9412 | 0.9571 | 0.6316 | 0.6316 | 0.6316 | 0.6761 | 0.4739 | 0.5342 |
| dmiip2024_4 | 0.2963 | - | 0.4571 | 0.2286 | 0.4211 | 0.6316 | 0.4842 | 0.4709 | 0.3571 | 0.3798 |
| bioinfo-0 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-1 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-2 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-3 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| bioinfo-4 | 0.7037 | 0.8261 | - | 0.4130 | - | - | - | - | - | - |
| Synthia with first | 0.8519 | 0.8947 | 0.7500 | 0.8224 | 0.3684 | 0.3684 | 0.3684 | 0.4000 | 0.3116 | 0.3251 |
| RMC_append_snippets | 0.9259 | 0.9474 | 0.8750 | 0.9112 | 0.5789 | 0.5789 | 0.5789 | 0.4634 | 0.3836 | 0.3983 |
| RMC_llama3_A | 0.8519 | 0.9000 | 0.7143 | 0.8071 | 0.0526 | 0.0526 | 0.0526 | 0.2680 | 0.3088 | 0.2413 |
| Fleming-1 | 0.9259 | 0.9444 | 0.8889 | 0.9167 | 0.5789 | 0.6316 | 0.6053 | 0.6932 | 0.5280 | 0.5767 |
| Fleming-2 | 0.9630 | 0.9730 | 0.9412 | 0.9571 | 0.5789 | 0.6316 | 0.6053 | 0.6932 | 0.5280 | 0.5767 |
| RMC_llama3_IA | 0.8519 | 0.8947 | 0.7500 | 0.8224 | 0.2105 | 0.2105 | 0.2105 | 0.2297 | 0.3838 | 0.2511 |
| AUEB-System3 | 0.7778 | 0.8636 | 0.4000 | 0.6318 | 0.5263 | 0.7368 | 0.6228 | 0.3646 | 0.2127 | 0.2398 |
| AUEB-System4 | 0.8148 | 0.8718 | 0.6667 | 0.7692 | 0.4211 | 0.5789 | 0.4868 | 0.2970 | 0.1439 | 0.1688 |
| IBE-LM ver1 | 0.7037 | 0.8261 | - | 0.4130 | 0.2632 | 0.3158 | 0.2763 | 0.1636 | 0.1895 | 0.1652 |
| IBE-LM ver2 | 0.7037 | 0.8261 | - | 0.4130 | 0.2105 | 0.2632 | 0.2368 | 0.2000 | 0.2126 | 0.1918 |
| IBE-LM ver3 | 0.7037 | 0.8261 | - | 0.4130 | 0.2105 | 0.2632 | 0.2368 | 0.1636 | 0.2009 | 0.1714 |
| IBE-LM ver4 | 0.7037 | 0.8261 | - | 0.4130 | 0.2105 | 0.2632 | 0.2368 | 0.1909 | 0.2133 | 0.1889 |
| IBE-LM ver 5 | 0.7037 | 0.8261 | - | 0.4130 | 0.1579 | 0.3158 | 0.2237 | 0.1364 | 0.1470 | 0.1290 |
| abstractive | 0.8889 | 0.9143 | 0.8421 | 0.8782 | 0.5789 | 0.6316 | 0.5965 | 0.5153 | 0.4160 | 0.4464 |
| similarity measures | 0.8519 | 0.9000 | 0.7143 | 0.8071 | 0.4737 | 0.6316 | 0.5368 | 0.3485 | 0.4141 | 0.3682 |
| extractive | 0.7778 | 0.8421 | 0.6250 | 0.7336 | 0.5263 | 0.5789 | 0.5395 | 0.4313 | 0.4708 | 0.4217 |
| mibi_rag_snippet | 0.4444 | 0.5161 | 0.3478 | 0.4320 | 0.1579 | 0.1579 | 0.1579 | 0.2343 | 0.1280 | 0.1574 |
| mibi_rag_abstract | 0.8889 | 0.9189 | 0.8235 | 0.8712 | 0.5263 | 0.5263 | 0.5263 | 0.6696 | 0.4502 | 0.5141 |
| mibi_rag_3 | 0.4444 | 0.5161 | 0.3478 | 0.4320 | 0.1579 | 0.1579 | 0.1579 | 0.2343 | 0.1280 | 0.1574 |
| mibi_rag_4 | 0.8889 | 0.9189 | 0.8235 | 0.8712 | 0.6316 | 0.6316 | 0.6316 | 0.6985 | 0.4736 | 0.5409 |
| mibi_rag_5 | 0.4815 | 0.5333 | 0.4167 | 0.4750 | 0.1579 | 0.1579 | 0.1579 | 0.2343 | 0.1280 | 0.1574 |
| IISR first submit | 0.8889 | 0.9189 | 0.8235 | 0.8712 | 0.6316 | 0.6842 | 0.6491 | 0.6295 | 0.5103 | 0.5426 |
| IISR 2nd submit | 0.9630 | 0.9730 | 0.9412 | 0.9571 | 0.6316 | 0.6842 | 0.6491 | 0.6416 | 0.4685 | 0.5210 |
| IISR 3rd submit | 0.9630 | 0.9730 | 0.9412 | 0.9571 | 0.6316 | 0.6842 | 0.6491 | 0.5824 | 0.3992 | 0.4555 |
| IISR 4th submit | 0.9259 | 0.9474 | 0.8750 | 0.9112 | 0.6316 | 0.6842 | 0.6491 | 0.6893 | 0.5560 | 0.5918 |
| IISR 5th submit | 0.9259 | 0.9444 | 0.8889 | 0.9167 | 0.4737 | 0.6316 | 0.5175 | 0.6104 | 0.4409 | 0.4928 |
| CPS | 0.7778 | 0.8421 | 0.6250 | 0.7336 | 0.2632 | 0.2632 | 0.2632 | 0.4356 | 0.2088 | 0.2647 |
| CPS2 | 0.7778 | 0.8500 | 0.5714 | 0.7107 | - | - | - | - | - | - |
| Gatech competition | 0.5926 | 0.6667 | 0.4762 | 0.5714 | 0.1579 | 0.1579 | 0.1579 | 0.1976 | 0.1231 | 0.1369 |
| GTBioASQsys2 | 0.4444 | 0.4444 | 0.4444 | 0.4444 | 0.1579 | 0.1579 | 0.1579 | 0.2619 | 0.2045 | 0.2201 |
| kmeans | 0.9259 | 0.9474 | 0.8750 | 0.9112 | 0.5263 | 0.5263 | 0.5263 | - | - | - |
| UR-IW-1 | 0.9259 | 0.9444 | 0.8889 | 0.9167 | 0.5789 | 0.6842 | 0.6228 | 0.4203 | 0.5185 | 0.4365 |
| UR-IW-3 | 0.5926 | 0.5926 | 0.5926 | 0.5926 | 0.4211 | 0.4211 | 0.4211 | 0.5073 | 0.4480 | 0.4472 |
| UR-IW-4 | 0.8519 | 0.8889 | 0.7778 | 0.8333 | 0.5789 | 0.6316 | 0.5965 | 0.5001 | 0.4150 | 0.4290 |
| UR-IW-5 | 0.8148 | 0.8571 | 0.7368 | 0.7970 | 0.6316 | 0.6316 | 0.6316 | 0.5296 | 0.4496 | 0.4660 |
| UR-IW-2 | 0.8889 | 0.9231 | 0.8000 | 0.8615 | 0.6842 | 0.7368 | 0.7105 | 0.6582 | 0.5951 | 0.5983 |
| Fleming-3 | 0.9259 | 0.9474 | 0.8750 | 0.9112 | 0.5789 | 0.6316 | 0.6053 | 0.6932 | 0.5280 | 0.5767 |
| lasige-ku | 0.4074 | 0.5294 | 0.2000 | 0.3647 | 0.1579 | 0.1579 | 0.1579 | - | - | - |
| BioASQ_Baseline | 0.3333 | 0.1818 | 0.4375 | 0.3097 | 0.0526 | 0.1579 | 0.1053 | 0.2459 | 0.2224 | 0.2179 |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
| dmiip2024 | 0.3089 | 0.3214 | 0.3032 | 0.3095 | 4.64 | 4.31 | 4.52 | 4.71 |
| dmiip2024_1 | 0.3276 | 0.3285 | 0.3218 | 0.3191 | 4.65 | 4.32 | 4.41 | 4.67 |
| dmiip2024_2 | 0.2752 | 0.2973 | 0.2657 | 0.2858 | 4.59 | 4.20 | 4.40 | 4.74 |
| dmiip2024_3 | 0.3018 | 0.3019 | 0.2969 | 0.2894 | 4.59 | 4.34 | 4.41 | 4.72 |
| dmiip2024_4 | 0.2971 | 0.2984 | 0.2856 | 0.2813 | 4.71 | 3.84 | 4.13 | 4.73 |
| bioinfo-0 | 0.4238 | 0.2060 | 0.4242 | 0.1984 | 4.40 | 4.61 | 3.99 | 4.39 |
| bioinfo-1 | 0.2662 | 0.1829 | 0.2711 | 0.1803 | 4.54 | 4.47 | 4.09 | 4.68 |
| bioinfo-2 | 0.4108 | 0.1915 | 0.4147 | 0.1866 | 4.46 | 4.53 | 3.85 | 4.45 |
| bioinfo-3 | 0.2423 | 0.1710 | 0.2410 | 0.1664 | 4.40 | 4.39 | 3.91 | 4.62 |
| bioinfo-4 | 0.2544 | 0.1839 | 0.2643 | 0.1829 | 4.55 | 4.45 | 4.07 | 4.73 |
| Synthia with first | 0.3194 | 0.3013 | 0.3168 | 0.2911 | 4.61 | 4.15 | 4.32 | 4.65 |
| RMC_append_snippets | 0.3594 | 0.3290 | 0.3516 | 0.3141 | 4.66 | 4.34 | 4.33 | 4.72 |
| RMC_llama3_A | 0.2579 | 0.2561 | 0.2529 | 0.2467 | 4.20 | 3.93 | 4.08 | 4.36 |
| Fleming-1 | 0.4364 | 0.2446 | 0.4384 | 0.2353 | 4.56 | 4.62 | 4.07 | 4.53 |
| Fleming-2 | 0.4364 | 0.2446 | 0.4384 | 0.2353 | 4.56 | 4.62 | 4.07 | 4.53 |
| RMC_llama3_IA | 0.3041 | 0.2717 | 0.3004 | 0.2597 | 3.74 | 4.34 | 3.42 | 4.25 |
| AUEB-System3 | - | - | - | - | - | - | - | - |
| AUEB-System4 | - | - | - | - | - | - | - | - |
| IBE-LM ver1 | - | - | - | - | 1.48 | 1.33 | 1.42 | 1.60 |
| IBE-LM ver2 | - | - | - | - | 1.48 | 1.33 | 1.42 | 1.60 |
| IBE-LM ver3 | - | - | - | - | 1.48 | 1.33 | 1.42 | 1.60 |
| IBE-LM ver4 | - | - | - | - | 1.48 | 1.33 | 1.42 | 1.60 |
| IBE-LM ver 5 | - | - | - | - | 1.48 | 1.33 | 1.42 | 1.60 |
| abstractive | 0.0818 | 0.0435 | 0.0824 | 0.0423 | 0.91 | 0.94 | 0.85 | 0.89 |
| similarity measures | 0.0741 | 0.0303 | 0.0760 | 0.0311 | 0.89 | 0.92 | 0.85 | 0.88 |
| extractive | 0.0786 | 0.0340 | 0.0794 | 0.0341 | 0.88 | 0.92 | 0.84 | 0.91 |
| mibi_rag_snippet | 0.2607 | 0.2482 | 0.2571 | 0.2408 | 4.58 | 3.58 | 3.88 | 4.66 |
| mibi_rag_abstract | 0.3596 | 0.3225 | 0.3565 | 0.3098 | 4.65 | 4.33 | 4.38 | 4.66 |
| mibi_rag_3 | 0.2603 | 0.2484 | 0.2570 | 0.2411 | 4.58 | 3.58 | 3.89 | 4.66 |
| mibi_rag_4 | 0.3375 | 0.3130 | 0.3279 | 0.2980 | 4.64 | 4.28 | 4.39 | 4.66 |
| mibi_rag_5 | 0.2701 | 0.2585 | 0.2681 | 0.2504 | 4.44 | 3.53 | 3.76 | 4.51 |
| IISR first submit | 0.3443 | 0.3331 | 0.3372 | 0.3184 | 4.73 | 4.44 | 4.47 | 4.73 |
| IISR 2nd submit | 0.3479 | 0.3341 | 0.3396 | 0.3190 | 4.71 | 4.42 | 4.49 | 4.79 |
| IISR 3rd submit | 0.3700 | 0.3001 | 0.3577 | 0.2837 | 4.65 | 4.46 | 4.35 | 4.68 |
| IISR 4th submit | 0.3480 | 0.3257 | 0.3363 | 0.3099 | 4.67 | 4.45 | 4.39 | 4.73 |
| IISR 5th submit | 0.3488 | 0.3385 | 0.3425 | 0.3244 | 4.72 | 4.46 | 4.52 | 4.76 |
| CPS | 0.3002 | 0.2646 | 0.3043 | 0.2608 | 4.39 | 3.88 | 3.82 | 4.53 |
| CPS2 | 0.2968 | 0.2430 | 0.2996 | 0.2388 | 4.39 | 3.73 | 3.66 | 4.54 |
| Gatech competition | 0.1400 | 0.1199 | 0.1526 | 0.1253 | 4.31 | 3.68 | 3.68 | 4.58 |
| GTBioASQsys2 | 0.1693 | 0.1444 | 0.1700 | 0.1417 | 4.36 | 3.58 | 3.64 | 4.61 |
| kmeans | 0.0376 | 0.0346 | 0.0407 | 0.0356 | 0.88 | 0.86 | 0.91 | 0.93 |
| UR-IW-1 | 0.3396 | 0.2745 | 0.3469 | 0.2668 | 4.76 | 4.75 | 4.39 | 4.72 |
| UR-IW-3 | 0.3171 | 0.2190 | 0.3209 | 0.2129 | 4.00 | 3.81 | 3.55 | 3.95 |
| UR-IW-4 | 0.3959 | 0.3263 | 0.3906 | 0.3096 | 4.67 | 4.52 | 4.35 | 4.66 |
| UR-IW-5 | 0.3842 | 0.2644 | 0.3861 | 0.2579 | 4.33 | 4.29 | 3.93 | 4.48 |
| UR-IW-2 | 0.3412 | 0.3067 | 0.3322 | 0.2921 | 4.66 | 4.60 | 4.46 | 4.72 |
| Fleming-3 | 0.4364 | 0.2446 | 0.4384 | 0.2353 | 4.56 | 4.62 | 4.07 | 4.53 |
| lasige-ku | 0.1123 | 0.0876 | 0.1346 | 0.1006 | 4.14 | 2.94 | 3.05 | 4.39 |
| BioASQ_Baseline | - | - | - | - | - | - | - | - |