BioASQ Participants Area
Task 7b: Test Results of Phase B
The test results are presented in separate tables for each type of annotation. The "System Description" of each system is used.
The evaluation measures that are used in Task B are presented
here .
Warning: For ideal answers, good ROUGE results do not always imply good manual scores.
Test batch 1
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
transfer-learning |
0.7931 |
0.8846 |
- |
0.4423 |
0.1795 |
0.2051 |
0.1880 |
- | - | - |
Lab Zhu ,Fdan Univer |
0.7931 |
0.8846 |
- |
0.4423 |
0.1538 |
0.2308 |
0.1923 |
0.2347 |
0.5748 |
0.3165 |
auth-qa-1 |
0.7586 |
0.8571 |
0.2222 |
0.5397 |
0.2564 |
0.3077 |
0.2778 |
0.1833 |
0.4877 |
0.2594 |
MQ-3 |
0.7931 |
0.8846 |
- |
0.4423 |
- | - | - |
- | - | - |
MQ-4 |
0.7931 |
0.8846 |
- |
0.4423 |
- | - | - |
- | - | - |
MQ-1 |
0.7931 |
0.8846 |
- |
0.4423 |
- | - | - |
- | - | - |
MQ-2 |
0.7931 |
0.8846 |
- |
0.4423 |
- | - | - |
- | - | - |
MQ-5 |
0.7931 |
0.8846 |
- |
0.4423 |
- | - | - |
- | - | - |
QA1 |
0.7931 |
0.8846 |
- |
0.4423 |
0.1538 |
0.2308 |
0.1761 |
- | - | - |
Lab Zhu,Fudan Univer |
0.7931 |
0.8846 |
- |
0.4423 |
0.2308 |
0.3077 |
0.2692 |
0.2417 |
0.6026 |
0.3276 |
LabZhu,FDU |
0.7931 |
0.8846 |
- |
0.4423 |
0.0256 |
0.1026 |
0.0577 |
0.1424 |
0.4855 |
0.2153 |
KU-DMIS-5 |
0.7931 |
0.8846 |
- |
0.4423 |
- | - | - |
- | - | - |
LabZhu-FDU |
0.7931 |
0.8846 |
- |
0.4423 |
0.0513 |
0.0769 |
0.0641 |
0.0417 |
0.1452 |
0.0620 |
LabZhu_FDU |
0.7931 |
0.8846 |
- |
0.4423 |
0.0769 |
0.1282 |
0.0962 |
0.1215 |
0.4230 |
0.1840 |
KU-DMIS-1 |
0.8276 |
0.8980 |
0.4444 |
0.6712 |
0.4103 |
0.5385 |
0.4637 |
0.4792 |
0.2476 |
0.3051 |
BJUTNLPGroup |
- |
- |
- |
- |
0.3077 |
0.4103 |
0.3483 |
0.1500 |
0.2411 |
0.1785 |
BioASQ_Baseline |
0.4828 |
0.5455 |
0.4000 |
0.4727 |
0.1282 |
0.2051 |
0.1547 |
0.2266 |
0.3460 |
0.2444 |
Ideal Answers
|
Automatic scores |
Manual scores |
System |
Rouge-2 |
Rouge-SU4 |
Readability |
Recall |
Precision |
Repetition |
transfer-learning |
- | - |
- |
- |
- |
- |
Lab Zhu ,Fdan Univer |
0.2964 |
0.2968 |
4.05 |
4.11 |
4.38 |
4.87 |
auth-qa-1 |
- | - |
- |
- |
- |
- |
MQ-3 |
0.5008 |
0.4973 |
3.89 |
4.44 |
4.09 |
4.36 |
MQ-4 |
0.5467 |
0.5443 |
3.86 |
4.47 |
4.05 |
4.26 |
MQ-1 |
0.5008 |
0.4973 |
3.89 |
4.44 |
4.09 |
4.36 |
MQ-2 |
0.5469 |
0.5370 |
3.87 |
4.55 |
4.14 |
4.33 |
MQ-5 |
0.4806 |
0.4838 |
3.91 |
4.44 |
4.08 |
4.40 |
QA1 |
- | - |
- |
- |
- |
- |
Lab Zhu,Fudan Univer |
0.2964 |
0.2968 |
4.05 |
4.11 |
4.38 |
4.87 |
LabZhu,FDU |
0.2964 |
0.2968 |
4.05 |
4.11 |
4.38 |
4.87 |
KU-DMIS-5 |
- | - |
- |
- |
- |
- |
LabZhu-FDU |
0.2964 |
0.2968 |
4.05 |
4.11 |
4.38 |
4.87 |
LabZhu_FDU |
0.2964 |
0.2968 |
4.05 |
4.11 |
4.38 |
4.87 |
KU-DMIS-1 |
- | - |
- |
- |
- |
- |
BJUTNLPGroup |
- | - |
- |
- |
- |
- |
BioASQ_Baseline |
- | - |
- |
- |
- |
- |
Test batch 2
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
auth-qa-1 |
0.6333 |
0.5926 |
0.6667 |
0.6296 |
0.2400 |
0.4000 |
0.3067 |
0.1549 |
0.5143 |
0.2321 |
auth-qa-2 |
0.5000 |
0.6341 |
0.2105 |
0.4223 |
0.2400 |
0.4000 |
0.3067 |
0.1549 |
0.5143 |
0.2321 |
transfer-learning |
0.5667 |
0.7234 |
- |
0.3617 |
0.2400 |
0.4400 |
0.3267 |
- | - | - |
limsi-reader |
0.5667 |
0.7234 |
- |
0.3617 |
0.1600 |
0.3200 |
0.2200 |
- | - | - |
Lab Zhu ,Fdan Univer |
0.5667 |
0.7234 |
- |
0.3617 |
0.0800 |
0.2000 |
0.1300 |
0.1232 |
0.4422 |
0.1865 |
limsi-reader-UMLS-r1 |
0.5667 |
0.7234 |
- |
0.3617 |
0.1600 |
0.3200 |
0.2100 |
- | - | - |
Lab Zhu,Fudan Univer |
0.5667 |
0.7234 |
- |
0.3617 |
0.1600 |
0.2800 |
0.2100 |
0.1732 |
0.4814 |
0.2450 |
LabZhu,FDU |
0.5667 |
0.7234 |
- |
0.3617 |
0.2000 |
0.3200 |
0.2500 |
0.1818 |
0.5108 |
0.2579 |
LabZhu_FDU |
0.5667 |
0.7234 |
- |
0.3617 |
0.0400 |
0.1200 |
0.0800 |
0.1103 |
0.1431 |
0.0998 |
MQ-1 |
0.5667 |
0.7234 |
- |
0.3617 |
- | - | - |
- | - | - |
MQ-2 |
0.5667 |
0.7234 |
- |
0.3617 |
- | - | - |
- | - | - |
QA1 |
0.5667 |
0.7234 |
- |
0.3617 |
0.3600 |
0.4800 |
0.4033 |
0.0471 |
0.2898 |
0.0786 |
MQ-3 |
0.5667 |
0.7234 |
- |
0.3617 |
- | - | - |
- | - | - |
MQ-4 |
0.5667 |
0.7234 |
- |
0.3617 |
- | - | - |
- | - | - |
MQ-5 |
0.5667 |
0.7234 |
- |
0.3617 |
- | - | - |
- | - | - |
LabZhu-FDU |
0.5667 |
0.7234 |
- |
0.3617 |
0.0400 |
0.1600 |
0.0900 |
0.1103 |
0.1431 |
0.0998 |
List only |
0.5667 |
0.7234 |
- |
0.3617 |
0.2400 |
0.3200 |
0.2733 |
- | - | - |
L2PS - DeepQA |
0.5667 |
0.7234 |
- |
0.3617 |
0.1600 |
0.3200 |
0.2333 |
- | - | - |
KU-DMIS-1 |
0.5667 |
0.6829 |
0.3158 |
0.4994 |
0.3200 |
0.6000 |
0.4367 |
0.5826 |
0.4839 |
0.4732 |
KU-DMIS-5 |
0.8333 |
0.8387 |
0.8276 |
0.8331 |
0.5200 |
0.6400 |
0.5667 |
0.5696 |
0.4368 |
0.4395 |
BioASQ_Baseline |
0.4667 |
0.2727 |
0.5789 |
0.4258 |
0.0800 |
0.2400 |
0.1367 |
0.1687 |
0.2954 |
0.1823 |
Ideal Answers
|
Automatic scores |
Manual scores |
System |
Rouge-2 |
Rouge-SU4 |
Readability |
Recall |
Precision |
Repetition |
auth-qa-1 |
- | - |
- |
- |
- |
- |
auth-qa-2 |
- | - |
- |
- |
- |
- |
transfer-learning |
- | - |
- |
- |
- |
- |
limsi-reader |
- | - |
- |
- |
- |
- |
Lab Zhu ,Fdan Univer |
0.3720 |
0.3771 |
4.32 |
4.37 |
4.54 |
4.90 |
limsi-reader-UMLS-r1 |
- | - |
- |
- |
- |
- |
Lab Zhu,Fudan Univer |
0.3720 |
0.3771 |
4.32 |
4.37 |
4.54 |
4.90 |
LabZhu,FDU |
0.3720 |
0.3771 |
4.32 |
4.37 |
4.54 |
4.90 |
LabZhu_FDU |
0.3720 |
0.3771 |
4.32 |
4.37 |
4.54 |
4.90 |
MQ-1 |
0.5102 |
0.5251 |
4.02 |
4.69 |
4.29 |
4.27 |
MQ-2 |
0.5120 |
0.5285 |
3.97 |
4.69 |
4.28 |
4.20 |
QA1 |
- | - |
- |
- |
- |
- |
MQ-3 |
0.5102 |
0.5251 |
4.02 |
4.69 |
4.29 |
4.27 |
MQ-4 |
0.5279 |
0.5456 |
4.00 |
4.67 |
4.28 |
4.25 |
MQ-5 |
0.4795 |
0.4989 |
3.95 |
4.69 |
4.31 |
4.25 |
LabZhu-FDU |
0.3720 |
0.3771 |
4.32 |
4.37 |
4.54 |
4.90 |
List only |
- | - |
- |
- |
- |
- |
L2PS - DeepQA |
- | - |
- |
- |
- |
- |
KU-DMIS-1 |
- | - |
- |
- |
- |
- |
KU-DMIS-5 |
- | - |
- |
- |
- |
- |
BioASQ_Baseline |
- | - |
- |
- |
- |
- |
Test batch 3
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
auth-qa-1 |
0.5652 |
0.6875 |
0.2857 |
0.4866 |
0.2759 |
0.4483 |
0.3420 |
0.1720 |
0.5338 |
0.2513 |
auth-qa-2 |
0.6957 |
0.8108 |
0.2222 |
0.5165 |
0.2759 |
0.4483 |
0.3420 |
0.1720 |
0.5338 |
0.2513 |
google-gold-input |
0.7826 |
0.8780 |
- |
0.4390 |
0.4138 |
0.6552 |
0.5023 |
- | - | - |
google-pred-input |
0.7826 |
0.8780 |
- |
0.4390 |
0.3448 |
0.5517 |
0.4322 |
- | - | - |
Lab Zhu ,Fdan Univer |
0.7826 |
0.8780 |
- |
0.4390 |
0.1379 |
0.3103 |
0.2023 |
0.1117 |
0.2689 |
0.1381 |
unipi-quokka-QA-1 |
0.8261 |
0.9000 |
0.3333 |
0.6167 |
- | - | - |
- | - | - |
QA1 |
0.7826 |
0.8780 |
- |
0.4390 |
0.4483 |
0.5862 |
0.5115 |
0.0780 |
0.4711 |
0.1297 |
UNCC_QA_1 |
0.7826 |
0.8780 |
- |
0.4390 |
0.4483 |
0.5862 |
0.5115 |
0.0780 |
0.4711 |
0.1297 |
unipi-quokka-QA-2 |
0.8696 |
0.9231 |
0.5714 |
0.7473 |
- | - | - |
- | - | - |
MQ-1 |
0.7826 |
0.8780 |
- |
0.4390 |
- | - | - |
- | - | - |
MQ-2 |
0.7826 |
0.8780 |
- |
0.4390 |
- | - | - |
- | - | - |
MQ-3 |
0.7826 |
0.8780 |
- |
0.4390 |
- | - | - |
- | - | - |
MQ-4 |
0.7826 |
0.8780 |
- |
0.4390 |
- | - | - |
- | - | - |
MQ-5 |
0.7826 |
0.8780 |
- |
0.4390 |
- | - | - |
- | - | - |
UNCC_QA2 |
0.7826 |
0.8780 |
- |
0.4390 |
0.4138 |
0.5862 |
0.4856 |
0.0680 |
0.3913 |
0.1121 |
UNCC_QA3 |
0.7826 |
0.8780 |
- |
0.4390 |
0.4138 |
0.5862 |
0.4943 |
0.0780 |
0.4711 |
0.1297 |
limsi-reader |
0.7826 |
0.8780 |
- |
0.4390 |
0.2414 |
0.4828 |
0.3213 |
- | - | - |
limsi-reader-UMLS-r1 |
0.7826 |
0.8780 |
- |
0.4390 |
0.2414 |
0.4828 |
0.3126 |
- | - | - |
KU-DMIS-1 |
0.6087 |
0.7429 |
0.1818 |
0.4623 |
0.3793 |
0.6207 |
0.4724 |
0.4267 |
0.3058 |
0.3298 |
Lab Zhu,Fudan Univer |
0.7826 |
0.8780 |
- |
0.4390 |
0.2759 |
0.4828 |
0.3621 |
0.1490 |
0.3489 |
0.1875 |
LabZhu,FDU |
0.7826 |
0.8780 |
- |
0.4390 |
0.3103 |
0.5172 |
0.3966 |
0.1847 |
0.3822 |
0.2261 |
LabZhu_FDU |
0.7826 |
0.8780 |
- |
0.4390 |
0.1379 |
0.2069 |
0.1667 |
0.0967 |
0.1649 |
0.1114 |
BJUTNLPGroup |
0.7826 |
0.8780 |
- |
0.4390 |
0.2759 |
0.3793 |
0.3011 |
0.0960 |
0.1791 |
0.1201 |
BioASQ_Baseline |
0.1739 |
- |
0.2963 |
0.1481 |
0.1034 |
0.1724 |
0.1322 |
0.1928 |
0.4080 |
0.2275 |
Ideal Answers
|
Automatic scores |
Manual scores |
System |
Rouge-2 |
Rouge-SU4 |
Readability |
Recall |
Precision |
Repetition |
auth-qa-1 |
- | - |
- |
- |
- |
- |
auth-qa-2 |
- | - |
- |
- |
- |
- |
google-gold-input |
- | - |
- |
- |
- |
- |
google-pred-input |
- | - |
- |
- |
- |
- |
Lab Zhu ,Fdan Univer |
0.2793 |
0.2827 |
4.13 |
4.17 |
4.53 |
4.93 |
unipi-quokka-QA-1 |
0.0525 |
0.0566 |
0.85 |
0.73 |
0.83 |
1.06 |
QA1 |
- | - |
- |
- |
- |
- |
UNCC_QA_1 |
- | - |
- |
- |
- |
- |
unipi-quokka-QA-2 |
0.0525 |
0.0566 |
0.85 |
0.73 |
0.83 |
1.06 |
MQ-1 |
0.4790 |
0.4850 |
3.79 |
4.54 |
4.29 |
4.23 |
MQ-2 |
0.5013 |
0.5086 |
3.78 |
4.60 |
4.30 |
4.17 |
MQ-3 |
0.4790 |
0.4850 |
3.79 |
4.54 |
4.29 |
4.23 |
MQ-4 |
0.5309 |
0.5344 |
3.79 |
4.58 |
4.25 |
4.21 |
MQ-5 |
0.4667 |
0.4737 |
3.84 |
4.54 |
4.29 |
4.22 |
UNCC_QA2 |
- | - |
- |
- |
- |
- |
UNCC_QA3 |
- | - |
- |
- |
- |
- |
limsi-reader |
- | - |
- |
- |
- |
- |
limsi-reader-UMLS-r1 |
- | - |
- |
- |
- |
- |
KU-DMIS-1 |
- | - |
- |
- |
- |
- |
Lab Zhu,Fudan Univer |
0.2793 |
0.2827 |
4.13 |
4.17 |
4.53 |
4.93 |
LabZhu,FDU |
0.2793 |
0.2827 |
4.13 |
4.17 |
4.53 |
4.93 |
LabZhu_FDU |
0.2793 |
0.2827 |
4.13 |
4.17 |
4.53 |
4.93 |
BJUTNLPGroup |
0.0942 |
0.0781 |
3.42 |
2.82 |
3.56 |
4.48 |
BioASQ_Baseline |
- | - |
- |
- |
- |
- |
Test batch 4
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
auth-qa-1 |
0.5217 |
0.6207 |
0.3529 |
0.4868 |
0.3235 |
0.4706 |
0.3725 |
0.1864 |
0.5302 |
0.2652 |
auth-qa-2 |
0.7391 |
0.8421 |
0.2500 |
0.5461 |
0.3235 |
0.4706 |
0.3725 |
0.1864 |
0.5302 |
0.2652 |
MQ-1 |
0.7391 |
0.8500 |
- |
0.4250 |
- | - | - |
- | - | - |
MQ-2 |
0.7391 |
0.8500 |
- |
0.4250 |
- | - | - |
- | - | - |
MQ-3 |
0.7391 |
0.8500 |
- |
0.4250 |
- | - | - |
- | - | - |
MQ-4 |
0.7391 |
0.8500 |
- |
0.4250 |
- | - | - |
- | - | - |
MQ-5 |
0.7391 |
0.8500 |
- |
0.4250 |
- | - | - |
- | - | - |
auth-qa-3 |
0.6522 |
0.7647 |
0.3333 |
0.5490 |
0.3235 |
0.4706 |
0.3725 |
0.1864 |
0.5302 |
0.2652 |
limsi-reader |
0.7391 |
0.8500 |
- |
0.4250 |
0.2941 |
0.5882 |
0.4054 |
- | - | - |
google-gold-input-ab |
0.7391 |
0.8500 |
- |
0.4250 |
0.4706 |
0.6471 |
0.5255 |
0.3236 |
0.5773 |
0.4001 |
google-pred-input |
0.7391 |
0.8500 |
- |
0.4250 |
0.3529 |
0.5882 |
0.4338 |
0.1385 |
0.2966 |
0.1806 |
google-gold-input-nq |
0.7391 |
0.8500 |
- |
0.4250 |
0.4706 |
0.5882 |
0.5132 |
0.3562 |
0.6000 |
0.4315 |
google-gold-input |
0.7391 |
0.8500 |
- |
0.4250 |
0.4706 |
0.7059 |
0.5495 |
0.3246 |
0.5805 |
0.4003 |
limsi-reader-UMLS-r1 |
0.7391 |
0.8500 |
- |
0.4250 |
0.2059 |
0.4412 |
0.2902 |
- | - | - |
Lab Zhu ,Fdan Univer |
0.7391 |
0.8500 |
- |
0.4250 |
0.2353 |
0.3235 |
0.2706 |
0.1098 |
0.3540 |
0.1623 |
UNCC_QA_1 |
0.6087 |
0.7097 |
0.4000 |
0.5548 |
0.4706 |
0.7353 |
0.5833 |
0.1087 |
0.6892 |
0.1843 |
BJUTNLPGroup |
0.7391 |
0.8500 |
- |
0.4250 |
0.2353 |
0.6471 |
0.3745 |
0.2273 |
0.3781 |
0.2755 |
BJUTNLPGroup_v2 |
0.7391 |
0.8500 |
- |
0.4250 |
0.2353 |
0.5000 |
0.3152 |
0.1000 |
0.1504 |
0.1169 |
FACTOIDS |
0.7391 |
0.8500 |
- |
0.4250 |
0.5294 |
0.7353 |
0.6103 |
0.1119 |
0.6957 |
0.1890 |
UNCC_QA3 |
0.7391 |
0.8500 |
- |
0.4250 |
0.5294 |
0.7353 |
0.6103 |
0.1087 |
0.6892 |
0.1843 |
L2PS - DeepQA |
0.7391 |
0.8500 |
- |
0.4250 |
0.2353 |
0.4118 |
0.2760 |
- | - | - |
List only |
0.7391 |
0.8500 |
- |
0.4250 |
0.0294 |
0.3235 |
0.1368 |
- | - | - |
Lab Zhu,Fudan Univer |
0.7391 |
0.8500 |
- |
0.4250 |
0.3235 |
0.4412 |
0.3686 |
0.2130 |
0.4600 |
0.2683 |
LabZhu,FDU |
0.7391 |
0.8500 |
- |
0.4250 |
0.4412 |
0.5588 |
0.4863 |
0.2753 |
0.4777 |
0.3192 |
limsi-reader-UMLS-r2 |
0.7391 |
0.8500 |
- |
0.4250 |
0.2059 |
0.5000 |
0.3309 |
- | - | - |
KU-DMIS-1 |
0.7391 |
0.8125 |
0.5714 |
0.6920 |
0.5882 |
0.8235 |
0.6912 |
0.4841 |
0.4937 |
0.4539 |
KU-DMIS-2 |
0.7391 |
0.8000 |
0.6250 |
0.7125 |
0.5882 |
0.8235 |
0.6863 |
0.3828 |
0.4369 |
0.3769 |
KU-DMIS-3 |
0.8261 |
0.8947 |
0.5000 |
0.6974 |
0.5588 |
0.7941 |
0.6593 |
0.5024 |
0.4141 |
0.4068 |
KU-DMIS-4 |
0.8696 |
0.9189 |
0.6667 |
0.7928 |
0.4706 |
0.7353 |
0.5696 |
0.5024 |
0.4141 |
0.4068 |
unipi-quokka-QA-1 |
0.7391 |
0.8421 |
0.2500 |
0.5461 |
0.2059 |
0.3824 |
0.2730 |
0.1442 |
0.6193 |
0.2163 |
unipi-quokka-QA-2 |
0.7391 |
0.8421 |
0.2500 |
0.5461 |
0.2059 |
0.3824 |
0.2730 |
0.1442 |
0.6193 |
0.2163 |
unipi-quokka-QA-3 |
0.8261 |
0.8889 |
0.6000 |
0.7444 |
0.2059 |
0.3824 |
0.2730 |
0.1442 |
0.6193 |
0.2163 |
unipi-quokka-QA-4 |
0.8696 |
0.9143 |
0.7273 |
0.8208 |
0.2059 |
0.3824 |
0.2730 |
0.1442 |
0.6193 |
0.2163 |
KU-DMIS-5 |
0.7391 |
0.8125 |
0.5714 |
0.6920 |
0.4706 |
0.7941 |
0.5990 |
0.3242 |
0.5813 |
0.3714 |
bioasq_experiments |
0.6087 |
0.6897 |
0.4706 |
0.5801 |
0.0588 |
0.0882 |
0.0735 |
0.0455 |
0.0303 |
0.0364 |
BioASQ_Baseline |
0.4348 |
0.4348 |
0.4348 |
0.4348 |
0.1765 |
0.4118 |
0.2534 |
0.1878 |
0.4225 |
0.2398 |
Ideal Answers
|
Automatic scores |
Manual scores |
System |
Rouge-2 |
Rouge-SU4 |
Readability |
Recall |
Precision |
Repetition |
auth-qa-1 |
- | - |
- |
- |
- |
- |
auth-qa-2 |
- | - |
- |
- |
- |
- |
MQ-1 |
0.4867 |
0.4976 |
3.98 |
4.52 |
4.24 |
4.30 |
MQ-2 |
0.5050 |
0.5159 |
3.93 |
4.63 |
4.24 |
4.23 |
MQ-3 |
0.4867 |
0.4976 |
3.98 |
4.52 |
4.24 |
4.30 |
MQ-4 |
0.5537 |
0.5618 |
4.04 |
4.58 |
4.22 |
4.30 |
MQ-5 |
0.4239 |
0.4340 |
4.09 |
4.56 |
4.38 |
4.45 |
auth-qa-3 |
- | - |
- |
- |
- |
- |
limsi-reader |
- | - |
- |
- |
- |
- |
google-gold-input-ab |
- | - |
- |
- |
- |
- |
google-pred-input |
- | - |
- |
- |
- |
- |
google-gold-input-nq |
- | - |
- |
- |
- |
- |
google-gold-input |
- | - |
- |
- |
- |
- |
limsi-reader-UMLS-r1 |
- | - |
- |
- |
- |
- |
Lab Zhu ,Fdan Univer |
0.3081 |
0.3241 |
4.24 |
4.34 |
4.47 |
4.92 |
UNCC_QA_1 |
- | - |
- |
- |
- |
- |
BJUTNLPGroup |
0.0433 |
0.0323 |
3.12 |
2.68 |
3.33 |
4.18 |
BJUTNLPGroup_v2 |
0.0996 |
0.0880 |
3.40 |
3.00 |
3.54 |
4.51 |
FACTOIDS |
- | - |
- |
- |
- |
- |
UNCC_QA3 |
- | - |
- |
- |
- |
- |
L2PS - DeepQA |
- | - |
- |
- |
- |
- |
List only |
- | - |
- |
- |
- |
- |
Lab Zhu,Fudan Univer |
0.3081 |
0.3241 |
4.24 |
4.34 |
4.47 |
4.92 |
LabZhu,FDU |
0.3081 |
0.3241 |
4.24 |
4.34 |
4.47 |
4.92 |
limsi-reader-UMLS-r2 |
- | - |
- |
- |
- |
- |
KU-DMIS-1 |
- | - |
- |
- |
- |
- |
KU-DMIS-2 |
- | - |
- |
- |
- |
- |
KU-DMIS-3 |
- | - |
- |
- |
- |
- |
KU-DMIS-4 |
- | - |
- |
- |
- |
- |
unipi-quokka-QA-1 |
0.3872 |
0.4012 |
3.98 |
4.07 |
4.04 |
4.55 |
unipi-quokka-QA-2 |
0.3872 |
0.4012 |
3.98 |
4.07 |
4.04 |
4.55 |
unipi-quokka-QA-3 |
0.3872 |
0.4012 |
3.98 |
4.07 |
4.04 |
4.55 |
unipi-quokka-QA-4 |
0.3872 |
0.4012 |
3.98 |
4.07 |
4.04 |
4.55 |
KU-DMIS-5 |
- | - |
- |
- |
- |
- |
bioasq_experiments |
0.4094 |
0.4240 |
3.44 |
4.20 |
3.92 |
4.54 |
BioASQ_Baseline |
- | - |
- |
- |
- |
- |
Test batch 5
Exact Answers
|
Yes/No |
Factoid |
List |
System |
Accuracy |
F1 Yes |
F1 No |
Macro F1 |
Strict Acc. |
Lenient Acc. |
MRR |
Mean Prec. |
Recall |
F-Measure |
auth-qa-1 |
0.5714 |
0.5161 |
0.6154 |
0.5658 |
0.0857 |
0.2286 |
0.1500 |
0.1361 |
0.4667 |
0.2075 |
auth-qa-2 |
0.5429 |
0.6667 |
0.2727 |
0.4697 |
0.0857 |
0.2286 |
0.1500 |
0.1361 |
0.4667 |
0.2075 |
Neural MSE |
- |
- |
- |
- |
- | - | - |
- | - | - |
First K sentences |
- |
- |
- |
- |
- | - | - |
- | - | - |
auth-qa-3 |
0.6286 |
0.7347 |
0.3810 |
0.5578 |
0.0857 |
0.2286 |
0.1500 |
0.1361 |
0.4667 |
0.2075 |
MQ-1 |
0.5429 |
0.7037 |
- |
0.3519 |
- | - | - |
- | - | - |
MQ-2 |
0.5429 |
0.7037 |
- |
0.3519 |
- | - | - |
- | - | - |
MQ-5 |
0.5429 |
0.7037 |
- |
0.3519 |
- | - | - |
- | - | - |
MQ-3 |
0.5429 |
0.7037 |
- |
0.3519 |
- | - | - |
- | - | - |
auth-qa-4 |
0.6286 |
0.6486 |
0.6061 |
0.6274 |
0.0857 |
0.2286 |
0.1500 |
0.1361 |
0.4667 |
0.2075 |
MQ-4 |
0.5429 |
0.7037 |
- |
0.3519 |
- | - | - |
- | - | - |
google-gold-input-ab |
0.7143 |
0.7727 |
0.6154 |
0.6941 |
0.2286 |
0.2857 |
0.2571 |
0.1774 |
0.4175 |
0.2415 |
google-pred-input |
0.6286 |
0.7347 |
0.3810 |
0.5578 |
0.1429 |
0.2857 |
0.2057 |
0.0863 |
0.2222 |
0.1222 |
google-gold-input-nq |
0.6286 |
0.7234 |
0.4348 |
0.5791 |
0.2857 |
0.3714 |
0.3057 |
0.2218 |
0.4452 |
0.2889 |
google-gold-input |
0.6571 |
0.7500 |
0.4545 |
0.6023 |
0.2857 |
0.3714 |
0.3167 |
0.2159 |
0.4452 |
0.2824 |
Lab Zhu ,Fdan Univer |
0.5429 |
0.7037 |
- |
0.3519 |
0.0286 |
0.0571 |
0.0381 |
0.1736 |
0.2925 |
0.1932 |
System B process |
- |
- |
- |
- |
- | - | - |
- | - | - |
BJUTNLPGroup |
0.5429 |
0.7037 |
- |
0.3519 |
0.2857 |
0.4000 |
0.3381 |
0.1667 |
0.2813 |
0.2020 |
UNCC_QA_1 |
0.4857 |
0.6538 |
- |
0.3269 |
0.2857 |
0.4286 |
0.3305 |
0.2051 |
0.5127 |
0.2862 |
UNCC_QA3 |
0.4857 |
0.6538 |
- |
0.3269 |
0.2286 |
0.3143 |
0.2643 |
0.2051 |
0.5127 |
0.2862 |
UNCC_QA2 |
0.5429 |
0.7037 |
- |
0.3519 |
0.2857 |
0.4286 |
0.3305 |
0.2051 |
0.5127 |
0.2862 |
System B+C proc |
- |
- |
- |
- |
- | - | - |
- | - | - |
System D+E process |
- |
- |
- |
- |
- | - | - |
- | - | - |
System384 B process |
- |
- |
- |
- |
- | - | - |
- | - | - |
System384 D+E proces |
- |
- |
- |
- |
- | - | - |
- | - | - |
limsi-reader |
0.5429 |
0.7037 |
- |
0.3519 |
0.2000 |
0.4000 |
0.2748 |
- | - | - |
Neural MSE Attention |
- |
- |
- |
- |
- | - | - |
- | - | - |
L2PS - DeepQA |
0.5429 |
0.7037 |
- |
0.3519 |
0.0571 |
0.2571 |
0.1257 |
- | - | - |
List only |
0.5429 |
0.7037 |
- |
0.3519 |
0.0857 |
0.2571 |
0.1352 |
- | - | - |
limsi-reader-UMLS-r2 |
0.5429 |
0.7037 |
- |
0.3519 |
0.2286 |
0.4000 |
0.2890 |
- | - | - |
KU-DMIS-1 |
0.8000 |
0.8444 |
0.7200 |
0.7822 |
0.2571 |
0.4571 |
0.3224 |
0.5236 |
0.3714 |
0.4202 |
Lab Zhu,Fudan Univer |
0.5429 |
0.7037 |
- |
0.3519 |
0.1143 |
0.1714 |
0.1381 |
0.2278 |
0.3091 |
0.2536 |
KU-DMIS-2 |
0.7429 |
0.8000 |
0.6400 |
0.7200 |
0.2571 |
0.4571 |
0.3271 |
0.5486 |
0.3992 |
0.4468 |
KU-DMIS-3 |
0.8286 |
0.8500 |
0.8000 |
0.8250 |
0.2857 |
0.4286 |
0.3452 |
0.5653 |
0.4131 |
0.4619 |
KU-DMIS-4 |
0.7429 |
0.7805 |
0.6897 |
0.7351 |
0.2286 |
0.4571 |
0.3238 |
0.5069 |
0.3575 |
0.4051 |
KU-DMIS-5 |
0.6571 |
0.7500 |
0.4545 |
0.6023 |
0.2857 |
0.5143 |
0.3638 |
0.5050 |
0.3714 |
0.4124 |
LabZhu,FDU |
0.5429 |
0.7037 |
- |
0.3519 |
0.2000 |
0.2571 |
0.2238 |
0.2347 |
0.3369 |
0.2647 |
LabZhu_FDU |
0.5429 |
0.7037 |
- |
0.3519 |
0.0000 |
0.0286 |
0.0095 |
0.1971 |
0.3417 |
0.2249 |
bioasq_experiments1 |
0.5143 |
0.6667 |
0.1053 |
0.3860 |
0.0571 |
0.1143 |
0.0714 |
0.1667 |
0.0694 |
0.0972 |
bioasq_experiments2 |
0.5143 |
0.6667 |
0.1053 |
0.3860 |
0.0000 |
0.0286 |
0.0057 |
0.1250 |
0.0556 |
0.0750 |
QA1 |
0.4857 |
0.6538 |
- |
0.3269 |
0.2571 |
0.3714 |
0.2938 |
0.2051 |
0.5127 |
0.2862 |
limsi-reader-UMLS-r1 |
0.5429 |
0.7037 |
- |
0.3519 |
0.0000 |
0.1429 |
0.0571 |
- | - | - |
unipi-quokka-QA-1 |
0.5429 |
0.6800 |
0.2000 |
0.4400 |
0.0857 |
0.1714 |
0.1152 |
0.1713 |
0.5873 |
0.2537 |
unipi-quokka-QA-2 |
0.5429 |
0.6800 |
0.2000 |
0.4400 |
0.0857 |
0.1714 |
0.1152 |
0.1713 |
0.5873 |
0.2537 |
unipi-quokka-QA-3 |
0.6857 |
0.7556 |
0.5600 |
0.6578 |
0.0857 |
0.1714 |
0.1152 |
0.1713 |
0.5873 |
0.2537 |
unipi-quokka-QA-4 |
0.7143 |
0.7727 |
0.6154 |
0.6941 |
0.0857 |
0.1714 |
0.1152 |
0.1713 |
0.5873 |
0.2537 |
unipi-quokka-QA-5 |
0.8000 |
0.8293 |
0.7586 |
0.7939 |
0.0857 |
0.1714 |
0.1152 |
0.1713 |
0.5873 |
0.2537 |
BioASQ_Baseline |
0.4857 |
0.3571 |
0.5714 |
0.4643 |
0.0571 |
0.1429 |
0.0867 |
0.2127 |
0.3619 |
0.2573 |
Ideal Answers
|
Automatic scores |
Manual scores |
System |
Rouge-2 |
Rouge-SU4 |
Readability |
Recall |
Precision |
Repetition |
auth-qa-1 |
- | - |
- |
- |
- |
- |
auth-qa-2 |
- | - |
- |
- |
- |
- |
Neural MSE |
0.3765 |
0.3735 |
4.36 |
4.80 |
4.30 |
4.42 |
First K sentences |
0.3553 |
0.3533 |
4.31 |
4.71 |
4.23 |
4.39 |
auth-qa-3 |
- | - |
- |
- |
- |
- |
MQ-1 |
0.5112 |
0.5142 |
4.31 |
4.74 |
4.20 |
4.37 |
MQ-2 |
0.5103 |
0.5126 |
4.27 |
4.76 |
4.21 |
4.33 |
MQ-5 |
0.4617 |
0.4611 |
4.40 |
4.66 |
4.21 |
4.44 |
MQ-3 |
0.5112 |
0.5142 |
4.31 |
4.74 |
4.20 |
4.37 |
auth-qa-4 |
- | - |
- |
- |
- |
- |
MQ-4 |
0.5198 |
0.5218 |
4.30 |
4.72 |
4.20 |
4.36 |
google-gold-input-ab |
- | - |
- |
- |
- |
- |
google-pred-input |
- | - |
- |
- |
- |
- |
google-gold-input-nq |
- | - |
- |
- |
- |
- |
google-gold-input |
- | - |
- |
- |
- |
- |
Lab Zhu ,Fdan Univer |
0.4093 |
0.4103 |
4.52 |
4.60 |
4.40 |
4.81 |
System B process |
0.1960 |
0.1930 |
3.99 |
3.41 |
3.58 |
4.68 |
BJUTNLPGroup |
0.0722 |
0.0575 |
3.16 |
2.81 |
3.06 |
3.88 |
UNCC_QA_1 |
- | - |
- |
- |
- |
- |
UNCC_QA3 |
- | - |
- |
- |
- |
- |
UNCC_QA2 |
- | - |
- |
- |
- |
- |
System B+C proc |
0.2066 |
0.2096 |
4.22 |
3.16 |
3.14 |
4.66 |
System D+E process |
0.1956 |
0.1858 |
4.10 |
3.36 |
3.52 |
4.82 |
System384 B process |
0.2173 |
0.2098 |
4.13 |
3.62 |
3.77 |
4.76 |
System384 D+E proces |
0.2053 |
0.1964 |
4.13 |
3.64 |
3.73 |
4.87 |
limsi-reader |
- | - |
- |
- |
- |
- |
Neural MSE Attention |
0.3765 |
0.3735 |
4.36 |
4.80 |
4.30 |
4.42 |
L2PS - DeepQA |
- | - |
- |
- |
- |
- |
List only |
- | - |
- |
- |
- |
- |
limsi-reader-UMLS-r2 |
- | - |
- |
- |
- |
- |
KU-DMIS-1 |
- | - |
- |
- |
- |
- |
Lab Zhu,Fudan Univer |
0.4093 |
0.4103 |
4.52 |
4.60 |
4.40 |
4.81 |
KU-DMIS-2 |
- | - |
- |
- |
- |
- |
KU-DMIS-3 |
- | - |
- |
- |
- |
- |
KU-DMIS-4 |
- | - |
- |
- |
- |
- |
KU-DMIS-5 |
- | - |
- |
- |
- |
- |
LabZhu,FDU |
0.4093 |
0.4103 |
4.52 |
4.60 |
4.40 |
4.81 |
LabZhu_FDU |
0.4093 |
0.4103 |
4.52 |
4.60 |
4.40 |
4.81 |
bioasq_experiments1 |
0.5056 |
0.5097 |
3.93 |
4.61 |
4.10 |
4.34 |
bioasq_experiments2 |
0.5056 |
0.5097 |
3.93 |
4.61 |
4.10 |
4.34 |
QA1 |
- | - |
- |
- |
- |
- |
limsi-reader-UMLS-r1 |
- | - |
- |
- |
- |
- |
unipi-quokka-QA-1 |
0.4242 |
0.4234 |
4.26 |
4.54 |
4.13 |
4.48 |
unipi-quokka-QA-2 |
0.4242 |
0.4234 |
4.26 |
4.54 |
4.13 |
4.48 |
unipi-quokka-QA-3 |
0.4242 |
0.4234 |
4.26 |
4.54 |
4.13 |
4.48 |
unipi-quokka-QA-4 |
0.4242 |
0.4234 |
4.26 |
4.54 |
4.13 |
4.48 |
unipi-quokka-QA-5 |
0.4242 |
0.4234 |
4.26 |
4.54 |
4.13 |
4.48 |
BioASQ_Baseline |
- | - |
- |
- |
- |
- |