BioASQ Participants Area
Task 14b: Test Results of Phase B
The test results are presented in separate tables for each type of annotation. The "System Description" of each system is used.The evaluation measures that are used in Task B are presented here .
Warning: For ideal answers, good ROUGE results do not always imply good manual scores.
Test batch 1
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
Test batch 2
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
Test batch 3
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |
Test batch 4
Exact Answers
| Yes/No | Factoid | List | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| System | Accuracy | F1 Yes | F1 No | Macro F1 | Strict Acc. | Lenient Acc. | MRR | Mean Prec. | Recall | F-Measure |
Ideal Answers
| Automatic scores (Rouge - R) | Manual scores | |||||||
|---|---|---|---|---|---|---|---|---|
| System | R-2 (Rec) | R-2 (F1) | R-SU4 (Rec) | R-SU4 (F1) | Readability | Recall | Precision | Repetition |