BioASQ Participants Area
BioASQ - Task Synergy
Task Synergy 2025 will begin in January 2025!
What's new in Task Synergy 2025:
- The time interval between two successive rounds is set to two weeks.
- This year, the topic of the questions in the task Synergy 2025 will be open to any developing problem considering documents from the current version of PubMed that will be designated for each round.
What's different in Task Synergy compared to Task b?
- Experts will not extract or compose answers on their own, but they will only select from the ones submitted by the participating systems.
- In this task, the participants will receive positive and negative feedback based on their responses from the previous rounds. Based on this feedback, they can provide new responses for persisting questions that haven't been answered yet, as well as new questions or new versions of the same questions introduced in later rounds.
- In this task, the relevant documents should be from designated versions of PubMed.
- The submission format is identical to the one of task b, however, the responses for both relevant material (phase A in task b) and answers (phase B in task b) should be submitted in a single file. In addition, exact and ideal answer should only be submitted for questions marked as "ready to answer", from the second round onwards. No concepts and triples will be considered in the relevant material to retrieve.
Other notes on Task Synergy
- Please note that there will be no regular training data provided for this version of Task Synergy. Instead, expert feedback will be incrementally provided based on participant responses in each round. In addition, you can use the data from the previous version of Synergy and the training data for task b that are available here.
- In each round of this task you will be asked to return results based on a snapshot of PubMed. For a detailed list of all updates in subsequent versions, please consult the accompanying README file.
- We strongly encourage you to go through the guidelines and pay attention to the format of the answers you will submit.
- Instructions on how to download and/or search the designated resources of the Task: TBD
BioASQ Synergy Guidelines
In BioASQ Synergy, a number of questions will be posed by our group of experts and answering them from the designated version of the PubMed dataset will constitute the first round of the task.A selection of your results will be provided as gold standard (reference) items before the second round of questions, and can be used as training data for the second or any subsequent round. These results will be without provenance of which system(s) they were submitted by, but annotated by the group of experts.
This process, of providing annotated results (feedback) along with the persisting and/or new questions, will be applied before each round.
Please note that either before the first round, or at any subsequent point, the PubMed dataset can be used by the systems for training purposes.
In each round, BioASQ Synergy will provide test questions, in English, along with gold standard (reference) items to the questions of the previous round (from round 2 onwards), if any. The test questions are being constructed by a team of biomedical experts from around Europe.
Unlike classic BioASQ challenges, in BioASQ Synergy there will be no phases, but there will be Rounds. In each round we will release test questions and systems will respond with relevant articles and relevant snippets from the title or abstract of the relevant articles only. Please note, that in this version of the task, the fulltext of the articles, even if available, is not considered.
Questions may persist from round to round either intact (if the biomedical experts are unsatisfied from previous responses) or modified / versioned (if the biomedical experts are further informed from previous results). Two experts may pose the same question in a round, therefore some questions may be included in the testsets twice (with different ids). This is to capture the case that the two experts may have different feedback for the same response submimtted by the participating systems for the same question.
If a question is designated as "ready to answer" then systems will respond with exact answers (e.g., named entities in the case of factoid questions) and ideal answers (paragraph-sized summaries), both in English.
There will be a total of four rounds. Systems may participate in any or, ideally, all rounds.
The rest of the guidelines is organized in sections. You can expand a section by clicking on it.
+ Types of questions
- Yes/no questions: These are questions that, strictly speaking, require "yes" or "no" answers, though of course in practice longer answers will often be desirable. For example, "Do CpG islands colocalise with transcription start sites?" is a yes/no question.
- Factoid questions: These are questions that, strictly speaking, require a particular entity name (e.g., of a disease, drug, or gene), a number, or a similar short expression as an answer, though again a longer answer may be desirable in practice. For example, "Which virus is best known as the cause of infectious mononucleosis?" is a factoid question.
- List questions: These are questions that, strictly speaking, require a list of entity names (e.g., a list of gene names), numbers, or similar short expressions as an answer; again, in practice additional information may be desirable. For example, "Which are the Raf kinase inhibitors?" is a list question.
- Summary questions: These are questions that do not belong in any of the previous categories and can only be answered by producing a short text summarizing the most prominent relevant information. For example, "What is the treatment of infectious mononucleosis?" is a summary question.
+ Required Answers in Task Synergy
- A list of at most 10 relevant articles (documents) d i,1 , d i,2 , d i,3 ,... from the designated version of the MEDLINE/PubMed Dataset. The list should be ordered by decreasing confidence, i.e., d i,1 , should be the article that the system considers to be the most relevant to the question q 1 ,, d i,2 , should be the article that the system considers to be the second most relevant etc. A single article list will be returned per question and participating system. The returned article list will actually contain unique article identifiers .
- A list of at most 10 relevant text snippets s i,1 , s i,2 , s i,3 ,... from the title or abstract of the returned articles. Again, the list should be ordered by decreasing confidence. A single snippet list will be returned per question and participating system, and the list may contain any number (or no) snippets from any of the returned articles d i,1 , d i,2 , d i,3 ,... Each snippet will be represented by the unique identifier of the article it comes from, the identifier of the section the snippet starts in, the offset of the first character of the snippet in the section the snippet starts in, the identifier of the section the snippet ends in, and the offset of the last character of the snippet in the section the snippet ends in. The snippets themselves will also have to be returned (as strings).
- For each question designated as "ready to answer", each participating system may return an ideal answer , i.e., a paragraph-sized summary of relevant information. In the case of yes/no, factoid, and list questions, the systems may also return exact answers ; for summary questions, no exact answers will be returned.
Exact Answers
- For each yes/no question , the exact answer of each participating system will have to be either "yes" or "no". Although the value "" will be used in the feedback files for yes/no questions to indicate that they do not have a golden answer yet, in the submission files the value "" is not axcepted.
- For each factoid question , each participating system will have to return a list* of up to 5 entity names (e.g., up to 5 names of drugs), numbers, or similar short expressions, ordered by decreasing confidence.
- For each list question , each participating system will have to return a single list* of entity names, numbers, or similar short expressions, jointly taken to constitute a single answer (e.g., the most common symptoms of a disease). The returned list will have to contain no more than 100 entries of no more than 100 characters each.
- No exact answers will be returned for summary questions .
Ideal Answers
For each question (yes/no, factoid, list, summary), each participating system may also return an ideal answer , i.e., a single paragraph-sized text ideally summarizing the most relevant information from articles and snippets. Each returned "ideal" answer is intended to approximate a short text that a biomedical expert would write to answer the corresponding question (e.g., including prominent supportive information), whereas the "exact" answers are only "yes"/"no" responses, entity names or similar short expressions, or lists of entity names and similar short expressions; and there are no "exact" answers in the case of summary questions. The maximum allowed length of each "ideal" answer is 200 words.* Important note: Relevant material already inspected in previous rounds should not be re-submitted in later rounds. That is, specific documents, snippets available in the feedback file, should not be submitted again for the same persisting question (with the same question id). Such items will not be considered in the evaluation of later rounds. For answers, on the other hand, no such restriction holds. The participants are expected to submitt their complete exact and ideal answers, regardless of whether they are included in the feedback of previous rounds or not, in part or as a whole. This is because, the golden answer for a question may change from round to round, as new relevant material becomes available. Being able to indentify such changes is part of the challenge.
* Please consult section JSON format of the datasets for specific details regarding the needed submission format.
+ Test dataset and evaluation process
- Monday, Jan 13, 2025, 13:00 GMT: Questions of round 1 released, answers of round 1 due within 72 hours.
- Monday, Jan 27, 2025, 13:00 GMT: Questions of round 2 released, along with feedback for previous rounds, answers of round 2 due within 72 hours.
- Monday, Feb 10, 2025, 13:00 GMT: Questions of round 3 released, along with feedback for previous rounds, answers of round 3 due within 72 hours.
- Monday, Feb 24, 2025, 13:00 GMT: Questions of round 4 released, along with feedback for previous rounds, answers of round 4 due within 72 hours.
+ Designated resources for Synergy
The relevant snippets will have to be parts of the title or abstract of the relevant articles.
You can download the MEDLINE/PubMed database directly from NLM. In particular, the designated snapshot version for each round will consist of the Annual Baseline Repository for 2025 updated according to all the Daily Update Files released until the designated snapshot date of the round. In addition, a service will also be availble for keyword-searching the designated version of the MEDLINE/PubMed database and will be properly updated between rounds. Instructions on how to access the service that the organizers provide to search the designated resources are available here.
+ JSON format of the datasets
{ "questions": [ { "body": "Which diagnostic test is approved for coronavirus infection screening?", "type": "factoid", "id": "5e5b8170b761aafe09000010", "answerReady": true }, { "body": "Is the FIP virus thought to be a mutated strain for the Feline enteric Coronavirus?", "type": "yesno", "id": "5e3ebaa348dab47f2600000a", "answerReady": false }, { "body": "What animal is thought to be the host for the Coronavirus causing MERS?", "type": "factoid", "id": "5e2f4a8bfbd6abf43b00002a", "answerReady": true }, { "body": "Please list 2 human diseases caused by coronavirus.", "type": "list", "id": "5e2f43bafbd6abf43b000029", "answerReady": false } ] }
The JSON format for the subissions should include all the questions of the corresponding testet and each questions whoud have all the corresponding fields with adequeate values, based on the type of the question. If no responses are available or acceptable (i.e. "answer_ready": false) the fields may have empty values (e.g. [] for arrays and "" for strings)
The "documents" list should conaind the
pmid
as unique identifier for each article.
An submission file for the above testet would look like this:
{ "questions": [ { "body": "Which diagnostic test is approved for coronavirus infection screening?", "id": "5e5b8170b761aafe09000010", "type": "factoid", "documents": [ "34312178", "36781712", "26783383", ........ ........ ], "snippets": [ { "document": "34312178", "offsetInBeginSection": 0, "offsetInEndSection": 131, "text": "The most commonly used diagnostic tests during the COVID-19 pandemic are polymerase chain reaction (PCR) tests.", "beginSection": "abstract", "endSection": "abstract" }, { "document": "36781712", "offsetInBeginSection": 86, "offsetInEndSection": 172, "text": "PCR screening is the adopted diagnostic testing method for COVID-19 detection.", "beginSection": "abstract", "endSection": "abstract" }, { "document": "26783383", "offsetInBeginSection": 752, "offsetInEndSection": 814, "text": "Some viruses including human coronavirus have been isolated from patients with Kawasaki disease.", "beginSection": "abstract", "endSection": "abstract" }, ........ ........ ], "answer_ready": true, "ideal_answer": "Polymerase chain reaction (PCR) tests are the most commonly used diagnostic tests during the COVID-19 pandemic.", "exact_answer": [ [ "respiratory syndrome coronavirus" ], [ "east respiratory syndrome" ], ........ ........ ] }, ........ ........ ] }
Finally, this is an example of the JSON format that will be used for feedback between rounds.
{ "questions": [ { "body": "Which diagnostic test is approved for coronavirus infection screening?", "type": "factoid", "id": "5e5b8170b761aafe09000010", "documents": [ { "id": "34312178", "golden": true }, { "id": "36781712", "golden": true }, { "id": "26783383", "golden": false }, ........ ........ ], "snippets": [ { "document": "34312178", "offsetInBeginSection": 0, "offsetInEndSection": 131, "text": "The most commonly used diagnostic tests during the COVID-19 pandemic are polymerase chain reaction (PCR) tests.", "beginSection": "abstract", "endSection": "abstract", "golden": true }, { "document": "36781712", "offsetInBeginSection": 86, "offsetInEndSection": 172, "text": "PCR screening is the adopted diagnostic testing method for COVID-19 detection.", "beginSection": "abstract", "endSection": "abstract", "golden": true }, { "document": "26783383", "offsetInBeginSection": 752, "offsetInEndSection": 814, "text": "Some viruses including human coronavirus have been isolated from patients with Kawasaki disease.", "beginSection": "abstract", "endSection": "abstract", "golden": false }, ........ ........ ], "answer_ready": true, "ideal_answer": [ "Polymerase chain reaction (PCR) tests are the most commonly used diagnostic tests during the COVID-19 pandemic.", "PCR screening is the adopted diagnostic testing method for COVID-19 detection.", ........ ........ ], "exact_answer": [ "Polymerase chain reaction", "PCR", ........ ........ ] } ] }
In the case of factoid questions , the
"exact_answer"
field is a list of lists. In the submission files, each inner list (up to 5 inner lists are allowed) should contain the name of the entity (or number, or other similar short expression) sought by the question; No multiple names (synonyms) should be submitted for any entity, therefore each inner list should only contain one element. If the List contains more than one elements, only the first element will be taken into account for evaluation.
In the case of list questions , the
"exact_answer"
field is a list of lists. Each element of the outmost list is a list corresponding to one of the entities (or numbers, or other similar short expressions) seeked by the question. In the submission files, no multiple names (synonyms) should be submitted for any entity, therefore each inner list should only contain one element. If the List contains more than one elements, only the first element will be taken into account for evaluation.
If any of the seeked entities has multiple names (synonyms), the corresponding inner list should only contain * one * of them. In the following example the exact golden answer to the list question (in the feedback file) contains three entities and the second entity has two names, i.e, "influenza" and "grippe":
"exact_answer": [["pneumonia"], ["influenza", "grippe"], ["bronchitis"]]However, the submitted answer by the participants should be one of the following:
"exact_answer": [["pneumonia"], ["influenza"], ["bronchitis"]]
or "exact_answer": [["pneumonia"], ["grippe"], ["bronchitis"]]Since, golden answer contains both synonyms, both answers are equivalent for evaluation.
Also note that "ideal_answer" in feedback data is a list, so that all golden ideal answers are available for training, if many. However, each system is expected to submit only one ideal answer per question as a string, as described in JSON format above. When submitting results, participants will use the same JSON format. The "text" field in the elements of the snippet list is also required.
+ Systems
Attention: Trying to upload results without selecting a system will cause an error and the results will not be saved.
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition : George Tsatsaronis, Georgios Balikas, Prodromos Malakasiotis, Ioannis Partalas, Matthias Zschunke, Michael R Alvers, Dirk Weissenborn, Anastasia Krithara, Sergios Petridis, Dimitris Polychronopoulos, Yannis Almirantis, John Pavlopoulos, Nicolas Baskiotis, Patrick Gallinari, Thierry Artiéres, Axel Ngonga, Norman Heino, Eric Gaussier, Liliana Barrio-Alvers, Michael Schroeder, Ion Androutsopoulos and Georgios Paliouras, in BMC bioinformatics , 2015 (bib) .