All deadlines are 11.59 pm UTC -12h (“Anywhere on Earth”).
New: This year, the workshop focuses on developing model evaluation and human evaluation strategies for multitasking, multilingual, and multimodal scenarios, with special consideration for low-resource and highly distant languages. Other key topics include designing evaluation metrics, creating adequate evaluation data, and reporting correct results.
Fair evaluations and comparisons are of fundamental importance to the NLP community to properly track progress, especially within the current deep learning revolution, with new state-of-the-art results reported in ever shorter intervals. This concerns the creation of benchmark datasets that cover typical use cases and blind spots of existing systems, the design of metrics for evaluating the performance of NLP systems along different dimensions, and the reporting of evaluation results in an unbiased manner.
While some workshops (e.g., Metrics Tasks at WMT, NeuralGen, HumEval, EvalNLGEval, GEM, and New Frontiers in Summarization) have tackled certain aspects of NLP evaluation, recent advancements have enabled models to be general purpose while handling multiple tasks (i.e, language understanding, summarization, dialogue, question answering, reasoning, etc.) across multiple languages and modalities. This progress has introduced challenges, such as the need for robust evaluation methods, diverse datasets, and reliable result reporting. There is a growing demand for evaluation strategies that address multitasking, multilingual, and multimodal scenarios. The first workshop in the series, Eval4NLP’20 (collocated with EMNLP’20), was the first workshop to take a broad and unifying perspective on the subject matter. The second (Eval4NLP’21 collocated with EMNLP’21), third (Eval4NLP’22 collocated with AACL’22) and fourth (Eval4NLP’24 collocated with AACL’23) workshop extended this perspective. The fifth Eval4NLP workshop aims to promote model evaluation and human evaluation strategies for these recent complex settings.
Further topics of interest of the workshop include (but not limited to):
See call for papers for more details. Further, reference papers here.
Email: eval4nlp@gmail.com