A Japanese Corpus of Many Specialized Domains for Word Segmentation and Part-of-Speech Tagging. Shohei Higashiyama, Masao Ideuchi, Masao Utiyama, Yoshiaki Oida, Eiichiro Sumita
Assessing Resource-Performance Trade-off of Natural Language Models using Data Envelopment Analysis. Zachary Zhou, Alisha Zachariah, Devin Conathan, Jeffery Kline
From COMET to COMES – Can Summary Evaluation Benefit from Translation Evaluation? Mateusz Krubiński, Pavel Pecina
Better Smatch = Better Parser? AMR evaluation is not so simple anymore. Juri Opitz, Anette Frank
GLARE: Generative Left-to-right AdversaRial Examples. Ryan Andrew Chi, Nathan Kim, Patrick Liu, Zander Lack, Ethan A Chi
Random Text Perturbations Work, but not Always. Zhengxiang Wang
A Comparative Analysis of Stance Detection Approaches and Datasets. Parush Gera, Tempestt Neal
Why sentence similarity benchmark is not predictive of application-oriented task performance? Kaori Abe, Sho Yokoi, Tomoyuki Kajiwara, Kentaro Inui
Chat Translation Error Detection for Assisting Cross-lingual Communications. Yunmeng Li, Jun Suzuki, Makoto Morishita, Kaori Abe, Ryoko Tokuhisa, Ana Brassard, Kentaro Inui
Evaluating the role of non-lexical markers in GPT-2’s language modeling behavior. Roberta Rocca
Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset. Guanyi Chen, Fahime Same, Kees Van Deemter