MaterialsAtlas Benchmark
MatQnA: A Benchmark Dataset for Multi-modal LLMs in Materials Characterization
materials sciencebenchmark datasetmulti-modal LLMscharacterizationXPSXRDSEMTEM
MatQnA is the first multi-modal benchmark dataset designed for evaluating large language models (LLMs) in materials characterization and analysis. It covers ten mainstream characterization methods and includes both multiple-choice and subjective question-answer pairs, constructed using a hybrid LLM and human-in-the-loop approach.
Citation: Weng, Yonghao, Liqiang Gao, Linwu Zhu, and Jian Huang. "MatQnA: A Benchmark Dataset for Multi-modal Large Language Models in Materials Characterization and Analysis." arXiv preprint arXiv:2509.11335 (2025).