MaterialsAtlas Benchmark

MatQnA: A Benchmark Dataset for Multi-modal LLMs in Materials Characterization

materials sciencebenchmark datasetmulti-modal LLMscharacterizationXPSXRDSEMTEM

MatQnA is the first multi-modal benchmark dataset designed for evaluating large language models (LLMs) in materials characterization and analysis. It covers ten mainstream characterization methods and includes both multiple-choice and subjective question-answer pairs, constructed using a hybrid LLM and human-in-the-loop approach.

Citation: Weng, Yonghao, Liqiang Gao, Linwu Zhu, and Jian Huang. "MatQnA: A Benchmark Dataset for Multi-modal Large Language Models in Materials Characterization and Analysis." arXiv preprint arXiv:2509.11335 (2025).

TypeBenchmark
DomainCharacterization & Anallysis
LicenseNot specified
ContributorsNot specified