MaterialsAtlas Benchmark
MatSciBench: Materials Science Reasoning Benchmark
materials sciencebenchmarklarge language modelsreasoningquantitative analysissymbolic analysismultimodal analysisAI evaluation
MatSciBench is a college-level benchmark designed to evaluate the reasoning capabilities of large language models in materials science. It includes 1,340 problems covering quantitative, symbolic, and multimodal question answering, with reference solutions available.
Citation: Zhang, Junkai, Jingru Gan, Xiaoxuan Wang, Zian Jia, Changquan Gu, Jianpeng Chen, Yanqiao Zhu et al. "MatSciBench: Benchmarking the Reasoning Ability of Large Language Models in Materials Science." arXiv preprint arXiv:2510.12171 (2025).