MaterialsAtlas Synthesis Resource
LeMat-Synth: A Multi-Modal Toolbox for Curating Synthesis Procedure Databases
materials sciencesynthesis procedureslarge language modelsvision language modelsdata curationopen sourcedatabasemachine learning
This paper introduces LeMat-Synth, a multi-modal toolbox utilizing LLMs and VLMs to automatically extract synthesis procedures and performance data from scientific literature. It presents a dataset of 81,000 papers covering 35 synthesis methods and 16 material classes, structured by a materials science ontology. The toolbox includes an open-source software library for community extension, aiming to transform unstructured literature into machine-readable information for predictive modeling.