Description
Large language models (LLMs) are emerging as a new interface between researchers, scientific data, and computational tools. In materials science, they offer opportunities to simplify access to complex workflows, accelerate data-driven research, and support inverse materials design. However, the reliability and scientific utility of LLMs depend critically on the availability of standardized workflows and well-structured research data.
In this contribution, we present LangSim1, an LLM-based interface for materials simulation workflows built on the pyiron workflow framework2. Rather than generating simulation code directly, LLM agents interact with validated scientific workflows, enabling robust execution of simulations and automated analysis of materials properties. Beyond forward simulations, these workflows can be combined with statistical and machine-learning models to identify candidate materials that satisfy target property requirements.
A key enabler for this approach is the standardization of workflows through the Python Workflow Definition (PWD)3, an interoperable workflow representation that supports workflow exchange between pyiron, jobflow, and AiiDA. By separating scientific intent from implementation details, PWD provides a structured interface between workflows, research data, and AI agents, improving reproducibility, interoperability, and reuse.
Our results highlight how standardized workflows can serve as a foundation for AI-assisted materials research. By linking research data to the workflows that generated it and exposing these workflows through natural-language interfaces, LLM agents can help researchers access, combine, and automate computational tools while maintaining transparency and reproducibility. This provides a pathway towards integrating AI agents with broader materials research infrastructures, autonomous laboratories, and digital twins.