Accelerating Material Inverse Design in High-Dimensional Continuous Spaces with Autonomous Machine Learning Ecosystems
thesis
posted on 2025-08-01, 00:00authored bySuvo Banik
Modeling and designing materials at the nanoscale is essential for advancing new technologies in crucial areas such as energy, electronics, and catalysis. By gaining control over structure at the atomic level, we can now pursue inverse design, where the goal is to start from a desired property and determine the atomic structures of materials that can achieve it. However, this process is far from straightforward. The design space is vast, complex, and often difficult to explore. Traditional physics-based methods such as Density Functional Theory (DFT) and Molecular Dynamics (MD) continue to provide high accuracy in exploring the material design space at the nanoscale, but they are typically too slow or computationally expensive when applied to the enormous range of possible material compositions and configurations. In recent years, machine learning (ML) and artificial intelligence (AI) have emerged as promising tools. These techniques can create fast, predictive models and generate new design strategies, helping reduce computational cost while still delivering accurate and robust predictions. Despite these advances, building a fully autonomous system for materials discovery remains a major challenge. It requires the integration of multiple steps, including conceiving a material idea and searching for optimal candidates, gathering quality data for efficient learning, learning meaningful physical patterns, constructing reliable models, evaluating phase stability, and identifying viable synthesis routes, all within a coherent and intelligent workflow.
This thesis takes on these challenges by creating and applying machine learning frameworks across major computational materials research themes crucial for building an effective and sustainable materials ecosystem. First, it explores smart ways to sample large design spaces of materials using reinforcement learning. This is applied to both discrete problems, such as optimizing defects in 2D materials, and continuous problems, such as predicting crystal structures to inverse design superhard materials or tuning mechanical parts as a continuum-scale modeling problem. Tests on standard benchmark problems show that the proposed approach often performs better and faster, especially in complex non-convex search spaces with many local optima. Second, the thesis introduces learning methods for generating representations of atomic environments. These representations are a key necessity for a machine learning or AI model to effectively learn the physics behind any physical phenomenon in materials. A graph-based neural network architecture is introduced that learns symmetry-aware, noise-insusceptible, unique, and robust features for material characterization. These features help us better understand phase changes and link structure to properties in materials such as porous frameworks like zeolites, nanoparticles, and supercritical fluids. Third, the thesis delves into the development of foundational models and their testing, particularly through the use of machine learning-based interatomic potentials, including Gaussian Approximation Potentials (GAP) for elemental nanoclusters and more advanced equivariant models such as MACE for 2D transition metal dichalcogenides. These models can predict energy, forces, and structural behavior very accurately, offering faster and more scalable alternatives to traditional physics-based approaches. These models can serve as accurate and computationally efficient predictors for materials properties across a wide thermodynamic window, in an efficient and scalable manner. Fourth and finally, the thesis explores the application of these interatomic models to understand and implement molecular simulations for studying complex phase behaviors and dynamics at the nanoscale—such as metastable phase formation in amorphous ice, low-friction (superlubric) transitions, changes in grain boundaries in ice films, and metal–insulator transitions in certain oxides. Using symbolic regression and data-driven techniques, the thesis extracts simple and interpretable models that explain and predict these complex behaviors.
Building on these capabilities, the final section of this thesis envisions autonomous materials design and discovery ecosystem. At its core is the integration of large language model (LLM) as AI agents serving as cognitive orchestrators—able to parse literature, generate simulation workflows, reason across symbolic and numerical domains, and iteratively improve based on experimental or computational feedback. These agents go beyond automation to function as scientific collaborators, supporting hypothesis generation, synthesis planning, and cross-domain optimization. Proposed applications include explainable phase prediction in zeolites, optimization of high-entropy alloys, and reinforcement learning-based recipe generation for autonomous synthesis. By interfacing with simulations, databases, and experimental data, these agents can enable real-time, closed-loop materials discovery.
History
Language
en
Advisor
Subramanian KRS Sankaranarayanan
Department
Mechanical and Industrial Engineering
Degree Grantor
University of Illinois Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Constantine M. Megaridis
Valeria Molinero
Pierre Darancet
Sushant Anand