Welcome to our website. Our group, based in the Department of Chemistry and Materials Science at Aalto University, Finland is led as Principal Investigator by Dr. Miguel Caro. We carry out academic research on computer simulation of materials using atomistic models. Browse the website to find out more about our work.

Announcements and latest news
Looking for stable iron nanoparticlesMiguel CaroJune 26, 2023This research is published in R. Jana and M.A. Caro, “Searching for iron nanoparticles with a general-purpose Gaussian approximation potential”, Phys. Rev. B 107, 245421 (2023) [also available from the arXiv]. Reprinted figures are copyright (c) 2023 of the American Physical Society. The video is copyright (c) 2023 of M.A. Caro.
In the field of catalysis it is common to use rare metals because of their superior catalytic properties. For example, platinum and Pt-like metals show the best performance for water splitting, but are too scarce and expensive to be used for many industrial-scale purposes. Instead, research is intensifying on finding alternative solutions based on widely available and cheap materials, especially metallic compounds. (For an overview of different materials specifically for water splitting, see, e.g., the review by Wang et al.)
Of all metallic elements on the Earth’s crust, only aluminium is more abundant than iron. Both metals can be used for structural purposes. However, iron is easier to mine and can be used to make a huge variety of steel alloys with widely varying specifications. For these reasons, iron ore constitutes almost 95% of all industrially mined metal globally. Being an abundant and readily available commodity, the prospect of potentially replacing critical metals with iron is very attractive. This includes developing new Fe-based materials for catalysis.
Some of the main aspects (besides cost and availability) to consider when assessing the prospects of a catalyst material are 1) activity (how much product can we make with a given amount of electrical power), 2) selectivity (whether we make a single product or a mixture of products) and 3) stability (how long does the material and its properties last under operating conditions). For instance, a very active and selective material for the oxygen evolution reaction will not be useful in practice if it has a high tendency to corrode. In that regard, native iron surfaces are not particularly good catalysts. However, there are different ways how this can be tackled. One way to tune the properties of a material is via compositional engineering; i.e., by “alloying” two or more compounds we can produce a resulting compound with quantitatively or even qualitatively different properties compared to the precursors. Another way to tune these properties is by taking advantage of the structural diversity of a compound, because the catalytic activity of a material can be traced back to atomic-scale “active sites”, where the electrochemical reactions take place.
At ambient conditions, bulk (solid) iron has a body-centered cubic (bcc) structure, where every atom has 8 neighbors each at the same distance, and all atomic sites look the same. Iron surfaces have more diversity of atomic sites, depending on the cleavage plane and reconstruction effects. With very thin (nanoscale) surfaces, even the crystal structure can be modified from bcc to face-centered cubic (fcc). In surfaces, the exposed sites differ from those in the bulk, but are still relatively similar to one another (with a handful of characteristic available atomic motifs). However, when we move to nanoparticles (known as “nanoclusters”, when they are very small), ranging from a few to a few hundreds (or perhaps thousands) of atoms, the situation is significantly more complex. For small nanoparticles, the morphology of the available exposed atomic sites will depend very strongly on the size of the nanoparticle. And a single nanoparticle will itself display a relatively large variety of surface sites. And because the atomic environments of these sites are so different, so can also be their catalytic activity. Thus, active sites that are not available in the bulk can be present for the same material in its nanoscale form(s).
To understand and explore the diversity of atomic environments in nanoscale iron, we (mostly Richard Jana with some help from me) developed a new “general-purpose” machine learning potential (MLP) for iron and used it to generate “stable” (i.e., low-energy) iron nanoparticles. Iron is a particularly hard system for MLPs, because of the existence of magnetic degrees of freedom (related to the effective net spins around the iron atoms), in addition to the nuclear degrees of freedom (the “positions” of the atoms). Usually, MLPs (as well as traditional atomistic force fields) are designed to only account explicitly for the latter. For this reason, existing interatomic potentials have been developed to accurately describe the potential energy landscape of “normal” ferromagnetic iron (bcc iron, the stable form at ambient conditions), but fail for other forms, which are relevant at extreme thermodynamic conditions (high pressure and temperature) or at the nanoscale (nanoparticles). While our methodology is still incapable of explicitly accounting for magnetic degrees of freedom, by carefully crafting a general training database we managed to get our iron MLP to implicitly learn the energetics of structurally diverse forms of iron, and in particular managed to achieve very accurate results for small nanoparticles, where the flexibility of a general-purpose MLP is most needed.
We built a catalogue of iron nanoparticles from 3 to 200 atoms, and found structures that were lower in energy than many of those previously available in the literature. Using data clustering techiques, we could identify the most characteristic sites on the nanoparticle surfaces based on their morphological similarities. In the video below, you can see all the lowest-energy nanoparticles we found at each size (in the 3-200 atoms range) with their surface atoms colorcoded according to the 10 most characteristic motifs identified by our algorithm.
The reactivity of each site, e.g., how strongly it can bind an adsorbant, such as a hydrogen atom or a CO molecule, depends very strongly on the surroundings, especially the number of neighbor atoms and how they are arranged. For instance, surface atoms that are almost “buried” inside the nanoparticle are more stable (less reactive) than those which are sticking out and have only few neighbors. Sites that either bind adsorbants too strongly or not at all tend to have poor catalytic activity, whereas sites in between those are the most promising ones because they can transiently bind a reaction intermediate and subsequently release it, allowing the reaction to take place. We have made initial progress in drawing the connection between motif classification and activity based on the MLP predictions, as seen in the featured figure at the top of this page. Navigating this wealth of surface sites and thoroughly screening their potential to catalyze specific chemical reactions (with more application-specific ML models or with DFT) is the next logical step.
We hope that this work will stimulate further research into the catalytic properties of iron-based nanocatalysts and bring us one step closer to the cheap and sustainable development of electrocatalysts for industrial production of fuels and chemicals. [...]
Read more...
Understanding the structure of amorphous carbon and silicon with machine learning atomistic modelingMiguel CaroMarch 6, 2023When we think about semiconductors the first ones to come to mind are silicon, germanium and the III-Vs (GaAs, InP, AlN, GaN, etc.), in their crystalline forms. In fact, the degree of crystallinity in these materials often dictates the quality of the devices that can be made with them. As an example, dislocation densities as low as 1000/cm2 can negatively affect the performance of GaAs LEDs. For reference, within a perpendicular section of (111)-oriented bulk GaAs, that’s a bit less than one in a trillion (1012) “bad” primitive unit cells. Typical overall defect densities (e.g., including point defects) are much higher, but still the number of “good” atoms is thousands of times larger than the number of “bad” atoms. With this in mind, one might be boggled to find out amorphous semiconductors can have useful properties of their own, despite being the equivalent of a continuous network of crystallographic defects.
On the way to understand the properties of an amorphous material, our first pit stop is understanding its atomic-scale structure. For this purpose, computational atomistic modeling tools are particularly useful, since many of the techniques commonly used to study the atomistic structure of crystals are not applicable to amorphous materials, precisely because they rely on the periodicity of the crystal lattice. Unfortunately, one of the most used computational approaches for materials modeling, density functional theory (DFT), is computationally too expensive to study the sheer structural complexity in real amorphous materials, in particular long-range structure for which many thousands or even millions of atoms need to be considered.
The introduction and popularization in recent years of machine learning (ML) based atomistic modeling and, in particular, ML potentials (MLPs), has enabled for the first time realistic studies of amorphous semiconductors with accuracy close to that of DFT. As two of the most important such materials, amorphous silicon (a-Si) and amorphous carbon (a-C) have been the target of much of these early efforts.
In a recent Topical Review in Semiconductor Science and Technology (provided Open Access via this link), I have tried to summarize our attempts to understand the structure of a-C and a-Si and highlight how MLPs allow us to peek at the structure of these materials and draw the connection between this structure and the emerging properties. The discussion is accompanied by a general description of atomistic modeling of a-C and a-Si and a brief introduction to MLPs, and it could be interesting to the material scientist curious about the modeling of amorphous materials or the DFT practicioner curious about what MLPs can do that DFT can’t.
At this stage, the field is still evolving (fast!) and I expect(/hope) this review will become obsolete soon, as more accurate and CPU-efficient atomistic ML techniques become commonplace. Especially, I expect the description of a-Si and a-C to rapidly evolve from the study of pure samples to the more realistic (and chemically complex) materials, containing unintentional defects as well as chemical functionalization. All in all, I am excited to witness in which direction the field will steer in the next 5-10 years. I promise to do my part and wait at least that much before writing the next review on the topic! [...]
Read more...
Automated X-ray photoelectron spectroscopy (XPS) prediction for carbon-based materials: combining DFT, GW and machine learningMiguel CaroJuly 13, 2022The details of this work are now published (open access ) in Chem. Mater. and our automated prediction tool is available from nanocarbon.fi/xps.
Many popular experimental methods for determining the structure of materials rely on the periodic repetition of atomic arrangements present in crystals. A common example is X-ray diffraction. For amorphous materials, the lack of periodicity renders these methods impractical. Core-level spectroscopies, on the other hand, can give information about the distribution of atomic motifs in a material without the requirement of periodicity. For carbon-based materials, X-ray photoelectron spectroscopy (XPS) is arguably the most popular of these techniques.
In XPS a core electron is excited via absorption of incident light and ejected out of the sample. Core electrons occupy deep levels close to the atomic nuclei, for instance 1s states in oxygen and carbon atoms. They do not participate in chemical bonding since they are strongly localized around the nuclei and lie far deeper in energy than valence electrons. Because core electrons lie so deep in the potential energy well, energetic X-ray light is required to eject them out of the sample. In XPS, the light source is monochromatic, which means that all the X-ray photons have around the same energy, $h \nu$. When a core electron in the material absorbs one of these photons with enough energy to leave the sample, we can measure its kinetic energy and work out its binding energy (BE) as $\text{BE} = E_\text{kin} – h \nu$.
After collecting many of these individual measurements, a spectrum of BEs will appear, because each core electron has a BE that depends on its particular atomic environment. For instance, a core electron from a C atom bonded only to other C atoms has a lower BE than a C atom that is bonded also to one or more O atoms. And even the details of the bonding matter: a core electron from a “diamond-like” C atom (a C atom surrounded by four C neighbors) has a higher BE than that coming from a “graphite-like” C atom (which has only three neighbors). Therefore, the different features, or “peaks” in the spectrum can be traced back to the atomic environments from which core electrons are being excited, giving information about the atomic structure of the material. This is illustrated in the featured image of this post (the one at the top of this page).
What makes XPS so attractive for computational materials physicists and chemists like us is that it provides a direct link between simulation and experiment that can be exploited to 1) validate computer-generated model structures and 2) try to work out the detailed atomic structure of an experimental sample. 1) is more or less obvious; 2) is motivated by the fact that experimental analysis of XPS spectra is usually not straightforward because features that come from different atomic environments can overlap on the spectrum (i.e., they coincidentally occur at the same energy). In both cases, computational XPS prediction requires two things. First, a computer-generated atomic structure. Second, an electronic-structure method to compute core-electron binding energies.
While candidate structural models can be made with a variety of tools (a favorite of ours is molecular dynamics in combination with machine-learning force fields), a standing issue with computational XPS prediction is the accuracy of the core-electron BE calculation. Even BE calculations based on density-functional theory, the workhorse of modern ab initio materials modeling, lack satisfactory accuracy. A few years ago my colleague Dorothea Golze, with whom I used to share an office during our postdoc years at Aalto University, started to develop highly accurate techniques for core-electron BE determination based on a Green’s function approach, commonly referred to as the GW method. These GW calculations can yield core-electron BEs at unprecedented accuracy, albeit at great computational cost. In particular, applying this method to atomic systems with more than a hundred atoms is impractical due to CPU costs. This is where machine learning (ML) can come in handy.
Four years ago, around the time when Dorothea’s code was becoming “production ready”, I was just getting started in the world of ML potentials from kernel regression, using the Gaussian approximation potential (GAP) method developed by my colleagues Gábor Csányi and Albert Bartók ten years before. I also had prior experience doing XPS calculations based on DFT for amorphous carbon (a-C) systems. Then the connection was clear. GAPs work by constructing the total (potential) energy of the system as the optimal collection of individual energies. This approximation is not based on physics (local atomic energies are not a physical observable) but is necessary to keep GAPs computationally tractable. However, the core-electron BE is a local physical observable, and thus idally suited to be learned using the same mathematical tools that make GAP work. In essence, the BE is expressed as a combination of mathematical functions which feed on the details of the atomic environment of the atom in question (i.e., how the nearby atoms are arranged).
Coincidentally, it was also around that time that our national supercomputing center, CSC, was deploying the first of their current generation of supercomputers, Puhti. They opened a call for Grand Challenge proposals for computational problems that require formidable CPU resources. It immediately occurred to me that Dorothea and I should apply for this to generate high-quality GW data to train an ML model of XPS for carbon-based materials. Fortunately, we got the grant, worth around 12.5M CPUh, and set out to make automated XPS a reality.
But even formidable CPU resources are not enough to satisfy the ever-hungry GW method, so we had to be clever about how to construct an ML model which required as few GW data points as possible. We did this in two complementary ways. One the one hand, we used data-clustering techniques that we had previously employed to classify atomic motifs in a-C to select the most important (or characteristic) atomic environments out of a large database of computer-generated structures containing C, H and O (“CHO” materials). On the other hand, we came up with an ML model architecture which combined DFT and GW data. This is handy because DFT data is comparatively cheap to generate (it’s not cheap in absolute terms!) and we can learn the difference between a GW and a DFT calculation with a lot less data than we need to learn the GW predictions directly. So we can construct a baseline ML model using abundant DFT data and refine this model with scarce and precious GW data. And it works!
An overview of the database of CHO materials used to identify the optimal training points (triangles).
Four years and many millions of CPUh later, our freely available XPS Prediction Server is capable of producing an XPS spectrum within seconds for any CHO material, whenever the user can provide a computer-generated atomic structure in any of the common formats. Even better, these predicted spectra are remarkably close to those obtained experimentally. This opens the door for more systematic and reliable validation of computer-generated models of materials and a better integration of experimental and computational materials characterization.
We hope that these tools will become useful to others and to extend them to other material classes and spectroscopies in the near future. [...]
Read more...
GAP Developers & Users MeetingMiguel CaroApril 19, 2022It is my pleasure to announce the first GAP Developers & Users Meeting, to take place on 2nd-5th August 2022 here at Aalto University, coorganized by our group.
The Gaussian approximation potential (GAP) is a theoretical/methodological framework for predicting the energy and forces in a system of interacting atoms using machine learning. GAP is also the name of the code implementing GAP training and prediction capabilities.
The GAP Developers & Users Meeting will feature three sessions:
GAP School: fundamentals of theory and codeAdvances in Software and Methodology: what’s new?Applications in Molecular and Materials Modeling: exciting science enabled by GAP
For further information, a list of speakers, and to learn how to register, please follow the link to our official website:
D&U Meeting 2022 [...]
Read more...
Academy of Finland EuroHPC funding awarded to the groupMiguel CaroFebruary 5, 2022We have been awarded Academy of Finland funding within the framework of the EuroHPC research program, which aims to support the transition to (pre-)exascale high-performance computing (HPC) platforms and quantum computing, among others. Miguel Caro will lead the ExaFF (“Exascale-ready machine learning force fields“) consortium as Principal Investigator and Consortium Coordinator. This project is a collaboration with Andrea Sand’s group at Aalto University and CSC. The Academy of Finland has granted us 1,011,240 EUR for this project, out of which 435,108 EUR correspond to the Caro group.
The objective of the project is to improve the accuracy and speed of Gaussian approximation potentials through the development of the TurboGAP code and the underlying algorithms. One of the priorities is to adapt them to efficient GPU execution, with a focus on exploiting the computing power provided by the CSC-hosted LUMI pre-exascale HPC system. The faster code and new functionality will be used to model new materials for battery applications and semiconductors under heavy radiation environments (e.g., circuit components aboard satellites).
This is great news for our group, since it secures the resources necessary to continue the development of the TurboGAP code in the short term (the funding runs for 3 years, from 2022 until 2024).
See the funding decision on the Academy of Finland’s website. [...]
Read more...
Structure and properties of nanoporous carbonsMiguel CaroJanuary 4, 2022Nanoporous carbons are an emerging class of materials with important applications in energy storage. In particular, the ability of graphitic carbons to intercalate ions is exploited in commercial Li-ion batteries where the anode is (typically) made of graphite and Li ions become electrostatically bound to the carbon host as the battery is charged: electrons are stored in the graphitic matrix as part of an electron-ion pair. When the battery discharges, these electrons are injected into the external circuit (providing power) and the Li ion is released into the electrolyte traveling to the cathode, usually made of a transition-metal oxide, where it also intercalates. This cycle repeats itself as the battery is charged and discharged, with the Li ion traveling back and forth between the anode and cathode.
Li is relatively scarce on the Earth’s crust and the mid-to-long term supply of Li needed to cover the rapidly increasing demand for Li-ion batteries is in jeopardy. The Li-intercalation process could in principle also be applied to more Earth-abundant ions, like K and Na, thus providing a reduction in the cost of ion batteries and ensuring future supply of raw materials. These are required to scale up the use of ion batteries and to make it affordarble for domestic and industrial applications. Unfortunately, Na and K do not intercalate in graphite as favorably as Li does, with Na-intercalated graphite deemed thermodynamically unstable, and in all cases incurring strong dimensional changes between charge and discharge. These dimensional changes pose risks to the mechanical stability of the material and the device containing it, with the associated safety concerns.
Nanoporous carbons are an obvious alternative to graphite for ion intercalation because the pores, interstitial voids between disordered graphitic planes, can be made within a range of sizes, all larger than the usual interplanar spacing in graphite. Thus, nanoporous carbons can in principle accommodate larger ions, including Na and K, which motivates their study and structural characterization. However, as is often the case with amorphous and disordered materials, experimental characterization can be challenging, since common characterization techniques employed to characterize crystals cannot be used. In the case of nanoporous carbons, experimental characterization of pore sizes and shapes is very complicated.
Nanoporous carbons are made of interlinked graphitic planes, where sp3 motifs provide the material with three-dimensional rigidity not present in graphite. Reprinted from Wang et al. Chem. Mater. (2022).
With this background in mind, we decided to study the microscopic structure and mechanical properties of nanoporous carbon using state-of-the-art atomistic modeling techniques based on machine learning interatomic potentials. These techniques provide, for the first time, the required combination of accuracy and computational efficiency to study nanoporous carbons where the size of the simulation box does not constrain the size of the pores that can be studied. Our results are now published in Chemistry of Materials:
Y. Wang, Z. Fan, P. Qian, T. Ala-Nissila, M.A. Caro; Structure and Pore Size Distribution in Nanoporous Carbon. Chem. Mater. (2022). Link to journal’s website. Open Access PDF from the publisher.
We started out by training a Gaussian approximation potential (GAP) for carbon based on the database developed by Deringer and Csányi [Phys. Rev. B 95, 094203 (2017)]. This new potential [10.5281/zenodo.5243184] achieves better accuracy and speed than the earlier version and can accurately predict the defect formation energies in graphitic carbon.
Different accuracy tests on the new a-C GAP. Reprinted from Wang et al. Chem. Mater. (2022).
Getting the relative formation energies right is critical for obtaining the correct topology of the complicated network of carbon rings within curved graphitic sheets. In particular, the relative abundance of 5-rings and 7-rings will determine the curvature and thus pore morphology in the material.
With this new potential, we carried out large-scale simulations of graphitization with the TurboGAP code developed in our group, using a melt-graphitize-anneal protocol, akin to that by de Tomas et al. [Carbon 109, 681 (2016)], but now with larger systems (more than 130,000 atoms) and the accuracy provided by the new GAP.
Generation protocol for nanoporous carbon. Reprinted from Wang et al. Chem. Mater. (2022).
With these simulations we managed to generate realistic nanoporous carbon structures within a wide range of mass densities (0.5 to 1.7 g/cm3), and characterized in detail their short-, medium- and long-range order. For instance, these simulations reveal hexagonal motifs to be the dominant structural block in these materials (as expected) followed by 5-rings, then 7-rings and, in much smaller quantities, larger and smaller ring structures, with almost no density dependence for the most common motifs.
Ring-size distribution. Reprinted from Wang et al. Chem. Mater. (2022).
The pore sizes, the main target of this study, show clearly defined unimodal distributions determined by the overall mass density of the material. This means that the pore sizes and morphologies are relatively homogeneous for a given sample.
Pore-size distribution. Reprinted from Wang et al. Chem. Mater. (2022).
Finally, a useful result of our study is a library of nanoporous carbon structures freely available to the community and amenable to future studies on the properties of this interesting and important class of carbon materials.
This study would have not been possible without the hard work and dedication of our PhD student Yanzhou Wang and the help of the other coauthors, as well as the support provided by the Academy of Finland and the CPU time and other computational resources provided by CSC and Aalto University’s Science IT project. [...]
Read more...