Our latest work (of a long list) on amorphous carbon simulation, titled “Machine learning driven simulated deposition of carbon films: From low-density to diamondlike amorphous carbon” just appeared in Physical Review B (https://doi.org/10.1103/PhysRevB.102.174201, check also on arXiv if you don’t have an APS subscription).
This paper is the product of a lot of work, spanning three and a half years, on identifying the growth mechanisms (yes, mechanisms) and characterizing the structure of amorphous carbon (a-C) throughout different densities. I thought it would be suiting to give a summary of how our simulations have contributed to understanding a-C, and how machine learning (ML) potentials have played a pivotal role in reaching our current level of understanding, beyond what was possible before ML simulation made an appearance in the arena of molecular and materials modeling.
Amorphous carbon is a disordered metastable form of elemental carbon (although it can also be doped, intentionally or unintentionally, with other elements, most notably hydrogen). As a testament to the incredible flexibility of C to form chemical bonds (which is at the root of the sheer complexity of organic molecules and life itself), a-C is made up of a mixture of C atoms with different environments: sp (as in acetylene), sp2 (as in graphite) and sp3 (as in diamond), depending on how many neighbors each C atom has. This material is of high interest in research and industry because its mechanical and electronic properties can be tuned between those of graphene/graphite and diamond, by adjusting the sp2/sp3 ratio.
The structure of a-C (both the atomic and electronic structures, actually), has been under debate since the 1970s-1980s. In particular, scientists have been intrigued by how the high-density form, also referred to as tetrahedral a-C (ta-C) attains a diamondlike structure. This time frame corresponds to the early days of molecular modeling, and thus a-C has been a target for all sorts of computational studies since the 1970s and 1980s. As an anecdote, one of the first (if not the first, and also highly cited) paper on the atomic and electronic structure of a-C, based on tight-binding calculations, was coauthored by John Robertson (the a-C guru, whose review paper is a reference manual in the field, albeit a bit outdated now) and my PhD supervisor Eoin O’Reilly, while they were working in Cambridge on a-C back in the 1980s (I asked Eoin and he confirmed he was working on a-C already before I was born). What is funny about this fact is that I only started working on a-C once I left Eoin’s group and came to Aalto University in 2013.
At Aalto, the work at Tomi Laurila‘s group focused (and still does) on making electrodes coated with a-C for detection of biomolecules. For this application, understanding the surface structure and chemistry of a-C is very important. Back in 2013, the simulation work was done in collaboration with Olga Lopez-Acevedo, and Rémi Zoubkoff was the postdoc doing the heavy lifting on a-C modeling when I arrived. Rémi was trying to use tight-binding (TB) molecular dynamics (MD) for melt-quench simulation of a-C, a simulation method where a-C is generated by MD, by rapidly cooling down (quenching) a liquid C sample. He had lots of trouble with his approach because of 5-fold coordinated complexes (5-c) predicted by TB (funny in retrospect, since we spent so much time dealing with characterization of 5-c environments in the new paper). Since also DFT-based melt-quench simulations had trouble with the holly grail of ta-C modeling, predicting very high (> 80%) sp3 fractions for high-density samples, I started my postdoc by doing some DFT-based generation of ta-C structures using a different method, based on geometry relaxation followed by pressure correction. Those simulations gave pretty good results for the structure of ta-C, in comparison to experiment (see this paper and this paper), but still did not resolve the issue of how ta-C grows to be similar to diamond. Experimentally, ta-C is not grown by melt-quench, but by deposition (atoms get thrown at a substrate, using a cathodic arc or some other experimental apparatus). But those simulations, which were carried out first by Nigel Marks in 2005 with his carbon version of the EDIP potential, were completely out of reach for DFT, because of computational costs. And unfortunately Nigel’s simulations failed to reproduce the high sp3 fractions observed experimentally in ta-C films.
So we’re now at an impasse: DFT is too expensive to do deposition, but simulation of the deposition process would be the only way to elucidate the growth mechanism. So I gave up and moved away from the surface side, started looking at the electrolyte side of the electrochemistry problem (that’s when I got interested in the 2PT method and free energy calculations, see this and this papers), and forgot about the structure of a-C for a while. But then, when I was working on the computational and theoretical part of our carbon materials review, in 2016-2017, I came across a new (to me) method based on machine learning to model the interatomic interactions in carbon. There was an arXiv preprint (now also in Physical Review B) by Volker Deringer and Gábor Csányi on a so-called Gaussian approximation potential (GAP) for a-C. I was preparing a comparison between different simulation methods for the review (see below), and Volker’s paper was missing some detail I was interested in (I think it was bulk moduli).
So I sent an email to Volker and he replied with the information I was after, but he also told me that he would be coming to Aalto for a conference in early 2017, and why not discuss in person about this a-C simulation business. Volker’s talk at the conference and a chat with him at the canteen afterwards was the first I heard about ML potentials, and their ability to accurately deal with interatomic interactions at a fraction of the computational cost of DFT. I got super excited about it, and proposed during our chat to do deposition simulations of ta-C with the new GAP. He would not say, but I am pretty sure from his expression that Volker thought this was a completely crazy idea. Mind you, while a lot cheaper than DFT, GAP simulations are still significantly expensive. Fortunately, in Finland we have excellent high performance computing (HPC) resources for research, provided by CSC.
So when Volker returned to Cambridge he sent me the files and showed me how to use GAP in combination with LAMMPS. I then started doing the deposition simulations at three different energies: 20, 60 and 100 eV. They progressed incredibly slow on CSC’s former supercluster Taito (now replaced by Puhti). I then moved them to Sisu, CSC’s former supercomputer (now replaced by Mahti), and got better scaling. But still, these simulations progressed incredibly slow, because deposition is intrinsically sequential (one deposition event followed by another). One needs to run the impact event with small time steps, since the incident atom is initially traveling so fast, and then the excess kinetic energy needs to be removed from the substrate by equilibration. And repeat. Many times. One example of this process is shown in the video below.
I went to visit Volker and Gábor in Cambridge during the summer of 2017, while these calculations where still running (painfully slowly), to discuss ta-C modeling. I remember getting excited every day around that time, as the bulklike portion of the film kept forming and it looked like we were going to hit the previously unattainable 90% mark… It took 3 months of continuous runs on Sisu (and about 3M CPUh) to get these films to grow. You can watch them grow in the movie below. More videos of deposition at different energies and the resulting structures are available from Zenodo.
We submitted this paper to Physical Review Letters in late 2017 and it got accepted in 2018 with glowing reviews (two “publish as is” and one “minor corrections”). You can access the paper here (or on arXiv, if you don’t have an APS subscription). You can also check the synopsis written by APS on our paper. The most significant aspect of this paper is that it settled the question of how diamondlike a-C grows, i.e., following the “peening” mechanism, instead of the widely accepted “subplantation” mechanism, as illustrated in the graph below. This mechanism had already been proposed by Nigel Marks in 2005 (see above), but Nigel’s calculations, carried out with C-EDIP, lacked quantitative agreement with experiment. Nigel, who we invited to our ASCM2019 workshop in Helsinki on nanocarbon modeling, was indeed a happy man when told he had been right all along.
After this first deposition paper, which focused on explaining the growth of high-density ta-C, we started working on the surface chemistry of a-C, which led to four Chemistry of Materials papers (structure1, structure2, x-ray1, x-ray2; I have previously written a blog post on the x-ray spectroscopy papers in this website). But we also kept alive the flame of understanding the growth and structure of a-C throughout the full range of mass densities. Low-density nanocarbons are very interesting at the moment in the context of energy storage, since they are porous, and other compounds can be stored in those pores. Unfortunately, all the stuff going on with the surface chemistry of a-C and other developments in ML potentials, teaching, event organization, not to mention trying to secure research funds and make career advancements, meant that finalizing the work on deposition of a-C progressed slower than expected.
However, all is good that ends well, and we have finally managed to publish our comprehensive simulations, which characterize the structure and growth mechanism of a-C from low (graphitic-like) to high (diamond like) densities, as shown above and below. At low energies, a-C grows by “direct attachment”, whereas at high energies it grows by peening.
And besides the implications for carbon science in general, and a-C knowledge in particular, one of the most significant aspects of our work is that it showed that the new ML potentials can be used to solve outstanding problems in molecular and materials modeling, previously out of reach due to computational limitations.