Building a Better Mousetrap, From the Atoms Up

The Ramprasad Lab is employing machine learning to design new materials without having to pre-test each one.

The Ramprasad Lab is employing machine learning to design new materials without having to pre-test each one. (Schematic by Chiho Kim, Ramprasad Lab/UConn Image)

The Ramprasad Lab is employing machine learning to design new materials without having to pre-test each one. (Schematic by Chiho Kim, Ramprasad Lab/UConn Image)

For most of human history, the discovery of new materials has been a crapshoot. But now, UConn researchers have systematized the search with machine learning that can scan millions of theoretical compounds for qualities that would make better solar cells, fibers, and computer chips. The search for new materials may never be the same.

No one knows why an early metallurgist decided to smelt a hunk of tin into some copper, but the resulting bronze alloy was harder and more durable than any material previously known. Most materials experimentation over the ensuing 7,000 years has been similarly random, guided largely by philosophy and chemical intuition.

But in a world that contains at least 95 stable elements – the basic building blocks of matter – the number of possible combinations is enormous, and experimentation is an awfully inefficient way to find what you’re looking for.

Enter UConn materials scientist Ramamurthy ‘Rampi’ Ramprasad. Instead of randomly mixing chemicals to see what they do, Ramprasad designs them rationally, using machine learning to figure out which atomic configurations make a polymer a good electrical conductor or insulator.

A polymer is a large molecule made of many repeating building blocks. Polymers are very common in both living and man-made materials. Probably the most familiar example is plastics, and the wide variation in plastics – which can be hard, soft, stretchy, brittle, spongy, clear, opaque or translucent – gives an inkling of how diverse polymers in general can be.

Polymers can also have diverse electronic properties. For example, they can be very good insulators – preventing electrons, and thus electric current, from traveling through them – or good conductors, allowing electricity to pass through them freely. And what controls all these properties is mainly how the atoms in the polymer connect to each other. But until recently, no one had systematically related properties to atomic configurations.

So Ramprasad and his colleagues decided to do just that. First, they would analyze known polymers, using laborious but accurate quantum mechanics-based calculations to figure out which arrangements of atoms confer which properties, and quantify those atomic-level relationships via a string of numbers that fingerprint each polymer. Once they had those, they could have a computer search through any number of theoretical polymers to figure out which ones might have which properties. Then anyone looking for a polymer with a certain property could quickly scan the list and decide which theoretical polymers might be worth trying.

Many polymers are made of building blocks containing just a few atoms. They look like this:

Polyurea. In this diagram, N is nitrogen, H hydrogen, and O oxygen. R stands in for any number of chemicals that could slightly alter the polymer, but the repeating NH-O-NH-O is the basic structure. Most polymers look like that, made of carbon (C), H, N and O, with a few other elements thrown in occasionally. (By <a href="https://commons.wikimedia.org/w/index.php?curid=4091421">Yikrazuul – own work, public domain</a>)
Polyurea, a common plastic. In this diagram, N is nitrogen, H hydrogen, and O oxygen. R stands in for any number of chemicals that could slightly alter the polymer, but the repeating NH-O-NH-O is the basic structure. Most polymers look like that, made of carbon (C), H, N and O, with a few other elements thrown in occasionally. (By Yikrazuul – own work, public domain)

For their project, Ramprasad’s group looked at polymers made of just seven building blocks: CH2, C6H4, CO, O, NH, CS, and C4H2S. These are found in common plastics such as polyethylene, polyesters, and polyureas. An enormous variety of polymers could theoretically be constructed using just these building blocks; Ramprasad’s group decided at first to analyze just 283, each composed of a repeated four-block unit.

They started from basic quantum mechanics, and calculated the three-dimensional atomic and electronic structures of each of those 283 four-block polymers. This is not trivial: calculating the position of every electron and atom in a molecule with more than two atoms takes a powerful computer a significant chunk of time, which is why they did it for only 283 molecules.

Once they had the three-dimensional structures, they could calculate what they really wanted to know: each polymer’s properties. They calculated the band gap, which is the amount of energy it takes for an electron in the polymer to break free of its home atom and travel around the material, and the dielectric constant, which is a measure of the effect an electric field can have on the polymer. These properties translate to how much electric energy each polymer can store in itself. The researchers used established techniques that have long been known. They take a prohibitive amount of computing time, which is why it’s so hard to evaluate materials this way.

Ramprasad’s group then went one step further. They wanted a shorthand system that a computer could use to look at the building blocks of a polymer and how they connect to each other, and make educated guesses about its properties.

Computers deal with numbers, so first they had to define each polymer as a string of numbers, a sort of numerical fingerprint. Since there are seven possible building blocks, there are seven possible numbers, each indicating how many of each block type are contained in that polymer. But a simple number string like that doesn’t give enough information about the polymer’s structure, so they added a second string of numbers that tell how many pairs there are of each combination of building blocks, such as NH-O or C6H4-CS. Still not quite enough information, so they added a third string that described how many triples, like NH-O-CH2, there were. They arranged these strings as a three-dimensional matrix, which is a convenient way to describe such strings of numbers in a computer.

Then they let the computer go to work. Using the library of 283 polymers they had laboriously calculated using quantum mechanics, the machine compared each polymer’s numerical fingerprint to its band gap and dielectric constant, and gradually ‘learned’ which building block combinations were associated with which properties. It could even map those properties onto a two-dimensional matrix of the polymer building blocks.

These two-dimensional matrices are heat maps; they use color to indicate whether a particular pair of building blocks has a positive or negative effect on a property, and how large that effect is. For example, the matrix on the lower right shows the pairing of CS-C4H2S has a strong, positive effect on a polymer’s total dielectric constant.
These two-dimensional matrices are heat maps; they use color to indicate whether a particular pair of building blocks has a positive or negative effect on a property, and how large that effect is. For example, the matrix on the lower right shows the pairing of CS-C4H2S has a strong, positive effect on a polymer’s total dielectric constant. (Mannodi-Kanakkithodi et al., ‘Machine Learning Strategy for Accelerated Design of Polymer Dielectrics,’ Scientific Reports, February 2016)

Once the machine learned which atomic building block combinations gave which properties, it no longer needed the quantum mechanics calculations of atomic structure. It could accurately evaluate the band gap and dielectric constant for any polymer made of any combination of those seven building blocks, using just the numerical fingerprint of its structure.

Many of the predictions of quantum mechanics and the machine learning scheme have been validated by Ramprasad’s UConn collaborators, chemistry professor Greg Sotzing and electrical engineering professor Yang Cao. Sotzing actually made several of the novel polymers, and Cao tested their properties; they came out just as Ramprasad’s computations had predicted.

“What’s most surprising is the level of accuracy with which we can make predictions of the dielectric constant and band gap of a material using machine learning. These properties are generally computed using quantum mechanical methods such as density functional theory, which are six to eight orders of magnitude slower,” says Ramprasad. The group published a paper on their polymer work in Scientific Reports on Feb. 15; and another paper that utilizes machine learning in a different manner, namely, to discover laws that govern dielectric breakdown of insulators, will be published in a forthcoming issue of Chemistry of Materials.

But even if you don’t have access to those academic journals, you can see the predicted properties of every polymer Ramprasad’s group has evaluated in their online data vault, Khazana, which also provides their machine learning apps to predict polymer properties on the fly. They are also uploading data and the machine learning tools from their Chemistry of Materials work, and from an additional recent article published in Scientific Reports on Jan. 19 on predicting the band gap of perovskites, inorganic compounds used in solar cells, lasers, and light-emitting diodes.

Ramprasad is unusually willing to share his results, but that’s because he’s a theoretical materials scientist; what he wants to know is why materials behave the way they do. What about a polymer makes its dielectric constant just so? Or what makes an insulator withstand enormous electric fields without breaking down? But he also wants this understanding to be put to work to design new useful materials rationally. So he makes the results of his calculations freely available in the hope that someone else might look through them, see one, and go, “Wow. I’m looking for a material with exactly those properties!” and then make it. If it works as predicted,  they’re both happy.

His work is aligned with a larger U.S. White House initiative called the Materials Genome Initiative. Much of Ramprasad’s work described here was funded by grants from the Office of Naval Research, as well as from the U.S. Department of Energy.