Importance of feature selection in machine learning and adaptive design for materials

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

29 Scopus citations

Abstract

In materials informatics, features (or descriptors) that capture trends in the structure, chemistry and/or bonding for a given chemical composition are crucial. Here, we explore their role in the accelerated search for new materials using machine learning adaptive design. We focus on a specific class of materials referred to as apatites [A10(BO4)6X2] and our objective is to identify an apatite compound with the largest band gap (Eg) without performing density functional theory calculations over the entire composition space. We construct three datasets that use three sets of features of the A, B, and X-ions (ionic radii, electronegativities, and the combination of both) and independently track which of these sets finds most rapidly the composition with the largest E g. We find that the combined feature set performs best, followed by the ionic radii feature set. The reason for this ranking is the B-site ionic radius, which is the key E g -governing feature and appears in both the ionic radii and combined feature sets. Our results show that a relatively poor ML model with large error but one that contains key features can be more efficient in accelerating the search than a low-error model that lack such features.

Original languageEnglish
Title of host publicationSpringer Series in Materials Science
PublisherSpringer Verlag
Pages59-79
Number of pages21
DOIs
StatePublished - 2018

Publication series

NameSpringer Series in Materials Science
Volume280
ISSN (Print)0933-033X

Fingerprint

Dive into the research topics of 'Importance of feature selection in machine learning and adaptive design for materials'. Together they form a unique fingerprint.

Cite this