Automated pipeline for superalloy data by text mining

  • Weiren Wang
  • , Xue Jiang
  • , Shaohan Tian
  • , Pei Liu
  • , Depeng Dang
  • , Yanjing Su
  • , Turab Lookman
  • , Jianxin Xie

Research output: Contribution to journalArticlepeer-review

72 Scopus citations

Abstract

Data provides a foundation for machine learning, which has accelerated data-driven materials design. The scientific literature contains a large amount of high-quality, reliable data, and automatically extracting data from the literature continues to be a challenge. We propose a natural language processing pipeline to capture both chemical composition and property data that allows analysis and prediction of superalloys. Within 3 h, 2531 records with both composition and property are extracted from 14,425 articles, covering γ′ solvus temperature, density, solidus, and liquidus temperatures. A data-driven model for γ′ solvus temperature is built to predict unexplored Co-based superalloys with high γ′ solvus temperatures within a relative error of 0.81%. We test the predictions via synthesis and characterization of three alloys. A web-based toolkit as an online open-source platform is provided and expected to serve as the basis for a general method to search for targeted materials using data extracted from the literature.

Original languageEnglish
Article number9
Journalnpj Computational Materials
Volume8
Issue number1
DOIs
StatePublished - Dec 2022
Externally publishedYes

Fingerprint

Dive into the research topics of 'Automated pipeline for superalloy data by text mining'. Together they form a unique fingerprint.

Cite this