CROSS-QUALITY FEW-SHOT TRANSFER FOR ALLOY YIELD STRENGTH PREDICTION: A NEW MATERIAL SCIENCE BENCHMARK AND AN INTEGRATED OPTI-MIZATION FRAMEWORK

Abstract

Discovering high-entropy alloys (HEAs) with high yield strength is an important yet challenging task in material science. However, the yield strength can only be accurately measured by very expensive and time-consuming real-world experiments, hence cannot be acquired at scale. Learning-based methods could facilitate the discovery process, but the lack of a comprehensive dataset on HEA yield strength has created barriers. We present X-Yield, a large-scale material science benchmark with 240 experimentally measured ("high-quality") and over 100K simulated (imperfect or "low-quality") HEA yield strength annotations. Due to the scarcity of experimental annotations and the quality gap in imperfectly simulated data, existing transfer learning methods cannot generalize well on our dataset. We address this cross-quality few-shot transfer problem by leveraging model sparsification "twice" -as a noise-robust feature learning regularizer at the pre-training stage, and as a data-efficient learning regularizer at the few-shot transfer stage. While the workflow already performs decently with ad-hoc sparsity patterns tuned independently for either stage, we take a step further by proposing a bi-level optimization framework termed Bi-RPT, that jointly learns optimal masks and automatically allocates sparsity levels for both stages. The optimization problem is solved efficiently using gradient unrolling, which is seamlessly integrated with the training process. The effectiveness of Bi-RPT is validated through extensive experiments on our new challenging X-Yield dataset, alongside other synthesized testbeds. Specifically, we achieve an 8.9 ∼ 19.8% reduction in terms of the test mean squared error and 0.98 ∼ 1.53% in terms of test accuracy, merely using 5-10% of the experimental data. Codes and sample data are in the supplement.

1. INTRODUCTION

Machine learning (ML) methods have recently demonstrated great promise in the important field of material science, and in this paper, we focus on ML-assisted high-entropy alloy (HEA) (Yeh et al., 2004) discovery and property prediction. HEAs own promising properties that traditional alloys do not hold, such as extraordinary mechanical performance at high temperatures, making them wellsuited options for various material applications. One particular property, i.e., the yield strength of HEAs, characterizes the maximum stress a material can endure before starting to deform, which is a critical parameter for customized HEA design. However, in order to accurately measure the yield strength of specific HEAs, expensive scientific experiments need to be conducted for each alloy, often involving hard-to-create experimental conditions, especially at high temperatures (mainly caused by difficulties with oxidation control) as well as extremely long experimental duration. At high temperatures, these measurements are typically taken with the Gleeble system (Gle). From sample preparation to yield strength measurement can take between two to four weeks even for a domain expert team, including melting of the alloy, machining the sample, and preparing and mechanically testing with the Gleeble. Therefore, it is challenging to acquire yield strength measurements from those "high-quality" experiments at scale.

