Improving Wafer Table Swap Classification in Lithography Systems Using Positive And Unlabeled Learning
Summary
When a wafer table (WT), due to wear and tear, is no longer able to meet the operational criteria,
it must be swapped for a new replacement table. These swaps are performed at the semiconductor fabs of ASML’s customers. However, when replacing WT on dry XT machines, customers may
choose third-party suppliers or deviate from the standard recovery protocol, which significantly
limits ASML’s visibility into WT swap events. These blind spots lead to unreliable remaining lifetime useful (RUL) estimations, impacting WT development. The current rule-based model, which
relies on diagnostic tests that are part of the recovery protocol, has proven insufficient. A Kaplan
Meier (KM) analysis reveals the model to be over sensitive in the early wafer table’s lifetime while
failing to capture many run-to-failure cases. As a result, the estimated remaining useful lifetimes
are inconsistent with real world observations. This research aims to develop a machine learning
methodology that surpasses the rule-based model, enabling realistic KM estimations. The main
challenges of this work are the limited sample size of 64 labeled data points and discrepancies between model features and actual wafer table swap events. Through a literature review we introduce
a semi-supervised Positive and Unlabeled (PU) learning scenario, where the set of 64 swap labels
form the positive class, while all other unobserved samples are treated as unlabeled. This unlabeled
set contains both positive and negative samples for wafer table swaps. PU learning is found to be
highly dependent on several key assumptions such as class priors and label selection mechanisms,
which must be examined in the context of wafer table swaps. In a feature exploration analysis on
the ASML data, features were identified whose predictive value is independent of the recovery
protocol chosen by the customer. Incorporating wafer table push time and productivity data into
the diagnostic tests enabled a logistic regression model that clearly outperforms the rule-based approach. The Naive and Two-Model PU learning–adapted logistic regression classifiers were evaluated on both open-source and ASML datasets. Results on the open-source datasets show that both
classifiers require only a small fraction of the samples to be labeled. However, evaluating performance on the ASML data remains challenging due to the presence of unlabeled samples in the test
set.
