Articles


Splitting Wafer Lots in Semiconductor Manufacturing Part I: The Statistician's Toolbox

May 17, 2000

Splitting Wafer Lots in Semiconductor Manufacturing Part I: The Statistician's Toolbox
Jack E. Reece, Reece Associates, Ltd., Lake George, CO, USA; and George A. Milliken, Kansas State University, Manhattan, KS, USA

Process engineers and their managers, particularly those involved in semiconductor manufacturing, work in a culture that must abhor the status quo. Continuous process improvement is essential for such an enterprise to remain competitive and to prosper. Tweaks in processes to improve device performance or yield are a way of life. Testing those improvements, though, demands an experimental design that minimizes the risk to production lots while gathering enough data for valid inferences. Part I of this article discusses the relevant statistical tools, Part II points out common analytical errors, and Part III explains the correct approach.

Contents

•Statistical Comparisons
•Analysis of Variance
•Types of Risk, the "Truth Table"

To test an idea for correcting or improving a particular process step, an engineer usually splits one or more wafer lots among one or more alternatives at that step. This approach attempts to compare the new approach to the old one within the same lot. Since die yield and final electrical functions are critical, these experiments usually involve potentially expensive production lots. Purely "short loop" processes using test wafers may not give all the information necessary to support a decision. Because of the value of an individual production lot, management will allocate very few such lots (rarely more than three) for any experiment.

Using production lots requires small sample sizes, so engineers must use statistical tests to draw inferences from a particular experiment. Inferences based on statistical tests always include the risk of making an incorrect decision. Overlooking a small increment (or decrement) in performance or die yield can impact tens of thousands, if not millions, of dollars in revenue, depending on production volumes. Designing these experiments to minimize the risks of wrong decisions is economically critical.

The examples in the discussions that follow are based on actual experimental practices, but the data and the descriptions of the process steps have been altered to protect proprietary information.

Statistical Comparisons

The Student's t Test Statistic
In its most trivial form a statistical inference compares the difference between the means of the two groups to an estimate of the variation in those groups, generating a test statistic (Equation 1):

Since such comparisons always involve small samples instead of the entire population of observations, the expression becomes that for generating a Student's t test statistic (Equation 2):

where sp is the pooled standard deviation from the two groups, n1 and n2 are the number of observations in each group, and tobs is the Student's t test statistic.

The analyst compares the absolute value of the observed test statistic generated from the data tobs to a "critical" value for that statistic ta2n, obtained from the Student's t distribution. The size of the critical value depends on the sample size (represented by n) and the level of risk (a) of concluding that the groups are different when they really are not that the observer is willing to accept. Usually the acceptable risk is no more than 0.05 to 0.10. If the observed test statistic is greater than the critical value, then the observer concludes that a difference exists between the means of the two groups and makes the appropriate decision. However, it is absolutely essential that the analyst use the correct estimate of variation between the groups in computing the observed Student's t statistic.

This discussion has considered only the a risk associated with a decision. A later section defines this risk and discusses the other type of risk (b) inherent in statistical tests.

Contents

Analysis of Variance

Comparisons using the Student's t statistic are limited to comparing an average of observations to a desired value or to comparing two averages to each other. Using this statistic for any comparison involving more than two sets of data is not valid. Analysis of variance techniques partition observations into groups according to the process treatments that produced the groups. The method, invented by Sir Ronald Fisher nearly 80 years ago, can accommodate any number of group comparisons. The technique compares the variations of the group (treatment) averages from the overall average of the observations to the inherent variation in the data and generates a test statistic called the Fisher's F-ratio (Equation 3):

where s2groups estimates the variation in the data between groups, s2errorestimates the variation in the data within groups (the random noise or statistical error), and Fobs is the Fisher F-ratio.

As with the Student's t test statistic, the analyst compares the observed test statistic Fobs to a critical value for that statistic Fc, given the number of observations and the level of a risk allowed. If the test statistic is greater than the critical value, then the observer concludes that a difference exists among the group means and makes the appropriate decision. As before the validity of the conclusions from this test depends on a the correct estimate of error for the calculation. Furthermore, this test acknowledges only a risk and does not consider b risk.

Contents

Types of Risk, the "Truth Table"

In the previous sections, the conclusions rely on some assumed a risk. This is the chance an observer is willing to take in concluding that a difference exists between group means, when it really does not. For the Student's t test the analysis involves two opposing hypotheses regarding the unknown populations that produced the data being analyzed. Table 1 summarizes these hypotheses.

In the analysis of variance case the null hypothesis states that the variation seen due to assigning the data to groups is indistinguishable from the inherent noise in the system, i.e., the means of the groups are not different. The alternative hypothesis states that the variation introduced by the group assignments is greater than the inherent noise, i.e., some difference exists among the means of the groups. If the analyst concludes that the alternative hypothesis is likely true, then he has determined that the average of at least one of the groups differs from that of another group, and that some treatment or group effect exists.

The "Truth Table" shown in Table 2 illustrates both types of risks one takes in making decisions from data.

An a error occurs when the means of the underlying populations are actually the same, but the analyst concludes that a difference between them exists. A b error occurs when the means of the underlying populations are actually different, but the analyst concludes that they are the same. Obviously, the desirable situation is that both of the risks are relatively small, usually 0.1 or less. That is, no one wishes to allow more than about a 10% chance for making the wrong decision, regardless of type. Another term used to describe a statistical test is the "power" of that test, defined as 1 - b. Therefore, if a test has a power of 0.9, the analyst would say that the test has a 90% chance of detecting a specified difference, given the conditions of the test. The conditions of the test include the number of observations to be collected, the observed difference between groups that the experiment must detect, and the variation in the collected data.

One of the authors (JR) prefers to use the following analogies to help keep the definitions of a- and b-risks meaningful: a-Risk is like seeing a ghost or apparition. The observer sees something that really does not exist. b-Risk is like stepping off a curb into the path of an oncoming garbage truck. The observer failed to see something potentially very important.

Consider the following scenario: For a particular process step an engineer is considering changing a process to an alternative one that is somewhat simpler and potentially less expensive to operate. Analysis of the data shows that the minor yield loss with the alternative process is not statistically significant at some level of a risk. That is, the engineer concludes that no difference exists between the two alternatives and recommends changing the process to the new simpler one. However, depending on how he conducted the experiment and analyzed the data, considerable and potentially unknown b risk may exist. The test may not be able to detect an actual detrimental difference when one exists (b risk). The "truth" may not become known until long after the change has been made and yields have consistently been lower than before resulting in an economic loss.

The responsible professional engineer neither wishes to report an overly-optimistic potential for a process change (a-risk) nor be party to an experiment that might fail to detect a potential effect or problem (b-risk).

The number of observations (replications) required to decide whether or not a difference exists between two processes depends on

  • The level of the a and b errors acceptable
  • The size of the differences between alternatives one wishes to detect (D)
  • The inherent actual variation in the measurement process.

Logically, an investigator would elect to keep both types of risk as small as possible and would choose to detect a small difference between processes. Unfortunately, given some level of variation in the process, these objectives can work together to increase the number of observations (replications) required – sometimes to an unreasonably large number if the investigator elects to use extremely restrictive values. Parts II and III of this article explain how alternative experimental designs can achieve quite different numbers of observations, and hence levels of risk, while risking the same number of production wafers.

For more information
"Statistically Speaking," is intended to help readers use statistical methods to solve process problems. Readers can pose questions for future columns through a companion Discussion Forum .

Jack Reece is a member of Semiconductor Online's advisory board. He can be reached at:
PO Box 308
Lake George, CO 80827 USA
Voice: +1 719-748-8641
FAX: +1 719-748-8642
jreece@pcisys.net

George A. Milliken is a Professor of Statistics at Kansas State University, specializing in the analysis of "messy data." He has extensive consulting contracts in agricultural and biological sciences (pharmaceuticals) as well as in conventional manufacturing. He is co-author, with fellow Kansas State University professor Dallas Johnson, of a landmark text on"Analysis of Messy Data." He can be reached at:
Department of Statistics
Kansas State University
Manhattan, KS 66506 USA
Voice: +1 (785) 532-0514
milliken@ksu.edu

Most Popular

Need Information?

Please wait... busy