0081// Estimation for Engineering Evaluation Sampling Size
27th Feb 2017 | Guan Boon, Wong
Perform sampling at production site for engineering evaluation is very common daily activities for engineer. However, not all engineer know how to determine the effective sampling size for an experiment. Data collection with large sample size is costly and small sample size giving us unreliable statistical study.
Today, we will discuss how to select a minimum sample size for an experiment using R-langauge.
(BTW, if you still remember I discussed a similar topic in older post with manual calculation)
Before we start with R-langauge, we need to know that which factor determine power of testing.
Factor 1:
Number of sampling size for each group(e.g. the quantity of parts you pick from production for inspection and categorized in same controlled attributes)
Factor 2:
True difference for population.
(e.g. the actual flatness difference between Good Part and Bad Part.)
Factor 3:
Standard deviation for population.
(e.g. the actual flatness standard deviation for both Good Part and Bad Part.)
Factor 4:
Significant Level
(aka probability to false reject a true hypothesis)
(e.g. Null Hypothesis:Good part and bad part no difference. However, result show there is difference between two group which is not true)
Factor 5:
Power for testing
(aka 1 - probability if fail to reject a true hypothesis)
(e.g. Null Hypothesis: Good part and bad part no difference. However, result show there is no difference between two group which is not true)
However, for true diffrence and standard deviation for entire population. This require data collection for historical data from production floor.
After that, we need to know that, in R-langauge have a very simple syntax for this calculation.
n = sampling size
delta mu = true difference
sigma = standard deviation
alpha = significance level
Power = power of testing
type = type of data
Example:
Engineering manager request you to random pick some parts from production floor and use it in an evaluation to study any different in flatness between good parts and bad parts.
Therefore, you need to decided how many parts you need to prepare for good and part parts.
Given that, based on data from previous data collection, flatness mean for entire population between good parts and bad parts is 0.025. And assume standard deviation (for entire population) for good parts and bad parts to be the same, 0.002.
Based on your evaluation requirement you can assume your significant level and power of testing.
In this case, significant level 0.05 (95% confidence level)and power of testing 0.95.
Therefore, add all these information into styntax below and left the size's portion as "NULL"
power.t.test(NULL,0.025,0.002,0.05,0.95)
Based on the calculation, n=1.730524. Therefore, you need at least 2 sample from good parts and 2 sample from bad parts.
Others application for this syntax
If you need know an unknown between sampling size, difference between population, standard deviation for population, significant level, power of testing and all other factors are given. You can just simply replace that portion with "NULL", the syntax will do the calculation for you. Below shown the example. =)
return 0;