1

I am using control limits to check if a process is going out of control or not and detect the mean shifts over time.

I am aware of how to apply the control limits, but not really sure about the statistical background behind it.

Now that I have set the background, I'll come to the question: I have a process for which a lot of variables are measured, and every product that goes through the process is measured for these variables (since it is an automated testing).

1. So, now since I'm measuring all the products that pass through the process, am I looking at the population or the sample? 2. If it is a population, then the question is should I sample or not? I know X bar and R charts are based on sampling. But is that designed keeping in mind the computational and operational complexities of measuring each and every product and calculating? or is there a more statistical reason behind it? 3. Can I use in someway, all the observation in an Xbar R Chart? without sampling (cause if I choose a rational subgroup as a day, the subgroup sizes are variable and that won't fit in to the standard calculations 4. If it is a sample, then is my only option I-MR charts? sampling a sample doesn't make a lot of sense to me.


Edit:- After discussion with whuber, I have decided to rephrase my question.

Since I am measuring all the products that pass through my process, I have varying sub group size. I know I-MR charts is an option. But what about X-bar R charts

Is it possible to have the Xbar-R chart even with varying subgroup size?

  • Is the constraint for not using variable subgroup sizes in Xbar-R chart only the sigma estimator?
  • If we find a way around the sigma estimator and use the actual SD for each subgroup and then combine them to get overall SD, would that be a statistical blunder?

Any help is hugely appreciated. Even if you guys can direct me to the relevant literature, I can try and read up on the same.

mdewey
  • 17,806
Manu Joseph
  • 71
  • 1
  • 6
  • I think your initial sentence might hold the key to answering all your questions: you are studying a process. Therefore there is no "population": all your data are samples of that process. – whuber Aug 18 '17 at 13:40
  • @whuber So, even if I am measuring the characteristics of all the products that is passing through the process, it is still a sample because it is not all the possible values that can come out of that process.. Am I right? And in that case, the concept of rational subgroups is essentially lost isn't it? And the only option I have is to stick to I-MR charts? – Manu Joseph Aug 18 '17 at 14:41
  • It comes down to your stated purpose: to check whether a process is in control. I don't see how that would change the concept of a rational subgroup or limit your options at all. Your question basically comes down to "how do I apply standard statistical quality control procedures"--there doesn't seem to be anything out of the ordinary in your situation. – whuber Aug 18 '17 at 15:18
  • @whuber . The only confusion I had comes down to two qns: 1. Is sampling necessary for SPC or was it introduced just for convenience? If I have data for all the tests, should I loose data by doing sampling? 2. Without sampling, is it possible to have an X bar chart? If I choose a rational subgroup as one day, but each day has variable sizes, can I use X bar chart? – Manu Joseph Aug 18 '17 at 17:12
  • You are sampling! Although it's true you are testing every object that comes off the line, for the purposes of assessing the control of the process, you still just have a sample of it. If your objective were to quantify properties of the particular objects that you have produced, then you would have the full population: but then any conclusions based on treating them as a population would apply solely to those objects. – whuber Aug 18 '17 at 17:57
  • Ah!... thank you for that.. It made sense.. So as a conclusion, since I have varying subgroup size, I should stick with I-MR chart. Or is it possible to have the Xbar-R chart even with varying subgroup size? is the constraint for not using variable subgroup sizes in Xbar-R chart only the sigma estimator? If we use the actual SD for each subgroup and then combine them to get overall SD, would that be a statistical blunder? (may be this should be another question?) – Manu Joseph Aug 19 '17 at 02:33
  • Those are all good questions. Since (so far) nobody has answered or even voted on your post, please feel free to edit it as radically as you like. The editing will cause it to act like a new post, but readers will have the advantage of seeing this comment thread (if it remains relevant), which might give them some context and save you some effort. – whuber Aug 19 '17 at 14:09
  • You can use Xbar-R charts with varying subgroup sizes. As you have identified, there are two issues; estimating the (long term, population) mean and SD; defining control limits. These may or may not apply to different sets of subgroups. This is fairly standard stuff and there are different approaches available. I'm not at a 'proper' computer so can't help fully, but you could browsing the Minitab help; e.g. https://support.minitab.com/en-us/minitab/18/help-and-how-to/quality-and-process-improvement/control-charts/how-to/variables-charts-for-subgroups/xbar-chart/methods-and-formulas/xbar-chart/ – user20637 Aug 20 '17 at 19:52
  • @user20637 Thank you for that link. But that method would mean that I have a different control limit for each subgroup. I would much prefer to have a single control limit. I don't plan to recalculate the control limits every subgroup. Calculate the control limits when the process is stable, and measure future data points against it, only recalculating every month or so. In addition to that, I also plan to use pattern recognition to predict a failure ahead of time. – Manu Joseph Aug 21 '17 at 05:11
  • @ManuJoseph Sorry, that's not the way it works. Even for a stable process control limits depend on subgroup size. For Xbar it's simple - the sampling distribution of the mean (https://en.wikipedia.org/wiki/Mean#Distribution_of_the_sample_mean). For R its not simple - http://support.minitab.com/en-us/minitab/18/help-and-how-to/quality-and-process-improvement/control-charts/how-to/variables-charts-for-subgroups/r-chart/methods-and-formulas/r-chart/#control-limits. "Pattern recognition" sounds as if you're getting into deep water, you might Google 'multivariate process control'. – user20637 Aug 22 '17 at 08:17
  • hmmm... understood..in that case, can I just take a fixed amount of observations in each rational subgroup, and run an Xbar R chart? It wouldn't be like sampling a sample right? And I did look into the multivariate process control, but decided against it cause of the fact that the variables that I have as observations do not have a causal relationship between them.. that is why I turned to a ANN based pattern recognition which will tell me the shifts and cyclical trends in the control charts and warn the failure of a process beforehand. – Manu Joseph Aug 22 '17 at 14:56
  • Few words so I'm blunt - sorry :-0 Ignoring valid data must makes the system less effective at detecting 'out-of-control'. You can have constant control limits and instead scale plotted points, but I guess you don't want that either, and it's not easy for R charts. There are established ways to detect shifts and cyclical trends before a value outside 3 sigma; maybe start at https://en.wikipedia.org/wiki/Control_chart#Rules_for_detecting_signals. You seem to be trying to "improve" control charts without fully understanding the current state of the art.You need more support than CrossValidated. – user20637 Aug 23 '17 at 08:01
  • @user20637.. From a statistical point of view, it may make sense to recalculate the control limit every time a new data point is available, but when you apply that I don't think it is wise. Once the control charts are calculated, it should only be recalculated if there is a definite improvement that you have made to the process.. if you keep re-calculating every sample you take, your control limits will just get wider as the process deteriorates.. and thereby lies my contention that you should have a stable control limit and not one which changes every sample. – Manu Joseph Aug 25 '17 at 11:20
  • @user20637 .. And also, I am aware of the rules you can apply to identify the shifts and trends.. while effective, they are still rudimentary and developed to be primarily used on the shop floor.. in this day and age when we have computers to churn through loads of data, it is only wise to utilise them to get better patter recognition in place. And that is why I turned to ANN based pattern recognition. – Manu Joseph Aug 25 '17 at 11:23

1 Answers1

1

After some research in to the topic, I have stumbled upon two journals which address this point. 1. Control Charts for Measurements with Varying Sample Sizes (Burr, Irving W.(1969, ASQC)) 2. Standardization of Shewhart Control Charts (Nelson, Lloyd S.(1989, ASQC))

An excerpt from the Nelson (1989) below:

When subgroup sizes differ there are three approaches usually recommended.
1. Draw the actual control limits for each subgroup separately.
2. Use the average of the subgroup sizes and calculate limits based on this >average size, and calculate the exact limit whenever doubt exists.
3. Standardize the statistic to be plotted and plot the results on a chart with >a centerline of zero and limits at ±3.

He goes on to explain why 1 and 2 are not the ideal way to do:

The first alternative may yield not only a messy chart but also one to which runs tests cannot be applied—specifically the trend and zigzag tests

The second alternative can also have the kind of problem just described. Further, one must be ready to calculate the exact limit when the approximate one is called into question

The third alternative, standardization, yields a neat chart for which interpretation is not a problem. The centerline is always at zero (although it is desirable to indicate on the chart the value of the mean of the original data), and because the vertical scale is a “sigma scale,” the zones for carrying out tests for special causes are always at ±1, ±2, and ±3.

The formulae for implementing the Standardized approach is in the below Table. enter image description here

For more details, please refer to the mentioned research papers I hope this will come in useful for someone who stumbles across the same scenario as mine.

Manu Joseph
  • 71
  • 1
  • 6
  • I was not suggesting recalculating global sigma each sample, just letting limits reflect sample size. Considering your comment about "this day and age" those references are old. Standardisation is what I meant when I said "You can have constant control limits and instead scale plotted points, but I guess you don't want that either, and it's not easy for R charts"; R charts are not included in your Table 1. – user20637 Aug 26 '17 at 17:30
  • The "this day and age" was related to the western electric rules or such rules for identifying the patterns and not a machine learning algorithm. I my case there are hundreds of variables to monitor, you have to have a machine learning algorithm to identify the trends.

    And I really didn't get it that you were talking about the methodology I posted when you said that.. it would have saved me a lot of trouble if you were a little more clear about the comment.. Anyways.. you're right about R chart. Can you let me know the R chart and S chart equivalents. I'll edit the answer and include them

    – Manu Joseph Aug 27 '17 at 04:26