CEP ONLINE: A WEB-ORIENTED EXPERT SYSTEM FOR STATISTICAL PROCESS CONTROL

Louzada, Francisco; Ferreira, Paulo; Ara, Anderson; Godoy, Caroline

doi:10.1590/0101-7438.2019.039.01.0177

ABSTRACT

In this paper, a new software for Statistical Process Control (SPC) is proposed. The system, the so-called CEP Online, was developed based on statistical computing resources of well-known free softwares, such as HTML, PHP, R and MySQL under an online server with operating system Linux Ubuntu. The main uni and multivariate SPC tools are available for monitoring and evaluation of manufacturing and non-manufacturing production processes over time. Some advantages of the new software are: (i) low operational cost, since it is cloud-based, only needing a computer connected to the Internet; (ii) easy to use with great interaction with the user; (iii) it does not require investment in any specific hardware or software; (iv) real time reports generation on process condition monitoring and process capability. Thus, the CEP Online offers for SPC practitioners fast, efficient and accurate SPC procedures. Therefore, CEP Online becomes an important resource for those who have no access to non-free softwares, such as SAS, SPSS, Minitab and STATISTICA. To the best of our knowledge, the CEP Online is unique with respect to its characteristics.

Keywords:
Control Charts; Capability Indices; Instantaneous Reports

1 INTRODUCTION

The increasing enterprise competitiveness and increasing consumer requirements, along with the globalization and world computerization, have caused significant changes in production of manufacturing and non-manufacturing environments worldwide. Many enterprises, particularly industries, have been faced with the need for improving their products. As a consequence, the quality control of their products has become extremely important to have customer satisfaction and to generate profit.

The standard methodology for quality improvement is the Statistical Process Control (SPC), which, according to ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., consists of a powerful set of tools used in achieving process stability and improving capability through the reduction of variability. SPC can be applied in any process involving a repetitive sequence of steps, i.e. it can be applied in both manufacturing and non-manufacturing processes.

Nowadays, there are several statistical softwares that can be used to generate SPC analysis (e.g., R, SAS, SPSS, Minitab, STATISTICA). However, with the exception of R, most of these softwares can only be used by purchasing their license, which has relatively high cost for certain sizes and types of enterprises. In this paper, we introduce a new software for SPC, the so-called CEP Online (CEP stands for “Controle Estatístico de Processo”, which is the Portuguese translation of Statistical Process Control). As similarly described in ¹³13. LOUZADA F & ARA A. 2018. MWStat: A modulated web-based statistical system. Pesquisa Operacional, 38(2): 291-306., its development is based on statistical computing resources of well-known free softwares, such as HTML, PHP, R and SQL under an online server with operating system Linux Ubuntu. It is aimed at enabling access to the main uni and multivariate SPC tools by mainly small and medium-sized enterprises.

To the best of our knowledge, there are no online softwares available in the literature that work and perform similarly to ours . This leads us to an innovative structure in applied statistics of SPC that will possibly be common in the near future.

It is worth noting that the R software itself (which is robust, rigorous, efficient and free, as pointed out by ⁵5. CANO EL, MOGUERZA JM & REDCHUK A. 2012. Six Sigma with R. Springer, New York.) has statistical packages that allow one to apply many of the uni and multivariate SPC methods described in this paper (see Section 2). Among these packages are SixSigma⁵5. CANO EL, MOGUERZA JM & REDCHUK A. 2012. Six Sigma with R. Springer, New York., qcc²⁵25. SCRUCCA L. 2004. qcc: An R package for quality control charting and statistical process control. R News, 4(1): 11-17, https://cran.r-project.org/doc/Rnews/.
https://cran.r-project.org/doc/Rnews/... , qualityTools²⁰20. ROTH T. 2016. qualityTools: Statistical methods for quality science. http://www.r-qualitytools.org, r package version 1.55.
http://www.r-qualitytools.org... , lattice²⁴24. SARKAR D. 2008. Lattice: Multivariate Data Visualization with R. Springer, New York ., spc¹¹11. KNOTH S. 2018. spc: Statistical Process Control - Calculation of ARL and Other Control Chart Performance Measures. https://CRAN.R-project.org/package=spc, r package version 0.6.0.
https://CRAN.R-project.org/package=spc... , spcadjust⁸8. GANDY A & KVALOY JT. 2013. Guaranteed Conditional Performance of Control Charts via Bootstrap Methods. Scandinavian Journal of Statistics, 40(4): 647-668., IQCC²2. BARROS F, BARBOSA E, GONCALVES E & RECCHIA D. 2017. IQCC: Improved Quality Control Charts. http://CRAN.R-project.org/package=IQCC, r package version 0.7.
http://CRAN.R-project.org/package=IQCC... , MSQC²²22. SANTOS-FERNÁNDEZ E. 2013. Multivariate Statistical Quality Control Using R. Springer, New York . and MPCI²³23. SANTOS-FERNÁNDEZ E & SCAGLIARINI M. 2012. MPCI: An R Package for Computing Multivariate Process Capability Indices. Journal of Statistical Software, 47(7): 1-15., all available on CRAN (https://cran.r-project.org/). However, we reinforce that none of them has all the statistical tools (descriptive statistics and graphics, goodness-of-fit tests, control charts and capability indices, for both univariate and multivariate designs) usually required to perform a complete SPC analysis. Thus, it is common for a SPC analyst to use different functions from different R packages to perform the desired tasks to produce desired output, which often requires significant time and effort of the analyst. Moreover, one needs programming skills to be able to use appropriately the R software. Compared to R, CEP Online aims to offer a reasonable set of SPC tools, which can be used in a faster, easier and friendlier way.

In order to exemplify some of the methodologies used, we apply (univariate) SPC tools to evaluate the performance of a production process of chocolate bars by certain toy food company (the quality characteristic to be monitored is the chocolate bar’s weight), and present the different reports generated by the proposed CEP Online system, available on http://www.mwstat.com/novocep.

In general, the web-based approach proposed here has the following characteristics: focused on SPC implementation, built on cloud and evaluates the process in- or out-of-control condition by constructing control charts, as well as in- or out-of-specification condition by calculating capability indices. It is built from different and free languages and softwares, having a structure that instantly connects with the R software to generate all necessary calculations.

It is important to note that updated versions of the CEP Online system can be produced (e.g., by other researchers/practitioners) to incorporate new statistical tools into the available modules (univariate and multivariate SPC modules), as well as into new modules (e.g., process design and improvement with designed experiments, and acceptance-sampling techniques). Some of our current developments (which are the subjects of our actual researches) include: (i) new variables and attributes control charts that are better alternatives to the traditional univariate Shewhart ones (see, e.g., ²¹21. SAGHIR A & LIN Z. 2015. Control Charts for Dispersed Count Data: An Overview. Quality and Reliability Engineering International, 31(5): 725-739.); (ii) robust non-parametric versions of the multivariate T ² control chart (see, e.g., ⁶6. CHEN N, ZI X & ZOU C. 2016. A Distribution-Free Multivariate Control Chart. Technometrics, 58(4): 448-459.^{), (}¹⁸18. MOSTAJERAN A, IRANPANAH N & NOOROSSANA R. 2018. An Explanatory Study on the Non-Parametric Multivariate T2 Control Chart. Journal of Modern Applied Statistical Methods, 17(1): 2-27.); (iii) and the application of copulas to multivariate control charts and capability indices (see, e.g., ³¹31. VERDIER G. 2013. Application of copulas to multivariate control charts. Journal of Statistical Planning and Inference, 143(12): 2151-2159.^{), (}⁴4. BUSABABODHIN P & AMPHANTHONG P. 2016. Copula modelling for multivariate statistical process control: a review. Communications for Statistical Applications and Methods, 23(6): 497-515.). Since these SPC tools are not available in any of the existing statistical softwares that perform SPC analysis, their insertion into CEP Online also brings modernity, originality and exclusivity to our proposed system.

The remainder of the paper is organized as follows. In Section 2, we present the SPC procedures available at CEP Online. In Section 3, the system architecture is described and explained. Section 4 shows the implementation and evaluation of the CEP Online system, including the generation of control charts and the calculation of capability indices for univariate processes only (chocolate bars example). Finally, Section 5 concludes the paper.

2 SYSTEM ELEMENTS AND METHODOLOGY

In this section, we present the SPC tools used to analyze the process performance over time. More specifically, a set of uni and multivariate SPC techniques were applied in order to generate control charts and compute capability indices.

2.1 Univariate statistical analysis

In this subsection, we briefly describe some of the most widely used SPC tools for analyzing single-variable processes, i.e. processes with a single measurable quality characteristic, such as length, diameter, volume (unrealistic scenario in many real-world SPC applications).

2.1.1 Shewhart control charts

The Shewhart control charts are undoubtedly the most widely known and used statistical tools for process monitoring or supervision. They were developed in the 1920s by Walter A. Shewhart, a physicist, engineer and statistician of the Bell Telephone Laboratories.

As stated in ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., Shewhart proposed a general theory/model for generating control charts, which can be simply explained as follows. Let ω be a sample statistic that measures some quality characteristic of interest, and suppose that μ_ω and σ_ω are, respectively, the mean and standard deviation of ω. Then, the control chart’s center line (CL), lower (LCL) and upper (UCL) control limits, are given by

L C L = μ_{ω} - L σ_{ω}, C L = μ_{ω}, U C L = μ_{ω} + L σ_{ω},

(1)

where L is the “distance” between the CL and the control limits, given in terms of standard deviation units. In the case of the well-known Six Sigma policy (adopted by us in our study), we have that $L = 3$ (this means that, if we assume a normal probability distribution as a model for a quality characteristic, then it turns out that the probability of producing an item within the control limits is 0.9973, which corresponds to 2,700 parts per million defective).

Still according to ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., the Shewhart control charts may be classified into two general types: variables control charts, if the quality characteristic can be measured and expressed as a number on some continuous scale of measurement; or attributes control charts, if the quality characteristic is not measured on a continuous scale or even a quantitative scale, i.e. we may judge each unit of product as either conforming or nonconforming on the basis of whether or not it possesses certain attributes, or we may count the number of nonconformities (defects) appearing on a unit of product.

Among the variables control charts, we highlight the $x$ , S and R charts, which are the most widely used charts for controlling central tendency based on the sample mean ( $x$ chart), and process variability via the sample standard deviation (S chart) or sample range (R chart).

By considering the Shewhart’s general model given by (1), we obtain the following control charts for variables (for details on the development of these graphics, see ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.):

$x$ and S charts:

μ_{0} \pm 3 \frac{σ_{0}}{\sqrt{n}} and c_{4} σ_{0} \pm 3 σ_{0} \sqrt{1 - c_{4}^{2}},

respectively, where n is the sample size (which is the same for the m collected samples/data groups), $c_{4} = \frac{Γ (n - 2)}{Γ ((n - 1) / 2)} \sqrt{\frac{2}{n - 1}}$ , with Γ(.) the gamma function, and μ₀ and σ₀ are the (known/specified) process mean and standard deviation, respectively. Otherwise (i.e., if there is no parameter specification), the $x$ and S charts’ control limits can be rewritten as

\bar{\bar{x}} \pm 3 \frac{S}{c_{4} \sqrt{n}} a n d S \pm 3 \frac{S}{c_{4}} \sqrt{1 - c_{4}^{2}},

respectively, where $\bar{\bar{x}} = \frac{\sum_{i = 1}^{m} {\bar{x}}_{i}}{m} = \frac{\sum_{i = 1}^{m} \sum_{j = 1}^{n} x_{i j}}{m n}$ (overall mean), with x _ij being the quality characteristic value of the j ^th item of the i ^th sample, and $\bar{S} = \sqrt{\frac{\sum_{i = 1}^{m} S_{i}^{2}}{m}}$ , with $S_{i}^{2} = \frac{\sum_{j = 1}^{n} {(x_{i j} - {\bar{x}}_{i})}^{2}}{n}$ .
As suggested by their names, the statistics plotted on the $x$ and S charts are the sample mean, $x_{i}$ , and the sample standard deviation, $S_{i} = \sqrt{S_{i}^{2}}$ , respectively.
These charts are also relatively easy to apply in the cases where the sample sizes are unequal/variable. For details, see ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp..

$x$ and R charts:

μ_{0} \pm 3 \frac{σ_{0}}{\sqrt{n}} and d_{2} σ_{0} \pm 3 d_{3} σ_{0},

respectively, where $E [R_{i}] = d_{2} σ_{0}$ and $\sqrt{V a r [R_{i}]} = d_{3} σ_{0}$ (see, e.g., Table 2.3 of ¹⁴14. LOUZADA F, DINIZ C, FERREIRA P & FERREIRA E. 2013. Controle Estatístico de Processos: Uma abordagem prática para cursos de Engenharia e Administração. LTC, Rio de Janeiro, 282 pp., for estimates of d ₂ and d ₃ for some values of n), with $R_{i} = \max_{j} \{x_{i j}\} - \min_{j} \{x_{i j}\}$ being the range of sample i, for $i = 1, 2, . . ., m$ , and min and max the minimum and maximum functions, respectively. If the process mean and standard deviation are not known/specified, then the control limits of the $x$ and R charts are revised as follows:

\bar{\bar{x}} \pm 3 \frac{\bar{R}}{d_{2} \sqrt{n}} a n d R \pm 3 \frac{d_{3} \bar{R}}{d_{2}},

where $R = \frac{1}{m} \sum_{i = 1}^{m} R_{i}$ .
Again, as suggested by their names, the statistics plotted on the $x$ and R charts are the sample mean, $x_{i}$ , and the sample range, R _i , respectively.
The R chart can also be used for samples of size $n = 1$ (individual units). In this case, we simply replace the sample range R _i by the moving range of two successive observations, that is $M R_{i} = |x_{i} - x_{i - 1}|$ , for $i = 2, . . ., m$ .
Finally, ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp. recommends the use of the sample range, instead of the sample standard deviation, as a measure of the process dispersion when the collected samples are of equal (i.e. $n_{1} = \cdot \cdot \cdot = n_{m} = n$ ) and small (i.e. $n < 10$ ) sizes.

Among the attributes control charts, we highlight those for fraction nonconforming (p and np charts) and for nonconformities or defects (c and u charts), which are also based on the same general statistical principles of the Shewhart control charts. Then, from (1), the center line and control limits of the main control charts for attributes would be as follows (again, for details on the development of these graphics, see ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.):

p chart:

p_{0} \pm 3 \sqrt{\frac{p_{0} (1 - p_{0})}{n}} (with parameter specification)

(2)

or

\bar{p} \pm 3 \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}} (without parameter specification),

(3)

where p ₀ is the known/specified fraction nonconforming in the production process, and $\bar{p} = \frac{1}{m} \sum_{i = 1}^{m} {\hat{p}}_{i} = \frac{1}{m n} \sum_{i = 1}^{m} D_{i}$ , with D _i being the nonconforming units in sample i. Note that ${\hat{p}}_{i}$ is the statistic plotted on the p chart.
In order to deal with variable sample size, we replace n by n _i in (2) and (3), as well as in the calculus of ${\hat{p}}_{i}$ (thus, ${\hat{p}}_{i} = \frac{D_{i}}{n_{i}}$ ). Moreover, ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp. suggests the use of a standardized p control chart, where $C L = 0, L C L = - 3, U C L = + 3$ and the variable plotted is

z_{i} = \frac{{\hat{p}}_{i} - p_{0}}{\sqrt{\frac{p_{0} (1 - p_{0})}{n_{i}}}} (with parameter specification)

or

z_{i} = \frac{{\hat{p}}_{i} - \bar{p}}{\sqrt{\frac{\bar{p} (1 - \bar{p})}{n_{i}}}} (without parameter specification).

According to the author, the major advantage of the latter approach is that tests for runs and pattern-recognition methods can safely be applied to the standardized chart.

np chart:

n p_{0} \pm 3 \sqrt{n p_{0} (1 - p_{0})} (with parameter specification)

or

n \bar{p} \pm 3 \sqrt{n \bar{p} (1 - \bar{p})} (without parameter specification) .

The statistic plotted on this chart is $n {\hat{p}}_{i} = D_{i}$ , i.e. the nonconforming units in sample i, for $i = 1,2,..., m$ . This is the reason why the np chart is often called a number nonconforming control chart.
As pointed out by ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., the np chart has the advantage of being easier to interpret, especially by non-statisticians, than the p chart. However, the p chart is recommended over the np chart for variable sample size, since the former is easier to interpret for this scenario (the center line of the p chart will not vary across samples).

c chart:

c_{0} \pm 3 \sqrt{c_{0}} (with parameter specification)

or

\bar{c} \pm 3 \sqrt{\bar{c}} (without parameter specification),

where c ₀ is the expected number of nonconformities/defects in an inspection unit of product (in general, the inspection unit will be a single unit of product), and $\bar{c} = \frac{1}{m} \sum_{i = 1}^{m} {\hat{c}}_{i}$ , with ${\hat{c}}_{i}$ being the number of nonconformities in sample i (this is also the statistic plotted on the c chart).
The c chart, as well as the u chart (described in the sequence), are particularly useful over the p and np charts in the case where an unit may contain several nonconformities and not be classified as nonconforming. For instance, manufacturing personal computers could have one or more very minor flaws (e.g., in the cabinet finish), but since these flaws do not seriously affect the unit’s functional operation, it could be classified as conforming ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp..

u chart:

u_{0} \pm 3 \sqrt{\frac{u_{0}}{n}} (with parameter specification)

(4)

or

\bar{u} \pm 3 \sqrt{\frac{\bar{u}}{\bar{n}}} (without parameter specification),

(5)

where u ₀ represents the observed average number of nonconformities per unit in a preliminary set of data, and $\bar{u} = \frac{1}{m} \sum_{i = 1}^{m} {\hat{u}}_{i}$ , with ${\hat{u}}_{i}$ being the average number of nonconformities per inspection unit in sample i ( ${\hat{u}}_{i}$ is also the variable plotted on the u chart).
We can handle the issue of variable sample size by replacing n by n _i in (4) and (5), or, as suggested by ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp. (which is also the preferred option), by using a standardized u control chart, where $C L = 0, L C L = - 3, U C L = + 3$ and the statistic plotted is

z_{i} = \frac{u_{i} - u_{0}}{\sqrt{\frac{u_{0}}{n_{i}}}} (with parameter specification)

or

z_{i} = \frac{u_{i} - \bar{u}}{\sqrt{\frac{\bar{u}}{n_{i}}}} (without parameter specification) .

The u chart is recommended over the c chart for variable sample size, since the latter would be very difficult to interpret for this scenario (both the center line and the control limits of the c chart will vary with the sample size).

2.1.2 CuSum and EWMA charts

The Shewhart control charts, described in Subsubsection 2.1.1, are useful to detect large shifts in the monitored process parameters. However, when small shifts are of great interest, we can use two effective alternative procedures: the Cumulative Sum (CuSum) and the Exponentially Weighted Moving Average (EWMA) control charts. Although they are not really new (both date from the 1950s), they are usually considered to be more advanced techniques than the Shewhart control charts. Next, we describe the CuSum and EWMA control charts for monitoring the process mean and when specifications for the process parameters are available/provided.

CuSum chart: consists of plotting the two statistics:

\begin{array}{l} C_{i}^{+} = \max {0, {\bar{x}}_{i} - (μ_{0} + K) + C_{i - 1}^{+}} and \\ C_{i}^{-} = \max {0, (μ_{0} + K) - {\bar{x}}_{i} + C_{i - 1}^{-}}, \end{array}

for $i = 1,2,..., m$ , where $K = 0.5 \frac{σ_{0}}{\sqrt{n}}$ and the starting values are $C_{0}^{+} = C_{0}^{-} = H / 2$ (50% fast initial response or headstart; see ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.), where $H = 5 \frac{σ_{0}}{\sqrt{n}}$ is the decision interval (control limit). These values of K and H are the recommended ones to detect a shift in the process mean of 1σ.
The CuSum control chart, as described above, is also known as tabular or algorithmic CuSum. It may also be constructed for individual observations (for details, see ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.).
In order to improve the ability of the CuSum chart to detect large process shifts, ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp. suggests the use of a combined CuSum-Shewhart procedure for online control (the Shewhart limit is at $H = 3.5 \frac{σ_{0}}{\sqrt{n}}$ ).

EWMA chart: the exponentially weighted moving average is defined as

W_{i} = λ {\bar{x}}_{i} + (1 - λ) W_{i - 1},

for $i = 1,2,..., m$ , where $0 < λ \leq 1$ is a constant (smoothing factor) and the starting value is the process target, i.e. $W_{0} = μ_{0}$ . Exact and asymptotic (steady-state values) control limits are given by

μ_{0} \pm γ \sqrt{\frac{λ [1 - {(1 - λ)}^{2 i}] σ_{0}^{2}}{(2 - λ) n}} and μ_{0} \pm γ \sqrt{\frac{λ σ_{0}^{2}}{(2 - λ) n'}}

respectively, where γ is a positive constant.
¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp. strongly recommends the use of the exact control limits for small values of i, e.g. $i \leq 10$ . The author also suggests the use of $0.05 \leq λ \leq 0.25$ , which works well in practice, with $λ = 0.05,0.10 and 0.20$ being popular choices, as well as $γ = 3$ (the usual 3σ limits).
The EWMA chart is also very insensitive (robust) to the normality assumption, which makes it an ideal control chart to use with individual observations ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp..

2.1.3 Process capability analysis

It is widely known that being in-control is not enough for a production process, since even an in-control process may produce bad (useless) items. A process must also be capable to achieve customer and/or product requirements or specifications.

One simple and quantitative way to express process capability is through the capability indices. Among them, we highlight the capability indices for on-center (C _p and P) and off-center (C _pk and C _pm ) processes. Such indices are defined as follows.

C _p

C_{p} = \frac{U S L - L S L}{6 σ_{0}} (with parameter specification)

or

{\hat{C}}_{p} = \frac{U S L - L S L}{6 S} (without parameter specification),

where USL and LSL are, respectively, the upper and lower specification limits for the measurable quality characteristic, and S is the sample standard deviation.
According to ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., this kind of index (known as process capability ratios, PCRs) is extensively used in industry. It measures the spread of the specifications relative to the 6σ spread in the process, or, as we usually say, the potential capability in the process.
The C _p index, as well as the P index (described in the sequence), implicitly assume that the process is centered between the upper and lower specification limits, i.e. that the process mean (μ₀ or $x$ ) coincides with the center (target value, $T = \frac{1}{2} (U S L + L S L)$ >) of the specification interval ([LSL, USL]) (on-center process).
Besides computing and interpreting the point estimate of C _p , we can also report a confidence interval for it. If the quality characteristic is normally distributed, then a 100 (α)% confidence interval on C _p is obtained from

[{\hat{C}}_{p} \sqrt{\frac{χ_{n - 1; (1 - α) / 2}^{2}}{n - 1}}; {\hat{C}}_{p} \sqrt{\frac{χ_{n - 1; (1 + α) / 2}^{2}}{n - 1}}],

(6)

where $χ_{n - 1; (1 - α) / 2}^{2}$ and $χ_{n - 1; (1 + α) / 2}^{2}$ are, respectively, the $(1 - α) / 2$ and $(1 + α) / 2$ percentile points of the chi-square distribution with $n - 1$ degrees of freedom.

P

P = (\frac{1}{C_{p}}) \times 100 (with parameter specification)

or

\hat{P} = (\frac{1}{{\hat{C}}_{p}}) \times 100 (with parameter specification) .

The P index has a useful practical interpretation - it is the percentage of the specification band used up by the process ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp..
A 100(α)% confidence interval on P is obtained from (6) as follows:

[{({\hat{C}}_{p} \sqrt{\frac{χ_{n - 1; (1 + α) / 2}^{2}}{n - 1}})}^{- 1} \times 100; {({\hat{C}}_{p} \sqrt{\frac{χ_{n - 1; (1 - α) / 2}^{2}}{n - 1}})}^{- 1} \times 100] .

C_pk:

C_{p k} = \min {\frac{U S L - μ_{0}}{3 σ_{0}}; \frac{μ_{0} - L S L}{3 σ_{0}}} (with parameter specification)

or

{\hat{C}}_{p k} = \min {\frac{U S L - \bar{x}}{3 S}; \frac{\bar{x} - L S L}{3 S}} (without parameter specification) .

An advantage of C _pk , as well as of C _pm (described in the sequence), over the C _p and P indices (described before), is that the former ones take process centering into account for the process capability estimation ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.. Thus, C _pk can also be used in the situation where the process is not operating at the midpoint of the specification interval (off-center process). In this case, $C_{p k} < C_{p}$ . The C _pk index is usually said to measure the actual capability in the process.
Assuming that the quality characteristic follows a normal distribution, we obtain an approximate 100(α)% confidence interval on C _pk as follows:

[{\hat{C}}_{p k} (1 - z_{(1 + α) / 2} \sqrt{\frac{1}{9 n {\hat{C}}_{p k}^{2}} + \frac{1}{2 (n - 1)}}); {\hat{C}}_{p k} (1 + z_{(1 + α) / 2} \sqrt{\frac{1}{9 n {\hat{C}}_{p k}^{2}} + \frac{1}{2 (n - 1)}})],

(7)

where $z_{(1 + α) / 2}$ represents the $(1 + α) / 2$ percentile point of the standard normal distribution.

C_pm:

C_{p m} = \frac{C_{p}}{\sqrt{1 + V^{2}}} (with parameter specification)

or

{\hat{C}}_{p m} = \frac{{\hat{C}}_{p}}{\sqrt{1 + {\hat{V}}^{2}}} (without parameter specification),

where $V = \frac{μ_{0} - T}{σ_{0}}$ and $\hat{V} = \frac{\bar{x} - T}{S}$ .
According to ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., the C _pk index does not tell us about the location of the mean in the specification interval, whereas the C _pm index is a better indicator of centering.
Moreover, ³3. BOYLES RA. 1991. The Taguchi Capability Index. Journal of Quality Technology, 23(2): 107-126. shows that

C_{p m} \geq k \Rightarrow | μ_{0} - T | < \frac{1}{6 k} (U S L - L S L) .

Thus, from a given value of C _pm , we can place a constraint on the difference between μ₀ and T. For instance, $C_{p m} \geq 2$ implies that $| μ_{0} - T | < \frac{1}{12} (U S L - L S L)$ .

Note that C _p , C _pk and C _pm values greater than 1, as well as P values lower than 100, are desired.

For an extensive summary of confidence intervals for various capability indices of kind PCRs (including C _p , C _pk and C _pm ), see ¹²12. KOTZ S & LOVELACE CR. 1998. Process Capability Indices in Theory and Practice. Arnold, London..

Finally, it is also important to note that a process capability analysis can also be performed via graphical techniques, such as histograms and control charts.

2.2 Multivariate statistical analysis

In practice, most process monitoring and control scenarios involve various related variables. Applying univariate control charts to each individual variable is not a good solution, since it is inefficient and can lead to erroneous conclusions ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.. Thus, multivariate methods that consider the variables jointly are needed.

In multivariate statistical quality control, we generally use the multivariate normal distribution to describe the joint behavior of continuous quality characteristics (multivariate normal process), i.e. $X = (X_{1},..., X_{p})' \sim N_{p} (μ, Σ)$ , where $X_{1},..., X_{p}$ denote the p variables (quality characteristics of interest), $μ = (μ_{1},..., μ_{p})'$ is the vector of the means of the X’s, and Σ is the p × p covariance matrix (whose main diagonal elements are the variances of the X’s and the off-diagonal elements are the covariances).

Some of the most widely used control charts for monitoring the variability (two approaches) and mean (Hotelling T ² chart) in the multivariate case, as well as capability indices (MC _p , MC _pk , MC _pm , MP _PC , Mp ₁ and Mp ₂), are briefly described in the following subsubsections.

2.2.1 Variability control charts

In this subsubsection, we present control charts for monitoring the multivariate normal process variability, that are based on two different approaches discussed in ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp..

First approach: the statistic plotted on the control chart for the i ^th sample, $i = 1,..., m$ , is

W_{i} = - p n + p n \ln (n) - n \ln (\frac{| A_{i} |}{| Σ_{0} |}) + t r (Σ_{0}^{- 1} A_{i}),

(8)

where Σ₀ is the (known/specified) p × p covariance matrix, $A_{i} = (n - 1) S_{i}$ , with S _i being the sample covariance matrix for sample i, and tr is the trace operator. The control chart only has an upper control limit given by $U C L = χ_{p (p + 1) / 2; 0.9973}^{2}$ .
This first approach is a direct extension of the univariate S ² control chart. For details on the S ² control chart, see ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp..
In practice, Σ (the true covariance matrix) usually will be estimated from preliminary samples. In this case, we simply replace Σ₀ by S (the sample covariance matrix) in (8).

Second approach: consists of plotting the sample generalized variance, $| S_{i} |, i = 1,..., m$ , which is a widely used measure of multivariate dispersion. The control limits of the chart are given by

| Σ_{0} | (b_{1} \pm 3 b_{2}^{0.5}) (with parameter specification)

or

\frac{| S |}{b_{1}} (b_{1} \pm 3 b_{2}^{0.5}) (without parameter specification),

where $b_{1} = \frac{1}{{(n - 1)}^{p}} Π_{k = 1}^{p} (n - k)$ and $b_{2} = \frac{1}{{(n - 1)}^{2 p}} Π_{k = 1}^{p} (n - k) [Π_{l = 1}^{p} (n - l + 2) - Π_{l = 1}^{p} (n - l)]$ .
¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp. suggests the use of univariate control charts for variability (e.g., the S and R charts, described in Section 2.1.1) in conjunction with the control chart for |S|.

2.2.2 Hotelling T ² chart

The Hotelling T ² control chart is the multivariate extension of the univariate Shewhart $x$ chart (described in Section 2.1.1). It is also the most used multivariate procedure for monitoring the mean vector of the process. In this subsubsection, we present two versions of the Hotelling T ² chart: one for grouped data $(n > 1)$ , and another for individual observations $(n = 1)$ .

Grouped data: the test statistic plotted on the control chart for sample i, $i = 1,2,..., m$ , is

T_{i}^{2} = n ({\bar{x}}_{i} - μ_{0})' Σ_{0}^{- 1} ({\bar{x}}_{i} - μ_{0}) (with parameter specification)

or

T_{i}^{2} = n ({\bar{x}}_{i} - \bar{\bar{x}})' S^{- 1} ({\bar{x}}_{i} - \bar{\bar{x}}) (without parameter specification),

where μ₀ is the (known/specified) vector of in-control means for each quality characteristic, ${\bar{x}}_{i} = {(\frac{1}{n} \sum_{j = 1}^{n} x_{1 i j},..., \frac{1}{n} \sum_{j = 1}^{n} x_{p i j})}^{'}$ , with $x_{i j} = {(x_{1 i j},..., x_{p i j})}^{'}$ being the vector of quality characteristics for the j ^th observation of the i ^th sample, $\bar{\bar{x}} = {(\frac{1}{m} \sum_{i = 1}^{m} {\bar{x}}_{1 i},..., \frac{1}{m} \sum_{i = 1}^{m} {\bar{x}}_{1 i})}^{'}$ , and S is the (p × p) average of sample covariance matrices S _i , with elements ${\bar{S}}_{k g} = \frac{1}{m} \sum_{i = 1}^{m} S_{k g i} = \frac{1}{m (n - 1)} \sum_{i = 1}^{m} \sum_{j = 1}^{n} (x_{k i j} - {\bar{x}}_{k i}) (x_{g i j} - x_{g i})$ for $k, g = 1,2,..., p$ .
For the first case (with parameter specification), the upper limit on the control chart is $χ_{p; 0.9973}^{2}$ . While for the second case (without parameter specification), there are two possible upper limits to be used, depending on the phase of control chart usage ¹1. ALT FB. 1985. Multivariate Quality Control. In: KOTZ S, JOHNSON NL & READ CR (Eds.), The Encyclopedia of Statistical Sciences, John Wiley & Sons, New York, vol. 6, pp. 110-122.. As explained in ¹⁷17. MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp., phase I analysis consists of using the Hotelling T ² chart for establishing control, i.e. for testing if the process was in control when the m groups were drawn and the sample statistics $\bar{\bar{x}}$ and S calculated (retrospective analysis); while in phase II the chart is used to monitor future production through an in-control set of observations obtained in phase I. The phase I upper limit for the Hotelling T ² control chart is given by

U C L = \frac{p (m - 1) (n - 1)}{m n - m - p + 1} F_{p, m n - m - p + 1; 0.9973} .

In phase II, the upper limit is

U C L = \frac{p (m + 1) (n - 1)}{m n - m - p + 1} F_{p, m n - m - p + 1; 0.9973},

where $F_{p, m n - m - p + 1; 0.9973}$ is the 99.73^th percentile of the F distribution with p numerator degrees of freedom and $m n - m - p + 1$ denominator degrees of freedom.

Individual observations: in this situation (very common in the chemical and process industries), the Hotelling T ² statistic becomes

T_{i}^{2} = {(x_{i} - μ_{0})}^{'} \sum_{0}^{- 1} (x_{i} - μ_{0}) (with parameter specification),

for $i = 1,2,..., m$ , where $x_{i} = {(x_{1 i},..., x_{p i})}^{'}$ and the upper limit of the control chart is still given by $χ_{p; 0.9973}^{2}$ ; or

T_{i}^{2} = {(x_{i} - \bar{x})}^{'} S^{- 1} (x_{i} - \bar{x}) (without parameter specification),

where $\bar{x} = {(\frac{1}{m} \sum_{i = 1}^{m} x_{1 i},..., \sum_{i = 1}^{m} x_{p i})}^{'}$ and the upper control limit is given by

U C L = \frac{{(m - 1)}^{2}}{m} β_{p / 2, (m - p - 1) / 2; 0.9973} (Phase I)

²⁹29. TRACY ND, YOUNG JC & MASON RL. 1992. Multivariate control charts for individual observations. Journal of Quality Technology, 24(2): 88-95., where $β_{p / 2, (m - p - 1) / 2; 0.9973}$ is the 99.73^th percentile of the beta distribution with parameters $p / 2$ and $(m - p - 1) / 2$ ; or

U C L = \frac{p (m + 1) (m - 1)}{m^{2} - m p} F_{p, m - p; 0.9973} (Phase II) .

However, in the case of individual observations, we need to use more sophisticated estimators of the covariance matrix Σ. One of them is the usual estimator obtained by simply pooling all m observations, i.e.

S_{1} = \frac{1}{m - 1} \sum_{i = 1}^{m} (x_{i} - \bar{x}) {(x_{i} - \bar{x})}^{'} .

A second estimator, originally suggested by ⁹9. HOLMES DS & MERGEN AE. 1993. Improving the performance of the T2 control chart. Quality Engineering, 5(4): 619-625., considers the difference between successive pairs of observations, i.e.

S_{2} = \frac{1}{2} \frac{V' V}{(m - 1)},

where

V = [\begin{matrix} v_{1}^{'} \\ ⋮ \\ v_{m - 1}^{'} \end{matrix}] = [\begin{matrix} (x_{2} - x_{1})' \\ ⋮ \\ (x_{m} - x_{m - 1})' \end{matrix}] .

For other estimators of Σ, see, e.g., ²⁷27. SULLIVAN JH & WOODALL WH. 1995. A comparison of multivariate quality control charts for individual observations. Journal of Quality Technology, 28(4): 398-408..
Interesting alternatives to the T ² and generalized variance (|S|) control charts for monitoring bivariate processes (namely, the ZMAX, VMAX and MCMAX charts), are presented in ¹⁵15. MACHADO MAG, COSTA AFB & CLARO FAE. 2009. Monitoring bivariate processes. Pesquisa Operacional, 29(3): 547-562..

2.2.3 Multivariate process capability analysis

Multivariate process capability indices (MPCIs) can be constructed using several different approaches. For a discussion on these approaches, see, e.g., ²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77.. In this subsubsection, we focus on MPCIs for multivariate normal processes data using principal component analysis (PCA). For details on this multivariate statistical technique, see, e.g., Chapter 8 of ¹⁰10. JOHNSON RA & WICHERN DW. 2007. Applied Multivariate Statistical Analysis. Pearson, New Jersey, 6 ed., 800 pp.. Among the PCA-based MPCIs, we highlight MC _p , MC _pk and MC _pm³³33. WANG FK & CHEN JC. 1998. Capability index using principal component analysis. Quality Engineering, 11(1): 21-27., MP _PC³⁰30. VEEVERS A. 1999. Capability indices for multiresponse processes. In: PARK SH & VINING GG (Eds.), Statistical Process Monitoring and Optimization, Marcel Dekker, New York, vol. 28, pp. 185-194., Mp ₁ and Mp ₂²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77.. Such MPCIs are briefly described below.

MC_p , MC _pk and MC _pm³³33. WANG FK & CHEN JC. 1998. Capability index using principal component analysis. Quality Engineering, 11(1): 21-27.:

M C_{p} = {(\prod_{i = 1}^{v} C_{p; P C_{i}})}^{1 / v},

(9)

where $C_{p; P C_{i}} = \frac{U S L_{P C_{i}} - L S L_{P C_{i}}}{6 σ_{P C_{i}}}$ represents the univariate measure of potential process capability C _p for the i ^th principal component (PC _i ), ν denotes the number of principal components (PCs) comprising around 90% of the process variability, and $L S L_{P C_{i}}, U S L_{P C_{i}}$ represent the lower specification limit, upper specification limit and standard deviation of PC _i , respectively, where

L S L_{P C_{i}} = u_{i}^{'} L S L and U S L_{P C_{i}} = u_{i}^{'} U S L,

with $u_{1}, u_{2},..., u_{p}$ being the eigenvectors of Σ, and LSL and USL are, respectively, the upper and lower specification limits of $X = {(X_{1}, X_{2},..., X_{p})}^{'}$ .
Sometimes, we need to estimate $σ_{P C_{i}}$ by $S_{P C_{i}}$ (that is, the sample standard deviation of PC _i ), thus obtaining

M {\hat{C}}_{p} = {(\prod_{i = 1}^{v} {\hat{C}}_{p; P C_{i}})}^{1 / v},

where ${\hat{C}}_{p; P C_{i}} = \frac{U S L_{P C_{i}} - L S L_{P C_{i}}}{6 S_{P C_{i}}}$ .
A 100(α)% confidence interval on MC _p is obtained from ³⁴34. WANG FK & DU TCT. 2000. Using principal component analysis in process performance for multivariate data. Omega, 28(1): 185-194.:

[{(\prod_{i = 1}^{v} {\hat{C}}_{p; P C_{i}} \sqrt{\frac{χ_{n - 1; (1 - α) / 2}^{2}}{n - 1}})}^{1 / v}; {(\prod_{i = 1}^{v} {\hat{C}}_{p; P C_{i}} \sqrt{\frac{χ_{n - 1; (1 + α) / 2}^{2}}{n - 1}})}^{1 / v}] .

Similarly, ³³33. WANG FK & CHEN JC. 1998. Capability index using principal component analysis. Quality Engineering, 11(1): 21-27. have defined MPCIs MC _pk and MC _pm by replacing $C_{p; P C_{i}}$ with $C_{p k; P C_{i}}$ and $C_{p m; P C_{i}}$ , respectively, in (9).
The MPCIs proposed by ³³33. WANG FK & CHEN JC. 1998. Capability index using principal component analysis. Quality Engineering, 11(1): 21-27. have the advantage of being simple and easy to calculate. However, ²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77. show, through an example, that such indices are incorrect because they assume that specification limits of different PCs are independent of each other (which is not true).

MP_PC³⁰30. VEEVERS A. 1999. Capability indices for multiresponse processes. In: PARK SH & VINING GG (Eds.), Statistical Process Monitoring and Optimization, Marcel Dekker, New York, vol. 28, pp. 185-194.:

M P_{P C} = \frac{1 + \sqrt{2}}{6 \sqrt{λ_{1}}},

where λ₁ denotes the eigenvalue associated with the first PC of Σ_W (that is, the covariance matrix of W), with W being the (transformed) vector whose elements are $W_{i} = X_{i} / d_{i}$ , where $d_{i} = \frac{1}{2} (U S L_{X_{i}} - L S L_{X_{i}})$ for $i = 1,2,..., p$ .
In general, we need to estimate Σ_W , thus obtaining estimates of λ₁ and MP _PC .
²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77. show, through an interesting example, that MP _PC may mislead the measurements of process capability. They also note that this index is applicable only for a rectangular specification region.

Mp₁andMp₂²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77.:

M_{p 1} = P {Y = {(Y_{1},..., Y_{v})}^{'} \in V | Y \sim N_{v} (μ_{Y} = T_{Y}, Σ_{Y} = d i a g (λ_{1},..., λ_{v}))}

and

M_{p 2} = P {Y = {(Y_{1},..., Y_{v})}^{'} \in V | Y \sim N_{v} (μ_{Y}, Σ_{Y} = d i a g (λ_{1},..., λ_{v}))},

where Y ₁,..., Y _ν denote the first ν PCs (explaining approximately 90% of the process variation), $λ_{1} \geq λ_{2} \geq \dots \geq λ_{p}$ are the eigenvalues of Σ, T _Y is the target vector for Y, and

\begin{array}{l} V = {{(y_{1},..., y_{v})}^{'} | L S L \leq U y \leq U S L, where y = {(y_{1},..., y_{p})}^{'} such that \\ \begin{matrix} y_{r} = E [Y_{r}], r = v + 1,..., p} \end{matrix} \end{array}

is the specification region for Y, where $U = (u_{1},..., u_{p})$ is the matrix of eigenvectors of Σ.
As noted by ²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77., Mp ₁ is analogous to MC _p and Mp ₂ is analogous to MC _pk . Therefore, if $M_{p 1} \geq 0.9973$ , the process is potentially capable, and if $M_{p 2} \geq 0.9973$ , the process is actually capable.
Since the computation of these two MPCIs involve the evaluation of multiple integrals on complicated regions, the authors suggest to assess them based on the empirical probability distribution of PCs. Their empirical approach is so described as follows. First, generate two random samples of a large size N (e.g., $N \geq 20,000$ ) from the distribution of the first ν PCs with the following mean vector:
- Sample I: $μ_{Y} = T_{Y}$ ,
- Sample II: μ_Y .
Then, estimate Mp ₁ and Mp ₂ as

\begin{array}{l} {\hat{M}}_{p 1} = \frac{Number of observations y = {(y_{1},..., y_{v})}^{'} from sample I \in V}{N}, \\ {\hat{M}}_{p 2} = \frac{Number of observations y = {(y_{1},..., y_{v})}^{'} from sample II \in V}{N} . \end{array}

According to ²⁶26. SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77., the empirical approach enables us to consider non-hyper rectangular specification region (not considered in other existing MPCIs).

3 SYSTEM ARCHITECTURE

In this section, we present the main modules of the CEP Online system and the web-based approach. The online system is based on four main modules and the web-based approach uses only free softwares.

3.1 Main modules

The main modules of the proposed system are shown in Figure 1. The quality measurements (variables or attributes) are used as a source of information for the production process. Univariate or multivariate methods are applied to the inserted data, allowing an overall assessment of the process quality. To perform this, we use the R software ¹⁹19. R CORE TEAM. 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/.
http://www.R-project.org/... , which brings back the information stored in the online database and dynamically calculates descriptive measures, control charts and capability indices. The results are stored again in a database and displayed in several system reports.

Figure 1
Main modules of the CEP Online system.

3.2 Web-based approach

The current number of Internet users is up from 3.5 billion, as estimated and announced by Internet Live Stats (www.internetlivestats.com), which means that approximately 40% of the world population has an Internet connection today. Due to the amplitude and existing convenience of the Internet (from a simple connection, an individual can visit pages that are stored on servers anywhere in the world and view many types of content), among other features, we propose an innovative computer system, called CEP Online, as an online statistical tool directed to monitor and control process quality, as well as to assess process capability.

The system has a cloud structure, so it is not necessary to install on local use, requiring only a connection to the Internet. Therefore, a key characteristic of this approach is that CEP Online can be used in large scale and accessed from different locations.

The CEP Online system is essentially built upon free softwares, which can be easily found on the web. Basically, we use an online server with operating system Linux Ubuntu Server 9.10 and the R software (version 3.3.1) as computational environment for carrying out statistical analysis. The HTML and PHP web languages were considered for the pages’ design and connections to the database. For data management, we employ MySQL 5.0, which uses the SQL language as interface, and phpMyAdmin version 2.7.0-pl2, which allows the administration of MySQL over the Internet. Using phpMyAdmin, we can create and remove databases; create, delete and change tables; insert, remove, and edit fields; execute SQL codes and manipulate key fields.

Figure 2 exhibits the general structure of the system, starting from the access by users until the structure of the database. The following steps show the connection between the user and the system.

Figure 2
General structure of the CEP Online system, as well as the languages used.

The user accesses the website and enters the login and password in the specified fields;
The system receives the information and checks user name and password on database. If all is in order (i.e. the login and password are authenticated), the system allows the access; otherwise, it returns a message saying that the provided information is invalid;
With access allowed, the system asks user about which SPC module he/she wants to use: univariate or multivariate, depending on the number of quality characteristics considered. Then, the system redirects the user to a page with the available options.

Once logged into the system, the user can insert new process data and evaluate the results related to process quality control and capability. Available options for each SPC module of CEP Online are shown in Figure 3.

Figure 3
Available tools in each SPC module of the CEP Online system.

4 IMPLEMENTATION AND EVALUATION

In this section, we perform a SPC analysis on (univariate) data extracted from [^Louzada201314. LOUZADA F, DINIZ C, FERREIRA P & FERREIRA E. 2013. Controle Estatístico de Processos: Uma abordagem prática para cursos de Engenharia e Administração. LTC, Rio de Janeiro, 282 pp.], using the CEP Online system. Due to space limitations, we are unable to provide a SPC analysis on multivariate data too. The (univariate) data consist of the weights (in grams, g) of chocolate bars made by certain (fictitious) company. Some other relevant information on the data used for this study are: $m = 24$ and $n = 15$ (total of 360 sampling units). The required weights are contained in the interval [90g; 110g].

The following subsections present the (step-by-step) SPC analysis on these data through CEP Online. Note that the graphical analyses are made with R software and displayed in the system as (flexible) interactive JavaScript charts (examples, including source code and live charts, are available at www.highcharts.com).

4.1 Logon to CEP Online

The user could access the system using logins and passwords provided by CEP Online, as shown in Figure 4 at the right area.

Figure 4
Login area (www.mwstat.com/novocep).

4.2 SPC module selection

In the next step (page), the user must select the appropriate SPC module (univariate or multivariate) to perform the process quality analysis.

For the present example, which involves only one variable (chocolate bar’s weight), choose the univariate SPC module, as shown in Figure 5, and then click on “>>” button.

Figure 5
SPC module selection.

4.3 Data insertion

Data can be inserted into CEP Online in a direct way, i.e. by simply typing (“Manual data entry”), or by reading from external text (.txt) or Excel (.xlsx, .xls or .csv) files. In the blank field, the user must enter the variable name (accents and spacings are not allowed) in the first row and use “enter” to separate data. The data are then stored in the online database, and the system shows some information regarding them in the history of uploaded data sets, such as the file name and extension, total number of observations, upload date and time, as well as buttons that allow the user to view (“Visualize”) and delete (“Delete”) entire data sets. Moreover, in this system page, the user must inform the number of samples to be used in the analysis, as well as the sample size value (equal sample sizes) or vector (unequal sample sizes). The user must also select the type of inserted data (“Attribute” or “Variable”). Finally, click on “Analyze” button to go to the next page.

Figure 6 shows how to fill this system page for the current example.

Figure 6
Data insertion.

4.4 Data confirmation

The user must check the information after inserting data. The system presents a table with some information regarding the data inserted in the previous page, such as the data set (file) name; number of variables (always one if the user has selected the univariate SPC module), observations and samples; sample size value (or “Different” in the case of unequal sample sizes) and data type (“Attributes” or “Variables”). If something goes wrong, e.g. the total number of observations in the data set does not match the number of samples and sample size(s), the system will send an alert message to the user, who should return to the previous page by clicking on “Return” button and correct the misspecified information. Observe that, if desired, users can perform a descriptive analysis by clicking on “>>” button. Otherwise, they can go directly to the SPC analysis by clicking on “Next” button.

Figure 7 shows this system page for the chocolate bars example.

Figure 7
Data confirmation.

4.5 Descriptive analysis

The optional descriptive analysis displays several summary statistics (total number of observations, minimum and maximum values, first and third quartiles, median, mean, variance, standard deviation and interquartile range), graphics (boxplot, histogram and normal probability plot) and normality tests (Shapiro-Wilk, Anderson-Darling, Lilliefors, Shapiro-Francia and Cramér-von Mises). These summary statistics are also displayed (for each quality characteristic) in the optional descriptive analysis of the multivariate SPC module, in addition to the correlation matrix, graphical tools (scatter plot matrix, chi-square Q-Q plot) and multivariate normality tests (the generalization of the Shapiro-Wilk test by ³²32. VILLASENOR-ALVA JA & GONZALEZ-ESTRADA E. 2009. A generalization of Shapiro-Wilk’s test for multivariate normality. Communications in Statistics: Theory and Methods, 38(11): 1870-1883., the E-statistic (energy) test by ²⁸28. SZEKELY GJ & RIZZO ML. 2005. A new test for multivariate normality. Journal of Multivariate Analysis, 93(1): 58-80., and the ¹⁶16. MARDIA KV. 1970. Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3): 519-530.’s test).

The descriptive analysis for the current example is presented in Figures 8 and 9. From these figures, we can observe, among others, that the chocolate bars’ weights are normally distributed.

Figure 8
Descriptive analysis - part 1.

Figure 9
Descriptive analysis - part 2.

4.6 Control charts selection

In this system page, the user must select the appropriate control charts for the data. If the data are of variable type, the system will show the options: “S and $x$ ” or “R and $x$ ”, “CuSum” (CuSum chart) and “EWMA” (EWMA chart). It is important to note that the system does not allow user to select both options, “S and $x$ ” and “R and $x$ ”, since the S and $x$ charts are recommended for $n \geq 10$ or unequal sample sizes; otherwise, the R and $x$ charts are the recommended ones (as commented before in Section 2.1.1). If the data are of attribute type, the system will present the options: p, np, c and u. Furthermore, the user must define the process condition, i.e. if there is parameter specification (“with parameter specification”) or not (“without parameter specification”).

Figure 10 shows this system page filled for the current example, with no parameter specification for the production process. Note that the S and $x$ charts are more adequate than the R and $x$ ones, since $n = 15 > 10$ .

Figure 10
Control charts selection and process condition.

4.7 SPC analysis

Here, the CEP Online system generates the (interactive) control chart(s) chosen in the previous page. If the data are of variable type, there will be some blank fields at the left area, that are filled with the necessary information (lower specification limit - “LSL”, upper specification limit - “USL”, and confidence level - “α”) to perform a capability study of the process (only in the case where the process is under control). At this area, there is also the confirmation of the type of data and control chart(s) selected in the previous steps.

Figure 11 presents the S and $x$ Shewhart control charts for the current example. Note that both the variability and mean of the quality characteristic (chocolate bars’ weights) are under control. We can thus perform a capability study of this production process.

Figure 11
SPC analysis.

4.8 Process capability analysis

The last page of the univariate SPC module of CEP Online shows a capability report for the process. This report includes a histogram with shadowed specification region, several capability indices (the ones described in Section 2.1.3) and their 100 (α)% confidence intervals (only for some of them). At the left area, there is again the confirmation of the information (type of data, control chart(s), process condition, lower and upper specification limits, and confidence level) inserted in the previous pages.

From Figure 12, we can observe that the process considered here (chocolate bars’ weights) is not able to meet its specifications, since the C _p , C _pk and C _pm values are much lower than 1 (at the 95% confidence level). Moreover, the P value tells us that this process uses approximately 300% of the specification band.

Figure 12
Process capability analysis

5 FINAL REMARKS

In this paper, a web-based expert system was developed to perform SPC analysis. It presents two statistical modules, one univariate and the other multivariate, which create different types of reports that show the process quality performance. The CEP Online system has a structure composed by some web languages, free softwares and the use of R software to make instantaneous analysis.

To the best of our knowledge, there is no web-based system that was developed to perform online SPC analysis. As we have seen here, the proposed methods are simple, though considering several well-known univariate and multivariate analysis. Besides, our online system can be considered an innovative software, that was built to help mainly small and medium-sized companies to perform SPC and capability analyses at low cost, i.e. with no need of using expensive well-known softwares, such as SAS, SPSS, Minitab and STATISTICA.

The CEP Online system allows the continuous monitoring of the process in a simple and efficient way, taking into account the data typed directly into the system or read from external files. It also allows the study of the ability of the process to meet its specifications/requirements.

In order to promote and popularize the access of information and the statistical science applied to process control and monitoring, the CEP Online system can be used in any company of the country, as well as in SPC training courses.

Since the proposed system has a cloud structure, it is not necessary to install it on local use, requiring only a connection to the Internet (low cost). This indicates that CEP Online can be used in large scale and accessed from different locations and different types of device.

As future prospects, we have the development and implementation of new statistical methods in both SPC modules, including modules that can be built by other researchers/practitioners, besides the continuous improvement of the presented modules.

Hence, we visualize the CEP Online system being used worldwide, giving equal opportunities to companies of all sizes and types of performing SPC analysis, even the ones that can not pay for statistical softwares.

ACKNOWLEDGEMENTS

The research is partially funded by the Brazilian organizations, CNPq and FAPESP.

REFERENCES

¹
ALT FB. 1985. Multivariate Quality Control. In: KOTZ S, JOHNSON NL & READ CR (Eds.), The Encyclopedia of Statistical Sciences, John Wiley & Sons, New York, vol. 6, pp. 110-122.
²
BARROS F, BARBOSA E, GONCALVES E & RECCHIA D. 2017. IQCC: Improved Quality Control Charts. http://CRAN.R-project.org/package=IQCC, r package version 0.7.
» http://CRAN.R-project.org/package=IQCC
³
BOYLES RA. 1991. The Taguchi Capability Index. Journal of Quality Technology, 23(2): 107-126.
⁴
BUSABABODHIN P & AMPHANTHONG P. 2016. Copula modelling for multivariate statistical process control: a review. Communications for Statistical Applications and Methods, 23(6): 497-515.
⁵
CANO EL, MOGUERZA JM & REDCHUK A. 2012. Six Sigma with R. Springer, New York.
⁶
CHEN N, ZI X & ZOU C. 2016. A Distribution-Free Multivariate Control Chart. Technometrics, 58(4): 448-459.
⁷
FRAGA GR, PEIXOTO TA & RANGEL JJA. 2018. Simulation optimization in dosing process control system in real time in a free and open-source software. Pesquisa Operacional, 38(2): 273-289.
⁸
GANDY A & KVALOY JT. 2013. Guaranteed Conditional Performance of Control Charts via Bootstrap Methods. Scandinavian Journal of Statistics, 40(4): 647-668.
⁹
HOLMES DS & MERGEN AE. 1993. Improving the performance of the T² control chart. Quality Engineering, 5(4): 619-625.
¹⁰
JOHNSON RA & WICHERN DW. 2007. Applied Multivariate Statistical Analysis. Pearson, New Jersey, 6 ed., 800 pp.
¹¹
KNOTH S. 2018. spc: Statistical Process Control - Calculation of ARL and Other Control Chart Performance Measures. https://CRAN.R-project.org/package=spc, r package version 0.6.0.
» https://CRAN.R-project.org/package=spc
¹²
KOTZ S & LOVELACE CR. 1998. Process Capability Indices in Theory and Practice. Arnold, London.
¹³
LOUZADA F & ARA A. 2018. MWStat: A modulated web-based statistical system. Pesquisa Operacional, 38(2): 291-306.
¹⁴
LOUZADA F, DINIZ C, FERREIRA P & FERREIRA E. 2013. Controle Estatístico de Processos: Uma abordagem prática para cursos de Engenharia e Administração. LTC, Rio de Janeiro, 282 pp.
¹⁵
MACHADO MAG, COSTA AFB & CLARO FAE. 2009. Monitoring bivariate processes. Pesquisa Operacional, 29(3): 547-562.
¹⁶
MARDIA KV. 1970. Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3): 519-530.
¹⁷
MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.
¹⁸
MOSTAJERAN A, IRANPANAH N & NOOROSSANA R. 2018. An Explanatory Study on the Non-Parametric Multivariate T2 Control Chart. Journal of Modern Applied Statistical Methods, 17(1): 2-27.
¹⁹
R CORE TEAM. 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/
» http://www.R-project.org/
²⁰
ROTH T. 2016. qualityTools: Statistical methods for quality science. http://www.r-qualitytools.org, r package version 1.55.
» http://www.r-qualitytools.org
²¹
SAGHIR A & LIN Z. 2015. Control Charts for Dispersed Count Data: An Overview. Quality and Reliability Engineering International, 31(5): 725-739.
²²
SANTOS-FERNÁNDEZ E. 2013. Multivariate Statistical Quality Control Using R. Springer, New York .
²³
SANTOS-FERNÁNDEZ E & SCAGLIARINI M. 2012. MPCI: An R Package for Computing Multivariate Process Capability Indices. Journal of Statistical Software, 47(7): 1-15.
²⁴
SARKAR D. 2008. Lattice: Multivariate Data Visualization with R. Springer, New York .
²⁵
SCRUCCA L. 2004. qcc: An R package for quality control charting and statistical process control. R News, 4(1): 11-17, https://cran.r-project.org/doc/Rnews/
» https://cran.r-project.org/doc/Rnews/
²⁶
SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77.
²⁷
SULLIVAN JH & WOODALL WH. 1995. A comparison of multivariate quality control charts for individual observations. Journal of Quality Technology, 28(4): 398-408.
²⁸
SZEKELY GJ & RIZZO ML. 2005. A new test for multivariate normality. Journal of Multivariate Analysis, 93(1): 58-80.
²⁹
TRACY ND, YOUNG JC & MASON RL. 1992. Multivariate control charts for individual observations. Journal of Quality Technology, 24(2): 88-95.
³⁰
VEEVERS A. 1999. Capability indices for multiresponse processes. In: PARK SH & VINING GG (Eds.), Statistical Process Monitoring and Optimization, Marcel Dekker, New York, vol. 28, pp. 185-194.
³¹
VERDIER G. 2013. Application of copulas to multivariate control charts. Journal of Statistical Planning and Inference, 143(12): 2151-2159.
³²
VILLASENOR-ALVA JA & GONZALEZ-ESTRADA E. 2009. A generalization of Shapiro-Wilk’s test for multivariate normality. Communications in Statistics: Theory and Methods, 38(11): 1870-1883.
³³
WANG FK & CHEN JC. 1998. Capability index using principal component analysis. Quality Engineering, 11(1): 21-27.
³⁴
WANG FK & DU TCT. 2000. Using principal component analysis in process performance for multivariate data. Omega, 28(1): 185-194.

Publication Dates

Publication in this collection
09 May 2019
Date of issue
Jan-Apr 2019

History

Received
26 July 2017
Accepted
13 Feb 2019

This is an open-access article distributed under the terms of the Creative Commons Attribution License

[1] ¹
ALT FB. 1985. Multivariate Quality Control. In: KOTZ S, JOHNSON NL & READ CR (Eds.), The Encyclopedia of Statistical Sciences, John Wiley & Sons, New York, vol. 6, pp. 110-122.

[2] ²
BARROS F, BARBOSA E, GONCALVES E & RECCHIA D. 2017. IQCC: Improved Quality Control Charts. http://CRAN.R-project.org/package=IQCC, r package version 0.7.
» http://CRAN.R-project.org/package=IQCC

[3] ³
BOYLES RA. 1991. The Taguchi Capability Index. Journal of Quality Technology, 23(2): 107-126.

[4] ⁴
BUSABABODHIN P & AMPHANTHONG P. 2016. Copula modelling for multivariate statistical process control: a review. Communications for Statistical Applications and Methods, 23(6): 497-515.

[5] ⁵
CANO EL, MOGUERZA JM & REDCHUK A. 2012. Six Sigma with R. Springer, New York.

[6] ⁶
CHEN N, ZI X & ZOU C. 2016. A Distribution-Free Multivariate Control Chart. Technometrics, 58(4): 448-459.

[7] ⁷
FRAGA GR, PEIXOTO TA & RANGEL JJA. 2018. Simulation optimization in dosing process control system in real time in a free and open-source software. Pesquisa Operacional, 38(2): 273-289.

[8] ⁸
GANDY A & KVALOY JT. 2013. Guaranteed Conditional Performance of Control Charts via Bootstrap Methods. Scandinavian Journal of Statistics, 40(4): 647-668.

[9] ⁹
HOLMES DS & MERGEN AE. 1993. Improving the performance of the T² control chart. Quality Engineering, 5(4): 619-625.

[10] ¹⁰
JOHNSON RA & WICHERN DW. 2007. Applied Multivariate Statistical Analysis. Pearson, New Jersey, 6 ed., 800 pp.

[11] ¹¹
KNOTH S. 2018. spc: Statistical Process Control - Calculation of ARL and Other Control Chart Performance Measures. https://CRAN.R-project.org/package=spc, r package version 0.6.0.
» https://CRAN.R-project.org/package=spc

[12] ¹²
KOTZ S & LOVELACE CR. 1998. Process Capability Indices in Theory and Practice. Arnold, London.

[13] ¹³
LOUZADA F & ARA A. 2018. MWStat: A modulated web-based statistical system. Pesquisa Operacional, 38(2): 291-306.

[14] ¹⁴
LOUZADA F, DINIZ C, FERREIRA P & FERREIRA E. 2013. Controle Estatístico de Processos: Uma abordagem prática para cursos de Engenharia e Administração. LTC, Rio de Janeiro, 282 pp.

[15] ¹⁵
MACHADO MAG, COSTA AFB & CLARO FAE. 2009. Monitoring bivariate processes. Pesquisa Operacional, 29(3): 547-562.

[16] ¹⁶
MARDIA KV. 1970. Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3): 519-530.

[17] ¹⁷
MONTGOMERY DC. 2009. Introduction to Statistical Quality Control. JohnWiley & Sons, 7 ed., 768 pp.

[18] ¹⁸
MOSTAJERAN A, IRANPANAH N & NOOROSSANA R. 2018. An Explanatory Study on the Non-Parametric Multivariate T2 Control Chart. Journal of Modern Applied Statistical Methods, 17(1): 2-27.

[19] ¹⁹
R CORE TEAM. 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/
» http://www.R-project.org/

[20] ²⁰
ROTH T. 2016. qualityTools: Statistical methods for quality science. http://www.r-qualitytools.org, r package version 1.55.
» http://www.r-qualitytools.org

[21] ²¹
SAGHIR A & LIN Z. 2015. Control Charts for Dispersed Count Data: An Overview. Quality and Reliability Engineering International, 31(5): 725-739.

[22] ²²
SANTOS-FERNÁNDEZ E. 2013. Multivariate Statistical Quality Control Using R. Springer, New York .

[23] ²³
SANTOS-FERNÁNDEZ E & SCAGLIARINI M. 2012. MPCI: An R Package for Computing Multivariate Process Capability Indices. Journal of Statistical Software, 47(7): 1-15.

[24] ²⁴
SARKAR D. 2008. Lattice: Multivariate Data Visualization with R. Springer, New York .

[25] ²⁵
SCRUCCA L. 2004. qcc: An R package for quality control charting and statistical process control. R News, 4(1): 11-17, https://cran.r-project.org/doc/Rnews/
» https://cran.r-project.org/doc/Rnews/

[26] ²⁶
SHINDE RL & KHADSE KG. 2008. Multivariate process capability using principal component analysis. Quality and Reliability Engineering International, 25(1): 69-77.

[27] ²⁷
SULLIVAN JH & WOODALL WH. 1995. A comparison of multivariate quality control charts for individual observations. Journal of Quality Technology, 28(4): 398-408.

[28] ²⁸
SZEKELY GJ & RIZZO ML. 2005. A new test for multivariate normality. Journal of Multivariate Analysis, 93(1): 58-80.

[29] ²⁹
TRACY ND, YOUNG JC & MASON RL. 1992. Multivariate control charts for individual observations. Journal of Quality Technology, 24(2): 88-95.

[30] ³⁰
VEEVERS A. 1999. Capability indices for multiresponse processes. In: PARK SH & VINING GG (Eds.), Statistical Process Monitoring and Optimization, Marcel Dekker, New York, vol. 28, pp. 185-194.

[31] ³¹
VERDIER G. 2013. Application of copulas to multivariate control charts. Journal of Statistical Planning and Inference, 143(12): 2151-2159.

[32] ³²
VILLASENOR-ALVA JA & GONZALEZ-ESTRADA E. 2009. A generalization of Shapiro-Wilk’s test for multivariate normality. Communications in Statistics: Theory and Methods, 38(11): 1870-1883.

[33] ³³
WANG FK & CHEN JC. 1998. Capability index using principal component analysis. Quality Engineering, 11(1): 21-27.

[34] ³⁴
WANG FK & DU TCT. 2000. Using principal component analysis in process performance for multivariate data. Omega, 28(1): 185-194.

Brasil

Brasil

CEP ONLINE: A WEB-ORIENTED EXPERT SYSTEM FOR STATISTICAL PROCESS CONTROL

ABSTRACT

1 INTRODUCTION

2 SYSTEM ELEMENTS AND METHODOLOGY

2.1 Univariate statistical analysis

2.1.1 Shewhart control charts

2.1.2 CuSum and EWMA charts

2.1.3 Process capability analysis

2.2 Multivariate statistical analysis

2.2.1 Variability control charts

2.2.2 Hotelling T ² chart

2.2.3 Multivariate process capability analysis

3 SYSTEM ARCHITECTURE

3.1 Main modules

3.2 Web-based approach

4 IMPLEMENTATION AND EVALUATION

4.1 Logon to CEP Online

4.2 SPC module selection

4.3 Data insertion

4.4 Data confirmation

4.5 Descriptive analysis

4.6 Control charts selection

4.7 SPC analysis

4.8 Process capability analysis

5 FINAL REMARKS

ACKNOWLEDGEMENTS

REFERENCES

Publication Dates

History

Brasil

Brasil

CEP ONLINE: A WEB-ORIENTED EXPERT SYSTEM FOR STATISTICAL PROCESS CONTROL

ABSTRACT

1 INTRODUCTION

2 SYSTEM ELEMENTS AND METHODOLOGY

2.1 Univariate statistical analysis

2.1.1 Shewhart control charts

2.1.2 CuSum and EWMA charts

2.1.3 Process capability analysis

2.2 Multivariate statistical analysis

2.2.1 Variability control charts

2.2.2 Hotelling T 2 chart

2.2.3 Multivariate process capability analysis

3 SYSTEM ARCHITECTURE

3.1 Main modules

3.2 Web-based approach

4 IMPLEMENTATION AND EVALUATION

4.1 Logon to CEP Online

4.2 SPC module selection

4.3 Data insertion

4.4 Data confirmation

4.5 Descriptive analysis

4.6 Control charts selection

4.7 SPC analysis

4.8 Process capability analysis

5 FINAL REMARKS

ACKNOWLEDGEMENTS

REFERENCES

Publication Dates

History

2.2.2 Hotelling T ² chart