This paper treats an application in which both Cluster Analysis and Decision Risk Analysis are brought to bear. Cluster Analysis is often considered mainly as a tool for Exploratory Data Analysis. Here, however, it is used in a more highly structured problem.

In this paper, mixture-model cluster analysis is applied to data on machine and manual white blood cell counts to estimate the lower limit of machine accuracy. Manual results are considered accurate. The Lab Manager must come up with a rule of the form, "Repeat the count manually if the machine count is less than L." A suitable choice of L must be made. The method is as follows. A sample of (machine, manual) pairs of counts is obtained. Two Gaussian clusters are fitted to (the logarithms of) the data. In one cluster there is high correlation between the machine and manual results. In the other, corresponding to low machine counts, there is very low correlation between the two. The cluster analysis yields parameter estimates for the bivariate Gaussian components of the mixture. Of course a by-product of this analysis is that the marginal distribution of the decision variable, machine count, has thus been estimated. A suitable cut-off point L, taking account of the risks involved, is defined in terms of this distribution.