METHODS

Data from previous research was used to test the Pareto macro using a variety of variables. Physical variables from northern Lake Winnipegosis, Manitoba, were compared to abundances of arcellacean and foraminiferal data that was compiled using multivariate analysis (Boudreau 1999; Fishbein and Patterson 1993). A three factor analysis using as variables pH, oxygen content and salinity was conducted for each of the abundances of Centropyxis aculeata, Difflugia corona, Cribroelphidium gunteri and Lagenodifflugia vas.

Two-level factorial designs were used for the analyses. Each factor was analyzed at a high and low setting. For three factor analysis, the data set was parsed as shown in Table 1 where A represents pH, B represents oxygen content and C represents salinity.

A macro for Excel™ was constructed to analyze the data sets. Two-level factorial designs (clear signal design) were used because they are easy to interpret and they are effective. This analysis includes 2 x 2 x 2 = 8 combinations of high and low values for the parameters. The main effects of each parameter were estimated by evaluating the difference in foraminiferal or arcellacean abundance caused by changing from a low value of the given parameter to a high value. The main effects were taken as the differences in average response values between high and low values. The Excel™ macro constructs a Pareto Chart from the input data which identifies the parameter(s) and interactions between the parameters that contribute most to the abundance of the arcellaceans and the foraminifera. The data used in the analysis are shown in Table 2 and an example of a test run is shown in Table 3 using the data for Cribroelphidium gunteri.

Pareto analysis can also be done graphically. First, the data are plotted as shown in Figure 1A. Then, data in the cube plot are averaged across one dimension, for example [salt], to generate a square plot as shown in Figure 1B.

The data is further reduced by averaging across another dimension, for example [O2] to yield a line plot as shown in Figure 1C.

The effect of going from low pH to high pH on the measured parameter, in this case relative abundance, is determined by subtracting the value at low pH from the value at high pH, in this case 0.277 - 0.674 = -0.397. This result indicates that an increase in pH leads to a decrease in the abundance of this species. This technique is repeated in order to reduce the data to obtain the effects of the other variables; in this example, [O2] and [salt].

Interaction between factors occurs when the effect of one factor depends on whether the other factor is at its high or low value. Two-factor interactions can also be calculated using this analysis. Data is reduced to the square plot, and the average of runs in which the two factors are at extremes (ie. High-High and Low-Low) minus the average of the runs in which the factor levels are mixed (ie. High-Low and Low-High) will yield a number. If no two-factor interactions are occurring, this difference is zero.

The Excel spreadsheet allows one to conduct this analysis without the need to draw numerous graphs, and greatly speeds up the data analysis process.