Simple statistics


Univariate statistics

Typical applicationAssumptionsData needed
Quick statistical description of a univariate sample None, but variance and standard deviation are most meaningful for normally distributed data Single column of measured or counted data

Displays the following statistics: Number of entries (N), smallest value (Min), largest value (Max), mean value (Mean), population variance (that is, the variance of the population estimated from the sample), sample variance (actual variance of just the sample), population and sample standard deviations (square roots of variance), median, skewness (positive for a tail to the right) and kurtosis (positive for a peaked distribution).

Diversity statistics

Typical applicationAssumptionsData needed
Quantifying taxonomical diversity in samples Representative samples One or more columns, each containing counts of individuals of different taxa down the rows

These statistics apply to association data, where number of individuals are tabulated in rows (taxa) and possibly several columns (associations). The available statistics are as follows, for each association:

  • Number of taxa
  • Total number of individuals
  • Dominance=1-Simpson index. Ranges from 0 (all taxa are equally present) to 1 (one taxon dominates the community completely).
  • Simpson index=1-dominance. Measures 'evenness' of the community from 0 to 1. Note the confusion in the literature: Dominance and Simpson indices are often interchanged!
  • Shannon index (entropy). A diversity index, taking into account the number of individuals as well as number of taxa. Varies from 0 for communities with only a single taxon to high values for communities with many taxa, each with few individuals.
  • Menhinick's richness index - the ratio of the number of taxa to the square root of sample size.
  • Margalef's richness index: (S-1)/ln(n), where S is the number of taxa, and n is the number of individuals.
  • Equitability. Shannon diversity divided by the logarithm of number of taxa. This measures the evenness with which individuals are divided among the taxa present.
  • Fisher's alpha - a diversity index, defined implicitly by the formula S=a*ln(1+n/a) where S is number of taxa, n is number of individuals and a is the Fisher's alpha.

    Many of these indices are explained in Harper (1999).

    Rarefaction

    Typical applicationAssumptionsData needed
    Comparing taxonomical diversity in samples of different sizes ? Single column of counts of individuals of different taxa

    Given a column of abundance data for a number of taxa, this module estimates how many taxa you would expect to find in a sample with a smaller total number of individuals. With this method, you can compare the number of taxa in samples of different size. Using rarefaction analysis on your largest sample, you can read out the number of expected taxa for any smaller sample size (including that of the smallest sample). The algorithm is from Krebs (1989). An example application in paleontology can be found in Adrain et al. (2000).

    Next: Comparing two distributions