Statistical Tests:
Choosing and Understanding the Appropriate Statistical Test for Ecological Studies


Use this dichotomous key to determine the type of simple statistical analysis that you should perform on your ecological data. The terms are general of necessity - for further guidance, consult with your Instructor and TAs. To proceed, click on the option that is most appropriate for your data and experimental design and follow the suggestions therein. A glossary is included below that briefly explains what the terms included in the key.

When you have finished navigating through this chart, you can then analyze your data most effectively with sophisticated, but easy to use, software packages such as SPSS or SAS.

If you would like to know more about the statistical tests mentioned below, consult any of many excellent introductory statistics textbooks. I used what has arguably been the standard in the field for decades, entitled Biometry, by Robert R. Sokal and F. James Rohlf, 1994, from W. H. Freeman and Company Press, ISBN: 0716724111. In addition, see below for a few websites that will help you deepen your understanding of statistics.

GOOD LUCK!


1a. The data adhere to the assumptions of normality (Parametric statistical tests )
Go to 2
1b.The data either do not adhere to the assumptions of normality or they have not been tested ( Non-parametric statistical tests )
Go to 8


2a. 1 independent variable (IV) with 2 treatments, 1 Dependent Variable (DV ) of continuous value
t-test or single-classification ANOVA
2b. Not as above (more than 1 IV , more than 2 treatments, more than 1 DV
Go to 3


3a. 1 IV with 3 or more treatments
Go to 4
3b. 2 or more IV with 1 or more treatments
Go to 6


4a. 1 IV with 3 or more treatments, 1 DV of continuous value, and wish to test for within-group influences
Nested ANOVA
4b. 1 IV with 3 or more treatments but different from above (with 2 or more DV and do NOT wish to test for within-group influences
Go to 5


5a. 1 IV with 3 or more treatments, 1 DV of continuous value
Single Classification ANOVA
5b. 1 IV with 3 or more treatments, 2 or more DV of continuous value
MANOVA


6a.2 IV
Go to 7
6b. 3 or more IV , with two or more treatments per IV
Multivariate analysis (not covered in this key)


7a. 2 IV only 1 treatment in each IV , 1 DV of continuous value
Linear Regression
7b. 2 IV , each with 2 or more treatments, 1 DV of continuous value
Two-Way ANOVA


8a.1 IV
Go to 9
8b. 2 or more IV
Friedman's Method for Randomized Blocks


9a. 1 IV , only 2 treatments , 1 DV
Go to 10
9b. 1 IV , 2 or more treatments, 1 DV
Kruskal Wallis Test or Chi-squared Goodness of Fit Test


10a. 1 IV, 2 treatments, 1 DV, DV is categorical
Chi-squared Test
10b. 1 IV, 2 treatments, 1 DV, DV is either continuous or ordinal
Go to 11


11a. 1 IV, 2 treatments, 1 DV, DV is either continuous or ordinal, paired replicates
Wilcoxon's Signed-Rank Test
11a. 1 IV, 2 treatments, 1 DV, DV is either continuous or ordinal, unpaired replicates
Mann-Whitney U Test or Kolmogorov-Smirnov 2-Sample Test

top


Glossary


ANOVA - An acronym for a general category of tests called Analyses of Variance, using summed squares of deviation from the mean value. The broad category of ANOVA tests has many types of subtests, including Single Classification ANOVA, MANOVA, Two-Way ANOVA, Nested ANOVA.

Categorical Variable Format - Also called frequency or discontinuous variable formats. Used when the dependent variable data are collected in such as way as to record the frequency of occurrences in each of 2 or more categories.

Chi-Squared Test - A non-parametric alternative to the t-test using categorical data.

Continuous Variable Format - The most common form of data, encountered when at least theoretically the data could assume an infinite number of values between any two fixed points, such as measuring the length, area, volume, weight, angle, temperature, periods of time, percentages, or rates.

Dependent Variable - The variable that is measured for its response to the treatment conditions being controlled or compared by the experimenter in the form of the independent variable .

Friedman's Method for Randomized Blocks - A non-parametric alternative to the Two-Way ANOVA that uses ranked data, but ranked within each block. A block is determined by the conditions specified by the two independent variables. Data are ranked and differences in the resultant scores between groups are analyzed for statistical significance.

Homoscedasticity - Homogeneity of variances (or the attribute of not being statistically different) between the different experimental groups (or treatments) being compared. This can be determined using Bartlett's Test for Homogeneity of Variances if there are more than two treatment groups. This is one of the three assumptions of normality along with random sampling and a normal distribution.

Independent Variable - The variable that the experimenter manipulates. This can be an actual manipulation or the basis of the data collection site when an experimenter is comparing different types of habitats. The dependent variable is analyzed for whether it changes in response to the independent variable.

top

Kolmogorov-Smirnov 2-Sample Test - A non-parametric alternative to the t-test , used when there are only two treatment groups being compared. This test differs from the Mann-Whitney U Test in that it uses distributions of the data and does not simply rank the observations as in the Mann-Whitney U Test . It is also less sensitive to how the data is ordered than the Mann-Whitney U Test .

Kruskal-Wallis Test - A non-parametric alternative to the Single Classification ANOVA , used when there is a single independent variables with several treatments. Data are ranked and differences in the resultant scores between groups are analyzed for statistical significance.

Linear Regression - A parametric statistical test that requires that the normality assumptions are met. This technique is used when the experimenter is interested in knowing that the relationship is between two independent variables and how they may collectively and individually determine the dependent variable.

Mann-Whitney U Test - A non-parametric alternative to the t-test, used when there are only two treatment groups being compared. This test differs from the Kolmogorov-Smirnov 2-Sample Test in that it uses ranked data as opposed to using distributions of the data as in the Kolmogorov-Smirnov 2-Sample Test . It is also more sensitive to how the data is ordered than the Kolmogorov-Smirnov 2-Sample Test . Data are ranked and differences in the resultant scores between groups are analyzed for statistical significance.

MANOVA or Multiple ANOVA - A parametric statistical test that requires that the normality assumptions are met. This ANOVA variant is used when there are multiple dependent variables, but only a single independent variable . Therefore, it is most useful when a researcher has collected data on multiple dependent variables and is interested in knowing how strongly these data are determined by the single independent variable .

Multivariate Statistical Analyses - A set of tests that are used when there are 3 or more independent variables with two or more treatments per variable. This is a more sophisticated suite of analyses that includes Multiple Regression, Principle Components Analysis, Factor Analysis, Discriminant Factor Analysis, Cluster Analysis, and the Canonical Correlation Analysis. These are not included in this decision key because they are outside the scope of the SEE-U program.

Nested ANOVA - A parametric statistical test that requires that the normality assumptions are met. This ANOVA variant is used when there is only a single independent variable with multiple treatments and when the experimenter is interested in knowing whether a procedural feature may have influenced the results. Examples of these features include having multiple people collecting the same data, differences in experimental growth media, and when different equipment are used to collect the same data.

Non-Parametric Statistics - A general class of statistics that do not require the normality assumptions to be met. In general, these statistical tests are less robust than the parametric statistics. As a consequence, they often do not detect statistically significant differences, when they may actually be present.

Normal Distribution - A probability distribution that has a symmetrical bell-shaped curve. The data are evenly distributed around the mean trait value. This is one of the three assumptions of normality along with random sampling and homoscedasticity .

Normality Assumptions - The assumptions that must be met to perform a parametric statistical test. This includes homoscedasticity , normal distribution, and random sampling.

Ordinal Variable Format - also called ranked variables. Used when a researcher wishes to rank the intensity of an event, as when heat level is broken into low, medium, high, and very high.

Paired Replicates - A way to collect data such that the same individual is exposed to both of the treatment types. For example, when the same thirty individuals are asked to perform the same tasks under high versus low temperature conditions.

Parametric Statistics - A general class of statistics that require the normality assumptions to be met. In general, these statistical tests are more robust than the non-parametric statistics. As a consequence, they are more likely to detect statistically significant differences when they may actually be present.

Random Sampling - Occurs when the areas or individuals to be used for each of the experimental treatment conditions are randomly assigned. This can be most easily accomplished using a random numbers table or random number generator for each replicate. This is one of the three assumptions of normality along with homoscedasticity and a normal distribution.

Single-Classification ANOVA - A parametric statistical test that requires that the normality assumptions are met. This ANOVA variant is used to analyze data from a single dependent variable with relation to when there is a single independent variable with more than 3 treatment types.

Treatment - also called sample, or value, or condition. This is the way in which the experimenter has set up the experiment by using a variety of forms of the independent variable . For example, if a researcher is interested in knowing the impact of fragmentation on abundance of a single species, she may choose three levels of fragmentation: high, medium, and low. This independent variable (level of fragmentation) would have three treatments (high, medium, and low).

t-test - A parametric statistical test that requires that the normality assumptions are met. This is the most simple ANOVA variant, and it is used when there is only a single independent variable with two treatment types and only a single dependent variable . This is a special case of the Single-Classification ANOVA .

Two-Way ANOVA - A parametric statistical test that requires that the normality assumptions are met. This ANOVA variant is used when there are two independent variables, each with 2 or more treatment types, but only a single dependent variable . This test looks for interactions between the two independent variables, as well as how much of an influence each of the independent variables have had on their own.

Wilcoxon's Signed-Rank Test - A non-parametric alternative to the paired t-test, used when the data are paired between treatment types. Like the Mann-Whitney U Test , this test uses only ranked data and not the raw data as in the Kolmogorov-Smirnov 2-Sample Test . Data are ranked and differences in the resultant scores between groups are analyzed for statistical significance.

top




Online Resources

Hyperstat: An Online Statistical Textbook is available from David M. Lane of Rice University. This is an excellent textbook, as well as a centralized clearinghouse for a great diversity of information, websites, online statistics courses, and textbook review. It should be your first stop if you are interested in learning more about statistics.

Statistics At Square One is available from T. D. V. Swinscow of the University of Southampton. This online textbook is very useful in helping you to develop an understanding of the bases of statistics, including probability, error, and distributions, as well as describing the more simple statistical procedures.

StatSoft, The Electronic Textbook is available from StatSoft Inc., the makers of the Statistica software package. This is an excellent introductory to advanced textbook that briefly describes the basics of statistics and some simple tests, but the strength here is the thorough descriptions of multivariate statistical tests.

StatSoft, The Electronic Textbook also has an excellent and thorough statistical glossary .

The Comparison of Statistical Methods Chart is useful for comparing the efficacy of different statistical analytical methods and is available from the US Environmental Protection Agency.

Statistics.com is a useful clearinghouse with links to a great diversity of courses (online and in-person), texts, software downloads, jobs for statisticians, and tutorials for understanding the field.

top



Resource Page Copyright © 2002 by J. Danoff-Burg.
All Rights Reserved.