NUMERICAL ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL DATA

NUMERICAL ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL DATA Lecture 11. Hypothesis Testing

Randomisation tests Simple introductory example Monte Carlo permutation tests HYPOTHESIS TESTING Types of permutation tests Palaeoecological stratigraphical data Time-duration tests Impacts of volcanic ash of terrestrial and aquatic systems Impacts of land-use on lake-water activity Impacts of PiceaAbies (spruce) on lake-water acidity Ecological data RDA as a tool for reduced-rank regression Impact of seasonal sheep grazing on grasslands Weeds in Sweden Short-term vegetational change in fen meadow Field eco-toxicology experiments Other numerical tools Spatial biogeographical data Mantel tests Partial Mantel test Spatial and temporal data Importance of permutation tests

RANDOMISATION TESTSSIMPLE INTRODUCTORY EXAMPLE Mandible lengths of male and female jackals in Natural History Museum Is there any evidence of difference in mean lengths for two sexes? Male mean larger than female mean. Null hypothesis (Ho) – no difference in mean lengths for two sexes, any difference is purely due to chance. If Ho consistent with data, no reason to reject this in favour of alternative hypothesis that males have a larger mean that females.

Classical hypothesis testing – t-test for comparison of 2 means Assume that values for group 1 are random sample from a normal distribution with 1 mean and standard deviation , and that values for group 2 are a random sample with mean 2 and standard deviation  H0 1 = 2H11 > 2

Test null hypothesis with estimate of common within-group s.d. • S = [{(n1 –1)S12 + (n2 – 1)S22}/(n1 + n2 –2)] • T = (x1 – x2)/(S(1/n1 + 1/n2)) • If H0 true,Twill be a random value from t-distribution withn1 + n2 – 2d.f. • Jackal data • x1 = 113.4mm s1 = 3.72mm s = 3.08 • x2 = 108.6mm s2 = 2.27mm • T = 3.484 18 d.f. Probability of a value this large is 0.0013 if null hypothesis is true. Sample result is nearly significant at 0.1% level. Strong evidence against null hypothesis. Support for alternative hypothesis.

ASSUMPTIONS OF T-TEST • Random sampling of individuals from the populations of interest • Equal population standard deviations for males and females • Normal distributions within groups

ALTERNATIVE APPROACH • If there is no difference between the two sexes, then the length distribution in the two groups will just be a typical result of allocating 20 lengths at random into 2 groups each of size 10. Compare observed difference with distribution of differences found with random allocation. • TEST: • Find mean scores for male and female and difference D0for observed data. • Randomly allocate 10 lengths to male group, remaining 10 to female. Calculate D1. • Repeat many times (n e.g. 999 times) to find an empirical distribution of D that occurs by random allocation. RANDOMISATION DISTRIBUTION. • If D0 looks like a ’typical’ value from this randomisation distribution, conclude that allocation of lengths to males and females is essentially random and thus there is no difference in length values. If D0 unusually large, say in top 5% tail of randomisation distribution, observed data unlikely to have arisen if null hypothesis is true. Conclude alternative model is more plausible. • If D0 in top 1% tail, significant at 1% level • If D0 in top 0.1% tail, significant at 0.1% level

The distribution of the differences observed between the mean for males and the mean for females when 20 measurements of mandible lengths are randomly allocated, 10 to each sex. 4999 randomisations.

x1 = 113.4mm x2 = 108.6mm D0 = 4.8mm Only nine were 4.8 or more, including D0. Six were 4.8 2 > 4.8 9 Significance level = 5000 = 0.0018 = 0.18% (cf. t-test 0.0013 0.13%) 20C10 = 184,756. 5000 only 2.7% of all possibilities.

THREE MAIN ADVANTAGES • Valid even without random samples. • Easy to take account of particular features of data. • Can use 'non-standard' test statistics. • Tell us if a certain pattern could or could not be caused/arisen by chance. Completely specific to data set. RANDOMISATION TESTS AND MONTE CARLO PERMUTATION TESTS If all data arrangements are equally likely, RANDOMISATION TEST with random sampling of randomisation distribution. Otherwise, MONTE CARLO PERMUTATION TEST. Validity depends on validity of permutation types for particular data-type – time-series stratigraphical data, spatial grids, repeated measurements (BACI). All require particular types of permutations.

USEFUL REVIEWS PALAEOECOLOGY Birks (1993) INQUA Newsletter on Data-Handling Methods 14, 2-8. ECOLOGY Crowley (1992) Ann. Review Ecology & Systematics 23, 405-447. Strong (1980) Synthese 43, 271–285 Harvey et al. (1983) Ann. Rev. Ecol. System. 14, 189–211

MONTE CARLO PERMUTATION TESTS ROUND LOCH OF GLENHEAD pH change 1874-1931 (17.3-7.3cm) very marked. Is it any different from other pH fluctuations over last 10,000 years? Null hypothesis – no different from rates of pH change in pre-acidification times. Randomly resample with replacement 1,000 times to create temporally ordered data of same thickness as the interval of interest – time-duration or elapsed-time test. As time series contains unequal depth intervals between pH estimates, not possible for each bootstrapped time series to contain exactly 10cm. Instead samples are added in time series until depth interval equals or exceeds 10cm.

Rate (pH change per cm)

STATISTICAL METHODS FOR TESTING COMPETING CAUSAL HYPOTHESES Also covariables Basic statistical model: Statistical testing by Monte Carlo permutation tests to derive empirical statistical distributions Variance partitioning or decomposition to evaluate different hypotheses.

BASIS OF MONTE CARLO PERMUTATION TESTS IN CANOCO • Null hypothesis – species are unrelated to environmental data • Alternative hypothesis – species are related to environmental data • STATISTICAL TEST • Select an appropriate test statistic to express how strongly species data respond to environment (e.g. r, t-ratio, F ratio). • Calculate test statistic for data S0. • Determine a reference distribution for the test statistic under the null hypothesis. Reference distribution shows the values to be expected under the null hypothesis that species are unrelated to environment. • Calculate the significance level, i.e. the probability that S0 or larger values occur in the reference distribution. Crux of all standard tests is that the reference distribution can be derived mathematically from the assumptions of the test, e.g. F-ratio in regression or ANOVA and F-distribution and hence tables of F-distribution.

MONTE CARLO PERMUTATION TEST • Reference distribution is determined from the data themselves, without the assumptions of normality and without mathematical derivations. Its basis lies in the observation that under the null hypothesis the samples in the species data can be randomly linked with the samples in the environmental data. • Under the null hypothesis, each permutation of the samples is equally likely. Each permutation leads to a new data set from which the test statistic can be calculated. The reference distribution is therefore the distribution of the test statistic in the permuted data sets. • STEPS • Choose test statistic that expresses how strongly the species data respond to the environment. In CANOCO there are two test statistics that both have the form of an F-ratio. • Calculate the test statistic for the data, F0. • Generate K new data sets that are equally likely under the null hypothesis. In CANOCO this is done by randomly permuting the samples in the species data (response data) while keeping the environmental (and covariable data) data fixed.

4. Calculate the test statistic for each new data set 5. Calculate the Monte Carlo significance level Cannot generate all possible permutations, so should do a reasonable number. If K = 10,000, little random variation but not strictly necessary. Also takes computer time. For 5% significance level, good compromise is at least 199permutations for the test.

covariables TYPES OF PERMUTATION TESTS AVAILABLE IN CANOCO 3 & 4 • Validity of permutation test depends on the type of permutation for particular research design • Completely randomised designed experiments, completely random permutation appropriate. Unrestricted permutation yields completely random permutations. • If data are from time series, line transect, or rectangular spatial grid, restricted permutation. • For ‘linear’ data, bend series into circle so that start and end meet. Randomly match X and Y. • For grid, wrap data around a torus. • Randomised block design, permutation must be conditioned on blocks. • e.g. farm types as covariables, permutation within blocks guarantees permutations are within farms. • Repeated measurement design, each unit must have been recorded the same number of times. • Data for consecutive samples in time must be given consecutive numbers in the input files. Use covariables to define blocks. • BACI – before-after-control-impact studies • abundance = plot effect + time + lime effect + error

Fig. 1. Ecological implications of the different randomization procedures for the correlation between two spatially auto-correlated variables. Here each variable sampled over a two-dimensional area is represented by a layer. a) Randomization that implies no spatial structure among variables or species; b) Restricted randomization within region, which implies some degree of spatial structure at the regional scale but no spatial structure within regions; c) Restricted randomization keeping the sequence of the data fixed by doing a toroidal shift (i.e. the spatial pattern at the variable or species is preserved); and d) Restricted randomization based on the degree of spatial autocorrelation of the observed data.

ADDITIONAL PERMUTATION TESTS AVAILABLE IN CANOCO 4 Split-Plot Design Hierarchical design with two levels of units - WHOLE PLOTS containing SPLIT PLOTS e.g. samples within different mountains, plots within stands, plots along transects, quadrats within time series (permanent plots). Can permute WHOLE PLOTS or SPLIT PLOTS or both. Whole Plots should be of equal size. Effect of environmental variables that vary within WHOLE PLOTS can be tested by permuting split plots completely at random within whole plots without permuting whole plots. Whole Plots restrict the permutations in the same way as blocks but without the need to define block-defining covariables. Within WHOLE PLOT or SPLIT PLOT, can then have time series or line transect, spatial grids, or freely exchangeable.

WHAT IS SHUFFLED IN CANOCO PERMUTATION TESTS? • No covariables in the analysis (or covariables are used to define blocks) it does not matter if the samples in the species data or the environmental data are permuted. Wide choice of possible test statistics. • Null hypothesis of the test is the OVERALL NULL MODEL - species are unrelated to the environmental data (within blocks, if defined). Simple hypothesis test, comparable to overall F-test is regression analysis. • 2. If covariables are present, e.g. does one variable have an effect on the species after taking into account the effect of another variable. • e.g. does nutrient pollution affect species composition after taking into account the natural variation in salinity of the water? In regression analysis, such questions are addressed by a t-test or, if the effect of more than one variable is of interest, a partial F-test. PARTIAL or CONDITIONAL tests. Need a multivariate form of such a test that does not assume multivariate normality.

covariables random errors Y = ZB + XC + E species data regression coefficients environmental variables • Effect of environmental variables X on the species data Y in the presence of the • covariables Z. • Test the null hypothesis that all elements of C = 0 when elements of B are • unknown. • To test H0 C = 0 (i.e. the effect of X), proposed solutions: • Permute the rows of the species data Y • Permute the rows of the environmental data X (CANOCO 2) • Permute the residuals Er of the regression of Y on Z REDUCED MODEL or NULL MODEL (CANOCO 3,4) • Permute the residuals of Ef of the regression of Y on Z and X • FULL MODEL (CANOCO 3,4)

Evaluation of proposals • Attractive, simple, keeps close to study design. However, Y values obtained at different values of Z are not exchangeable under null hypothesis if Z has an effect (B ≠ 0). Wrong type I error, low power if Z is important. • When permuting rows of X, the correlations between X and Z change so correlation structure is lost. Type I error is increased. No logical basis for testing any interaction effects. • 3 & 4. Explicitly use the regression model because residuals cannot be calculated without a model. • MODEL-BASED PERMUTATIONS • Both correct type I errors if F-ratio is used as test statistic and number of degrees of freedom is large (n – p – q > 10). (In CANOCO when one talks of samples in the species data being permuted, strictly mean that the samples of the RESIDUALISED SPECIES DATA are permuted. The residualisation is with respect to Z in reduced model (proposal 3) and with respect to X and Z in full model (proposal 4)).

Model-based permutation • Permutation test of the effect of X, adjusted for the possible effect of Z. • Choose test statistic F ratio of partial F-test to test the null hypothesis C = 0 • Regress species data on Y on covariable data Z, then add environmental variables X to regression, giving residual sum of squares RSSz and RSSz + x, respectively. • Calculate F ratio for testing C = 0 • sum of all canonical eigenvalues (with covariables & environmental variables) 2. Calculate test statistic for data, F0 • Generate K new data sets that are equally likely under the null hypothesis. • Two substeps • 1. Regress Y on Z, yielding fitted values Ŷ and residuals E with E = Y –Ŷ • 2. Permute the rows of E to yield E* and calculate new data Y* = Ŷ + E*

Calculate the test statistic for each new data set Y*. • As in step 1 with Y* replacing Y giving F ratios F1, F2, F3, F4 Fk • Calculate Monte Carlo significance level • Place F0 among F1, F2 ... Fk and determine the proportion of values greater than or equal to F0 • This is the procedure for "Test of significance for all canonical axes" or "Test based on the trace statistic". ALSO - test statistic based on first canonical eigenvalue or axis one 1 F = 1 / (RSSz+1 / (n – p – q)) residual sum of squares of model with covariables fitted and first ordination axis of environmental data (rank 1 restriction on matrix of regression coefficients C) Maximum power against alternative hypothesis (H1) that there is a single dominating gradient that determines the relation between species and environment.

TESTS OF STATISTICAL SIGNIFICANCE IN CANONICAL ANALYSIS Comparison of the methods of permutation of raw data or residuals in terms of the permuted fractions of variation, in the presence or absence of a matrix of covariables W. Fractions of variation: (a) is the variation of matrix Y explained by X alone; (c) the variation explained by W alone; (b) the variation explained jointly by X and W, and (d) the residual variation.

1 Permutation of raw data may result in unstable (often inflated) type I error when the covariable contains outliers. This does not occur, however, when using restricted permutations of raw data within groups of a qualitative covariable, which gives an exact test. Not enough known yet to decide whether to use reduced or full model. Needs experiments with simulated data, assessment of type I and type II errors, precision of p-values, etc.

WHAT IS PERMUTED IN CANOCO – A SUMMARY If no covariables or conditioning variables, biological data or environmental data can be permuted. In partial models with conditioning variables, cannot permute biological data as they are dependent on the conditional variables and cannot permute the constraining environmental data as they correlate with the conditional variables. Residuals are exchangeable if they are independent and identically distributed. Reduced model permutes residuals after considering the conditional variables. Full model permutes residuals after considering the conditional variables and the constraining variables.

DISTANCE-BASED REDUNDANCY ANALYSIS Legendre & Anderson (1999) Ecol. Monogr. Extends RDA and associated battery of permutation tests to use any ecologically reasonable similarity or dissimilarity coefficient. • Calculate DC between samples • Transform to principal co-ordinates, including a correction for negative eigenvalues Y • Create a matrix of dummy variables (or environmental variables) X (and Z) • Do a RDA using Y and X (and Z) • Permutation tests for particular model.

REGRESSION AND ANALYSIS OF VARIANCE USING PERMUTATION TESTS Ideal for data with non-normal error structure. 1. Multiple linear regression (ordinary or forced through the origin) with permutation tests (permutation of values y; permutation of residuals of the full regression model). MLR www.fas.umontreal.co/biol/legendre 2. Non-parametric multivariate analysis of variance. Use any symmetric distance or dissimilarity measure as measure of sample differences. Assess statistical significance by permutation tests (restricted permutation of raw data; permutation of residuals of the full model; permutation of residuals of the reduced model). Designs - one-way design; two-way nested design; two-way crossed design. Anderson, M.J. (2001) Austral. Ecology 26, 32-46 NPMANOVA and PERMANOVA_2factor www.stat.auckland.ac.nz/~mja

3. Generalised distance-based multivariate analysis for a linear model. Multivariate multiple regression of any symmetric distance matrix for response variables Y(n x m). Predictor variables (X n x q) contain variables of interest for testing the multivariate null hypothesis of no relationship between Y and X on the basis of the distance matrix chosen. X may contain the codes of an ANOVA model (design matrix) or one or more explanatory variables (e.g. environmental variables). Can include covariables in the analysis. Permutations - unrestricted of raw data; residuals under a reduced model; residuals under a full model. McArdle, B.H. & Anderson, M.J. (2001) Ecology 82, 290-297 Anderson, M.J. & Robinson, J. (2001) Australian and New Zealand J. Statistics 43, 75-88 DISTLM www.stat.auckland.ac.nz/~mja DISTLM-forward www.stat.auckland.ac.nz/~mja

WARNINGS 'Non-parametric' does not mean 'no assumptions' • Traditional parametric univariate ANOVA assumptions: • (i) distribution of errors is normal • (ii) data are independent • (iii) any treatment effects are additive • (iv) variances are homogenous among groups • Permutation tests allow only assumption (i) to be ignored. Permutation tests still require replicate observations to be independent and identically distributed, the errors to come from the same distribution (same common variance) and be independent of one another. Use of permutation tests does not avoid assumptions of independence or homogenous variances and as a linear ANOVA model is being used, the assumption of additivity is still important.

2. Assumptions of independence applies to observations, not variables. 3. Significant results can be caused by heterogeneous dispersions or variances. 4. Pair-wise a posteriori tests are not corrected for experiment-wise type I error rates. Thus if one uses 0.05 for tests, you can expect a significant result in 1 out of every 20 tests by chance alone. Do obtain the exact Monte Carlo permutation value. Onus on the user is how to interpret the p-value obtained (e.g. ? Use Bonferroni correction) 5. Cannot enumerate all possible permutations. Well worth using 999 or even 4999 permutations. Good programs will tell you what the maximum number of possible permutations is for a particular test (data and design matrix).

TIME-DURATION TEST Round Loch of Glenhead example Is the pH change between 1874 and 1931 any different from any other pH fluctuations in the last 10,000 years? Randomly resample with replacement 1000 times to make temporally ordered data of same duration or thickness as 1874-1931. Compare observed pH change with pH change in the 1000 bootstraps. How many times is the rate of interest (Obs) exceeded in the 1000 bootstraps? Exact Monte Carlo probability = (Number of bootstraps  Obs) + 1 (for Obs) Number of bootstraps + 1 (for Obs) p = 0.021

ASSESSING IMPACTS OF LAACHER SEE VOLCANIC ASH ON TERRESTRIAL AND AQUATIC ECOSYSTEMS

Map showing the location of Laacher See (red star), as well as the location of the sites investigated (blue circle). Numbers indicate the amount of Laacher See Tephra deposition in millimetres (modified from van den Bogaard, 1983).

Loss-on-ignition of cores Hirschenmoor HI-1 and Rotmeer RO-6. The line marks the transition from the Allerød (II) to the Younger Dryas (III) biozone. LST = Laacher See Tephra.

Diatoms in cores HI-1 and RO-6 grouped according to life-forms. LST = Laacher See Tephra.

Younger Dryas Allerød

Diatom-inferred pH values for cores HI-1 and RO-6. The interpolation is based on distance-weighted least-squares (tension = 0.01). The line marks the transition from the Allerød (II) to the Younger Dryas (III) biozone. LST = Laacher See Tephra.

Exp x-t 211 years YD Time AL Data Terrestrial pollen and spores (9, 31 taxa) Aquatic pollen and spores (6, 8 taxa) RESPONSE VARIABLES Diatoms (42,54 taxa) % data NUMERICAL ANALYSIS (Partial) redundancy analysis Restricted (stratigraphical) Monte Carlo permutation tests Variance partitioning Log-ratio centring because of % data • = 0.5 x = 100 t = time EXPLANATORY VARIABLES

NUMERICAL ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL DATA