360
pages

*"During the entire course of my Ph.D. I've been (embarrasingly) looking for a way to teach myself the fundamentals of statistical analysis. At this point in my education, I've come to realize that often times, simply knowing the basics is enough for you to properly apply even the most complex analytical methods. ‘Statistics for Terrified Biologists’ has been just such a book - it was more than worth the $40 I spent on it, and while my 'book clubs' aren't meant to be reviews, I highly recommend the book to anyone who's in a similar predicament to my own."* –Carlo Artieri's Blog Book Club

The typical biology student is “hardwired” to be wary of any tasks involving the application of mathematics and statistical analyses, but the plain fact is much of biology requires interpretation of experimental data through the use of statistical methods.

This unique textbook aims to demystify statistical formulae for the average biology student. Written in a lively and engaging style, *Statistics for Terrified Biologists* draws on the author’s 30 years of lecturing experience. One of the foremost entomologists of his generation, van Emden has an extensive track record for successfully teaching statistical methods to even the most guarded of biology students.

For the first time basic methods are presented using straightforward, jargon-free language. Students are taught to use simple formulae accurately to interpret what is being measured with each test and statistic, while at the same time learning to recognize overall patterns and guiding principles. Complemented by simple illustrations and useful case studies, this is an ideal statistics resource tool for undergraduate biology and environmental science students who lack confidence in their mathematical abilities.

Voir plus
Voir moins

Vous aimerez aussi

Contents

Preface

1

2

3

How to use this book Introduction The text of the chapters What should you do if you run into trouble? Elephants The numerical examples in the text Boxes Sparetime activities Executive summaries Why go to all that bother? The bibliography

Introduction What are statistics? Notation Notation for calculating the mean

Summarizing variation Introduction Different summaries of variation Range Total deviation Mean deviation Variance Whyn−1? Why the squared deviations? The standard deviation The next chapter Sparetime activities

xiii

1 1 1 2 3 3 3 4 4 4 6

8 8 8 10

12 12 13 13 13 13 14 16 17 18 20 20

vi

4

5

6

7

Contents

When are sums of squares NOT sums of squares? Introduction Calculating machines offer a quicker method of calculating sums of squares Added squares The correction factor Avoid being confused by the term “sum of squares” Summary of the calculator method of calculating down to standard deviation Sparetime activities

The normal distribution Introduction Frequency distributions The normal distribution What per cent is a standard deviation worth? Are the percentages always the same as these? Other similar scales in everyday life The standard deviation as an estimate of the frequency of a number occurring in a sample From per cent to probability Executive summary 1 – The standard deviation

The relevance of the normal distribution to biological data To recap Is our observed distribution normal? Checking for normality What can we do about a distribution that clearly is not normal? Transformation Grouping samples Doing nothing! How many samples are needed? Factors affecting how many samples we should take Calculating how many samples are needed

Further calculations from the normal distribution Introduction Is “A” bigger than “B”? The yardstick for deciding Derivation of the standard error of a difference between two means Step 1 – from variance of single data to variance of means

21 21

21 21 22 22

23 24

25 25 25 26 27 29 30

31 31 33

35 35 36 37 38 38 40 40 40 41 41

42 42 42 43

45 45

8

9

Contents

Step 2 – from variance of single data to “variance of differences” Step 3 – the combination of Steps 1 and 2; the standard error of difference between means (s.e.d.m.) Recap of the calculation of s.e.d.m. from the variance calculated from the individual values The importance of the standard error of differences between means Summary of this chapter Executive summary 2 – Standard error of a difference between two means Sparetime activities

Thettest Introduction The principle of thettest Thettest in statistical terms Whyt? Tables of thetdistribution The standardttest The procedure The actualttest ttest for means associated with unequal variances The s.e.d.m. when variances are unequal A worked example of thettest for means associated with unequal variances The pairedttest Pair when possible Executive summary 3 – Thettest Sparetime activities

One tail or two? Introduction Why is the analysis of varianceFtest onetailed? The twotailedFtest How many tails has thettest? The ﬁnal conclusion on number of tails

10 Analysis of variance – What is it? How does it work? Introduction Sums of squares in the analysis of variance Some “madeup” variation to analyze by Anova The sum of squares table

vii

48

49

51

52 52

56 57

58 58 58 59 60 61 64 64 69 69 70

73 75 78 80 82

83 83 83 84 86 87

88 88 89 89 91

viii

Contents

Using Anova to sort out the variation in Table C 91 Phase 1 91 Phase 2 92 SqADS – an important acronym 93 Back to the sum of squares table 96 How well does the analysis reﬂect the input? 96 End Phase 97 Degrees of freedom in Anova 97 The completion of the End Phase 99 The variance ratio 100 The relationship between “t” and “F” 101 Constraints on the analysis of variance 103 Adequate size of experiment 103 Equality of variance between treatments 103 Testing the homogeneity of variance 104 The element of chance: randomization 104 Comparison between treatment means in the analysis of variance 107 The least signiﬁcant difference 108 A caveat about using the LSD 110 Executive summary 4 – The principle of the analysis of variance 111

11 Experimental designs for analysis of variance Introduction Fully randomized Data for analysis of a fully randomized experiment Prelims Phase 1 Phase 2 End Phase Randomized blocks Data for analysis of a randomized block experiment Prelims Phase 1 Phase 2 End Phase Incomplete blocks Latin square Data for the analysis of a Latin square Prelims Phase 1 Phase 2

115 115 116 117 117 118 118 120 121 123 123 125 126 127 127 130 131 132 134 134

End Phase Further comments on the Latin square design Split plot Executive summary 5 – Analysis of a randomized block experiment Sparetime activities

Contents

12 Introduction to factorial experiments What is a factorial experiment? Interaction If there is no interaction What if there is interaction? How about a biological example? Measuring any interaction between factors is often the main/only purpose of an experiment How does a factorial experiment change the form of the analysis of variance? Degrees of freedom for interactions The similarity between the “residual” in Phase 2 and the “interaction” in Phase 3 Sums of squares for interactions

13 2Factor factorial experiments Introduction An example of a 2factor experiment Analysis of the 2factor experiment Prelims Phase 1 Phase 2 End Phase (of Phase 2) Phase 3 End Phase (of Phase 3) Two important things to remember about factorials before tackling the next chapter Analysis of factorial experiments with unequal replication Executive summary 6 – Analysis of a 2factor randomized block experiment Sparetime activity

14 Factorial experiments with more than two factors Introduction Different “orders” of interaction

ix

135 136 137

139 141

143 143 145 145 147 148

148

150 150

151 152

154 154 154 155 155 156 156 157 158 162

163 163

166 169

170 170 171

x

Contents

Example of a 4factor experiment 172 Prelims 173 Phase 1 175 Phase 2 175 Phase 3 176 To the End Phase 183 Addendum – Additional working of sums of squares calculations 186 Sparetime activity 192

15 Factorial experiments with split plots Introduction Deriving the split plot design from the randomized block design Degrees of freedom in a split plot analysis Main plots Subplots Numerical example of a split plot experiment and its analysis Calculating the sums of squares End Phase Comparison of split plot and randomized block experiment Uses of split plot designs Sparetime activity

194 194 195 198 198 198 201 202 205 206 209 211

16 Thettest in the analysis of variance 213 Introduction 213 Brief recap of relevant earlier sections of this book 214 Least signiﬁcant difference test 215 Multiple range tests 216 Operating the multiple range test 217 Testing differences between means 222 Suggested “rules” for testing differences between means 222 Presentation of the results of tests of differences between means 223 The results of the experiments analyzed by analysis of variance in Chapters 11–15 225 Sparetime activities 236

17 Linear regression and correlation Introduction Cause and effect Other traps waiting for you to fall into Extrapolating beyond the range of your data Is a straight line appropriate? The distribution of variability

238 238 239 239 239 239 244

Contents

Regression Independent and dependent variables The regression coefﬁcient (b) Calculating the regression coefﬁcient (b) The regression equation A worked example on some real data The data (Box 17.2) Calculating the regression coefﬁcient (b) – i.e. the slope of the regression line Calculating the intercept (a) Drawing the regression line Testing the signiﬁcance of the slope (b)of the regression How well do the points ﬁt the line? – the coefﬁcient of 2 determination (r) Correlation Derivation of the correlation coefﬁcient (r) An example of correlation Is there a correlation line? Extensions of regression analysis Nonlinear regression Multiple linear regression Multiple nonlinear regression Analysis of covariance Executive summary 7 – Linear regression Sparetime activities

18 Chisquare tests Introduction 2 When and where not to useχ The problem of low frequencies Yates’ correction for continuity 2 Theχﬁt”test for “goodness of The case of more than two classes 2 χwith heterogeneity 2 Heterogeneityχanalysis with “covariance” 2 Association (or contingency)χ 2×2 contingency table Fisher’s exact test for a 2×2 table Larger contingency tables Interpretation of contingency tables Sparetime activities

xi

244 245 247 248 253 255 255

256 257 257 258

262 263 263 264 266 266 269 270 272 272 274 276

277 277 278 279 279 280 282 284 286 289 289 291 292 293 294

xii

Contents

19 Nonparametric methods (what are they?) Disclaimer Introduction Advantages and disadvantages of the two approaches Where nonparametric methods score Where parametric methods score Some ways data are organized for nonparametric tests The sign test The Kruskal–Wallis analysis of ranks Kendall’s rank correlation coefﬁcient The main nonparametric methods that are available

Appendix 1 How many replicates

Appendix 2 Statistical tables

Appendix 3 Solutions to “Sparetime activities”

Appendix 4 Bibliography

Index

296 296 296 298 298 299 300 300 301 302 303

306

314

321

337

339