Loading documents preview...
Types of data Nominal = data labelled according to category – with no order Ordinal = data labelled according to category with an intrinsic order, but without equal differences between consecutive levels, e.g. ASA or pain (mild, mod, severe) Parametric = data labelled according to category, with an intrinsic order with equal distance between consecutive intervals. A.k.a. Continuous • Can be either o Interval Data = equal differences between numbers without a natural zero e.g. Celcius scale (zero does not mean zero energy) o Ratio Data = equal differences between numbers with a natural order (i.e. complete absence of thing being measured e.g. Kelvin scale) Estimation = made to try and determine parameters Central tendency = single value representation of set of data Mode = most commonly occurring value Median = middle value in an ordered list of data (where the list contains an even number of observations, the median is the average of the two central observations) Mean = average value Parametric data can be represented by mean, median and mode Non-parametric data can be represented by median and mode Measure of Dispersion Spread of data = distribution Range = difference between largest and smallest values – limited use Quartiles = expresses distribution in quarters Inter-Quartile Range = difference between 1st and 3rd quartile (ignores 1st and last quarters)
Variance = measures spread using all data = calculate difference between each value, square them, then add them up. Standard Deviation = the square root of variance – converts variance into appropriate units Normal Distribution = bell-shaped curve, symmetrical about central axis (which corresponds to mean, mode and median). Standard normal curve has a mean of 0 and a SD of 1. Area under the curve = 1. 68% of values lie within +/- 1 SD of the mean 95% of values lie within +/- 2 SD of the mean 99% of values lie within +/- 3SD of the mean Standard Error of Mean (SEm) = quantifies uncertainty in estimate of mean SEm = SD / √n Standard error of mean = SD / square root of sample size Skew = values are clustered on one side and sparse on the other. Null hypothesis = no change is seen – i.e. observations are the same Alternate hypothesis = change is seen p-value • = probability of a result occurring by chance if null hypothesis is true lower the p-value, the lower the change the observation occurred by chance (i.e. the null hypothesis is unlikely) p-value of 0.05 = 5% chance p-value >0.05 = null hypothesis is not accepted as true, but merely not rejected p-value <0.05 = significant = i.e. null hypothesis is rejected p-value <0.01 = extremely significant Error Type 1 error = false positive = seeing a difference where there isn’t one Type 2 error = false negative = not seeing a difference where there is one
Type 1 = Now you see it
Type 2 = Now you don’t
Experimental design aims to minimise error Error occurs due to • Random error – due to intrinsic variation in samples (reduced by increasing sample size) • Systematic error – a.k.a. bias (not reduced by increasing sample size) Bias = systematic error resulting in incorrect estimation of statistical parameters • Selection bias = groups aren’t comparable (reduced by randomisation) • Measurement bias = error occurring in measuring variables (e.g. equipment error or observer bias – reduced by blinding and standardising equipment) Confounding = association between study factors is distorted due to other variables Reduce error by • Randomization (reduces selection bias) = equal chance of being in either group • Blinding (reduces measurement bias) o Single blinded = subject doesn’t know what group they are in o Double blinded – subject and observer don’t know what group the subject is in • Adequate sample size reduces error (ideal sample size can be calculated by power analysis) Power of a study = probability of appropriately rejecting the null hypothesis if it is false (i.e. ability to detect a significant difference if one exists). Sample size depends on: • Effect size = difference in effect between treatment and control group (larger the effect size, the smaller the sample size needed) • Beta-value = probability of a type 2 error = 20% (i.e. power of 80% is needed) • Alpha value = p-value = 0.05 • Distribution of value = parametric or non-parametric Assessing distribution of data = parametric tests – either Kolmogorov-Smirnov test or Q-Q plots (quantilequantile)
Assessing significance or data = calculating p-values Requires an appropriate test for the type of data being examined Parametric = applicable to data that is normally distributed • Student’s t-test o Assesses null hypothesis that mean obtained is same as known population mean o T = (sample mean – known mean)/SE of sample mean o When means are the same t=0 o As sample mean deviates from population mean, t increases and p-value decreases = i.e. probability data came from different population increases • Student’s paired t-test o Examines paired data (i.e. data from same subject, before and after) o Interested in differences between individuals, NOT populations o T = (mean difference before and after) / SE of difference • ANOVA tests o = analysis of variance One tailed tests • look to see if there is a difference above or below the null value Two tailed tests • look to see if there is a difference above and below the null value Non-normal distribution Nominal = Chi-squared • compares observed values (seen in sample) and expected values (calculated by extrapolating known data from a population to the population study) • Calculated by doing the following o For each observed number subtract the corresponding expected number (O - E) o Then Square that (O – E)2 o Then Divide that by the corresponding expected number [(O-E)2/E] o Repeat this for every cell o Add all the individual values for [(O-E)2/E] together = this is the chi-square statistic for the table • In order to analyse the result you will need o A pre-determined level of significance – usually 0.05
• •
o The degrees of freedom (df) for the data (= number in the sample minus the number of restrictions) E.g. if you have 4 numbers with the restriction they must add up to 50. Then the first 3 numbers can be anything, e.g. 5, 10 and 15. Therefore the fourth number must be 20 (in order to make 50) Therefore the degrees of freedom = (4-1) = 3 Having calculated these, the Chi-squared value is applied to a Chisquared distribution table. If your calculated Chi-squared corresponds to a p-value of 0.05 or less then the null hypothesis can be rejected.
Ordinal = Wilcoxian signed-rank sum test, Mann-Whitney test How to perform THE MANN-WHITNEY U TEST 1.
Call one sample A and the other B.
Sample A = 7; 3; 6; 2; 4; 3; 5; 5 Sample B = 3; 5; 6; 4; 6; 5; 7; 5 2. Combine the samples into one group, and rank in ascending order A A A B
A
B
A
A
B
B
B
A
B
B
A
B
2
4
4
5
5
5
5
5
6
6
6
7
7
3
3
3
3. Look at each B in turn, count the number of A’s preceding each one. Add up the total to get a U value U= 3+4+6+6+6+7+7+8 = 47 4. Look at each A in turn, count the number of B’s preceding each one. Add up the total to get a U value U= 0+0+0+1+2+2+5+7 = 17 5. Use the smaller of the two U values. Compare this to the probability table, against the total sample number. The table value gives the probability value – the percentage probability that the difference between the two sets of data could have occurred by chance Type of Data and Which test to use
2 groups, different subjects
Same subjects, before and after intervention
> 2 groups, different subjects
Serial measurements
Continuous
Unpaired t test
Paired t-test
ANOVA
Repeated measures ANOVA
Ordinal
MannWhitney
Wilcoxon rank
KruskalWallis
Friedman
Nominal
Chi-squared
McNemar test
Chi-squared
Cochran’s
General Definitions Sensitivity • Probability of diagnosing a true positive Specificity • Probability of diagnosing a true negative Positive predictive value • Probability a person has a disease when given a +ve test result Negative predictive value • Probability a person does not have a disease when given a –ve test result Risk •
Ratio of events occurring in a study group to the total number of events across all groups Relative Risk • Ratio of risk in treatment group to risk in control group • = risk in treatment / risk in control
Absolute risk reduction • Difference in event rates between treatment and control groups • = risk in control group – risk in treatment group Relative risk reduction
• •
% reduction in events in treatment group compared with control group = 1 – relative risk
Odds • ratio of probability of an event occurring to probability of it not occurring Odds ratio • ratio of the odds of an event occurring in one group to the odds of it occurring in another group Number needed to treat (NNT) • number of patients needed to be treated to prevent one adverse outcome • ideally needs to be as low as possible • = 1 / absolute risk reduction Number needed to harm (NNH) • number of patients needed to be treated to cause one adverse event • Low NNH = low therapeutic index