Study Guides (238,353)
Canada (115,095)
Psychology (683)
PSY100H1 (375)

PSYB01 - Lec 9 - Inferential Stats and Hypothesis Testing (near-verbatim)

6 Pages
Unlock Document

University of Toronto St. George
Connie Boudens

Inferential Statistics and Hypothesis testing: Nov 22, 2012 – Lec 9 Descriptive statistics summarize a data set – describe its characteristics  What the average and standard deviation for the data set is that you collected  But you want to go beyond the data that you collected  Descriptive stats are impt because you need 2 describe to ppl what exactly your data looked like but you try to make conclusions about larger group using your sample  You want representative sample and ensure sample is sufficiently large so you can draw inferences about the population that sample comes from  Polling organization are good at representative samples and can predict how population will behave and draw some conclusion about the population by using the sample  Can you make conclusions about the population using the sample that you have? This isn’t as straight- forward as it sounds – can’t go just go by the average – have to do more work Inferential statistics are used to draw inferences about population -from a sample. Two main methods used in inferential statistics: 1. estimation of population parameters – statistics @ the population level – Sample used to estimate mean and SD for the population • You’re never going to know exactly what the pop’n mean and SD are but sample can help estimate what they are – Confidence interval is constructed. • CI around the mean usually • CI allows you to say with a certain level of certainty that the population mean falls within that relatively narrow range; ex) 99% certain that mean for population falls between 100- 105 – CI allows you to say that the population mean is within that interval. – Bigger sample size = smaller CI • This is good because more accurate; if CI = smaller, then estimate = more accurate 2. hypothesis testing – Null Hypothesis – Alternative Hypothesis / Research Hypothesis – Ex) research question: is UTSC students smarter than general public? • Hypothesize: yes, they are smarter than general public – with hypothesis testing, can generate 2 competing hypotheses • 1) null hypothesis – assumes that there are no differences – average IQ of UTSC students is same as IQ for population • 2) alternative/research hypothesis – there is a difference of IQ between UTSC students and population – In research paper, will usually mention their alternative/research hypothesis; not null H – In hypothesis testing, always assume that your null hypothesis is true even though your alternative hypothesis is true; researcher must present enough evidence to reject null hypothesis • Have to get sample of UTSC students (n=50), test them and mean IQ = 121 – but does this apply to the general UTSC population? Inferential stats must be done Why would my sample differ from the population?  If your sample is perfectly representative of the UTSC population then you have no problem – but this will never be the case – there will always be some variation  2 basic reasons why sample may differ from population Two sources of deviation: • Systematic error – Due to bias in your sample – If you’ve done a good job you’ve eliminated these – Some potential problems: you asked for volunteers to participate in study and they were told they were going to do IQ test, usually the ppl who will participate are those with high IQs; smart ppl will do IQ tests because it makes them feel good about themselves; or person who scores marks them really high because they know it’s about IQ; etc; - these are sources of systematic error – you can eliminate these errors • Sampling error – Samples will vary from each other in random fashion - Unavoidable - Any sample chosen will vary from population in some way and they will all be diff from each other - Not all going to have the same mean or same SD - You won’t know how the samples would vary because you’re only taking one sample; would cost lots of resources to do every possible sample and you have good methods for extrapolating to the population so you would only use one sample - What you want to know is if the sample is representative of the overall population - If you’ve eliminated your sources of systematic error and accounted for other potential research problems (accounting for confounding variables/extraneous, getting good sample) – this all goes towards systematic error - But you’re always going to be left with sampling error (unavoidable) You need to figure out if the mean in your sample is due to sampling error or a real difference between UTSC students and general public because it could be the case that it’s sampling error • Let’s say that the sample that you initially pick is an unusal sample with IQ 125 (for UTSC students) but the samples that you pick after that (IQ = 100, 98, 105 etc;) – if that happens – problematic – can do bigger sample OR can use inferential stats • So, with inferential stats, can use this one sample and see if it is representative of total UTSC pop’n • Last week: o Bell shaped histogram –normally distributed scores /bell distribution o A lot of statistical techniques based on assumption that data is distributed like this in the pop’n o Series of histograms with larger and larger sample sizes; a curve that’s drawn over this type of histogram o Ex) height = normally distributed in the pop’n o Histogram A = 100 women categorized into height types; height of bar represents # of ppl in that category in that division o Histogram = not that many women who are really short or really tall; a lot in the middle o So, you want to know whether your sample is an accurate representation of the pop’n  Need distribution of sample means to figure this out = also a bell-shaped distribution but special kind of distribution  It is the means of all random samples that you could’ve drawn from a population  If n=50 – drawing every sample of 50 UTSC students that you could and then giving them all IQ tests –would have thousands of samples and eventually will have all of the samples you could possibly draw  So give all of these samples IQ tests – will wind up with a distribution where few samples on either end and most of the samples will be somewhere in the middle; sampling distribution = so all the possible sample means that you could possibly draw are in this distribution – normal distribution  The sampling distribution is normal if one of the 2 conditions is true: 1) If the population is normal or 2) if you have a relatively large sample size (n=30)  If one of these 2 things is true, if you were to plot all of these means of all samples – will be normal distribution Your analysis will be based on.. • The Distribution of Sample Means: the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population. • It will be almost perfectly normal if either: – the population is normal, or – the n of the sample is large • This distribution has a mean that is equal to the population mean – The average of the means of the distribution of all of your samples is the same as the population mean • With UTSC students, have sampling distribution of the means which is made up of all possible samples of 75 students – test all people – find mean for each sample – plot them on graph – – You have actual distribution of IQ scores at UTSC but this is a distribution that isn’t known and isn’t knowable – Sample distribution of the means = theoretical – can’t go thru and get every sample of 75 students – will take extremely long time So, you have mean of 100 with SD of 10 and you have a theoretical distribution (=sampling distribution of the means) • What do you do now? • Can calculate SD of sampling distribution (theoretical dist of all possible samples) from the SD of the sample – it allows you to …because you know sampling distribution is normal and you know sampling distribution is normal because sample size is 75 which is large enough and you also know the SD of this sampling distribution – it allows you to figure out where in this theoretical distribution does your sample actually comes from • RMB: this is a theoretical distribution of all samples of 75 that you can possibly draw but now you can figure out what the chances are that your sample comes from low end, high end or somewhere in middle o It tells you how likely it is that your sample actually comes from this theoretical normal distribution o Wrt normal distribution, we know how much of the area underneath the curve falls between the markings and the markings that we have @ the bottom = SD o So, this is a standard normal distribution so the 0 marking will be the mean and 34.1% will be between the mean and one SD above the mean o It’s impt to know what SD is because it helps to visualize what this distribution looks like o Let’s say the estimated SD of this sampling distribution is
More Less

Related notes for PSY100H1

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.