Internal and external validity Research strategies - Results vs. interpretation - Internal validity - External validity • The “undergraduate participant” problem • Cross-species generalizations (Smith, Minda, & Washburn, 2004) • The “But it’s not real life!” argument - Research strategies Results and interpretation - Analyze the data interpret the data - Data analysis and data interpretation are two completely different things. - Researchers can (and often do) come to different interpretations starting from identical data. Interpretation example: Fictional clinical psychology experiment - Aclinical psychologist decides to evaluate the efficiency of the medication R22 to treat anxiety. • 40 participants are randomly assigned to one of two conditions: R22 or placebo • The experiment uses a double-blind procedure • Pre-treatment assessments show that both groups have a similar level of anxiety. • Finally, a post-treatment measure of anxiety is taken Fictional study results - 8 participants in the R22 condition improved compared to only four in the placebo. Therefore, R22 is an effective treatment for anxiety. - The majority of participants who had withdrawn did so because the experimentation situation made them feel anxious! Conclusions - Attrition: Aloss of participants during a study that may lead to biased results. - The data for the participants who completed the study were not in question. The interpretation was, however. - Any component of a research study that raises doubts about the quality of the research process or the interpretation of the research results is a threat to validity. - Attrition is more likely when a study is going on for a long time. - Research where this is an issue, they go to great lengths to figure out who dropped out and know if something is special about them. Get nervous about them because don’t know if they are going to change the final results. Internal validity - Internal validity: The extent to which a research study produces a single, unambiguous explanation for the relationship between two variables. - How strongly do the results allow you to defend your interpretation, your hypothesis, and your theory? - Measurement validity ≠ (doesn’t equal) study validity - Different factors are always at play and happening simultaneously. Ideally you would be able to consider all facts Guy’s corollary of Stanovich’s (2007) connectivity principle - When interpreting research results, look for the straightforward, common “scientific” sense, and boring explanations before putting forward a spectacular one. • Attrition • Environmental variables • Assignment bias • History • Maturation • Instrumentation • Testing effects • Regression towards the mean Environmental variables - Any changes in the testing environment (people, places, time) may have an impact on the results, and thus, on internal validity. • People • Places • Times - If conducting a boring experiment, you have participants come into the lab (e.g., learning experiment). Immediately there is environmental variables e.g. People testing in morning, testing in evening. What about temperature? Morning is cool and afternoon is unbearable. - You can be worried about these effects however most of the time it doesn’t matter because we don’t assign participants in experimental conditions in a way that makes it matter. Meaning, morning group will do same conditions as afternoon people. The effect of context on memory - Godden and Baddeley (1975). The context in which learning takes place influences memory. - 16 experienced divers learned lists of 40 words in one of two environments and recalled them in one of the two environments. - They found that if you recall words in the same environment you learned them that performance is better. Performance is better when conditions match. They call it episodic cue encoding. Environment serves as a cue for recall. Gooden and Baddeley (1975) results Assignment bias - Athreat to internal validity that occurs when the process used to assign different participants to different treatments produces groups of individuals with noticeably different characteristics. • Private vs. public school debates (e.g. Lubienski & Lubienski, 2006). - Aprincipal from a private school which says students are doing really well on standardized tests and because the teachers are doing well. - What can lead to these different results? • Socioeconomic status and home environment • The teachers could be better. To test this, if demographics of the children are the same. You can test if teachers are different. • Ask yourself… if there's a better or different explanation. History - Athreat to internal validity from any outside event that occurs during the time that a research study is being conducted and has an influence on the participants’scores. - Bahrick (1984). Semantic Memory Content in Permastore: Fifty Years of Memory for Spanish Learned in School - 773 participants were tested from 1 to 50 years after taking Spanish course(s) at the university level. Results - Even 50 years after the courses ended, a substantial proportion of the participants’ knowledge of Spanish remains. Results - The difference in performance between students who obtainedAs and those who obtained Cs remains constant for up to 50 years. Bahrick’s interpretation - Semantic memory content in permastore: fifty years of memory for Spanish learned in school - 773 participants were tested from 1 to 50 years after taking Spanish course(s) at the university level - Wanted to see how well people could remember content from academic courses after the passage of time’ - Tested people that either just finished the exam, or up to 50 years after taking the same class - He has recognition tests, production, short reading stories etc. Exercises students would have to learn in class or do on test - Is there a threat to validity (history)? - “In addition to taking a test of knowledge of Spanish, subjects completed a questionnaire designed to provide information about Spanish instruction; grades obtained in Spanish courses; and various opportunities to read, write, speak, or listen to Spanish and other Romance languages during the retention interval”(p.3). Results - After 50 years people remembered specific amount of Spanish. There is a lot of forgetting that occurs early on (5-7 years) then stabilizes so that after 7 years the performance starts to level off. - The difference in performance between students who obtainedA’s and those who obtained C’s remain constant for up to 50 years. Permastore - Permastore: State of knowledge that remains unchanged for 25 years or more following a 0 to 6 year period of accelerated decline. - Almost every participant never touched Spanish again in the time between taking exam and being retested Maturation - Athreat to internal validity from any physiological or psychological changes that occur in a participant during the time that research study is being conducted and that can influence the participant’s scores. • Especially important at both ends of the life-span (research with infants, children, and the elderly). • Try to find a control group that is similar to the experimental group. E.g. Man 67 with university degree, find a man with same statistics Instrumentation - Athreat to internal validity from changes in the measurement instrument that occur during the time a research study is being conducted. • Grading • Observational studies - The easiest example is grading. History prof and teaching large 2 year class with 200 exams with 10 pages worth of writing and need to grade all of it in one week. Will person grade 1 test, 100 test and 200 test the same way. • To help this look for very specific things and force self to look at every grade sheet and answer Testing effects - Athreat to internal validity that occurs when participants are exposed to more than one treatment and their responses are affected by an earlier treatment. • Practice • Fatigue • Carry-over  Something with one condition and transferring knowledge Regression toward the mean - Astatistical phenomenon in which extreme scores (high or low) on a first measurement tend to be less extreme on a second measurement. - When you get an extreme score, the probability of getting a more extreme score is so remote is highly unlikely and will go towards the middle. • Galton (1886) – Regression towards mediocrity in hereditary stature. • My “psychic” experiment • Sophomore slumps  Sports commentators talk about that. They have a fantastic first season and then second season they don’t do as well. If you have a good season it’s hard to have a better one the next season Guy’s corollary of Stanovich’s (2007) connectivity principle - When interpreting research results, look for the straightforward, common “scientific” sense, and boring explanations before putting forward a spectacular one. - Attrition, environmental, assignment bias, history, maturation, instrumentation, testing effects, regression towards the mean Conclusions about threats to internal validity - Experimental psychologists are trained to: • catch these threats in non-scientific reports • avoid or minimize these threats in their research - Considering that it is rare that experimental psychologists commit blatant methodological errors that threaten internal validity, then how are the more subtle errors caught? Replications - We catch errors by replicating and expanding published research studies. - Extraneous variable:Any variable that exists within a study other than the variables being studied. • Aproblem if it turns into a confounding variable - Confounding variable: An extraneous variable that is allowed to change systematically along with the variables being studied and that threatens internal validity. • The variable you are studying changes systematically with other outside variables - If you can keep getting the same results then there is a good chance the original study was accurate. I
