3213L chapters 4-14

105 Pages
194 Views
Unlock Document

Department
Clinical and Health Psychology
Course
PSY 3213L
Professor
kilimenko
Semester
Fall

Description
Chapter 4 Two scientific studies activities: Exploratory data collection and anaylsis : is aimed at classifying behaviors, identifying potentially important variables, identifying relationships between those variables and the behvaiors Hypothesis testing: Evaluating potential explanations for the observed relationships Causal relationship, one variable directly or indirectly influences another. - unidirectional, A influences B but not vice versa - bidirectional- each variable influences each other correlational relationship: changes one variable accompany a change in another, but no proper testing has been done to show that they actually influence each other. -when changes in one variable tend to accompany a specific change in another the variables are said to covary correlational research- determine whether two variables covary and if so establish the directions, magntitudes and forms of the observed relationship - nonexperimental - observing two values of two or more variables and determining what relationships exist between them. - Make no attempt to manipulate variables, observe “as is” - Makes It possible to predict from the value of one bariable the probable value of the other variable. - The variable used to predict – predictor variable - The variable whose value is being predicted is called the criterion variable Two problems with this method - third variable problem o you want to prove variation in one of the observed variables could only be due to the influence of the other observed variable. o However there could be a third variable, its usually unobserved and may influence both variables causing them to vary together even though no direct relationship exists between them o Must examine the effects of each potential third variable to determine whether it does account for the observed relationship. - Directionality problem o The direction of causality is sometimes difficult to determine. Reasons for choosing correlational -manipulating the variables may be impossible or unethical - can provide rich source of hypothesis that can be later tested experimentally - you want to see how naturally occurring variables relate in the real world Experimental research - incorporates a high degree of control over the variables in the study - establish causal relationships among the variables - manipulation of one or more independent variables and control over extraneous variables manipulate the independent variable - chosen by the experimenter - specific conditions associated with each level is called treatments. - By manipulating you hope to show that changes in the levels of the independent variable cause changes in the behavior recorded Receiving treatment- experimental group Other group- control group Extraneous variables – those that may affect the behavior that you wish to investigate but are not of interest for the present experiment Uncontrolled variability -difficult or impossible to detect any effects of the independent variable -produce chance differences in behavior across the levels of the independent variable -hold these extraneous variables constant -make sure all treatments are exactly alike -randomize their effects across treatment - make them even out and allow them to be unmistaken for effects across the independent variable -random assignment- allows you to use inferential stats to evaluate the probability with which chance alone could have produced the observed differences strengths and limitations experimental approach can tell you whether changes in one variable actually caused changes in the other also ability to identify and describe casual relationships - limitation – you cannot use the experimental method if you cannot manipulate your hypothesized casual variables. - The tight control over extraneous factors required to clearly reveal the effects of the independent variable Experiments vs demonstration Lacks an independent variable Exposes to just one treatment condition Simply expose a single group to a particulat treatment and measure the behavior Useful- this happens and not that. Not experiments and do not show causual relationships Internal and external validity - internal – ability of your research design to adequately test the hypothesis - test the hypothesis it was designed to test - this means that the independent variable caused the observed variation in the dependant variable - in a correlational study changes in the value of your criterion relate solely to changes in the value of your predictor variable - internal validity is threatened to the extent that extraneous variables can provide alternative explanations for the findings of the study confounding variables- when two or more variables combine in such a way that their effects cannot be separated. - confounding is less problematic when the confounding variable is known to have little or no effect on the dependant or criterion variable or when its known effect can be taken into account in the analysis - best thing to do is to substitute what you believe to be less serious threats to internal validity for the more serious ones threats to internal validity -history (an event may occur between two different observations) -maturation(effect of age or fatigue) -testing(when a pretest sensitizes participants to what you are investigating ) -instrumentation (unobserved changes in criteria used by observers or instrument calibration) -statistical regression (scores tend to be closer to the average in the population when before they were more of outliers) -biased selection of subjects(differ initially and that’s what effects the change, usually happens when preexisting groups in their studies rather than assigning subjects to groups at random) -experimental mortality (loss of participants) external validity - degree that results canbe extended beyond the limited research setting and sample in which they were obtained, - may tell us little about how they react in the real world - objective is to gain insight into the underlying mechanisms rather than discover relationships threats - may be less relevant in basic research - becomes more relevant when the findings are expected to be applied directly to a real world setting setting setting is effected by -costs -convience -ethical considerations - research question lab setting - gain important control over variables that could effect results - gain control over extraneous variables that could effect the dependent variable - may lose generality - simulation o might want to do it because its unethical o expensive and time consuminh o retaining control o relatively realistic conditions - designing a simulation o observe and study carefully o identify crucial elements o more realistic = greater chance that it will be applicable to real world phenonmenon - realism o mundane- simulation mirrors real world event o experimental- simulation psychologically involes the participant in the experiment field research - participants natural environment - manipulate independent variable and measure a dependent variable - has all the qualitites of a lab experiment advantages and disadvantages -results and be generalized to the real world - disadvantage- little control over potential confounding variables, (low internal validity) - extraneous variables can obscure or distort the effects of the independent variable in field experiments e know that probability sampling strategies are most likely to give us a representative sample An inclusion criterion might be declaring a major in a heavily math-oriented field (e.g., mathematics, computer science, physics). An exclusion criterion might be test anxiety (e.g., test scores might be lower not because of the stereotype threat but because of test anxiety that decreases performance in all testing situations). Random assignment controls extraneous factors because it randomly distributes personal characteristics that can influence outcome across conditions. There are two ways to randomly assign participants to experimental conditions: 1. Free random assignment 2. Matched random assignment. With free random assignment, the experimenter uses a random number table or a random- number generator on a calculator or computer to assign participants to groups. There is no attempt to measure and use personal characteristics as part of the random assignment process. With matched random assignment, information about subject characteristics is collected and used to identify similar participants. After the match is made, participants are then randomly assigned to groups. This strategy insures an equal distribution of critical personal characteristics across experimental conditions. Bias can be introduced into studies by both the experimenter and the participant. Direct knowledge of the study hypothesis, the nature of the experimental manipulation, and group assignment can lead to subtle differences in the ways that experimenters and participants interact in the research setting. Restricting knowledge of the experiment through "blind" procedures can help to eliminate this bias. n a single-blind procedure, a laboratory assistant who does not know the study hypothesis administers the experimental manipulation. The laboratory assistant also does not know the experimental condition to which the participant was assigned. Having a naive intermediary between the experimenters who designed the study and the research participants prevents experimenter expectancies from influencing study results. In a double-blind procedure, both the person administering the experimental manipulation and the participant do not know the study hypothesis and group assignment. The prototypic double-blind study is a randomized study of medication. The participants receive either the active drug or a pill that looks, smells, and tastes exactly like the drug but without the active ingredient. Neither the experimenter nor the participants knows whether the active drug or placebo is being administered until after the study is over. Performance is compared within individual participants. Order and sequence effects are major sources of error in within-subjects designs. Order effects produce changes in performance based on the order of the condition in the experiment and not the manipulation in the specific condition. Practice effects can be considered order effects. In cognitive experiments, performance is usually lower on the first task because participants are unfamiliar with the setting. Once the participants become familiar with what is required, performance increases. Fatigue effects are also order effects. Performance is worse in later conditions because participants are tired. Sequence effects are produced by characteristics of the experimental manipulation. For example, in a study of perception of weight, participants will judge a weight lighter if it follows a heavy rather than a light weight. The same occurs in reverse; participants will judge weights as heavier if they follow a light rather than a heavy weight. Sequence effects are caused by an interaction between order and specific aspects of the manipulation.  We always try to build controls that minimize or eliminate confounds/threats to validity into our studies.  The most general control in experimental research is adequate preparation of the research setting.  Extraneous variables can be controlled by carefully selecting who is in our study through inclusion and exclusion criteria and by randomly assigning participants to experimental groups.  Single-blind and double-blind procedures can help to control for experimenter bias.  Control groups in between-groups experimental designs should be as similar as possible to experimental groups.  Counterbalancing in as effective control in within-subjects designs. Hypothesis testing is one of the most important concepts in Statistics. This is how we decide if:  Effects actually occurred.  Treatments have effects.  Groups differ from each other.  One variable predicts another. Null hypothesis – nothing happened You test your sample statistic against the value based on the Null Hypothesis sampling distribution. Your sample statistic and Null Hypothesis sampling distribution values are close. You conclude that they are not different; you did not find an effect in your study. Alternative hypotheisis – something happened You test your sample statistic against the value based on the Null Hypothesis sampling distribution. Your sample statistic and Null Hypothesis values are not close. You conclude that they are different; you found an effect in your study; hooray! Hypothesis testing: 1. You come up with your hypothesis (for example - college students sleep less than other folks). 2. You generate a sample (pick a set of college students). 3. You calculate your summary statistics (for example, the mean and SD of number of hours that college students sleep per night). 4. You determine the statistical test that will compare your summary statistic against the value determined by your Null Hypothesis. (You would use the single sample t-test for college students’ sleep.) 5. You calculate the test statistic using your summary statistics. The formula for the test statistics is different for each type of test but the basic concept is the same. You calculate how far your sample is from the Null Hypothesis taking into account that sample values of a statistic vary by chance when smaller samples are taken from a larger population. The SE tells us how much they vary. 6. You derive the appropriate sampling distribution - or refer to one already listed in the tables in your statistics book. Your computer program can also give you this information. 7. You choose the cut-off value on your sampling distribution that tells you that your sample statistic is very far from the Null Hypothesis and thus not likely. We call this cut-off value our alpha level or significance level (more about alpha later). 8. You decide whether to reject the Null Hypothesis or fail to reject the null. You do this by comparing your test statistic to the cut-off value. 9. You draw your conclusion. If you reject the Null Hypothesis, you say that your result is statistically significant. This simply means that it did not happen by luck or chance. If you fail to reject the null, you conclude that you did not find an effect or difference in this study. You can make an error or two when you test hypotheses. You might say things are different when they are not. You may miss a relationship that really exists. These are called Type Iand Type II errors, respectively. Power is the probability of correctly rejecting a false Null Hypothesis. Many experts recommend that you use a power of .80. This means that you have an 80% chance of finding a difference when you really want to find it. You don't want to miss a real difference or correlation. (Bad - missing a difference is called a Type II error with probability equal to Beta). Power is equal to 1 - Beta. The test might say there is a difference when there is not one. (Bad - an error called Type I error whose probability equals your alpha rate: .05 or .01). Depending on conditions, you may have a good or bad chance of finding the desired result. To increase power you can: 1. Try to increase the effect size or the strength of the relationship. 2. Decrease experimental error. 3. Use a higher alpha level (say .05 as compared to .01). Note this increases power but also Type I error. 4. Increase sample size. 5. Use matched samples or covariance technique  We always test a null hypothesis against an alternative/research hypothesis.  If a sample is close to null, we conclude that nothing happened in the study. If a sample is far away or different from the null, we reject the null hypothesis and conclude that something happened.  The logic of hypothesis testing is counterintuitive (or backwards). We test whether nothing happened (our sample value is close to the null) in order to conclude that something happened.  There are two types of error in hypothesis testing (Type I and Type II). Type I errors occurs when we conclude that there is a difference when there is not. Type II errors occur when we conclude that there is no difference when there is.  Statistical power is the probability of correctly detecting a difference between probability of detecting a true difference., we want to maximize the Video Name and identify the 4 major types of validity Internal- is the independent variable responsible for the observed changes in the dependednt variables (objective) - infer causation - function of procedures and study design - confounds – occurs when two potential effective variables are allowed to covary simultaneously o could be as responsive to dependent , shows up at the same time - need high level of constrait - adequatlely controlled concerns Statistical – are the statistical tests accurate? (objective) External- do the results apply to the broader population (subjective) Construct- is our theory the best explanation for the results (subjective) the most subjective, accumulation of evidence Name and define major confounding variables Validity check Instrumentation Maturation History Regression to the mean Testing/practice effects Selection bias Attrition/ differential Morality Sequence effects Diifusion of treatments Compensatory rivalry Resentful demoralization Explain why some forms of study validity are considered moe objective and others are more subjective Reduced to the rules that exisist. Rules are clearer and more agreed upon in internal rather than external. Describe specific threats to validity in several additional sample studies and explain how the study can be redesigned to reduce threats Movie 2 Maturation- passage of time, development History- competing event. Another explanation, covary Testing/ practice effects- tested once and tested twice, changed in measurement process. Might remember the questions, taking the measures in two points in time Regression to the mean- leveling effect. Extreme scores, rarely end up extreme again. mid range not too much change. Rare to see high scores stay high and low scores stay low. Selection bias- choice of participants, limited. Attrition(front end)/ differential mortality (back end)- loose some people. Loose some people moved, no longer want to participant, death. Lost certain kinds of people. Within a small environment Diffusion of treatment- students talk and are in two different groups, gets benefit of treatment. Virtue of hanging out. Compensatory rivalry- no treatment, shows that he doesn’t need the treatment. Control group. He gets motivated to do better than the treatment group. Conclude that there was no effect because theres no difference because everyone was getting treatment blah blah sharing info Resentful demoralization- opposite of compensatory effect. Apart of control, theyre motivated to do worse. Mad. Bummed. Less desireable treatment. Differential attrition/ morality- group being measured is different. Lost people. Participants may be aware Subject effects - demand characteristics - placebo effects experimenters are also active - bias - know subjects may try and behave the way they think they are suppose to behave - hawthorne effect - outguessing the experimenter - know youre in experiment and reacting to the awareness demand characteristic - respond to subtle cues about what is expected - also occurred in hawthorne - feedback, paid and production placebo effect - expectations that the treatment will work experimenter effects - demand characteristics - subtle biases in observations, recording, measurement - unlike fraud this behavior is out of our awareness construct validity - why did it work - manipualate what we intended to - does dependent measure get at what we say it does - is our theory the best explanation for the results? Inadequate preoperational explication of constucts - did you think through the theory and definitions of constucts before you measured mono operation bias - self report meaures, only used interviews, etc. operationalized in a single fashion. Didn’t adequately express assessing - process measures if trying to link treatment to effect - did toy manipulate what you said - did dependent variables indented to measure same construc actually do so? - Include alternative measures that should not be effected by the treatment - Is there too much overlap on irrelevant factors with your dependent measures? Mono-operation bias Did my treatment cause the outcome?- Chapter 8,9,13,14 Observational designs do not involve maone nipulating the independent variable Behavioral catergories – define what is being recorded. Make sure they are defined well and not ambigious, cultural traditions may not be agreed on - become familiar with the behaviors and make a list - do preliminary observations - literature research frequency method- within a time period duration method- how long a behavior lasts interval – divide up observation period into time intervals and record a behavior that occurs within that time period. (short enough for one behavior to occur ) complexity – how to make your observations time sampling - scan group for specific time - alternate between periods of observation and recording individual sampling - single subject for observation over a given time period . - repeat for other individuals event sampling - observe only one behavior recording - use recording devices - multiple observers watch the video independently - hide a camera better than you can hide yourself. Use audio recorders instead of taking notes - eyes are focused on the subject - faster - disadvantage: may disturb subjects reliability of observations - disagreement may happen if you have not clearly defined the behavioral catergories - interrater reliability provides an empirical index of observer agreement o establishing this helps ensure their accurate and reproduceable - simplest way to asses rinterrater reliability is to evaluate percent agreement o high as possible o 70% is acceptable o if defined as exact match then the percent agreement underestimates interrater agreement o only gives raw estimate of agreement o may have extremely high levels of chance agreement in which percent agreement overestimates cohens kappa - assesses amount of agreement that would be expected by chance - need to determine o proportion of actual agreement o agreement expected by chance - 1 step is to tabulate in a confusion matrix - 2 step compute rd - 3 find the proportion of expected agreement by multiplying corresponding row and column totals - any value of .70 or greater indicates acceptable reliability - pearsons product moment correlation - if observers agree pearson R will be strong and positive - they can be highly correlated even when obserbers disagree - magnitudes of the recorded scores increase and decrease similarily intraclass correlation coefficient - reliability on observations scaled on an interval or ratio scale of measurement observer bias - when the observers know the goal of a study or the hypothesis the observations are influenced by this information - use a blind observer - when observers interpret what they see rather than simply recording behavior quantitative and qualitative - quantitative data is expressed numerically - qualititative data is written records of observed behavior o cannot apply standard descriptive and inferential statistcs to your data naturalistic observations unobtrusive observations - do not alter natural behaviors of subjects - be hidden, habituate the subjects to your presence (video as well) - behavior in the real world - high external valididty - cant use naturalistic observation to investigate the underlying causes of the behaviors - requires you to be there, engaged the entire time. Ethnography - researcher becomes immersed in the behavioral or social system being studied - primarily to study and describe the functioning of cultures through a study of social interactions and expression between people and groups - done in field setting - participant observation – part of the group - non participant observation- serve as a nonmember of the group - minimize people altering behavior by training participant observers not to interfere or use observers that are blind - remove problem of reactivity by observing covertly - covert participant- part of the group but disclose status - gaining access might be hard you need to get past the gatekeepers who are the protectors of the group. - Another entry into the group is to use guides and informants who convince the gatekeepers that your aims are legit and study is worthwhile. - First step to analyzing is to do an initial reading of field notes - Second step to analyzing is to code any systematic patterns - Ethnographic is purely descriprive in nature, we cannot explain why Sociometry - identitifying and measuring interpersonal relationships within a group - use sociometry as the sole research tool to map interpersonal relationships - sociogram the choices of friends case history - observe a single case - not experimental design - demonstration - no manipulation of IV - cannot determine causes archival research - nonexperimental strategy that involves studying existing records - all factors pertaining to observational research apply to archival research - gain access to archived material - practical matter is the completeness of the records - purely descriptive - maybe indentify interesting trends or correlations - cannot establish causual relationships content analysis - analyze a written or spoken record - occurrence of catergories or events, pauses, negative comments, behavior etc - usually use archival sources for analysis - example is court proceedings - all factors that are observational research apply to content - observational technique - objective o clear set of rules - systematic o info assigned into catergories o include articles that are for/against personal favor - should have generality o fit within theoretical, empirical or applied context. - When performing you need clear operational definitions of terms o Materials need to be analyzed before you develop catergories o The recording unit is the element of the materials that you are going to record o Context unit- context within which the word was used - Who will do the analysis o May be effected by bias o use blind observer o use 1+ observer to evaluate intterater reliability o content alaysis of a biased sample may produce biased results - content analysis is purely descriptive - durability can be a problem - invalidated over time meta analysis - conclusion may not accurately reflect the strength of the relationships examined in your review - meta analysis – set of statistical procedures that allow you to combine or compare results - form of archival research - doing meta analysis on meta analysis is called second order meta analysis - 3 steps o identify relevant variables o locate relevant research to review o conduct the proper meta analsysis - step 1 o identify variables o focus on only those related to your topic o what variables to record o driven by the research question o info needed is dependent on the meta analysis technique you use - step 2: o locate research to use o file drawer phenonmenon inflates the type 1 error  how to deal  attempt to uncover those studies that never reached print  estimate the extent of the impact on the file drawer phenonmenon on your analysis o done by determining the number of studies that must be in the file drawer before serious biasing takes place - step 3 o apply technique st o 1 technique shows you can compare studies o doing a meta analysis comparing studies is analogous to conducting an experiment using human or animal subjects o second technique combine studies to determine the average effect of a variable across studies o comparing effect sizes is more desirable than looking at p  p value only tells you likkihood of making type 1 error - drawbacks to meta o published research may vary o research in new areas is rejected from refereed journals o quality ratings would be made twice  once after reafing the method section alone  then reading the method and results sections together o common critizim is that its difficult to understand how studies with widet varying materials measure and methods can be compared o core issue is whether or not differing methods are related to differenct effect sizes chapter 9 in a field survey you directly ask people their behavior you can draw an inference about the factors of underlying behavior one major ethical concern whether and how you will maintain the anonymity of your participants and the confidentiatlity of their responses. Designing questionnaire - clearly define the topic of your study - keep it focused because too much in a survey or a long survet can confuse and overburden - demographics are used as predictor variables - specifically measure voter preference – criterion variable - administer your questionnaire to a piolet group make sure relaibale and valid open ended questions - his or her own words - drawback – may not understand exactly what you are looking for or omit some answers. - Can also make sumerizing data difficult Restricted items - provide a limited number of specific response alternatives - control the participants range of responses - easier to summarize and analyze - not as rich in information partically open ended items - resemble restricted items but provide an additional, “other” catergory and an opportunity to give answers not listed. - Helps respondents separate the question from the response catergories that follow. - Make any special instuctions intended to clarify a question a part of the question itself. - Put check boxes blank spaces or numbers. - Place all alternatives in a single column Rating scales - scales with fewer than 10 points also are frequently used but you should not go below 5 points - end points labeled – anchors keep interpretation from drifting - all points are labled provides moe accurate info - reasonable compromise is the ends and middle labled. - The psychological phenonmenon underlying the scale and the scale itself - Likert scale- agreement – disagreement Assembling your questionnaire - coherent visually pleasing format - demographic items should not be placed first - interesting and engaging - apply to everybody - be easy - interesting - continuity - organized - order affects answers only when people are poorly educated - place objectionable questions after less objectionable ones - use graphics - verbal/ graphical relate to how your questions are worded and presented mail surveys - mail directly to the participant - nonresponse bias – fail to complete questionnaire - develop strategies to increase return rate - multiple contacts - include small token of appreciation - less money tends to work better - lower cost - consider this way first internet surveys - distributed via email - short and simple - quick and easy and have large data set - internet may not be representative of general population telephone surveys - contact by phone - have an interviewer ask questions or a robot - touch tone relephone to respond group administered surveys - people may participate because little effort is required - not treated as serious when in a group - not ensure anonymity - right to decline participation may be harder face to face interviews - directly speaking - in a structured interview you prepare questions - unstructured you have a general idea but no sequence of questions - structured- all participants asked the same thing same order o easier to summarize and analyze o may miss some important info having a highly structured interview - unstructured may be hard to code later on - experimenter bias and demand characteristics become a problem - run a piolet test reliability repeated measures - admister once then allow time then again - consider how long to wait - too short may result in participants remembering questions and answers they gave - leads to artificially high level of test-retest reliability - wait too long it could be low - test restest may be problematic when o measuring ideas fluctuate with time o issues for which individuals are likely to remember their answers on the firs ttesting o questionaires that are long and boring - parallel forms o must be equivalent o same number of items and the same response format o eliminate possibility that rapid chaning attitudes will result in low reliability - sing administration o split half- split in half and two scores are graded o works best when limited to a specific area o each score is based on a limited set of items which can reduce reliability o don’t do if its not clear how the splitting should be done o some do the odd even split o apply the kuder-richardson formula o the higher the number the greater the reliability o .75 is moderate o likert formar- coefficient alpha is used - inceasing reliability o increase the number of items on your questionnaire o standardize administration procedures o score carefuly o clear well written and appropriate - validity of the questionnaire o content validity- assesses whether the questions cover the range of behavirs normally o construct validity- stablished by showing the questionaires results agree with predictions based on theory o criterion related validity- coorelating the results with those of another established measure.  Concrurrent validity- same dimension administered at the same time  Predictive validity- correlating with some behavior that would be expected to occur Sample - representative sample- closely matches the chatacteristics of the population - random sampling every member has an equal chance of appearing in the sample - simple random sampling- randomly selecting a certain number of individuals from the population - random does not gaurentee a representative sample o combat this by selecting a large sample - representative sample o dividing the population into segments o selecting an equal size from the segment - proportionate sampling o the proportions of people in the population are reflected in your sample - systematic sampling o used in conjunction with stratified sampling o every xth element after a random start - cluster sampling o basic sampling unit is a group of participants rather than the individual participant o saves time o cost effective o multistage sampling  identify a large cluster and randomly select among them sample size - economic sample- enough participants to ensure a valid survey and no more - the amount of acceptable error and the expected magnitude of the populations proportions - the deviation of sample characteristics from those of the population are called sampling error. - Look at literature and see what margin of error was used - Magnitiude of the differences you expect to find - Design a small pilot Chapter 13 Unstacked format- create separate colums for the scores from each treatment Stacked- one column for all treatments, column for the treatment levels, column for dependent variable. Quantitative independent variable can just be put as the number however a qualitiative indepenedent variable must be assigned a number for each level aka dummy coding Grouped data - taking an average = one score that characterizes an entire distribution - may not represent the performance of the individual subject - curve resulting from plotting averaged data may not reflect the true nature of the psychological phenonmenon being studied. Individual data - makes the most sense - reflects the effect of the independent variable more faithfully than data averaged over the entire group look at both the grouped and individual data graphing - represents data in a 2-D space - horizontal axis is the x axis- independent variable - vertical axis is the y axis and that’s the dependent variable bar graphs - length of the bar represents the value of the dependent variable - error bars- the precision of the estimate in the form of error bars o error bars show the variability of scores around the estimate - a bar graph is the best method of graphing when your independent variable is categorical - x axis is catergorical and qualitiative line graphs - works when x axis is contimuous and quantitiative - positively accelerated when the curve is flat at first and becomes progressively steeper along the x axis - negatively accelerated when the curve is steep at first and then becomes progressively flatter and levels off at a max or min - monotonic is when its uniformly increasing or decreasing - nonmonotomic is that a function contains reversals in all directions scatter plots - correlational strategy - line of best fit - include the equation for this line and the coefficient of correlation - helpful for when you calculate a measure of correlation pie graphs - for data in the form of proportions or percentages - if a piece is pulled out its called and exploded pie graph and it emphasizes the proportion of time devoted to the subject frequency distribution - consists of a set of mutually exclusive catergories into which you sort the actual values observed in your data together with a count of the number of data value falling into each catergory. Histogram - resemble bar graphs - are drawn touching each other - y axis is a frequency, typically a mean score stemplot - simplifys the job of displaying distributions - easy to construct and have advantage over histograms and tables of preserving all the actual values present in the data - inherently create class widths of ten - not useful when the sets become too large skewed distribution - has a long tail trailing off in one direction and a short tail in the other. - Positively skewed- long tail goes to the right - Negatively skewed- long tail goes off to the left, downscale Normal distribution- symmetric and hill shaped- bell curve Measures of the center - mode o most frequent o bimodal- have 2 modes o nominal and ordinal scale - median o middle score o order from highest to lowest o two middle scores you take the average o ordinal scale - mean o senstitive to distance between scores o interval or ratio scale o normally distributed use mean as measure of center o negatively skewed the mean underestimates the center o positively skewed the mean overestimates the center o neither mean or median will accurately represent the center if your distribution is bimodal measure of spread - range o simplest and least informative o does not take into account magnitude of the scores between the extremes o very sensitive to outliers - interquartile range o order the scores o divide into 4 equal parts o less sensitive to extreme scores - variance o average squared deviation from the mean - standard deviation o most popular measure of spread 5 number summary st - minimum, the 1 quartile, the median, the third quartile and the maximum - interquartile range is Q3 –Q1 associatiation, regression - pearsons product moment correlation coeffiecient or pearson r o scale your independent measures on an interval or ratio scales o pearson r can range from +1 through 0 to -1 o positive correlation represents a direct relationship o negative correlation indicates an inverse relationship o correlation of 0 says no relationship exisits o magnitiude of the correlation coefficient tells you the degree of linear relationship o o means not relationship exsists o Both+1 and -1 represtnet perfect linear relationships o Parabola shape is called a curvilinear relationship - Point biserial correlation o Because one variable is continuous and the other dichotomous this would apply o Dichotomous relationship is dummy coded o Magnitiude partyl depends on the proportion of participants falling into each of the dichotomous catergories o If the number of participants in each category are not equal then the maximum attainable value for the point-biserial correlation is less than + or – 1 - Spearmen rank order correlation o Either when your data are scaled on an ordinal scale or when you want to determine whether the relationship between variables is monotonic - Phi coefficient o Both variables being correlated are measured on a dichotomous scale. - Bivariate regression o Find the straight line that best fits the data plotted on a scatter plot o The best fitting straight line is the one that minimizes the sum of the squared distances between each data point and the line as measured along the y axis - The coefficient of determination o The square of the correlation coefficient - According to the text, you can increase the reliability of your questionnaire by B. standardizing administration procedures. C. writing clear, appropriate questions. In ________ sampling, you identify naturally occurring groups (for example, classes in a school) and sample some of those groups. -cluster If you decide to assess reliability with multiple tests and use alternate forms of your questionnaire, you would then use ________ to assess reliability. - parallel forms Labeling each point on a scale versus labeling only the end points Usually does not significantly affect the responses participants give to a question According to the text, which of the following would be a way of assessing the reliability of a questionnaire? Administer the same questionnaire (or a parallel form) to the same participants more than once. B. Administer the questionnaire once and assess internal consistency. A drawback to an open-ended item is that he responses obtained may be difficult to code and analyze. A sample consisting of participants who are not representative of the population is a(n) ________ sample. - Biased The advantage of restricted items over open-ended items is that - provide more control Dr. Loo administers a long and boring questionnaire concerning attitudes that tend to fluctuate over time. When assessing the reliability of his questionnaire, Dr. Loo should - avoid using test re-test To write good survey items, you should A. use simple words rather than complex words. B. make the stem of a question short and easy to understand but use complete sentences. C. avoid vague questions in favor of more precise ones. You would compare two studies in a meta-analysis if you wanted to find out whether the studies produced significantly different results. Cohen’s Kappa is used to Evaluate interater reliability In meta-analysis, the file drawer phenomenon A. inflates the probability of making a Type I error. The text proposes that ________ can be used as an index of interrater reliability. he Pearson productmoment correlation A(n) ________ is one who is unaware of the hypotheses being tested in an observational study. A. blind observer If you can identify one behavior as more important than another in an observational study, you can then use ________ sampling - event When comparing studies, looking at ________ is the preferred technique. -effect sizes In an observational study of patients in a psychiatric ward, you alternate 5-minute periods of observation with 5-minute periods of recording behavior. This is an example of ________ sampling. A. time The unit of analysis in a meta-analysis should be how variable x affects variable y. The major advantage of using grouped data is that convenience―a single score (e.g., the mean) can be calculated to represent a group 1 - r gives you the coefficient of nondetermination. A ________ presents a frequency distribution graphically as a series of bars representing the classes whose heights indicate the number of cases falling into each class. A. histogram Examining individual scores makes the most sense when A. you have repeated measures of the same behavior. Why is it a good idea to explore your data using EDA techniques before you conduct any statistical tests? A. EDA can reveal defects in your data that may warrant taking corrective action before you proceed to the inferential analysis. B. EDA can help you determine which summary statistics are appropriate for your data. C. EDA may reveal unsuspected influences in your data. An advantage of the stemplot over the histogram is that only the stemplot allows you to - preserve the actual score Dummy-coding variables involves A. assigning numerical values (for example, 0 and 1) to categorical variables. The ________ provides an estimate of the amount of error in prediction. - standard error of estimate In a stemplot of scores ranging from 11 to 83, a score of 42 would be located at a stem value of ________. A. 4 If your treatment did not have an effect on your dependent variable, you can assume that the means representing each group in your experiment A. are independent estimates of a single population mean. Data transformations are used to A. adjust data to meet assumptions of statistical tests Serious violations of one or more of the assumptions underlying parametric statistics may lead you to commit a Type I error more or less often than the stated alpha level If your dependent variable were a dichotomous yesno response, you could compare the proportion of subjects saying yes in the experimental group to the proportion saying yes in the control group by using the A. z test for the difference between two proportions. A one-tailed test is used if you are interested in whether the obtained value of the statistic falls in one particular tail of the sampling distribution for that statistic. When an interaction is present, C. main effects are not interpreted, because your independent variables do not have simple effects on your dependent variable. When the effect of one independent variable on your dependent variable changes over levels of a second, a(n) ________ is present. Interaction If you want the average of 10 scores to equal 100, you can choose any numbers you want for 9 of the scores, but the 10th score will have to be whatever number will make the average of the scores equal 100. Thus the _______________________ equal(s) 9. Degrees of freedom Because of ________, you should not conduct too many post hoc comparisons, even if you predicted particular differences between means. - probability pyrmaidying At a given significance level, a one-tailed test s more likely to detect real differences between means than is a two-tailed test. Chapter 5 4 scales of measurement typically discussed in psychological statistics -nominal – lowest scale numbers are assigned to catergories. Numbers are assigned to which catergory is completely arbitrary. Gender. Gives it identitiy, you can count it. -ordinal- magnitiude as well as identity. Can tell us if it has more or less. The distance is not equal and we don’t know the distance. No math -interval- equal distance they do not have a true zero point the number 0 is arbitaray (Iq tests) temperature, calculate means and SD, nomothetic research, -ratio- true 0 , highest level, all math, 0 means absence, very strong variables. -approximately interval abstract number system identity- each number has a particular meaning magnitude- numbers have an inherent order from smaller to larger equal intervals- means that the difference between numbers are on the same scale absolute/ true zero- zero point represents the absence of the property being measured interval scales have properties of - identity - magnitude - equal distance equal distance allows us to know how many units - do not have a true zero point - the number 0 is arbitrary ratio scales have the properties of the abstract number system - identity - magnitude - equal distance - absolute/ true zero – allow us to know how many times greater one case iis scales with absolute zero and equal interval are considered ratio scales ordinal can be split into ranked preferences- they do not tell us how much just more or less. The things we like more are ranked higher. Assigned ranks- in order to select a smaller subset or to show an individuals relative placement in a larger group. Theyre considered ordinal because they have the properties of identity and magnitude. Interval can have a zero point but it doesn’t mean the absent of whatevers being tested. - cannot make statements that are multiplication or division ratio scales - all the properties of the abstract number system o identity o magnitude o equal distance o absolute/true zero - all mathematical operations - 0 represents the absence of the behavior likert-type ratings - used in sureys where we asked to rate how much agree/disagree - ordinal scales sometimes but sometimes interval or approx. equal interbval - properties of identity and order - identity- let us know whether we agree or disagree - order- each number represents a rating that is more or less equal ro the others - psychologist disagree if the scale is equal or not equal create measures by adding up the individual likert-type ratings or calculating an average sum or total of item responses will give broader range of scores than the indivula likert-type ratings. Controversial what scale it is. Ordinal- really cant see the same interval or gap between interval- repsondant looking at a scale is making equal interval judgements. Not a true value. Mainly just between ordinal/ interval Scale matters because ot determines the mathematical operations that are permitted for those variables. Mathematical operations determine which statistics can be applied to the data. How a variable is mesasure- precision Reliability = dependability Validity- study does what it should do. Measure what we said, right test, right score. Invalid. Not truthful or correct. Goof assessment rools can help reject the null hypotheis or acceptance of the research hypothsis 3 types of reliability - interrater o two observers watching the same behavior their scores should agree with each other - internal consistency o people should respond in a consistent way to all of the questions - test retest o if you give people a test more than once they should get about the same score each time the spearman- brown split half coefficient - two random halves - if the sum scale were reliable you would expect the two halves would have an r close to 1.0 cronbachs alpha not split halves, take into account all possible split halves if a is close to 1.0 your test are reliable. How do we establish validity - internal - external - construct internal - inside the experiment. Well designed. Free of confounds external - outside the experiment, generalized, construct - think concept. Manipulating and measuring concepts - truly represents what you had in mind - correlating instruments with similar measures - predicting a specific behavior or criterion - established by being successfully used in a wide range of studies operational definition - how observations are made - what is observed - how behavior is recorded the more complex the behavior that is recorderd the more difficult it is to achieve good interrater reliability. Face to face interview Best when you need to establish rapport with your participant. Disadvantage-the social situation created might bias the participants responses. Also expensive to administer. Telephone interviews Offers some social distance, might be easier to answer a sensitive question. Answer participants questions and cost is less than face to face Disadvantage- easier to deny a request Survey method Privacy and low cost Can choose when its convient to sit down and answer the survey Low cost alternative Disadvantage- participants cannot ask questions and response rates are low Three types of survey questions - open ended - close ended - partially closed open ended - important for complete answers - however they require more effort from the participant close ended - multiple choice, ranks or likert-type rating - easy for participants - least effort - not appropriate when the expected responses are too complex partially closed - good idea of the range of expected responses but want to give the participants the opportunity to give an answer that is rare or you did not consider surveys - make sure you address only one issue - avoid bias high cronbach alpha means that they would score high on others. High correlation means that the same or similar responses were given both timrees and the instrument or question is relatively stable The measurement process Major task - represent variables numerically - begin with a conceptual definitions (theory) and create operational definitions major considerations - types of variables - types of scales - reliability and validity conceptual variable – theoretical construct, abstact idea. Not tangable. Self esteem. Conceptual def, in terms of measurement there isn’t a operational definition operational def- the procedure by whuch the researcher measure the construct and or manipulare the variable. - bridgeman: define variables in terms of the operations needed to produce them. What is it that’s going to be needed to indicate? In research every variable needs to be defined. - requires to think in behavioral terms - what do we have to fo to know we have observed something of interest. - The more careful and complete the operational definition the more precise the measurement of the variable will be True score- true part of the observed score. Perfect Error score- difference between observed score and true score - error as measurement “fluctuation” method error - due to characteristics of the test or the testing situation. Not random. Reflects the systematic error. (scale is always off by 5) trait error- is due to individual characteristics, random aspects of people. Unique characteristics or experiences of the subject. (take home message) increase reliability and decreasing error - increase sample size - elimianate unclear questions - use both easy and difficult questions - minimize the effects of external events - standardize insturctions o clear definitions about behaviors and events to be rated - maintain consistent scoring procedures o provide feedback about discrepancies - dcrease response set biases o social desirability a reliable score is one that is relatively free from measurement fluctuation reliability is measured using correlation coeffiecient. Percentages. Refered to as an r. reliability coefficients - indicate relative consistency - r can range from -1.0 to 1.0 - cronbachs coefficient alpha and KR-20 (internal consistenct measures) range from 0 to 1.0 o statistically possible to be less than 0 but then highly problematic - percentages range from 0 – 100% - types of reliability - test- retest – a measure of stability, admister the same test at two different times. R from test 1 and r from test 2 - parallel forms- a measure of equivalence, two separate forms of the test to the same people R from test 1 to test 2 rather high - inter-rater reliability- measure of agreement. Have raters rate behavior and then determine the amount of agreement between them. Percentage of agreements - internal consistency- measure of how consistently each item measure the same underlying construct- correlate performance on each item with overall performance across participants. ( cronbach coeffeicent alpha) cronbachs coeffeicent – item variants and sums across the items. Looks like correlation coefficient. Don’t like negative and should be between -1 and 1 and usually upper .7 range add to the population size that are consistant the score goes up add items and reliability went down. In consistant set of responses. When you respond consistant it still goes down. Rule of thumb - min for research = .70 - use test in applied setting for clinical decisions - why? Do the math for reliability formula + arbitrariness why problematic? - it is for the user to determine what amount of error variance you are willing to tolerate - although you can make scale more reliable by adding redundant items, real value- add might be minimal. Add good and different because don’t want to over load participants and have a good score test and you look at even vs odd = split half split half – splitting an exisiting measure in half and comparing them. Parallel forms – two stand alone versions Cronbach – internal Scores from a valid test should correlate with other similar variables and should predict other variables they should predict Validity refers to the accumulating evidence to provide scientific basis for score meanings/interpretations accuracy of interpretation of scores. Validity refers to the tests results or scores not the test itself Validity ranges from low to high Validiity must be interpreted within the testing context (people, place, time) Validity 4 kinds - face - content - criterion o concurrent o predictive - construct content - a measure of how well the items represent the entire universe of items - ask an expert if the items asses what you want them to do criterion - concurrent o a measure of how well a test estimates a criterion o select a criterion and correlate scores on the test with scores on the criterion in the present - predictive o a measure of how well a test predicts a criterion o select a criterion and correlate scores on the criterion in the future construct - a measure of how well a test assesses some underlying construct - asses the underlying construct on which the test is based and correlate these scores with the test scores - corrrlate new test with an established test - show that people with and without certain traits score differently - determine whether tasks required on test are consistent with theory guiding test development multitrait- multimethod matrix display assiciations between indicators measures and methods that are trying to get at one trait and one construct. Everything above diagnol correlates with everything below See if there are convergents and divergents Convergents – different methods same construct yield similar results Discriminate validity- different methods, different consturcts, yield different results. Less convergence. Different methods shouldn’t be highly related A valid test must be relaiable but a reliable test need not be valid. Selecting measures Psychometric characteristics - type of relaibaility - type of validity subject measure considerations - developed/ validated on comparable samples? - Reading level/ motor performances/ age related - Social desirability / response set issues Novel measure - with unknown / not established psychometric properties - gather data on measure with a pilot sample first - problematic psychometrics undermine study concludsions and inferences chapter 6 We use inferential statistics to make an inference from the sample back to the population. Validity of that inference depends on how representative the same or subset is of the population from which weve drawn. Sampling procedures 1. probability sampling- random chance component 2. non probability sampling random components give us confidence that our sample is reasonably good representation of the population. Probability sampling - random or chance - every person has an equal and independent chance of being selected non- independent sampling - refered by a friend - similar values different kinds of probability sampling - simple- straightforward, population is homogenous for the characteristic - systematic- not random. Every nth person from the frame - stratified- subgroups that differ substantially, treats as if they were two or more separate populations and then randomly samples within each. - Proportionate- subgroups differ in size. Stratifies the population into relevant subgroups then random sampling within each subgroup. Equal to their proportion in the population - Cluster- impossible or impractical to identify every person in the sample. - Multistage- most sophisticated. Large studies. Repreentative national sample. Zipcode…street…address. Non probability sampling strategies - Haphazard- bias to the study. Should be avoided. “man on the street” technique - Convenience- selects a particular group but doesn’t sample all of a population. - Purpose- targets a particular group of people. Particular characteristics that are hard to find generally. Non probabililty sampling occurs when it is practically impossible to use probability sampling. Time and expense constraints. Frequency of the behavior or characteristics of interest is so low in the population that a more targeted strategy is needed. All probability sampling requires a sampling frame a population defined and avalible through records Lecture Population: all possible individuals making up a group of interest in a study Sample: a relatively small number of individuals drawn from a population for inclusion in a study. Unreprestivive – sampling error Subpopulation- small segment of the defined population Generalization- the ability to apply findings from a sample to a larger population Nonrandom sampling- not randomly chosen Types: - convience - snowballing - quota convience – readily available or who will volunteer to participate snowballing- you select initial particiapnts who meet some kind of criteria and hen aquire other participants through a referral from your initial participants quota- convience sample that is comprised of subgroups similar in number to the population. Specific people are being targeted because they match a criteria pros and cons pros- sampling from small subject pool is easier, requires less time, money and effort cons: nonrandom samples have less external validity, and thus the generality of results is compromised internet research nonrandom because they are not representative of the population the characterisics of knowing how to use a computer and internet limits advantage: broader range of participants disadvantage: generalization , external validity lab research - solicit , flyers etc - - subject pool issues with subject pool punished if they don’t participate, skew the data field research - take lab to particiapts (knock on doors, churches etc) - set of situation and wait for participants voluntary participant -ALL - volunteer bias – bias in a sample that results from using volunteer participants exclusively - volunteer bias can affect internal validity - could hurt relationship bettwen IV and DV - only volunteers can not be generalized to population so hurts external validity fix this by - avoid people to not volunteer anymore - contributing to science - making it sound appealing. Non threatening - avoid tasks that are stressful - state that there is importance in this research and theyd be helping science. An example(s) of poor research in Psychology, presented in the video on Research Methods was/were __________. A. None of the choices are correct. B. Dr. Bettelheim’s study of child autism C. intensive behavioral treatments of autism D. both B and C. Answer Key: B Review Check to review before finishing (will be flagged in Table of Contents) Question 2 of 50 Score: 0 (of possib1epoint) Which of the following is NOT a purpose of conducting a literature search? A. Prevent you from carrying out a study that has already been done B. Identify questions that need to be answered C. Provide ideas and justification for designing a study D. none of the above Answer Key: D Question 3 of 50 Score: 1 (of possib1epoint) ________ theories are considered the best type of theory because they propose a new structure to explain a phenomenon A. Analogical B. Descriptive C. Empirical D. Fundamental Answer Key: D Question 4 of 50 Score: 1 (of possibl1 point) The information processing theory of memory (3 types of memory: sensory, short-term, and long-term) is a descriptive theory because it A. describes the concept and explains how it affects memory ability. B. uses an analogy to describe a concept. C. only describes features of the concept without explaining it. D. creates a new concept to explain the relationship between two variables. Answer Key: C Question 5 of 50 Score: 1 (of possibl1 point) According to Michael Shermer's presentation of Why people believe weird things, the problem with having a theory is that _________________________________. A. they tend to contain individual's own personal biases B. they can never be falsified C. they have to be tested D. they are not easy to develop Answer Key: A Review Check to review before finishing (will be flagged in Table o
More Less

Related notes for PSY 3213L

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit