Textbook Notes (369,137)
Canada (162,407)
Psychology (9,699)
PSYC31H3 (106)
Chapter 7

Chapter 7.docx

11 Pages
101 Views

Department
Psychology
Course Code
PSYC31H3
Professor
Zachariah Campbell

This preview shows pages 1,2 and half of page 3. Sign up to view the full 11 pages of the document.
Description
Chapter 7: Tests of Intellectual Abilities Psychometric Theory Test Construction STEP 1: Begins with a domain (type of behaviour that the researcher desires to sample) STEP 2: Generation of test items to measure the concept STEP 3: Test needs to be given to multiple individuals • Test items will be added/deleted depending on usefulness STEP 4: Factor analysis (statistical procedure in which all of the scores are correlated with one another to determine those variables or factors which account for the most variance in the data) may be done to determine the degree to which degree each item is related to the test’s construct STEP 5: Standardization: Uniformity of procedure in administering and scoring a test Establishment of norms: conversion of the raw scores of the sample group into percentiles in order to construct a normal distribution to allow ranking future test takers. • Many tests use a sample of the larger population of the US often stratified by age, gender, and ethnicity STEP 6: Scores – an individual’s scores on tests have no real value until they are compare to other individuals’ scores. • Raw scores can be converted to scaled scores: conversion of a participant’s raw score on a test or a version of the test to a common scale that allows for a numerical comparison among participants o Standard score: any score expressed in units of standard deviations of the distribution of scores in the population, with the mean at zero o Mean: arithmetic mean of a list of numbers is the sum of all the members of the list divided by the number of items on the list o Standard deviation: square root of the variance; it is usually employed to compare the variability of different groups • Example: WISC-IV. Mean = 100. SD = 15 Reliability Reliability: extent to which a test is repeatable and yields consistent scores • Test reliability: consistency of score obtained by the same individual when retested with the identical test or an equivalent form of the test Measures of test reliability make it possible to estimate what proportion of the total variance (in a population of samples, the mean of the square of the differences between the respective samples and their mean) is error variance (any condition that is irrelevant to the purpose of the test is error variance; uniform testing conditions try to prevent this). Reliability of a test is represented by the correlation coefficient: a number between -1 and +1, which measures the degree to which two variables are linearly related • Positive: ‘as X increases, so does Y’ o Most correlations vary between 0 and +1 o Reputable test has a reliability coefficient of 0.80 or higher • Negative: ‘as X increases, Y decreases’ Forms of reliability: • Test-retest reliability: involves administering the test to the same group of people at least twice; the first set of scores is then correlated with the second set of scores • Alternate-form reliability: same people are testing with one form of a test on one occasion and with another, equivalent form on a second occasion; the correlation between the scores represents the reliability coefficient • Split-half reliability: a measure of the reliability of a test based on the correlation between scores on two halves of the test, often the odd and even numbered test items • Administering a test once and finding the consistency of responses of all items in the test – interitem consistency: based on the performance of each item o Difference between the split-half coefficient and the Kuder-Richardson (interitem) coefficient serves as a rough index of the heterogeneity of the test Validity Validity: how valid a test is depends on what the test measures and how well it measures that subject A test can statistically reliable but also have relationship to the variables that the test designer want to study. • Similar to reliability data validity information is represented as a correlation coefficient • Standard approach is the correlation between the test score and a criterion measure Types of validity • Content validity: extent to which the items of a test or procedure are in fact a representative sample of that which is be measured o Depends on whether items for the test are appropriate for content area and whether there are enough of various types of items to sample the behaviour o Can suffer from subjective factors or bias in choice of items by researcher • Face validity: refers to the appearance of the test in terms of the content to be measured to the test taker o Face valid tests are clearer about what constructs they are measuring which may lead to subjects to be less truthful o Not a correlation between the test and any measure • Criterion validity: the effectiveness of the test in predicting behavioural criteria on which psychologists agree o Example: intelligence tests predict academic achievement • Construct validity: refers to whether a scale measures the unobservable social construct (ex. Fluid intelligence) that it purports to measure; it is related to the theoretical ideas behind the trait under construction o Correlation between new and older tests are often used as proof that the new test is valid o Internal consistency: measure based on the correlation between different items on the same test (or the same subscale on a larger test); measures whether several items that propose to measure the same general construct produce similar scores  Convergent validation: a test must demonstrate that the test correlates highly with other variables with which it theoretically should  Discriminant validation: test should not correlate with variable with which it should differ  Multitrait-multimethod design: experimental design for the direct evaluation of convergent and discriminant validation o Construct validity is actually an outdated term. These days, the term ‘validity’ is used instead because all types of validity ultimately conform to the construct Meta-analysis: evaluation of multiple studies using factor analytic methodology • May find more positive findings • Effect sizes are taken into account. Effect sizes are the degree of statistical significance (statistical evidence that there is a difference in the data from a study and is unlikely to have occurred by chance) in the data or size of the correlation Ethics in Testing and Assessment Three documents that specify the proper construction and development of testing materials: 1. APA’s Ethical Principles of Psychologists and Code of Conduct a. Specific principles related to competence, privacy and confidentiality and assessment 2. Standards for Educational and Psychological Testing a. Appropriate technical and professional standards to be followed in construction, evaluation, interpretation, and application of psychological assessment 3. International Testing Committee’s International Guidelines for Test Use a. Includes need for high standards when developing and using tests b. Need for professional behaviour History of Intellectual Assessment Earliest accounts of assessment is from Chinese Empire – used assessment for civil service exams Greece – tests within educational system: physical and intellectual abilities European universities relies on formal examinations Lots happened in 19 century: • Mental hygiene movement led to individuals with mental disorders to be separate from criminals o Development of diagnostic classification systems began with Kraeplin o Also important was the ability to determine the literacy of the two groups to see who would benefit from treatment Need of school systems also led to a need to determine the level of ability in students • Used to identify gifted vs. average vs. those who may need special education services • Testing has been used to separate current abilities and predict future benefits from various types of educational strategies Military purposes • Advancements in technology led to different kinds of abilities necessary for soldiers in different positions • Testing helped determine the literacy and potential of soldiers as they entered service • Testing would also be used to assess the various types of difficulties which arose from participation in the service o Neuropsychological tests were used to determine the extent of brain damage Jean Esquirol (1772-1840) first to make distinction between individuals based on testing information, in 1838. • Wrote a text on the various difference in types of mental retardation • Concluded that the individual’s use of language was the most dependable criteria for intellectual level Edward Seguin (1896-1907) spent most of his career trying to ed
More Less
Unlock Document

Only pages 1,2 and half of page 3 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit