Deck 4: Reliability

ملء الشاشة (f)
exit full mode
سؤال
Classical Test Theory assumes

A)the length of a test has no bearing on its reliability.
B)measurement errors occur systematically.
C)it is not possible to estimate true scores.
D)the distribution of random errors is the same for every respondent.
استخدم زر المسافة أو
up arrow
down arrow
لقلب البطاقة.
سؤال
Who developed methods for evaluating sources of error in behavioral research?

A)Edward Thorndike
B)Kuder and Richardson
C)Charles Spearman
D)Cronbach
سؤال
If we repeatedly administered the same test to the same individual, the standard deviation of the person's score would be the

A)standard error of the mean.
B)variance.
C)reliability of the test.
D)standard error of measurement.
سؤال
Theoretically, if Susie repeatedly took the 6th grade achievement test, you would be able to find her true score by finding the ____ of the distribution of her scores.

A)mean
B)standard deviation
C)variance
D)standard error of measurement
سؤال
Theoretically, reliability is

A)the correlation of the observed test score with the true score.
B)the square root of the ratio of true to the observed score.
C)the ratio of true to the observed score squared.
D)not possible to define.
سؤال
Assuming the "rubber yardstick" shrinks and expands at random, what can be said about the distribution of scores from the rubber yardstick?

A)It will have a mean of zero (0).
B)It will be normal.
C)It will have a standard error of zero (0).
D)It will be skewed.
سؤال
The work of Charles Spearman combined what two measurement concepts?

A)mean and variance
B)sample statistics and population parameters
C)sampling error and correlation
D)reliability and validity
سؤال
What is Spearman known for?

A)Working out the basics of reliability theory
B)Developing the notion of sampling error
C)Creating methods for measuring error
D)Developing multivariate analysis
سؤال
What is Cronbach known for?

A)Developing measures to evaluate sources of error
B)Creating the basics of multivariate analysis
C)Developed the basics of contemporary measurement theory
D)Distinguished between objective and subjective measures
سؤال
We can get an idea of how much measurement error is present in a score through the

A)true score.
B)observed score.
C)standard error of the mean.
D)standard error of measurement.
سؤال
Because classic test theory assumes a person's true score is the same over time, repeating the same test over and over gives a distribution of scores that reflect what?

A)systematic error
B)random error
C)reliability
D)internal consistency
سؤال
The basic theory of reliability was first worked out by

A)Karl Pearson.
B)Charles Spearman.
C)Julian Stanley.
D)Lee Cronbach.
سؤال
According to classical test theory, errors of measurement are

A)always overestimates of true score.
B)always underestimates of true score.
C)random.
D)constant.
سؤال
When creating a test, one generally uses a subset of items to represent a larger construct. This is known as

A)a population parameter.
B)a domain sampling.
C)a sampling error.
D)descriptive statistics.
سؤال
When talking about errors in terms of psychological testing, we are referring to the fact that:

A)someone got an answer incorrect.
B)there is always some inaccuracy in the measurement.
C)the test was inappropriate for that particular group.
D)the score is too subjective to be accurate.
سؤال
Repeated use of the same test typically results in different scores. How does classical test theory account for this?

A)poor test validity
B)systematic variability
C)random error
D)inattention
سؤال
An observed score is composed of

A)the residual and the true score.
B)the criterion and the predictor.
C)the measurement error and the predictor.
D)the true score and the measurement error.
سؤال
Which of the following is an important distinction between systematic errors and random errors?

A)Random errors are more likely than systematic errors to cause errors in conclusions.
B)Systematic errors occur only in objective measures and random errors occur only in subjective measures.
C)Random errors can be eliminated by careful wording of test items.
D)Systematic errors are extremely rare among psychological tests.
سؤال
If you have three clocks in your house, and every clock is 10 minutes fast, this is an example of

A)systematic error.
B)random error.
C)measurement error.
D)a rubber yardstick.
سؤال
Classical Test Theory assumes that

A)errors are systematic.
B)errors are random.
C)true scores cannot be estimated.
D)the length of a test has no bearing on its reliability.
سؤال
Sources of error associated with time sampling are measured using

A)the test-retest method.
B)the split half method.
C)KR 20.
D)the alpha method.
سؤال
How does the domain sampling model conceptualize reliability?

A)The absolute value of the difference between the standard error of measurement and the variance
B)The ratio of variance of the observed scores on the short version of a test and the variance of the long-run true scores
C)The sum of squares of the difference between the observed and true scores
D)The ratio of the number of sample items to the number of domain items, multiplied by the mean of the sample distribution
سؤال
Professor Pine constructed five different short history tests by randomly drawing questions from the huge pool of all possible questions about the current material. He has created

A)randomly parallel tests.
B)a large sample size.
C)systematic errors.
D)attenuation effects.
سؤال
In the domain sampling model, the error that is being considered is the error caused by

A)choosing the wrong domain.
B)systematic error.
C)using a limited sample of items.
D)random error.
سؤال
Tests designed according to item response theory

A)are no longer considered useful.
B)can only be used with non-objective material
C)yield more reliable results with fewer items
D)provide low-tech methods for field use.
سؤال
The difference between David's two typing tests, one at the beginning of the semester and one at the end, reflects the fact that he typed quite a few term papers during the semester. This reflects

A)attenuation.
B)random error.
C)practice effects.
D)domain sampling.
سؤال
A split-half correlation, KR 20, and coefficient alpha are all used to evaluate

A)standard errors of measurement.
B)internal consistency.
C)variance.
D)validity.
سؤال
Why might different random samples of domain items yield different estimates of the true score?

A)sampling error
B)poor reliability
C)respondent error
D)item bias
سؤال
Which of the following would tend to provide the most conservative estimate of split-half reliability?

A)the Phillips method
B)the Spearman-Brown formula
C)coefficient alpha
D)the odd-even reliability coefficient
سؤال
Suppose you were trying to estimate the reliability of a whole test on the basis of the correlation between scores on the two halves of the test. In order to correct for using scores based on the halves, you might use the

A)KR 20.
B)alpha method.
C)Spearman-Brown formula.
D)split half method.
سؤال
A reliability coefficient of .60 suggests that

A)64% of the variance on the test is error.
B)40% of the variance on the test is error.
C)78% of the variance on the test is error.
D)the test can be used for clinical purposes but not for research.
سؤال
If a researcher is attempting to assess the reliability of a measure of depression, the method of choice would be

A)internal consistency.
B)time sampling.
C)the test-retest method.
D)more than one of these.
سؤال
Federal government guidelines require that a test be

A)standardized for use among all U.S. sub-populations.
B)factor analyzed before it can be used to make employment decisions.
C)reliable before it can be used to make employment decisions.
D)reliable above the .90 level.
سؤال
The Spearman Brown formula corrects for deflated reliability due to

A)half-length tests.
B)small sample size.
C)systematic error.
D)poor test item construction.
سؤال
Dr. Janine developed two equivalent forms of a test and administered them both, in counter-balanced order, to a group of people on the same day in order to assess reliability. What is this called?

A)test- retest
B)parallel forms
C)split-half
D)KR 20
سؤال
Dr. Smith is trying to determine the reliability of a new personality test. Two randomly parallel tests, A and B, have a correlation of .81. What is the estimated reliability of the new personality test?

A).81
B)-.9
C).9
D).81/ t
سؤال
The problems created by using a limited number of items to represent a larger and more complicated construct are explicitly considered in the ____ model.

A)multivariate
B)random sampling
C)domain sampling
D)standard error of measurement
سؤال
The method for estimating the internal consistency of a test that simultaneously considers all possible ways of splitting the items is the

A)Spearman Brown formula.
B)Kuder-Richardson formula.
C)Cronbach's alpha.
D)the odd-even method.
سؤال
Upon repeated applications of the same test, performance on the second application may be affected by previous experience on the test. This is known as

A)attenuation.
B)a carryover effect.
C)shrinkage.
D)selected recall.
سؤال
As opposed to reliability based on the classical test theory, ____ focuses on the range of item difficulty that is useful in assessing an individual's ability.

A)domain sampling
B)internal consistency
C)coefficient alpha
D)item response theory
سؤال
Which of the following is a problem in evaluating the agreement between observers in behavioral studies?

A)The observers are usually not trained.
B)The behaviors being studied are usually not directly observable.
C)There will always be some agreement by chance.
D)There is no method for evaluating the agreement between observers.
سؤال
In order to determine the unidimensionality of a test, you can use

A)factor analysis.
B)split half reliability.
C)parallel forms assessment.
D)the Spearman-Brown prophecy formula.
سؤال
Test constructors can improve test reliability by

A)increasing the number of items.
B)decreasing the number of items.
C)retaining items that have the most face validity.
D)reducing the item to total correlation.
سؤال
Correction for attenuation is used

A)to estimate the validity of a test.
B)to correct for tests that are short.
C)to correct for tests that are long.
D)to estimate the true correlation between variables that have been measured with error.
سؤال
Measures of test-retest reliability are sometimes considered inappropriate for the evaluation of health status because

A) health status tests should not given at multiple points in time.
B)variations in health status may be related to true changes over time rather than measurement error.
C)there is no domain of health status.
D)health status is too complicated to measure.
سؤال
The difference between KR 20 and coefficient alpha is

A)KR 20 can be used to evaluate time sampling problems while alpha cannot.
B)Alpha can be used to evaluate time sampling problems while KR 20 cannot.
C)KR 20 can only be used for items scored right or wrong but Alpha can be used for items in any format.
D)Alpha can only be used for items scored right or wrong but KR 20 can be used for items in any format.
سؤال
The kappa statistic is used to

A)assess the level of agreement among several observers.
B)estimate the correlation between a continuous variable and an artificially dichotomous variable.
C)estimate the percentage of disagreement between observers.
D)estimate the validity of behavioral observation.
سؤال
If the same test, given at different points in time to the same test takers, yields different scores, then the method typically used to assess this source of error is

A)test-retest.
B)alternate forms/parallel forms.
C)split-half.
D)KR 20.
سؤال
Jennifer read a report in which the agreement between raters of children's aggressive behavior was .50, indicating

A)the raters agreed at chance levels.
B)agreement was poor.
C)agreement was excellent.
D)agreement was moderate.
سؤال
Which of the following is true of the parallel forms method?

A)It is the most often used method for estimating reliability.
B)It provides one of the most rigorous methods for estimating reliability.
C)It is largely ineffective with psychological tests.
D)Sophisticated computer programs have made it unnecessary.
سؤال
Standard errors of measurement are used to

A)determine whether an observed score is the "true" score.
B)determine the standard deviation of the scores.
C)calculate the exact true score.
D)create confidence intervals around specific observed test scores.
سؤال
The preferred method for assessing the level of agreement between observers is the

A)kappa statistic
B)Spearman coefficient
C)coefficient alpha
D)rank-order statistic
سؤال
Approximately what value must a reliability coefficient have for most purposes in basic research?

A).90
B).50
C).70
D).30
سؤال
What is the impact of carryover effects on test-retest reliability?

A)Test-retest reliability is not influenced by carryover effects.
B)Carryover effects result in an overestimation of reliability.
C)Carryover effects result in an underestimation of reliability.
D)Test-retest reliability increases carryover effects.
سؤال
The reliability of a difference score is

A)equal to the reliability of the most reliable of the two measures.
B)equal to the reliability of the least reliable of the two measures.
C)the average reliability of the two measures.
D)expected to be lower than the reliability of either of the two measures.
سؤال
Difference scores are created by

A)subtracting one test score from another.
B)subtracting the true score from a predicted score.
C)eliminating error from true scores.
D)giving a test to two different individuals.
سؤال
The standard error of measurement allows us to

A)estimate the degree to which a test provides inaccurate readings.
B)have an acceptable margin of error.
C)determine the source of error.
D)avoid any measurement error.
سؤال
Which of the following is used to estimate the number of items that should be added to a test to achieve a specified reliability?

A)KR 20
B)coefficient alpha
C)Spearman-Brown prophecy formula
D)split-half technique
سؤال
Which of the following is a source of measurement error?

A)respondent sampling
B)scorer sampling
C)internal consistency
D)external consistency
سؤال
Items are probably measuring the same thing when the correlation between an item and the total score

A)is high.
B)is low.
C)approaches 0.
D)is negative.
سؤال
What is the most useful indicator of reliability for the interpretation of individual scores?

A)split-half variance
B)item sampling
C)test-retest
D)standard error of measurement
سؤال
Explain how someone might decide how reliable is "reliable enough" for a measure. What settings might warrant more stringent criteria for reliability, and why?
سؤال
Describe some of the advantages and disadvantages associated with behavioral observation techniques. Provide examples.
سؤال
Briefly discuss each of the APA's standards for reliability.
سؤال
The reliability coefficient is

A)the mean of the observed scores.
B)the variance of the observed scores.
C)the ratio of the mean of the true scores on a test to the mean of the observed scores.
D)the ratio of the variance of the true scores on a test to the variance of the observed scores.
سؤال
In the domain sampling model, the reliability of a test increases as

A)the number of items increases.
B)the number of items decreases.
C)the number of test administrations increases.
D)the number of test administrations decreases.
سؤال
The intercorrelations among items within the same test is referred to as

A)interrater reliability.
B)discriminability.
C)standard errors of measurement.
D)internal consistency.
سؤال
Reliability theory combines De Moivre's concept of sampling error with Pearson's concept of _____________ in the context of measurement.

A)coefficient alpha
B)internal consistency
C)product moment correlation
D)domain sampling
سؤال
Describe the reasons for the large movement from Classical Test Theory to Item Response Theory.
سؤال
For which of these constructs is it most appropriate to measure test-retest reliability?

A)IQ
B)Depression
C)Literacy
D)Blood pressure
سؤال
Classical Test Theory is based on certain assumptions. Discuss these basic assumptions and the theory behind them, and then address the challenges to any of these assumptions.
سؤال
Interrater reliability is of concern in

A)personality testing.
B)behavioral observation studies.
C)factor analysis.
D)parallel forms assessment.
سؤال
Tests will be most reliable if they are

A)multidimensional.
B)unidimensional.
C)brief.
D)criterion-referenced.
سؤال
Carryover effects only affect reliability when changes over time are

A)large.
B)systematic.
C)random.
D)due to practice effects.
سؤال
There are several methods to estimate reliability. Compare and contrast the different methods of reliability discussed in this chapter, stressing the importance of coefficient alpha.
سؤال
The formula used to estimate how long a test must be to achieve a desired level of reliability is

A)kappa
B)prophecy
C)Spearman
D)Thorndike
سؤال
The prophecy formula is used to

A)predict expected values.
B)estimate how long a test must be to achieve a desired level of reliability.
C)estimate how long a test must be to achieve a desired level of validity.
D)calculate variability.
سؤال
Discuss the challenges to the use of difference scores.
سؤال
Classical test theory assumes that

A)there are no errors in measurement.
B)each person has a true score.
C)observed scores almost always reflect true ability.
D)errors of measurement are systematic.
فتح الحزمة
قم بالتسجيل لفتح البطاقات في هذه المجموعة!
Unlock Deck
Unlock Deck
1/79
auto play flashcards
العب
simple tutorial
ملء الشاشة (f)
exit full mode
Deck 4: Reliability
1
Classical Test Theory assumes

A)the length of a test has no bearing on its reliability.
B)measurement errors occur systematically.
C)it is not possible to estimate true scores.
D)the distribution of random errors is the same for every respondent.
D
2
Who developed methods for evaluating sources of error in behavioral research?

A)Edward Thorndike
B)Kuder and Richardson
C)Charles Spearman
D)Cronbach
D
3
If we repeatedly administered the same test to the same individual, the standard deviation of the person's score would be the

A)standard error of the mean.
B)variance.
C)reliability of the test.
D)standard error of measurement.
D
4
Theoretically, if Susie repeatedly took the 6th grade achievement test, you would be able to find her true score by finding the ____ of the distribution of her scores.

A)mean
B)standard deviation
C)variance
D)standard error of measurement
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
5
Theoretically, reliability is

A)the correlation of the observed test score with the true score.
B)the square root of the ratio of true to the observed score.
C)the ratio of true to the observed score squared.
D)not possible to define.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
6
Assuming the "rubber yardstick" shrinks and expands at random, what can be said about the distribution of scores from the rubber yardstick?

A)It will have a mean of zero (0).
B)It will be normal.
C)It will have a standard error of zero (0).
D)It will be skewed.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
7
The work of Charles Spearman combined what two measurement concepts?

A)mean and variance
B)sample statistics and population parameters
C)sampling error and correlation
D)reliability and validity
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
8
What is Spearman known for?

A)Working out the basics of reliability theory
B)Developing the notion of sampling error
C)Creating methods for measuring error
D)Developing multivariate analysis
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
9
What is Cronbach known for?

A)Developing measures to evaluate sources of error
B)Creating the basics of multivariate analysis
C)Developed the basics of contemporary measurement theory
D)Distinguished between objective and subjective measures
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
10
We can get an idea of how much measurement error is present in a score through the

A)true score.
B)observed score.
C)standard error of the mean.
D)standard error of measurement.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
11
Because classic test theory assumes a person's true score is the same over time, repeating the same test over and over gives a distribution of scores that reflect what?

A)systematic error
B)random error
C)reliability
D)internal consistency
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
12
The basic theory of reliability was first worked out by

A)Karl Pearson.
B)Charles Spearman.
C)Julian Stanley.
D)Lee Cronbach.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
13
According to classical test theory, errors of measurement are

A)always overestimates of true score.
B)always underestimates of true score.
C)random.
D)constant.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
14
When creating a test, one generally uses a subset of items to represent a larger construct. This is known as

A)a population parameter.
B)a domain sampling.
C)a sampling error.
D)descriptive statistics.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
15
When talking about errors in terms of psychological testing, we are referring to the fact that:

A)someone got an answer incorrect.
B)there is always some inaccuracy in the measurement.
C)the test was inappropriate for that particular group.
D)the score is too subjective to be accurate.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
16
Repeated use of the same test typically results in different scores. How does classical test theory account for this?

A)poor test validity
B)systematic variability
C)random error
D)inattention
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
17
An observed score is composed of

A)the residual and the true score.
B)the criterion and the predictor.
C)the measurement error and the predictor.
D)the true score and the measurement error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
18
Which of the following is an important distinction between systematic errors and random errors?

A)Random errors are more likely than systematic errors to cause errors in conclusions.
B)Systematic errors occur only in objective measures and random errors occur only in subjective measures.
C)Random errors can be eliminated by careful wording of test items.
D)Systematic errors are extremely rare among psychological tests.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
19
If you have three clocks in your house, and every clock is 10 minutes fast, this is an example of

A)systematic error.
B)random error.
C)measurement error.
D)a rubber yardstick.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
20
Classical Test Theory assumes that

A)errors are systematic.
B)errors are random.
C)true scores cannot be estimated.
D)the length of a test has no bearing on its reliability.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
21
Sources of error associated with time sampling are measured using

A)the test-retest method.
B)the split half method.
C)KR 20.
D)the alpha method.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
22
How does the domain sampling model conceptualize reliability?

A)The absolute value of the difference between the standard error of measurement and the variance
B)The ratio of variance of the observed scores on the short version of a test and the variance of the long-run true scores
C)The sum of squares of the difference between the observed and true scores
D)The ratio of the number of sample items to the number of domain items, multiplied by the mean of the sample distribution
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
23
Professor Pine constructed five different short history tests by randomly drawing questions from the huge pool of all possible questions about the current material. He has created

A)randomly parallel tests.
B)a large sample size.
C)systematic errors.
D)attenuation effects.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
24
In the domain sampling model, the error that is being considered is the error caused by

A)choosing the wrong domain.
B)systematic error.
C)using a limited sample of items.
D)random error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
25
Tests designed according to item response theory

A)are no longer considered useful.
B)can only be used with non-objective material
C)yield more reliable results with fewer items
D)provide low-tech methods for field use.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
26
The difference between David's two typing tests, one at the beginning of the semester and one at the end, reflects the fact that he typed quite a few term papers during the semester. This reflects

A)attenuation.
B)random error.
C)practice effects.
D)domain sampling.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
27
A split-half correlation, KR 20, and coefficient alpha are all used to evaluate

A)standard errors of measurement.
B)internal consistency.
C)variance.
D)validity.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
28
Why might different random samples of domain items yield different estimates of the true score?

A)sampling error
B)poor reliability
C)respondent error
D)item bias
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
29
Which of the following would tend to provide the most conservative estimate of split-half reliability?

A)the Phillips method
B)the Spearman-Brown formula
C)coefficient alpha
D)the odd-even reliability coefficient
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
30
Suppose you were trying to estimate the reliability of a whole test on the basis of the correlation between scores on the two halves of the test. In order to correct for using scores based on the halves, you might use the

A)KR 20.
B)alpha method.
C)Spearman-Brown formula.
D)split half method.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
31
A reliability coefficient of .60 suggests that

A)64% of the variance on the test is error.
B)40% of the variance on the test is error.
C)78% of the variance on the test is error.
D)the test can be used for clinical purposes but not for research.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
32
If a researcher is attempting to assess the reliability of a measure of depression, the method of choice would be

A)internal consistency.
B)time sampling.
C)the test-retest method.
D)more than one of these.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
33
Federal government guidelines require that a test be

A)standardized for use among all U.S. sub-populations.
B)factor analyzed before it can be used to make employment decisions.
C)reliable before it can be used to make employment decisions.
D)reliable above the .90 level.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
34
The Spearman Brown formula corrects for deflated reliability due to

A)half-length tests.
B)small sample size.
C)systematic error.
D)poor test item construction.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
35
Dr. Janine developed two equivalent forms of a test and administered them both, in counter-balanced order, to a group of people on the same day in order to assess reliability. What is this called?

A)test- retest
B)parallel forms
C)split-half
D)KR 20
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
36
Dr. Smith is trying to determine the reliability of a new personality test. Two randomly parallel tests, A and B, have a correlation of .81. What is the estimated reliability of the new personality test?

A).81
B)-.9
C).9
D).81/ t
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
37
The problems created by using a limited number of items to represent a larger and more complicated construct are explicitly considered in the ____ model.

A)multivariate
B)random sampling
C)domain sampling
D)standard error of measurement
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
38
The method for estimating the internal consistency of a test that simultaneously considers all possible ways of splitting the items is the

A)Spearman Brown formula.
B)Kuder-Richardson formula.
C)Cronbach's alpha.
D)the odd-even method.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
39
Upon repeated applications of the same test, performance on the second application may be affected by previous experience on the test. This is known as

A)attenuation.
B)a carryover effect.
C)shrinkage.
D)selected recall.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
40
As opposed to reliability based on the classical test theory, ____ focuses on the range of item difficulty that is useful in assessing an individual's ability.

A)domain sampling
B)internal consistency
C)coefficient alpha
D)item response theory
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
41
Which of the following is a problem in evaluating the agreement between observers in behavioral studies?

A)The observers are usually not trained.
B)The behaviors being studied are usually not directly observable.
C)There will always be some agreement by chance.
D)There is no method for evaluating the agreement between observers.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
42
In order to determine the unidimensionality of a test, you can use

A)factor analysis.
B)split half reliability.
C)parallel forms assessment.
D)the Spearman-Brown prophecy formula.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
43
Test constructors can improve test reliability by

A)increasing the number of items.
B)decreasing the number of items.
C)retaining items that have the most face validity.
D)reducing the item to total correlation.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
44
Correction for attenuation is used

A)to estimate the validity of a test.
B)to correct for tests that are short.
C)to correct for tests that are long.
D)to estimate the true correlation between variables that have been measured with error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
45
Measures of test-retest reliability are sometimes considered inappropriate for the evaluation of health status because

A) health status tests should not given at multiple points in time.
B)variations in health status may be related to true changes over time rather than measurement error.
C)there is no domain of health status.
D)health status is too complicated to measure.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
46
The difference between KR 20 and coefficient alpha is

A)KR 20 can be used to evaluate time sampling problems while alpha cannot.
B)Alpha can be used to evaluate time sampling problems while KR 20 cannot.
C)KR 20 can only be used for items scored right or wrong but Alpha can be used for items in any format.
D)Alpha can only be used for items scored right or wrong but KR 20 can be used for items in any format.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
47
The kappa statistic is used to

A)assess the level of agreement among several observers.
B)estimate the correlation between a continuous variable and an artificially dichotomous variable.
C)estimate the percentage of disagreement between observers.
D)estimate the validity of behavioral observation.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
48
If the same test, given at different points in time to the same test takers, yields different scores, then the method typically used to assess this source of error is

A)test-retest.
B)alternate forms/parallel forms.
C)split-half.
D)KR 20.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
49
Jennifer read a report in which the agreement between raters of children's aggressive behavior was .50, indicating

A)the raters agreed at chance levels.
B)agreement was poor.
C)agreement was excellent.
D)agreement was moderate.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
50
Which of the following is true of the parallel forms method?

A)It is the most often used method for estimating reliability.
B)It provides one of the most rigorous methods for estimating reliability.
C)It is largely ineffective with psychological tests.
D)Sophisticated computer programs have made it unnecessary.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
51
Standard errors of measurement are used to

A)determine whether an observed score is the "true" score.
B)determine the standard deviation of the scores.
C)calculate the exact true score.
D)create confidence intervals around specific observed test scores.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
52
The preferred method for assessing the level of agreement between observers is the

A)kappa statistic
B)Spearman coefficient
C)coefficient alpha
D)rank-order statistic
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
53
Approximately what value must a reliability coefficient have for most purposes in basic research?

A).90
B).50
C).70
D).30
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
54
What is the impact of carryover effects on test-retest reliability?

A)Test-retest reliability is not influenced by carryover effects.
B)Carryover effects result in an overestimation of reliability.
C)Carryover effects result in an underestimation of reliability.
D)Test-retest reliability increases carryover effects.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
55
The reliability of a difference score is

A)equal to the reliability of the most reliable of the two measures.
B)equal to the reliability of the least reliable of the two measures.
C)the average reliability of the two measures.
D)expected to be lower than the reliability of either of the two measures.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
56
Difference scores are created by

A)subtracting one test score from another.
B)subtracting the true score from a predicted score.
C)eliminating error from true scores.
D)giving a test to two different individuals.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
57
The standard error of measurement allows us to

A)estimate the degree to which a test provides inaccurate readings.
B)have an acceptable margin of error.
C)determine the source of error.
D)avoid any measurement error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
58
Which of the following is used to estimate the number of items that should be added to a test to achieve a specified reliability?

A)KR 20
B)coefficient alpha
C)Spearman-Brown prophecy formula
D)split-half technique
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
59
Which of the following is a source of measurement error?

A)respondent sampling
B)scorer sampling
C)internal consistency
D)external consistency
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
60
Items are probably measuring the same thing when the correlation between an item and the total score

A)is high.
B)is low.
C)approaches 0.
D)is negative.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
61
What is the most useful indicator of reliability for the interpretation of individual scores?

A)split-half variance
B)item sampling
C)test-retest
D)standard error of measurement
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
62
Explain how someone might decide how reliable is "reliable enough" for a measure. What settings might warrant more stringent criteria for reliability, and why?
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
63
Describe some of the advantages and disadvantages associated with behavioral observation techniques. Provide examples.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
64
Briefly discuss each of the APA's standards for reliability.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
65
The reliability coefficient is

A)the mean of the observed scores.
B)the variance of the observed scores.
C)the ratio of the mean of the true scores on a test to the mean of the observed scores.
D)the ratio of the variance of the true scores on a test to the variance of the observed scores.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
66
In the domain sampling model, the reliability of a test increases as

A)the number of items increases.
B)the number of items decreases.
C)the number of test administrations increases.
D)the number of test administrations decreases.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
67
The intercorrelations among items within the same test is referred to as

A)interrater reliability.
B)discriminability.
C)standard errors of measurement.
D)internal consistency.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
68
Reliability theory combines De Moivre's concept of sampling error with Pearson's concept of _____________ in the context of measurement.

A)coefficient alpha
B)internal consistency
C)product moment correlation
D)domain sampling
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
69
Describe the reasons for the large movement from Classical Test Theory to Item Response Theory.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
70
For which of these constructs is it most appropriate to measure test-retest reliability?

A)IQ
B)Depression
C)Literacy
D)Blood pressure
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
71
Classical Test Theory is based on certain assumptions. Discuss these basic assumptions and the theory behind them, and then address the challenges to any of these assumptions.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
72
Interrater reliability is of concern in

A)personality testing.
B)behavioral observation studies.
C)factor analysis.
D)parallel forms assessment.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
73
Tests will be most reliable if they are

A)multidimensional.
B)unidimensional.
C)brief.
D)criterion-referenced.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
74
Carryover effects only affect reliability when changes over time are

A)large.
B)systematic.
C)random.
D)due to practice effects.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
75
There are several methods to estimate reliability. Compare and contrast the different methods of reliability discussed in this chapter, stressing the importance of coefficient alpha.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
76
The formula used to estimate how long a test must be to achieve a desired level of reliability is

A)kappa
B)prophecy
C)Spearman
D)Thorndike
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
77
The prophecy formula is used to

A)predict expected values.
B)estimate how long a test must be to achieve a desired level of reliability.
C)estimate how long a test must be to achieve a desired level of validity.
D)calculate variability.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
78
Discuss the challenges to the use of difference scores.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
79
Classical test theory assumes that

A)there are no errors in measurement.
B)each person has a true score.
C)observed scores almost always reflect true ability.
D)errors of measurement are systematic.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.
فتح الحزمة
k this deck
locked card icon
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 79 في هذه المجموعة.