Deck 14: Classroom Assessment and Grading

Full screen (f)
exit full mode
Question
A test or rating scale is objective to the extent that it

A) is free of biases of the administrators and scorers.
B) measures only one, or only a very few variables.
C) predicts an important and realistic criterion.
D) yields the same score each time an individual takes it.
Use Space or
up arrow
down arrow
to flip the card.
Question
Identify the type of objective test item that is most appropriate for measuring the following specific learning outcome: "Select the best reason for a specific action from a given list of alternatives."

A) Essay
B) Multiple-choice
C) Short-answer
D) True-false
Question
Objective tests are generally more reliable than essay tests because objective tests can

A) be corrected for guessing a response correctly.
B) contain more independent items measuring achievement.
C) eliminate subjective judgment in their preparation.
D) measure almost any important educational attainment.
Question
The most important use of essay tests is to

A) measure simple learning outcomes.
B) measure complex learning outcomes.
C) reduce grading time.
D) sample a wide variety of learning outcomes.
Question
When you write multiple-choice items, you should use

A) as much wording as possible in the distractors.
B) distractors that require fine discriminations.
C) "none of the above" less frequently than "all of the above."
D) stems that present a single problem.
Question
The key feature of authentic assessments is

A) development of tests by professional evaluators.
B) high test-retest reliability.
C) testing in a realistic context.
D) use of essays as the primary form of testing.
Question
A major difference between formative and summative tests is the

A) format of the test items.
B) interpretation of the test data.
C) preparation of the test directions.
D) role played by content validity in the two tests.
Question
Which one of the following strategies does NOT tend to increase the reliability of essay test grades?

A) Base your ratings on a model answer that you have constructed.
B) Grade all essay items for each student in turn based on a pre-established point system.
C) Have students place their names on the back of their test papers.
D) Score all responses to one essay item before moving on to the next item.
Question
What type of test would provide the most useful information for the following question: "Are the learning outcomes for the new unit sequenced appropriately?"

A) Diagnostic
B) Formative
C) Placement
D) Summative
Question
All of the following statements are true of essay tests EXCEPT:

A) Each question should give students a precise task.
B) Less material can be covered in essay than in multiple-choice tests.
C) Students should be able to answer the questions in a few words for the sake of efficiency.
D) Questions should measure the higher-level objectives.
Question
"We will have weekly quizzes, but your final grade will be based only on the midterm and final exam." This decision implies that the quizzes are to be used for what type of evaluation?

A) Criterion-referenced
B) Formative
C) Norm-referenced
D) Summative
Question
Which one of the following sources would be the LEAST likely product to be found in a student's portfolio?

A) Artistic products
B) Peer comments
C) Standardized test results
D) Written products
Question
Starch and Elliot's classic 1912 experiments dealing with the extent of teachers' personal values and biases in scoring essay tests demonstrated what measurement issue that still plagues essay scoring today?

A) Objectivity
B) Relevance
C) Reliability
D) Validity
Question
The term objective as used in objective testing refers to the

A) content goals of the items.
B) goal(s) of the test.
C) type of material covered.
D) way the test is scored.
Question
What type of test would provide the most useful information for the following question: "Are students making satisfactory progress in learning the metric system?"

A) Diagnostic
B) Formative
C) Placement
D) Summative
Question
At the beginning of the semester, Mr. Rumstead gave a formative test for the purposes of setting objectives. At the end of the course he gave the same test to determine grades. The second time this test was given, it was used as what type of test?

A) Aptitude
B) Diagnostic
C) Formative
D) Summative
Question
The most defensible practice for scoring essay tests is to evaluate

A) all parts of one student's paper before going on the next student's paper.
B) each one of the items for all students with reference to its respective model answers.
C) each question as acceptable or unacceptable and assign equal weight to each question.
D) the response for each question with regard to content, organization, and mechanics, with each factor weighted equally.
Question
Exhibitions differ from portfolios because exhibitions

A) are authentic assessments.
B) involve an immediate audience.
C) use criterion-referenced standards.
D) use norm-referenced standards.
Question
What guideline for writing multiple-choice items is violated in the following item stem? "A norm-referenced test is..."

A) Each alternative must fit the grammatical form of the item.
B) Item stems should be stated in simple terms.
C) The stem should include a complete question.
D) Unessential details should be omitted in the item stem.
Question
Which one of the following actions is a limitation of multiple-choice tests?

A) Allow for bluffing
B) Are difficult to grade
C) Can be difficult to prepare
D) Cannot measure higher-order learning
Question
The validity of any test is related directly to the

A) difficulty of the test.
B) evaluation of expert reviewers.
C) length of the test.
D) purpose of the test.
Question
A typical criterion-referenced report card that reports student learning tends to be

A) complex and time-consuming for teachers.
B) constructive for group comparisons.
C) convenient but not helpful for many students.
D) practical for elementary grades but not for high school.
Question
Ms. Bateman writes "the answer is just great" as the only comment on a student's paper. Based on Woolfolk's discussion of feedback, this type of feedback is

A) less appropriate if Ms. Bateman is a tenth-grade teacher than a fourth-grade teacher.
B) more appropriate if Ms. Bateman is a tenth-grade teacher than a fourth-grade teacher.
C) more appropriate for an essay test than for a multiple-choice test.
D) rarely appropriate regardless of grade level or type of test.
Question
The type of skills that would be most effective for teachers to have in conducting conferences with students and their families is skill in

A) academic knowledge.
B) communication.
C) creativity.
D) problem solving.
Question
A popular grading system to use for combining grades from many assignments is to use

A) an average of the grades from all sources.
B) an average of all of the norm-referenced scores.
C) a point system.
D) percentage grading.
Question
When teachers want to give different weights to tests and assignments, it is most appropriate to use a what type of grading system?

A) Criterion-referenced
B) Normal curve
C) Percentage
D) Point or norm-referenced
Question
Mr. Garren has been emphasizing authentic testing in his social studies class. Which one of the following will be a likely result of this emphasis?

A) Fewer essay tests
B) More exhibitions by students of their work
C) More mastery grading of performances
D) More reliable grading of students
Question
A local high school developed a math achievement test and used the results to determine the selection of students for an advanced placement course with a limited number of seats. What type of test should be used?

A) Criterion-referenced
B) Diagnostic
C) Norm-referenced
D) Standardized
Question
Grades should be tied to meaningful course objectives so that

A) high-ability students will be motivated to pursue worthwhile goals.
B) low-ability students will be motivated to pursue worthwhile goals.
C) students do not have to choose between learning and making a grade.
D) the course objectives can be tested fairly and reliably.
Question
Which one of the following situations requires a norm-referenced evaluation?

A) Assessing whether an individual has been drinking too much to drive
B) Certifying whether a newly graduated education student can perform satisfactorily as a teacher
C) Hiring one manager from a pool of ten applicants for a large department store
D) Reporting to parents about how much students have learned during the semester
Question
Criterion-referenced tests are used primarily to assess

A) achievement of general instructional goals.
B) each student's achievement compared to other students.
C) mastery of specific objectives.
D) the range of achievement in a large group.
Question
Paper-and-pencil exercises, direct observations of performances, development of portfolios, and creation of artifacts are all methods of

A) assessment.
B) evaluation.
C) measurement.
D) testing.
Question
Which one of the following procedures would improve the reliability and validity of grading short essay tests, thus refuting the complaint of sensitivity to bias and variability in grading?

A) Administering more pretests
B) Grading on the curve
C) Implementing a contract system
D) Using a scoring rubric
Question
A recommended procedure for authentic assessment is

A) grading on the curve in order to determine overall performance scores.
B) having students participate in developing the rating scales and scoring rubrics to be used in evaluation.
C) using authentic testing initially with higher-achieving students, with gradual integration of other students.
D) using only clearly defined, highly structured tasks or problems.
Question
The most important attribute of a norming sample is that it should be

A) completely random.
B) large and diverse.
C) limited in size.
D) similar to future test-takers.
Question
When a test actually measures what it purports to measure, the test is said to be

A) credible.
B) reliable.
C) usable.
D) valid.
Question
What strategy is recommended instead of assigning a failing grade to students' poor work?

A) Consider the work to be incomplete.
B) Give students support in revising the work.
C) Maintain high standards for students' work.
D) Take responsibility for the students' poor work.
Question
A test or any assessment instrument is objective to the extent that it

A) is free of biases of the test administrators and scorers.
B) measures only a very few variables at a time.
C) predicts an important and realistic criterion.
D) yields the same score each time an individual takes it.
Question
For which one of the following situations would a criterion-referenced test be the most appropriate measure to use?

A) Assessing the range of abilities in a large, mixed-ability group of students
B) Comparing students' general ability in specific subject areas such as English, algebra, or general science
C) Measuring mastery of basic competencies in addition and subtraction
D) Selecting candidates for a teaching position when only a few openings are available
Question
A school administrator wants to identify the top 10 percent of the Grade 12 students in order to recommend them for scholarship competition at the highest rated university in Canada. What testing purpose would serve the administrator's purpose?

A) Criterion-referenced
B) Diagnostic
C) Norm-referenced
D) Standardized
Question
Which of the following is NOT considered an appropriate accommodation in testing?

A) Allowing for the test to be written in a different room, minimizing distractions.
B) Giving a different test for a student to complete that covers easier content.
C) Allowing the student to complete a long test in two smaller segments with a short break in between.
D) Fewer items are included on each page of the test.
Question
With regard to the practice of retaining or "holding back" students with failing grades, Woolfolk's general recommendation is that

A) promotion should include resource room assignments as well as one-to-one tutoring when needed.
B) promotion underscores the idea that poor performances bring negative consequences.
C) retention is usually better for self-esteem and performance than undeserved promotion.
D) students should be promoted with their peers but provided with extra help in the summer or the next year.
Question
Descriptive rating scales are a type of scoring rubric.
Question
Common types of authentic assessments include portfolios and exhibitions.
Question
It is generally best to put as little wording as possible into the stem of a multiple-choice question relative to the distractors.
Question
Having students assist in the development of rating scales and scoring rubrics can lead to improved learning.
Question
Multiple-choice questions may be used to test higher-level objectives.
Question
It is considered desirable to grade one essay question for the entire class before grading the next one for any student.
Question
Do evaluations of portfolios, exhibitions, and other types of authentic assessment products have the same measurement concerns as those for classroom and standardized tests?

A) No, because authentic assessments are constructed by the students themselves.
B) No, because authentic assessments are inherently valid and fair to students.
C) Yes, because authentic assessments have the same potential for invalidity and unfairness.
D) Yes, because authentic assessments promote learning better than paper-and-pencil tests.
Question
The dual marking system refers to

A) using two teachers to establish inter-rater reliability on the same scoring of a student test.
B) both the teacher and student scoring student working using the same rubric and comparing.
C) establishing a marking system that ensures validity and reliability.
D) establishing two grades (one for actual achievement and the second including judgment about effort in final grade)
Question
In a student portfolio, the teacher should determine what evidence of work should be included.
Question
A multiple-choice item format is preferable to a matching format when related concepts are to be linked.
Question
What is the benefit of using a behaviour-content matrix as described in the Woolfolk text?

A) It can help teachers develop test items appropriate to each combination of skill and content.
B) It can help teachers determine potential sources of testing anxiety experienced by students.
C) It can help teachers assess the validity and reliability of testing items.
D) It can help identify gaps in students' knowledge and skills within ta test.
Question
Which is true of most standardized achievement tests used in Canada?

A) They are norm-referenced.
B) They are criterion-referenced.
C) They are free of cultural bias.
D) They are helpful in measuring aptitude.
Question
Michael's LSAT (Law School Admission Test) scores correlate with his grade point average achieved in his first year of law school. Therefore, the LSAT is useful in deciding who to admit to law school due to

A) good curriculum alignment.
B) good construct-related evidence of validity.
C) good criterion-related evidence of validity.
D) good evidence of absence of assessment bias.
Question
What term is used to describe a system in which each student works for a particular grade according to agreed-upon standards?

A) contract system
B) norm-referencing system
C) criterion-referencing system
D) dual marking system
Question
Establishing scoring standards is more difficult for making norm-referenced decisions than for making criterion-referenced decisions.
Question
Which of the following examples is NOT related to the halo effect?

A) Jim is a nice student who never causes any trouble in class. The teacher gives him a B- instead of a C+ (which more accurately reflects his true grade).
B) Alice tries really hard in class. The teacher holds a conference with her parents to explore why effort is not impacting achievement.
C) Marley is very disruptive and disrespectful. She turns in a paper that although would be scored as a Level 4 on the rubric, receives a Level 3.
D) Mahsa completes a presentation on the Canadian Rockies. Although factually incorrect, the teacher is very impressed with the use of media in her presentation and slightly raises her grade due to this effort.
Question
When a test developer calculates an estimate of how much students' scores vary due to unreliability, they are most interest in

A) a confidence interval.
B) measures of central tendency.
C) standard error of measurement.
D) construct validity.
Question
Authentic tests are generally easier than conventional tests to grade objectively.
Question
Authentic assessments are procedures that directly assess student performances in "real-life" situations.
Question
Criterion-referenced grading systems use standards of subject mastery and learning to determine grades.
Question
Because authentic assessments require students to perform, it is perfectly logical to __________, because the whole point of instruction is for students to do well on these tasks.
Question
Norm-referenced tests are used to indicate progress toward specific competency levels.
Question
Criterion-referenced grading is more appropriate than norm-referenced grading for authentic assessments.
Question
Measurement is the quantitative component of evaluation.
Question
Assessment is the term used to describe the process of gathering information about students' learning outcomes.
Question
A general term for tests designed to assess performances in realistic contexts is __________ testing.
Question
The incorrect answers in a multiple-choice question are called alternatives or ________.
Question
Both criterion-referenced and norm-referenced report cards indicate student progress toward specific goals.
Question
Tests in which the scoring of items does not require interpretation are said to be ________.
Question
Grading on the curve is an example of a criterion-referenced system.
Question
A type of authentic assessment in which students' work is collected to demonstrate progress and achievement in a particular area is a(n) __________.
Question
A general term for the type of testing that is used to guide planning and identify students' needs is ________ assessment.
Question
A general term for the type of testing that is used to determine final achievement in a course is ________ assessment.
Question
Reliability is the degree to which a test measures what it is supposed to measure.
Question
A test must be both reliable and valid in order to be useful.
Question
Teachers who hold lower expectations for ethnic minority students may also be negatively influencing the students' testing performance and assessment experiences.
Question
Criterion-referenced assessment is valuable in determining mastery of basic skills.
Question
When individual differences in achievement are to be reported, a norm-referenced grading system is appropriate.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/95
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 14: Classroom Assessment and Grading
1
A test or rating scale is objective to the extent that it

A) is free of biases of the administrators and scorers.
B) measures only one, or only a very few variables.
C) predicts an important and realistic criterion.
D) yields the same score each time an individual takes it.
is free of biases of the administrators and scorers.
2
Identify the type of objective test item that is most appropriate for measuring the following specific learning outcome: "Select the best reason for a specific action from a given list of alternatives."

A) Essay
B) Multiple-choice
C) Short-answer
D) True-false
Multiple-choice
3
Objective tests are generally more reliable than essay tests because objective tests can

A) be corrected for guessing a response correctly.
B) contain more independent items measuring achievement.
C) eliminate subjective judgment in their preparation.
D) measure almost any important educational attainment.
contain more independent items measuring achievement.
4
The most important use of essay tests is to

A) measure simple learning outcomes.
B) measure complex learning outcomes.
C) reduce grading time.
D) sample a wide variety of learning outcomes.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
5
When you write multiple-choice items, you should use

A) as much wording as possible in the distractors.
B) distractors that require fine discriminations.
C) "none of the above" less frequently than "all of the above."
D) stems that present a single problem.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
6
The key feature of authentic assessments is

A) development of tests by professional evaluators.
B) high test-retest reliability.
C) testing in a realistic context.
D) use of essays as the primary form of testing.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
7
A major difference between formative and summative tests is the

A) format of the test items.
B) interpretation of the test data.
C) preparation of the test directions.
D) role played by content validity in the two tests.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
8
Which one of the following strategies does NOT tend to increase the reliability of essay test grades?

A) Base your ratings on a model answer that you have constructed.
B) Grade all essay items for each student in turn based on a pre-established point system.
C) Have students place their names on the back of their test papers.
D) Score all responses to one essay item before moving on to the next item.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
9
What type of test would provide the most useful information for the following question: "Are the learning outcomes for the new unit sequenced appropriately?"

A) Diagnostic
B) Formative
C) Placement
D) Summative
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
10
All of the following statements are true of essay tests EXCEPT:

A) Each question should give students a precise task.
B) Less material can be covered in essay than in multiple-choice tests.
C) Students should be able to answer the questions in a few words for the sake of efficiency.
D) Questions should measure the higher-level objectives.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
11
"We will have weekly quizzes, but your final grade will be based only on the midterm and final exam." This decision implies that the quizzes are to be used for what type of evaluation?

A) Criterion-referenced
B) Formative
C) Norm-referenced
D) Summative
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
12
Which one of the following sources would be the LEAST likely product to be found in a student's portfolio?

A) Artistic products
B) Peer comments
C) Standardized test results
D) Written products
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
13
Starch and Elliot's classic 1912 experiments dealing with the extent of teachers' personal values and biases in scoring essay tests demonstrated what measurement issue that still plagues essay scoring today?

A) Objectivity
B) Relevance
C) Reliability
D) Validity
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
14
The term objective as used in objective testing refers to the

A) content goals of the items.
B) goal(s) of the test.
C) type of material covered.
D) way the test is scored.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
15
What type of test would provide the most useful information for the following question: "Are students making satisfactory progress in learning the metric system?"

A) Diagnostic
B) Formative
C) Placement
D) Summative
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
16
At the beginning of the semester, Mr. Rumstead gave a formative test for the purposes of setting objectives. At the end of the course he gave the same test to determine grades. The second time this test was given, it was used as what type of test?

A) Aptitude
B) Diagnostic
C) Formative
D) Summative
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
17
The most defensible practice for scoring essay tests is to evaluate

A) all parts of one student's paper before going on the next student's paper.
B) each one of the items for all students with reference to its respective model answers.
C) each question as acceptable or unacceptable and assign equal weight to each question.
D) the response for each question with regard to content, organization, and mechanics, with each factor weighted equally.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
18
Exhibitions differ from portfolios because exhibitions

A) are authentic assessments.
B) involve an immediate audience.
C) use criterion-referenced standards.
D) use norm-referenced standards.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
19
What guideline for writing multiple-choice items is violated in the following item stem? "A norm-referenced test is..."

A) Each alternative must fit the grammatical form of the item.
B) Item stems should be stated in simple terms.
C) The stem should include a complete question.
D) Unessential details should be omitted in the item stem.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
20
Which one of the following actions is a limitation of multiple-choice tests?

A) Allow for bluffing
B) Are difficult to grade
C) Can be difficult to prepare
D) Cannot measure higher-order learning
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
21
The validity of any test is related directly to the

A) difficulty of the test.
B) evaluation of expert reviewers.
C) length of the test.
D) purpose of the test.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
22
A typical criterion-referenced report card that reports student learning tends to be

A) complex and time-consuming for teachers.
B) constructive for group comparisons.
C) convenient but not helpful for many students.
D) practical for elementary grades but not for high school.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
23
Ms. Bateman writes "the answer is just great" as the only comment on a student's paper. Based on Woolfolk's discussion of feedback, this type of feedback is

A) less appropriate if Ms. Bateman is a tenth-grade teacher than a fourth-grade teacher.
B) more appropriate if Ms. Bateman is a tenth-grade teacher than a fourth-grade teacher.
C) more appropriate for an essay test than for a multiple-choice test.
D) rarely appropriate regardless of grade level or type of test.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
24
The type of skills that would be most effective for teachers to have in conducting conferences with students and their families is skill in

A) academic knowledge.
B) communication.
C) creativity.
D) problem solving.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
25
A popular grading system to use for combining grades from many assignments is to use

A) an average of the grades from all sources.
B) an average of all of the norm-referenced scores.
C) a point system.
D) percentage grading.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
26
When teachers want to give different weights to tests and assignments, it is most appropriate to use a what type of grading system?

A) Criterion-referenced
B) Normal curve
C) Percentage
D) Point or norm-referenced
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
27
Mr. Garren has been emphasizing authentic testing in his social studies class. Which one of the following will be a likely result of this emphasis?

A) Fewer essay tests
B) More exhibitions by students of their work
C) More mastery grading of performances
D) More reliable grading of students
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
28
A local high school developed a math achievement test and used the results to determine the selection of students for an advanced placement course with a limited number of seats. What type of test should be used?

A) Criterion-referenced
B) Diagnostic
C) Norm-referenced
D) Standardized
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
29
Grades should be tied to meaningful course objectives so that

A) high-ability students will be motivated to pursue worthwhile goals.
B) low-ability students will be motivated to pursue worthwhile goals.
C) students do not have to choose between learning and making a grade.
D) the course objectives can be tested fairly and reliably.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
30
Which one of the following situations requires a norm-referenced evaluation?

A) Assessing whether an individual has been drinking too much to drive
B) Certifying whether a newly graduated education student can perform satisfactorily as a teacher
C) Hiring one manager from a pool of ten applicants for a large department store
D) Reporting to parents about how much students have learned during the semester
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
31
Criterion-referenced tests are used primarily to assess

A) achievement of general instructional goals.
B) each student's achievement compared to other students.
C) mastery of specific objectives.
D) the range of achievement in a large group.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
32
Paper-and-pencil exercises, direct observations of performances, development of portfolios, and creation of artifacts are all methods of

A) assessment.
B) evaluation.
C) measurement.
D) testing.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
33
Which one of the following procedures would improve the reliability and validity of grading short essay tests, thus refuting the complaint of sensitivity to bias and variability in grading?

A) Administering more pretests
B) Grading on the curve
C) Implementing a contract system
D) Using a scoring rubric
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
34
A recommended procedure for authentic assessment is

A) grading on the curve in order to determine overall performance scores.
B) having students participate in developing the rating scales and scoring rubrics to be used in evaluation.
C) using authentic testing initially with higher-achieving students, with gradual integration of other students.
D) using only clearly defined, highly structured tasks or problems.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
35
The most important attribute of a norming sample is that it should be

A) completely random.
B) large and diverse.
C) limited in size.
D) similar to future test-takers.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
36
When a test actually measures what it purports to measure, the test is said to be

A) credible.
B) reliable.
C) usable.
D) valid.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
37
What strategy is recommended instead of assigning a failing grade to students' poor work?

A) Consider the work to be incomplete.
B) Give students support in revising the work.
C) Maintain high standards for students' work.
D) Take responsibility for the students' poor work.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
38
A test or any assessment instrument is objective to the extent that it

A) is free of biases of the test administrators and scorers.
B) measures only a very few variables at a time.
C) predicts an important and realistic criterion.
D) yields the same score each time an individual takes it.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
39
For which one of the following situations would a criterion-referenced test be the most appropriate measure to use?

A) Assessing the range of abilities in a large, mixed-ability group of students
B) Comparing students' general ability in specific subject areas such as English, algebra, or general science
C) Measuring mastery of basic competencies in addition and subtraction
D) Selecting candidates for a teaching position when only a few openings are available
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
40
A school administrator wants to identify the top 10 percent of the Grade 12 students in order to recommend them for scholarship competition at the highest rated university in Canada. What testing purpose would serve the administrator's purpose?

A) Criterion-referenced
B) Diagnostic
C) Norm-referenced
D) Standardized
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
41
Which of the following is NOT considered an appropriate accommodation in testing?

A) Allowing for the test to be written in a different room, minimizing distractions.
B) Giving a different test for a student to complete that covers easier content.
C) Allowing the student to complete a long test in two smaller segments with a short break in between.
D) Fewer items are included on each page of the test.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
42
With regard to the practice of retaining or "holding back" students with failing grades, Woolfolk's general recommendation is that

A) promotion should include resource room assignments as well as one-to-one tutoring when needed.
B) promotion underscores the idea that poor performances bring negative consequences.
C) retention is usually better for self-esteem and performance than undeserved promotion.
D) students should be promoted with their peers but provided with extra help in the summer or the next year.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
43
Descriptive rating scales are a type of scoring rubric.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
44
Common types of authentic assessments include portfolios and exhibitions.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
45
It is generally best to put as little wording as possible into the stem of a multiple-choice question relative to the distractors.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
46
Having students assist in the development of rating scales and scoring rubrics can lead to improved learning.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
47
Multiple-choice questions may be used to test higher-level objectives.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
48
It is considered desirable to grade one essay question for the entire class before grading the next one for any student.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
49
Do evaluations of portfolios, exhibitions, and other types of authentic assessment products have the same measurement concerns as those for classroom and standardized tests?

A) No, because authentic assessments are constructed by the students themselves.
B) No, because authentic assessments are inherently valid and fair to students.
C) Yes, because authentic assessments have the same potential for invalidity and unfairness.
D) Yes, because authentic assessments promote learning better than paper-and-pencil tests.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
50
The dual marking system refers to

A) using two teachers to establish inter-rater reliability on the same scoring of a student test.
B) both the teacher and student scoring student working using the same rubric and comparing.
C) establishing a marking system that ensures validity and reliability.
D) establishing two grades (one for actual achievement and the second including judgment about effort in final grade)
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
51
In a student portfolio, the teacher should determine what evidence of work should be included.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
52
A multiple-choice item format is preferable to a matching format when related concepts are to be linked.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
53
What is the benefit of using a behaviour-content matrix as described in the Woolfolk text?

A) It can help teachers develop test items appropriate to each combination of skill and content.
B) It can help teachers determine potential sources of testing anxiety experienced by students.
C) It can help teachers assess the validity and reliability of testing items.
D) It can help identify gaps in students' knowledge and skills within ta test.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
54
Which is true of most standardized achievement tests used in Canada?

A) They are norm-referenced.
B) They are criterion-referenced.
C) They are free of cultural bias.
D) They are helpful in measuring aptitude.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
55
Michael's LSAT (Law School Admission Test) scores correlate with his grade point average achieved in his first year of law school. Therefore, the LSAT is useful in deciding who to admit to law school due to

A) good curriculum alignment.
B) good construct-related evidence of validity.
C) good criterion-related evidence of validity.
D) good evidence of absence of assessment bias.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
56
What term is used to describe a system in which each student works for a particular grade according to agreed-upon standards?

A) contract system
B) norm-referencing system
C) criterion-referencing system
D) dual marking system
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
57
Establishing scoring standards is more difficult for making norm-referenced decisions than for making criterion-referenced decisions.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
58
Which of the following examples is NOT related to the halo effect?

A) Jim is a nice student who never causes any trouble in class. The teacher gives him a B- instead of a C+ (which more accurately reflects his true grade).
B) Alice tries really hard in class. The teacher holds a conference with her parents to explore why effort is not impacting achievement.
C) Marley is very disruptive and disrespectful. She turns in a paper that although would be scored as a Level 4 on the rubric, receives a Level 3.
D) Mahsa completes a presentation on the Canadian Rockies. Although factually incorrect, the teacher is very impressed with the use of media in her presentation and slightly raises her grade due to this effort.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
59
When a test developer calculates an estimate of how much students' scores vary due to unreliability, they are most interest in

A) a confidence interval.
B) measures of central tendency.
C) standard error of measurement.
D) construct validity.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
60
Authentic tests are generally easier than conventional tests to grade objectively.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
61
Authentic assessments are procedures that directly assess student performances in "real-life" situations.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
62
Criterion-referenced grading systems use standards of subject mastery and learning to determine grades.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
63
Because authentic assessments require students to perform, it is perfectly logical to __________, because the whole point of instruction is for students to do well on these tasks.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
64
Norm-referenced tests are used to indicate progress toward specific competency levels.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
65
Criterion-referenced grading is more appropriate than norm-referenced grading for authentic assessments.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
66
Measurement is the quantitative component of evaluation.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
67
Assessment is the term used to describe the process of gathering information about students' learning outcomes.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
68
A general term for tests designed to assess performances in realistic contexts is __________ testing.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
69
The incorrect answers in a multiple-choice question are called alternatives or ________.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
70
Both criterion-referenced and norm-referenced report cards indicate student progress toward specific goals.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
71
Tests in which the scoring of items does not require interpretation are said to be ________.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
72
Grading on the curve is an example of a criterion-referenced system.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
73
A type of authentic assessment in which students' work is collected to demonstrate progress and achievement in a particular area is a(n) __________.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
74
A general term for the type of testing that is used to guide planning and identify students' needs is ________ assessment.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
75
A general term for the type of testing that is used to determine final achievement in a course is ________ assessment.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
76
Reliability is the degree to which a test measures what it is supposed to measure.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
77
A test must be both reliable and valid in order to be useful.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
78
Teachers who hold lower expectations for ethnic minority students may also be negatively influencing the students' testing performance and assessment experiences.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
79
Criterion-referenced assessment is valuable in determining mastery of basic skills.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
80
When individual differences in achievement are to be reported, a norm-referenced grading system is appropriate.
Unlock Deck
Unlock for access to all 95 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 95 flashcards in this deck.