University of Phoenix Material Validity and Reliability Matrix For each of the tests of reliability and validity listed on the matrix, prepare a 50-100-word description of test’s application and under what conditions these types of reliability would be used as well as when it would be inappropriate. Then prepare a 50-100-word description of each test’s strengths and a 50-100-word description of each test’s weaknesses. TEST OF RELIABILITY |APPLICATION AND APPROPRIATENESS |STRENGTHS |WEAKNESSES | |Internal Consistency |This method determines the consistency of items that compose |The strength would be in comparing the responses from the |All items would need to be of the same design or construct to| | |a particular test.
In order to gauge the reliability of the |participants to look for the consistency. It would be using |eliminate variables. This is fairly easy to develop in some | | |test, one needs to use an evaluation that measures only one |the same items for comparison so the responses should all |tests, such as a math test, but much more difficult for more | | |item or area. An evaluation would be administered to a group|fall within an expected range.
It would eliminate certain |complex or subjective measurements. | | |of people at one time. The administrator would then examine |variables in other forms of comparison. | | | |the responses to see if the items, which would all be of the | |(http://www. socialresearchmethods. net/kb/reltypes. hp) | | |same design or construct, yielded the same or similar results| | | | |from all who took the evaluation. If the responses were | | | | |similar, then the internal consistency would be considered | | | | |reliable.
If the responses varied a great deal, in other | | | | |words there was no consistency in the responses, the | | | | |instrument would be considered not to be a reliable | | | | |measurement.
There are many was to determine the internal | | | | |consistency such as split-half, test-retest, and parallel | | | | |forms.
The split-halves method is a common way to gauge the | | | | |internal consistency reliability of various survey | | | | |instruments. | | | | |(http://www. ocialresearchmethods. net/kb/reltypes. php) | | | |Split-half |This is utilized to determine internal reliability. |This is a fairly easy way to measure and determine the |In this method multiple items would need to be developed all | | |Split-half reliability is exactly that – the test is split |relationship between the two parts.
If the two parts are |using the same design or construct. This could be difficult | | |into 2 equal parts. All items of the same design or |similar, the internal consistency of the instrument is |for more complex constructs. | | |construct are divided into two parts. The entire evaluation |considered to be reliable. Because of the simplicity of this|(http://www. socialresearchmethods. et/kb/reltypes. php) | | |is presented to a sample group of participants. After the |method, it is often used to determine the reliability of | | | |administration, the total score is calculated and it is |various survey instruments. | | | |divided in half.
The two halves or parts are then compared |(http://www. socialresearchmethods. net/kb/reltypes. php) | | | |to see if the results are similar. The comparison of the two| | | | |parts determines the reliability of the whole. | | | |(http://www. socialresearchmethods. net/kb/reltypes. php) | | | |Test/retest |This method presents the same test to the same sample group |Only one form of the test would need to be used or developed. There is no information about reliability until the posttest. | | |on two different occasions. For this method, there must be a|The same control group would be used for both testing |After the posttest, if the reliability is low, all the time | | |control group. The control group is administered a “pretest”|sessions eliminating the need to gather a second group. |between the testing sessions has been a waste of time. The | | |and a “posttest”.
There would be no substantial changes made| |amount of time between testing sessions could have a major | | |to the construct between the two tests and there would be no | |impact. Also the number of people involved in the control | | |“instructing” of the control group between the two testing | |group would be a factor.
The test developer would also need | | |sessions. A critical factor in this method is the amount of | |to design multiple items which measure the same construct. | | |time permitted between the two tests. The correlation | |This could be a weakness for some constructs. | |between the two tests would be expected to be higher if the | | | | |time between testing session is short. The longer the time | |(http://www. socialresearchmethods. net/kb/reltypes. php) | | |frame between each test, the more the correlation between the| | | | |two tests would differ.
In test-retest we can obtain | | | | |different estimates of reliability depending on the length of| | | | |time between testing sessions. | | | | |(http://www. ocialresearchmethods. net/kb/reltypes. php) | | | |Parallel and alternate |For this measurement of reliability test developers need to |Test designers can develop one large item bank. This bank of|Test designers would need to develop a large bank of items | |forms |create two forms of the test that are considered to be equal |items could be used to develop alternate measures of the same|all of the same construct.
This is time consuming and | | |or parallel to each other. The easiest way to do this would |thing. |difficult. This would also be very challenging for complex or| | |be to create a large bank of questions. The questions would | |subjective construct. Cronbach’s Alpha is the most often used| | |all need to address the same construct.
The questions would | |estimate of consistency for parallel and alternate test | | |then be randomly assigned to two different tests. Both tests| |forms. | | |would be administered to the same group of people. The | | | | |results of both tests would be compared.
The estimate of | | | | |reliability would be the similarity between the results. This| | | | |approach relies on the randomly divided questions being | | | | |parallel or equivalent.
In order to the reliable, the two | | | | |parallel forms would have to be designed so that the two | | | | |forms would be equal.
Each test should be independent of the| | | | |other and found to equivalent measures. | | | | |(http://www. socialresearchmethods. net/kb/reltypes. php) | | | | | | | |TEST OF VALIDITY |APPLICATION AND APPROPRIATENESS |STRENGTHS |WEAKNESSES | |Face validity |Face validity judges how well a measure or procedure appears |This is a quick and easy first step to determine validity. This is only an estimate. First impressions can be | | |to measure a specific criteria. There are many questions to |It is a “face value” way to estimate if the test will measure|misleading. There is no guarantee that the test will be an | | |be considered such as: Does it seem like a reasonable way to |a certain criterion. Since it does not require extensive |accurate measurement of the criterion for a specific. | |gain the needed information? Is it well designed? Will it |examination, the determination can be made by an amateur. | | | |be reliable? Face validity does not depend on established | | | | |theories for support.
This is a direct contrast to the | | | | |content validity. | | | | | | | | | |(Fink, 1995). | | | |(http://writing. colostate. edu/guides/research/relval/com2b5. c| | | |fm) | | | | | | | | | | | | | | | | | | | | | | | | | | | | |Content validity |Content Validity determines the likelihood that the intended |In order to determine Content validity researchers must first|This method can be expensive and time consuming because of | | |content of a specific domain will be measured. (Carmines & |very clearly define the domains under study. Content- |the need to involve subject matter experts. These SME will | | |Zeller, 1991, p. 20,) Content validity relies on a theoretical|related evidence relies on subject matter experts (SME’s) who|either need to be located and hired or time will have to be | | |approach to determine if a test is actually measuring all |will evaluate test items against the test specifications |spent training someone to become the subject matter expert. | |domains within a certain criterion. This approach involves |before they become part of the assessment. |(http://writing. colostate. edu/guides/research/relval/com2b5. c| | |subject matter experts referred to as SME’s. The purpose of | |fm) | | |the SME is to evaluate test items using the test | | | | |specifications.
Content validity is rather uncomplicated for | | | | |concrete information or domains. However, if the intent is to| | | | |measure more abstract concepts, such as socio-cultural | | | | |domains, it becomes much more complicated.
For example, if a| | | | |researcher needs to evaluate an attitude like self-esteem, | | | | |they must first determine what is contained within the | | | | |relevant domain for that attitude. | | | | | | | | | |(http://writing. colostate. edu/guides/research/relval/com2b5. | | | | |fm) | | | | | | | | | | | | | | | | | | | | | | | | | | | |Criterion related | Criterion related validity is also known as instrumental |There are two different estimators which provides for a wider|Accuracy is often hard to predict. Just because someone | | |validity. This is used to determine the accuracy of a |range of application.
Employee selection tests are often used|performs well on a test does not always predict how well they| | |measure or a procedure by comparing it with another measure |as predictors of job performance. |will perform in other areas. At the same time, some people | | |or procedure that has already been determined to be valid. | |perform poorly on tests but do extremely well in other areas | | |Criterion-related validity refers to measures used for | |or situations. | | |prediction or estimation.
There are two types of | | | | |criterion-related validity: Concurrent and predictive | | | | |validity. Criterion-related validity is often used in the | | | | |selection of employees. Prospective employees are presented | | | | |with a battery of tests.
Employees are then selected based on| | | | |their performance on the tests. The assumption is that the | | | | |person who does well on the tests will also perform well on | | | | |the job. | | | | |(http://writing. colostate. edu/guides/research/relval/com2b5. | | | | |fm) | | | | | | | | | | | | | | | | | | | | | | | | | | | | References: Types of Reliability, retrieve on 7/1/2001 from http://www. socialresearchmethods. net/kb/reltypes. php Validity, retrieve on 7/1/2011 from (http://writing. colostate. edu/guides/research/relval/com2b5. cfm)