AMCLASS Validation Studies

There are no standard definitions of the descriptive terminology used to characterize the configuration, severity, site of lesion, or asymmetry of an audiogram. Nor are there independent measures against which definitions of these terms can be validated. The best method for validating AMCLASS definitions is to compare AMCLASS categories against the judgments of a panel of expert audiologists.

To validate the rules for determining configuration, severity, and site of lesion, we selected 231 audiograms from a clinical database and asked a panel of five expert judges (4 audiologists and one otologist) to select a configuration, severity, and site of lesion category for each. From their responses, a consensus was determined, that is, the category chosen most frequently by the panel of judges. This made it possible to compare the judgments of each judge with the consensus and to compare the categories selected by AMCLASS with the consensus. If AMCLASS agreement with the consensus is as high as the agreement between the average judge and the consensus, we can claim that AMCLASS is as good as an average expert audiologist.

A separate validation study was conducted for the interauaral asymmetry. The same expert panel judged 199 audiograms to be symmetrical or asymmetrical.

Agreement among judges, consensus, and AMCLASS is summarized in the table below. Interjudge agreement was surprisingly low. The average agreement between pairs of judges for configuration indicates that for one third of cases judges disagree. On the average, they disagreed on about a fifth of the cases for severity and a third of the cases for site of lesion. The low interjudge agreement indicates that there is no consistency in how audiograms are described, even among highly experienced audiologists.

The average agreement between judges and AMCLASS was higher than the average interjudge agreement. This indicates that on average the judges agreed with AMCLASS more often than they agreed with each other.

The most important comparison for assessing AMCLASS is the last two columns in the table. For configuration, severity, and asymmetry, agreement between AMCLASS and consensus is higher than the average agreement between judges and consensus. This indicates that AMCLASS performs better than the average expert. For site of lesion, agreement between AMCLASS and consensus was slightly lower than the average agreement between judges and consensus. This occurred because the judges did not use the Sensorineural or Mixed category as expected. This category was included for cases of profound hearing loss where the bone conduction thresholds cannot be measured due to the maximum output limits of audiometers. The judges tended to categorize these as Sensorineural. Rather than adjust the rules to maximize agreement with judges, which was done for configuration, severity, and asymmetry, we chose to retain this definition of Sensorineural or Mixed because a conductive component cannot be ruled out in these cases.

For a list of AMCLASS publications click here.

Interjudge Agreement (%)
Judges v. AMCLASS (%)
Judges v. Consensus (%)

AMCLASS v. Consensus (%)

Configuration

67.6
75.3
83.1
89.6
Severity
82.6
85.5
88.2
92.2
Site of Lesion
74.4
77.2
86.1
84.8
Asymmetry
76.9
83.2
86.2
91.0