Attribute Agreement Analysis

Humans are of varying in nature. Because of this, the judgement made by one person won’t be similar to that of another person’s under equal conditions. Take the case of a judiciary system. What happens if laws are not applied consistently over people.. Ultimately it leads to the failure of the system. How can the consistency be ensured?

To understand this we need to monitor the system, analyse the variation and take appropriate actions. How can the variation happen? We need to understand the reasons for this variation. Variation can happen due to the subjectivity of judgement. Here the attribute which needs to be analysed for variation is ‘judgement’. Statistical tools like Minitab can be used to analyse this kind of Attribute Agreement Analysis (AAA)

Now there is another type of variation. For example, in the case of measurements using Verneir Calipers, the effect of the subjective judgment is on the lower side. But the repeatability and reproducibility of gauge measurements comes in to picture. This kind of variation can be analysed using Gauge R & R techniques in Minitab.

In common AAA and Gauge R & R are applied for analysing a variation in a measurement system. This does not address actual process variation as we do in the case of Statistical Process Control (SPC). So looking from a top view variations are of two types of variations within a system.

  • Actual Process variation
  • Measurement System Variation

A Practical Scenario where AAA can be applied

 In a project we collect some of the base parameters like size, effort, schedule, defect etc. Using this we deduce the derived parameters like productivity, defect density etc.  Defect density can be taken based on the category of defects. For example, it can be logical defect density, documentation defect density etc.

Consider a scenario where logical defect density found in different modules (modules of same nature) in a project was getting analysed. A module, say as “A” has got higher value of logical defect density compared to other modules. The baseline of the parameter in the organization is from 0 .1 to 0.5 defects /KLOC. i.e. the expected process variation is within 0.1 to 0.5 defects/KLOC.  In the project for one module, it got increased.  The project manager decided to take appropriate actions for the same after triggering a root cause analysis. Each defect got analysed in depth.  From the analysis it was found that the issue was with defect type mapping and nothing to do with the logic of the code.  For that particular module a good number of documentation defects were mapped to logical defects and hence logical defect density raised drastically.

Now if the same reviewer had done the review of other modules too, as well as defect mapping, logical defect density may be on the higher side due to his / her incorrect defect mapping. Inside the project it may not trigger further analysis as there is no assignable cause (in SPC terms, no out of turn points). Now what happens.. The wrong defect classification is going to the organizational database and from there improvement initiatives are being triggered for reducing the same and at the same time, the analysis for actual defects (may be documentation defects) would be hidden due to incorrect defect mapping. This will ultimately affect the process performance objectives of the organization.

To understand this kind of variation happening in attribute, an analysis needs to be triggered. An attribute agreement analysis evaluates the consistency of ratings for each appraiser, across appraisers, and versus a standard or known value.

Attribute agreement analysis (AAA) produces some key statistics that tell us whether the results are due to random chance or if our judgment appears to be better (or worse) than random chance.

Kappa and Kendall’s correlation statistics are used to evaluate the agreement. Kappa’s statistics is used for nominal data whereas Kendall’s coefficient is used in the case of ordered data.

Kappa value can range from –1 to +1. +1 Shows perfect agreement while as 0 shows that agreement is by chance. Kappa of <1 means agreement is less than chance. AIAG recommends Kappa>0.75 for good system and <0.4 as poor system. In the case of Kendall’s coefficient for acceptable values of AAA results, it has to be greater than 80%.If the results are not acceptable, corrective action should be triggered for the same.