Table window

Confusion Matrix

 

To assess the accuracy of an image classification, it is common practice to create a confusion matrix. In a confusion matrix, your classification results are compared to additional ground truth information. The strength of a confusion matrix is that it identifies the nature of the classification errors, as well as their quantities.

 

Tip: The output cross table of a Cross operation on two maps which use a class or ID domain, can also be shown in matrix form. For more information, refer to Cross : functionality.

Preparation:

The details of these steps are described in How to calculate a confusion matrix.

Dialog box options:

First column:

Select the column with the same name as the ground truth map (or test set).

Second column:

Select the column with the same name as the output map of the Classify operation.

Frequency:

Select the NPix column.

The confusion matrix appears in a secondary window.

Note:

Interpretation of a confusion matrix:

Consider the following example of a confusion matrix:

  

CLASSIFICATION RESULTS

forest

bush

crop

urban

bare

water

unclass

ACC

GROUND

forest

440

40

0

0

30

10

10

0.83

TRUTH

bush

20

220

0

0

40

10

20

0.71

crop

10

10

210

10

50

10

60

0.58

urban

20

0

20

240

100

10

40

0.56

bare

0

0

10

10

230

0

10

0.88

water

0

20

0

0

0

240

10

0.89

REL

0.90

0.76

0.88

0.92

0.51

0.86

  

Average accuracy

=

74.25%

Average reliability

=

80.38%

Overall accuracy

=

73.15%

In the example above:

Explanation:

Accuracy (also known as producer's accuracy): The figures in column Accuracy (ACC) present the accuracy of your classification: it is the fraction of correctly classified pixels with regard to all pixels of that ground truth class. For each class of ground truth pixels (row), the number of correctly classified pixels is divided by the total number of ground truth or test pixels of that class. For example, for the 'forest' class, the accuracy is 440/530 = 0.83 meaning that approximately 83% of the 'forest' ground truth pixels also appear as 'forest' pixels in the classified image.

Reliability (also known as user's accuracy): The figures in row Reliability (REL) present the reliability of classes in the classified image: it is the fraction of correctly classified pixels with regard to all pixels classified as this class in the classified image. For each class in the classified image (column), the number of correctly classified pixels is divided by the total number of pixels which were classified as this class. For example, for the 'forest' class, the reliability is 440/490 = 0.90 meaning that approximately 90% of the 'forest' pixels in the classified image actually represent 'forest' on the ground.

The average accuracy is calculated as the sum of the accuracy figures in column Accuracy divided by the number of classes in the test set.

The average reliability is calculated as the sum of the reliability figures in column Reliability divided by the number of classes in the test set.

The overall accuracy is calculated as the total number of correctly classified pixels (diagonal elements) divided by the total number of test pixels.

From the example above, you can conclude that the test set classes 'crop' and 'urban' were difficult to classify as many of such test set pixels were excluded from the 'crop' and the 'urban' classes, thus the areas of these classes in the classified image are probably underestimated. On the other hand, class 'bare' in the image is not very reliable as many test set pixels of other classes were included in the 'bare' class in the classified image, thus the area of the 'bare' class in the classified image is probably overestimated.

Note:

The results of your confusion matrix highly depend on the selection of ground truth / test set pixels. You may find yourself in a situation of the chicken-egg problem with your sample set, the classification result and your test set. On the one hand, you want to have as many correct sample set pixels as possible so that the classification will be OK; on the other hand, you also need to have an ample number of correct ground truth pixels for the test set to be able to assess the accuracy and reliability of your classification. Using the same data for both the sample set and the test set will produce far too optimistic figures in the confusion matrix. It is mathematically correct to use half of your ground truth data for the sample set and the other half for the test set.

See also: