How to calculate

a confusion matrix

To assess the accuracy of an image classification, it is common practice to create a confusion matrix. In a confusion matrix, your classification results are compared to additional ground truth information. The strength of a confusion matrix is that it identifies the nature of classification errors, as well as their quantities.

To obtain a confusion matrix:

  1. Have the output raster map of your image classification available; check the Properties of the classified image to know which domain and georeference are used by the classified image.
  2. Create a raster map which contains additional ground truth information, such a map is also known as the test set. There are several ways to create a test set, these are shortly described here; at the end of this topic you will find more details of creating a test set.
  3. Make sure that the ground truth/test set raster map does not contain the same pixels as the sample set raster map from the training phase. Your accuracy assessment will show too optimistic figures when the pixels in the sample set (on which the classification is based) are also used in the test set (with which the classification results are checked).

  4. Perform a Cross operation with your ground truth map and the classified image to obtain a cross table.
  5. To start the Cross operation:

    In the Cross dialog box:

  6. In the table window displaying the cross table, open the View menu and choose Confusion matrix.
  7. In the Confusion matrix dialog box:

When you click OK, the confusion matrix is displayed in a matrix window.

For more information on the interpretation of a confusion matrix, refer to Table window : Confusion matrix.

Additional information to create a test set:

  1. Creating a test set yourself with the pixel editor using a background map:
  2. Using an existing raster map or polygon map of the area:
    When you have a recent and reliable polygon or raster map of the area, you can directly use this map as the ground truth map.
  3. When you have collected ground truth data in the field, a calculation-wise correct method is to use half of these data for the sample set, and the other half for the test set.

See also: