8 Classification accuracy

To illustrate classification accuracy, here we compare two classifications of flapping migration. The first is using the inbuilt classify_flap function, the second is using pressure change. Both are then compared using compare_classifications and compare_confusion_matrix.

8.1 Classify migratory flapping flight

Once a classification has been performed (here we use the example of a hoopoe, as it’s migratory flight can be prediction using classify_flap)

This creates a timetable of migratory flight events which can be visualised using classification$timetable, as seen below:

Table 8.1: Migration timetable (first 10 rows)
start end Duration (h)
3 2016-08-06 20:20:00 2016-08-07 01:50:00 5.500000
4 2016-08-07 19:40:00 2016-08-08 09:15:00 13.583333
5 2016-08-08 19:30:00 2016-08-09 04:10:00 8.666667
6 2016-08-09 21:15:00 2016-08-10 01:30:00 4.250000
7 2016-08-10 22:30:00 2016-08-10 23:50:00 1.333333
8 2016-08-21 18:45:00 2016-08-22 04:15:00 9.500000

This classification is pretty accurate and we will use this as a reference dataset to compare with another classification: high pressure change. i.e. a high change in altitude.

8.2 Setup the reference dataset

Because the second classification is done using pressure (30 minute data resolution) compared to this classification which was done using activity (5 minute resolution), the activity classification is set to the same resolution as pressure using the create_merged_classification function.

8.3 Setup the prediction data

Hoopoes seems to perform large altitudinal changes during migratory flight, so we preform a very rough classification by specifying that any altitude change greater than 2 hPa is equivalent to a migratory flight (this is for illustrative purposes only, and should not be used as a definite classification method).

8.4 Compare the two classifications

We can then compare the two classifications point by point using the compare_classifications function.

This puts both classifications side by side, and shows how many classifications provided each class, as well as the agreement between the two, as can be seen below.

Table 6.2: Comparison of both classifications (first 10 rows)
reference prediction Migration Other agreement
Other Other 0 2 TRUE
Other Other 0 2 TRUE
Other Other 0 2 TRUE
Other Other 0 2 TRUE
Other Other 0 2 TRUE
Other Other 0 2 TRUE

8.5 Create a confusion Matrix

A confusion matrix uses predicted and reference points and estimate:

  • Errors in Commission provide a measure of false negatives i.e. the number of points that were predicted to be part of a class that they were not (probability something was incorrectly prediction FN/(TP+FN)).
  • Errors in Omission provide a measure of false positives that were predicted to be in a different class from their actual class (probability that something was missed FP/(FP +TP).
  • Producer Accuracy or Precision provides a measure of how likely something was missed by the classification (probability that something was not missed TP/(TP + FP)).
  • User Accuracy or Recall represents the probability that a class was correctly prediction TP/(TP + FN).
  • Overall Accuracy represents the probability that all classes were correctly prediction (TP+TN)/(TP+TN+FP+FN).
  • Kappa Coefficient measures the agreement between the classification and the truth ((TN+FP) (TN+FN) + (FN+TP) (FP+TP)) / (TP+FP+TN+FN)2
Table 8.2: Confusion Matrix
Ref Other Ref Migration Row_Total Commission_Error Users_accuracy Total_accuracy Kappa_Coeff
Pred Other 2.990700e+04 104.0000000 30011 0.0034654 0.9965346 NA NA
Pred Migration 8.000000e+01 726.0000000 806 0.0992556 0.9007444 NA NA
Col_Total 2.998700e+04 830.0000000 30817 NA NA NA NA
Omission_Error 2.667800e-03 0.1253012 NA NA NA NA NA
Producers_accuracy 9.973322e-01 0.8746988 NA NA NA NA NA
Total_accuracy NA NA NA NA NA 0.9940293 NA
Kappa_Coeff NA NA NA NA NA NA 0.9483213

8.6 Overall accuracy

The total accuracy is 99.4% which is not bad. Most of the error comes from the omission of some migration periods by the prediction i.e. there are periods where the bird is performing a migratory flight, but remains at the same altitude and are therefore missed by the classification. However, this is only for 1.25% of the points.