| Approach | Features | Number of features | Classifier | Dataset (no. of classes) = no of files | Recognition accuracy (Recall) | Sampling frequency |
---|---|---|---|---|---|---|---|
1 | Croker et al. [11] | Frequency and time | 5 | ED | Frog (5) = 100 | Recall = 85% Accuracy = 89% | 16 kHz |
2 | Dang et al. [12] | Envelope Extraction | Not specified | Matched filtering | Frog (3) = not specified | Accuracy = 90% | < 10 kHz |
3 | Wei et al. [13] | From Gradient Projection for Sparse Reconstruction | featureless using a sparse representation | Their own \({\varvec{\iota}}\)1-minimization Sparse Approximation-based classifier | Frog (14) = 228 | Recall \(\approx\) 98% | 24 kHz |
crickets (20) = 663 | Recall \(\approx\) 50% | ||||||
4 | Colonna et al. [10] | Wavelet | 4 | k-NN | Anurans(9) = 49 syllables | 96.25% 94.16 86.96% | 44.1 kHz 11 kHz 5.5 kHz |
5 | Algobail et al. [19] | Time | 2 | ED | Animals (7) = 114 | 81.34% | 44.1 kHz |
6 | Our scheme | Wavelet | 2 | MD ED | Animals (12) = 587 | 85.59% 86.06% | 8 kHz |