www.scielo.br/aabc Automated bioacoustic identification of species

Research into the automated identification of animals by bioacoustics is becoming more widespread mainly due to difficulties in carrying out manual surveys. This paper describes automated recognition of insects (Orthoptera) using time domain signal coding and artificial neural networks. Results of field recordings made in the UK in 2002 are presented which show that it is possible to accurately recognize 4 British Orthoptera species in natural conditions under high levels of interference. Work is under way to increase the number of species recognized.


INTRODUCTION
Recognition of insect, animal and bird species from their calls has been employed for many years for identifying individuals and locating animals.However, such ''manual'' surveys are slow, time consuming and rely heavily on the surveyor's expert knowledge of the group under investigation.Surveys also generally take place at infrequent intervals primarily due to the time required, leading to difficulties in interpreting long-term trends.Rapid advances in computing and electronics are leading to the development of automated recognition systems capable of providing long-term continuous unattended monitoring in inhospitable regions.These systems can be designed for hand-held use and applications range from rapid biodiversity assessment especially in acoustically rich habitats (Riede 1993), electronic identification guides, acoustic autecology and the detection and recognition of pest species.Research into automated bioacoustic species iden-E-mail: edc1@ohm.york.ac.uk tification is more mature in some fields than others.Table I gives some examples of bioacoustic research.
This paper describes the development of a novel bioacoustic signal recognition system (IBIS -Intelligent Bioacoustic signal Identification System) and its application to the recognition of British Orthoptera.The technique employed is a purely time domain method known as Time Domain Signal Coding (TDSC) which, when coupled with an artificial neural network (ANN) classifier, provides a powerful vehicle for bioacoustic signal analysis and recognition.It has been successfully tested on 25 species of British Orthoptera with 99% recognition accuracy (Chesmore et al. 1997, Chesmore 2000, 2001, Chesmore and Nellenbach 2001) and 10 species of Japanese bird with 100% accuracy (Chesmore 1999(Chesmore , 2001)).However, these results were for high signal to noise ratio (SNR) signals.This paper describes results of field trials where the SNR is more variable and sounds are corrupted by interference from other natural and man-made noise sources.Each recording was ''manually'' examined for echemes (first order assemblage of syllables) and songs of varying quality; these were extracted to separate files for training and testing purposes.

Signal Analysis and Recognition
The basic principle of TDSC is to characterize the ''shape'' of the waveform between successive zerocrossings of the signal (termed an ''epoch'').Full details of the algorithm can be found in Chesmore (2001).The output of the coding process is a stream of codewords (1 per epoch) describing changes in the shape of the waveform over time.Further processing is carried out in 2 ways: accumulation of the frequency of occurrence of each codeword -the S-matrix, and the frequency of occurrence of pairs of codewords -the A-matrix.The A-matrix is employed in this application.Recognition of sounds via A-matrices is carried out using an artificial neural network (ANN) which takes the A-matrix as input and has an output for each species (or sound) to be recognized.The system operates in 2 phases, training phase and operational phase.In the training phase, high quality examples of sounds that are to be identified (known as exemplars) are used to train the ANN so that the correct ANN output is activated.Training occurs by repeated presentation of the sounds and modification of the weights within the network in such a way as to reduce the overall error between the current outputs and desired outputs.Training continues until the overall error is below a given threshold.Once trained, the system is ready to use and unknown sounds can be classi-  fied.Each of the outputs will give a value between 0.0 (zero match) and 1.0 (perfect match); the unknown sound being recognized as the output with the highest value.The type of ANN used in this application is a standard multilayer perceptron (MLP) with backpropagation training.Upon listening to the recordings, it was discovered that there were many other sounds present, mainly man-made and it was decided to include these sounds for recognition.
Representative sounds for each category (insect, animal, man-made) were selected, stored as separate .wavfiles and used to train the ANN.The following 13 sound sources were used in training: • 4 grasshopper species; • 1 blow fly sound (wing beats of unknown species); • 4 bird sounds (3 different alarm calls of undetermined origin and Chiffchaff Phylloscopus collybita); • 2 vehicle (car) sounds (metaled road and dirt road); • 1 single engine light aircraft sound; • 1 background sound (sound when no other sources present -includes wind noise).

RESULTS
Testing of the recognition system was carried out in 3 ways: recognition of single echemes, recognition of whole songs and recognition of sounds in 2s intervals.The latter approach does not rely on a priori knowledge of the signals (e.g.start of echeme or song) but simply allocates a sound to 2s intervals; this leads to the possibility of generating continuous sound maps.

Recognition of Single Echemes
Echeme duration for the 4 species under consideration is approximately 2s.Echemes were manually extracted from the recordings and stored as separate .wavfiles.Table III gives results for the 4 species which were recognized from 13 sounds.The threshold is used to remove any recognition results below the threshold to reduce low accuracy results.It is evident that recognition accuracy for a threshold of 0.9 is between 81.8% (C.parallelus) and 100% (O.viridulus) whereas with no threshold they drop to 64.3% and 97% respectively, and with M. maculatus dropping is from 90% to 41.2%.

Recognition of Whole Songs
Figure 1 shows results of whole song recognition with varying threshold.Recognition for a threshold of 0.9 is between 80% and 100% for the 4 species.
O. viridulus has a song that can last for more than 30s so a 10s segment was selected.

Continuous Sound Recognition
It is possible to simply recognize sound on a short time scale without any a priori knowledge of the signals thus reducing computational overheads in  that the grasshopper (O.viridulus) has been recognized correctly, as have the light aircraft and a bird alarm call.This approach has considerable potential for general sound mapping applications where both sound pressure level and sound type could be monitored.

CONCLUSIONS
This paper has shown that it is possible to accurately and reliably recognize sounds in a noisy field environment.One important aspect of this research is that the techniques employed are suitable for implementation on hand-held or stand-alone field deployable devices leading to the potential for longterm continuous monitoring.Much work has still to be carried out, in particular better wave shape descriptors and investigation into separation of multiple simultaneous calls.TDSC is not limited to insect sounds and a real-time hand-held recognition system is being developed for British bats.

ACKNOWLEDGMENTS
The author would like to acknowledge English Nature, the Forestry Commission, East Riding County Council of Yorkshire and the Yorkshire Wildlife Trust for granting access to some of the sites.

Fig. 1 -Fig. 2 -
Fig.1-Graph of whole song recognition with varying threshold.Y-axis is accuracy in percentage and x-axis is threshold level (0.5-0.9).

TABLE I Examples of automated bioacoustic species identification.
on a Sony MZ-R90 portable minidisc recorder with a Sony ECM-MS907 condenser microphone and transferred to a PC (Dell Inspiron 8100) via a standard sound card.The sounds were sampled at 44.1 kHz and stored as 16-bit signed mono .wavformat files using Avisoft-SASLab Pro software package.Table II lists the 6 species encountered during the recording sessions; only 4 are used in the acoustic study.

TABLE II Orthoptera species recorded in Yorkshire in 2002.
Species not used in this acoustic study.** Does not produce any acoustic signal. *