Acessibilidade / Reportar erro

Classification of Flying Insects with high performance using improved DTW algorithm based on hidden Markov model

ABSTRACT

Insects play significant role in the human life. And insects pollinate major food crops consumed in the world. Insect pests consume and destroy major crops in the world. Hence to have control over the disease and pests, researches are going on in the area of entomology using chemical, biological and mechanical approaches. The data relevant to the flying insects often changes over time, and classification of such data is a central issue. And such time series mining tasks along with classification is critical nowadays. Most time series data mining algorithms use similarity search and hence time taken for similarity search is the bottleneck and it does not produce accurate results and also produces very poor performance. In this paper, a novel classification method that is based on the dynamic time warping (DTW) algorithm is proposed. The dynamic time warping algorithm is deterministic and lacks in modeling stochastic signals. The dynamic time warping (DTW) algorithm is improved by implementing a nonlinear median filtering (NMF). Recognition accuracy of conventional DTW algorithms is less than that of the hidden Markov model (HMM) by same voice activity detection (VAD) and noise-reduction. With running spectrum filtering (RSF) and dynamic range adjustment (DRA). NMF seek the median distance of every reference of time series data and the recognition accuracy is much improved. In this research work, optical sensors are used to record the sound of insect flight, with invariance to interference from ambient sounds. The implementation of our tool includes two parts, an optical sensor to record the "sound" of insect flight, and a software that leverages on the sensor information, to automatically detect and identify flying insects.

Key words:
classification; dynamic time warping; nonlinear median filtering; optical sensors; running spectrum filtering, voice activity detection

INTRODUCTION

The entomologists have tried to abolish the existence of unwanted. There are some blanket methods available and proven to be successful, but they are costly and create environmental problems. Automated tools for detection and monitoring insect species that threaten biological resources in both productive and native ecosystems, particularly for pest management and biosecurity are to be provided. Additionally, there is potential for deployment of sensors to obtain detailed spatio-temporal information about insect density, related to environmental conditions, for precision agriculture. Smart sensors are being developed with the aim of protecting the ecosystem just by counting and classifying the insects, so that the substance used to eradicate the harmful insects can be applied on the target location.

A lot of novel and relevant applications have emerged, with the development of tools and techniques of data mining methods. And hence there is a class of intelligent sensors capable of collecting information about the environment and making decisions based on the input data. There exist a sensor that uses a laser and machine learning techniques to classify species of insects. This sensor is being considered as an important step in the development of intelligent traps. Such traps can attract, selectively capture insect species of interest, and release all other species back into the environment. Then the unwanted insects are tracked and killed by the traps.

On the other hand, insects play a vital role in maintaining the ecological balance. For instance, insects can be food sources for other animal species, and they are assisting in the breed of plants and improve agricultural production, since they perform pollination and seed dispersal, or responsible for the production of many useful substances for humans such as honey, wax and silk. The existing trap releases other insect species which are not pests or disease vectors, limiting the impact of its presence in the environment.

The scientific challenges over the area of ecology are as interesting as the potential applications of this technology. There is a need for intelligent sensors that must be capable of processing large amounts of data. These data form a continuous stream of information. The sensors of olden days have a limited amount of memory and therefore it is not feasible to store the entire data stream for later processing. And later on the entomologists are in need of smart sensors. Thereby, sensors must process the data stream in real time, identify events of interest and discard data consisting other events and background noise. In the case of a sensor coupled to an intelligent trap, the process of identification of events related to insects is not enough. There is a need to classify all events connected to insect-species in real-time. Hence the researchers used a type of classification that uses a trap that can make a decision of capturing or releasing an insect according to its species.

To do so, first the data need to be collected and most of the researches show that images identified does not produce accurate results. Hence acoustical information is needed in terms of temporal data.

The wingbeat sounds of flying insects and additional features improve the classification performance. The additional features of species include different flight activity with respect to circadian rhythms as it includes time-of-intercept. And time series data mining algorithms have to be used to mine those temporal acoustical data of flying insects. Most of literature survey shows that the classic Dynamic Time Warping (DTW) measure is the best measure in most domains. It has been applied in the areas of robotics, medicine, biometrics, music/speech processing, climatology, aviation, gesture - recognition, user-interfaces, industrial processing, geology, astronomy, space exploration, wildlife monitoring, etc. DTW algorithm makes the search fast but with less complexity. And the performance in case of recognition accuracy is too poor.

The improved DTW algorithm uses non-linear median filtering to improve the recognition accuracy. Hidden Markov model is more complex and achieves a very high accuracy as it uses training.

The voice activity detection reads and detects the beginning and end of the recorded voice. The noise reduction techniques such as the cepstrum mean subtraction (CMS), and running spectrum filtering (RSF), and dynamic range adjustment (DRA) produces high recognition accuracy even at low Signal to Noise Ratio in HMM techniques. And these techniques are not applied in DTW algorithm.

RELATED WORK

To implement the proposed work the following papers are referred as an survey.

Previous researches show that to collect data acoustic microphones are used [66. S. Boll, "Suppression of acoustic noise in speech using spectral subtraction. Acoustics," Speech and Signal Processing, IEEE Transactions on, 27(2), 113-120, 1979, 2525. KS. Repasky, JA. Shaw, R. Scheppele, C. Melton, JL. Carsten, LH. Spangler, "Optical detection of honeybees by use of wing-beat modulation of scattered laser light for locating explosives and land mines." Appl. Opt., 45: 1839-1843, 2006]. As per the inverse squared law, sound attenuates. That is, the sound intensity captured using microphone, drops one ninth, if an insect flies just three times away from the microphone. If more sensitive microphones are used for expecting more accuracy, it then includes wind noise of the environment too. Moreover the data collection become too difficult in case of intolerable climatic conditions also. And there arose situations like nocturnal insects flying forcefully by tapping and prodding them under bright halogen lights and features of flying insects recorded in confined spaces or under extreme temperatures [44. P. Sykacek, S.J. Roberts, "Bayesian time series classification," NIPS : 937-944, 2001,2020. N. Begum, B. Hu, T. Rakthanmanon, E. Keogh, "Towards a Minimum Description Length Based Stopping Criterion for Semi-Supervised Time Series Classification." IRI : 333 - 340, 2013.] leads to critics. In certain special cases like insects tethering with string confine them within the range of the microphone [44. P. Sykacek, S.J. Roberts, "Bayesian time series classification," NIPS : 937-944, 2001]. It is hard to produce effective result with such insect data in natural conditions.

And it does not mean that the difficultly of obtaining data do not limit the data to perform the research work. However, it is known that classification models with more data is better [1212. A. Moore, RH. Miller, "Automated identification of optically sensed aphid (Homoptera: Aphidae) wingbeat waveforms." Ann. Entomol. Soc. Am. 95: 1-8, 2002,2424. SSC. Rund, SJ. Lee, BR. Bush, GE. Duffield, "Strain- and sex-specific differences in daily flight activity and the circadian clock of Anopheles gambiae mosquitoes." Journal of Insect Physiology 58: 1609-19, 2012].

The usage of poor quality data issue and the sparse data issue lead the researchers to learn very complicated classification models. The use of neural networks [77. Y. Chen, Supporting Materials https://sites.google.com/site/insectclassification/, 2013
https://sites.google.com/site/insectclas...
] produces good result in such cases. However, neural networks have many parameters/settings, including the interconnecting pattern between different layers of neurons, learning process for updating the weights of the interconnections, the activation function that converts a neuron's weighted input to its output activation, etc. [88. Y. Chen, B. Hu, E. Keogh, G. Batista, "DTW-D: time series semi-supervised learning from a single example." In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 383-391, 2013]. It is difficult to overstate how optimistic the results of neural network experiments can produce.

Semi-supervised learning (SSL) is applied where labeled data are limited [2828. M. A. Ranzato, M. Szummer, "Semi-supervised learning of compact document representations with deep networks," ICML: 792-799, 2008]. Self-training is a general framework with few assumptions. In self-training, a classifier is first trained with a small number of labeled data then classifies the unlabeled data, and adds the most confidently classified object into the labeled set. The classifier re-trains itself using the new labeled set and the procedure is repeated until adding new datas to the labeled set does not increase the accuracy of the classifier or some other stopping criteria is met.

A similarity metric, in combination with 1-NN classification (shape-based) or (structure-based) is applied, to classify noisy data or long data [2727. Bagnall, A., Davis, L.M., Hills, J., Lines, J.: Transformation Based Ensembles for Time Series Classification. In: SDM. vol. 12, pp. 307-318. SIAM (2012)]. By transforming time series data into an alternative space the performance of classifiers can be improved.

FLYING INSECT DETECTION AND CLASSIFICATION

Dynamic Time Warping

Dynamic Time Warping (DTW) is a time-series data mining algorithm that compares and aligns two sequences of data points.

Consider two sequences A and B, composed respectively of n and m feature vectors.

A = a1, a2, .... , ai, ...., an

B = b1, b2, .... , bj, ...., bm

Each feature vector is d-dimensional and is represented as a point in a d-dimensional space.

Dynamic time warping proceeds by warping the time axis iteratively until an optimal match between the two sequences is found.

In the figure above, which is an example of two sequences of data points with only 1 dimension, the time axis is warped so that each data point in the green sequence is optimally aligned to a point in the blue sequence.

DTW With Nonlinear Median Filtering

The conventional DTW algorithm is discrete and hence modified with the combination of Sakoe-Chilba and Itakura to improve the recognition accuracy. The warping function and adjustment window of DTW with Sakoe-Chilba and DTW with Itakura are shown in the figure below;

Figure 3
A.1 Time Warping of two sound sequences

Figure: 3
B.1 Sakoe-Chilba's DTW algorithm

Figure: 3
B.2 Itakura's DTW algorithm

Clustering of Output Sequences based on HMM

This approach consists in two steps: first, it uses a pairwise distance between observed sequences with the computed symmetrized similarity. This similarity is obtained by training an HMM for each sequence and then the log-likelihood (LL) of each model is computed for each sequence. This information is used to build an LL matrix which is then used to cluster the sequences in K groups, using a hierarchical algorithm.

Then one HMM is trained for each cluster; the resulting K models are merged into a "composite" global HMM. Each HMM is used to design a disjoint part of this "composite" model. This initial estimate is refined. As a result, a global HMM modeling all the data is obtained. The number of clusters is selected using a cross-validation method.

A discrete-time hidden Markov model λ can be viewed as a Markov model whose states are not directly observed. Instead, each state is characterized by a probability distribution function modeling the observations, corresponding to that state. More formally, an HMM is defined by the following entities:

S ={S1, S2,···, SN} the set of possible (hidden) states.

The transition matrix, A={aij,1≤j≤N} representing the probability of moving from state Si to state Sj, aij=P[qt+1=Sj|qt=Si],1≤i, j≤N.

The emission matrix B={b(oSj)}, indicating the probability of emission of Symbol o(V, when system state is Sj and V can be a discrete alphabet or a continuous set (e.g.V=IR), in which case, b(o|Sj) is a probability density function.

π={πi} , the initial state probability distribution,

πi=P[q1=Si],1≤i≤N

The HMM by a triplet λ= (A,B,π)

Consider a given a set of N sequences {O1...ON} to be clustered. The algorithm performs the following steps:

  1. Train one HMM λi for each sequence Oi and

  2. Compute the distance matrix D={D(Oi,Oj)}, representing a similarity measure between sequences. Then find out the forward probability P(Oj|λi), or by devising a measure of distances between models.

  3. Pairwise distance-matrix-based method is used to perform the clustering.

Clustering with DTW and HMM

When fitting an HMM to a set of sequences, the induction algorithm will try to fit all the sequences in the set equally well. Because the number of states is set in advance and not learned from the data, it is not clear, how the states are \allocated" to the different sequences. It is likely that the states' observation probability distributions will cover the regions in the observation space most often visited by the given sequences and that the state probability transitions will be changed accordingly.

Therefore, if we assume that the sequences in a training set were generated by some hidden Markov models and our task is to identify these models, then it is advantageous to start the HMM clustering algorithm with even an approximate initial clustering. If the majority of sequences in the initial cluster come from the same model, then it is likely that the learned compromise HMM will be closer to this one model. Since the DTW clustering technique can provide a good initial partition, the HMM clustering algorithm is initialized with it. For each cluster in the DTW partitioning, an HMM is created by applying the fixed-point operation described in the previous section to the sequences of the cluster. The remaining sequences from each DTW cluster are then checked against the HMMs of the other DTW clusters. Finally, if any sequences are still unassigned to an HMM, they are placed in a set that is clustered solely by HMM clustering.

METHODOLOGY

Instruments to Record Flying Sounds

The most important components connected to our framework include the sensor consisting of a phototransistor array with an electronic board, and a laser line pointing at the phototransistor array. When an insect flies across the laser beam its wings partially occlude the light, causing small light fluctuations. The light fluctuations are captured by the phototransistor array, as changes in current, and the filtered signal is then amplified by electronic board.

The output of the electronic board is recorded as audio data in the MP3 format. Each MP3 file lasts for 6 hours, and a new file starts recording immediately after a file has recorded for 6 hours. So the data is continuous and produces time-series data. The length of the MP3 file is limited by the device firmware and not fit on a disk space. The MP3 standard produces lossy file and optimized for human perception of speech and music. But, most flying insects produce sounds that are well within the range of human hearing and when compared to lossless recordings these data are not with exploitable information.

Voice Activity Detection

The proposed work proceeds in two stages: feature extraction, followed by classification Both the feature extraction and the classification arise naturally from models for source separation. Because, humans tend to perceive spectral features of audio- at least on short time scales-it is natural to use frequency- domain, rather than the time-domain features in audio processing. In source separation, it is typical to work with invertible transforms, as the Short-Time Fourier Transform (STFT), because it is necessary to recover the time-domain signals.

Classifying each time frame, as either speech or non-speech is straightforward. We simply sum up the sound activations

where KS is the total number of speech features, to produce a single activity number for each frame. After median filtering at to produce a smoothed estimate ~ a t , we classify a frame as speech if ~at> c and non-speech otherwise. The user can adjust the threshold c depending on, the desired false-positive and false-negative tradeoff.

The classification algorithm depends on the speech activations onlyand not on the noise activations. This ensures that our algorithm is robust to non-stationary noise environments, where the signal-to-noise ratio may be fluctuating.

The algorithm with KL divergence is called Block KL-NMF is:

The Dynamic Time Warping with Nonlinear Median Filter

Assuming the matching distance Matrix is

(1) Sorting ascendingly the distances for every reference word yields D'm

(2) Computing the median by the NMF.

(3) In the approach we propose herein the recognized word corresponding to

RESULTS AND DISCUSSIONS

The performance of DTW when compared HMM is shown as

Table I
IV shows the DTW recognition accuracy for 10 dB and 20 dB SNR white and babble noise, with and without NMF

Table 2
Recognition accuracy (%) of Itakutra DTW with NMF and VAD

Table 3
Recognition accuracy (%) of Symmetric Sakoe- Chiba DTW with NMF and VAD

Table 4
Recognition accuracy (%) of Asymmetric Sakoe-Chiba DTW with NMF and VAD

The above tabulation shows that the recognition accuracy of three DTW algorithms are improved much more by the NMF. The Itakura DTW is better than the other systems. In Table II, the recognition accuracy of DTW with NMF is 85:04% for 10 dB SNR, and 97:08% for 20 dB SNR in white noise. Furthermore, the recognition accuracy of DTW with NMF is 77:38% for 10 dB SNR, and 92:82% for 20 dB, for babble noise. The HMM-based recognition accuracy is 85:6% for 10 dB SNR, and 97:5% for 20 dB SNR, in white noise. HMM accuracy is 77:0% for 10 dB SNR, and 95:7% for 20 dB SNR in babble noise.

The figure shown below presents the recognition accuracy of DTW with NMF for various filter orders. DTW with order-1 NMF is same as that of conventional DTW approach and this method uses the minimum distance for recognition. And when the performance is measured it produces the poorest accuracy. Accuracy improves with increasing NMF order, and reaches its maximum for order around 9. The proposed method produces the accuracy of 85% in recognition whereas the DTW achieves 77:36%, at 10 dB SNR. And it also shows that the NMF order and the accuracy are inversely proportional.

The bar graph shows the recognition rates of three DTW algorithms with NMF in white noise. The recognition rate of Itakura DTW is best among DTW algorithms. It is very close that of HMM in white noise.

Figure 4
A.1 Cages for Data Gathering

Figure 4
A.2 Components of the Acoustic Sensors

Figure 5:
Accuracy of DTW with NMF vs. Filter order

Figure 6:
The accuracy (%) of three DTW algorithms accuracy with NMF for order 9 at 10 and 20 dB SNR.

CONCLUSION

In this paper, pseudo-acoustic optical sensors are used to produce vastly superior data and Dynamic time warping with nonlinear median filtering combined with Hidden Markov Model using the concept of clustering has been used for robust and general classification with intrinsic and extrinsic features of the insect's flight behavior. The accuracy produced by the use of the proposed work, enables the researchers to apply this in the entomological research too. And this work improves recognition accuracy. Thus, DTW with NMF yields 85:04% accuracy, compared to 77:36% for conventional Itakura DTW at 10 dB SNR. This means , that our new DTW method approaches the accuracy of HMM-based method.

In this proposed work manual intervention is needed for setting up the initial parameters and domain dependent settings. And also working with situations where the patterns may be very rare is of larger scope. Hence in future our work needs to be enhanced by addressing these issues.

REFERENCES

  • 1
    P. Chazal, M. O'Dwyer, R.B. Reilly, "Automatic classification of heartbeats using ECG morphology and heartbeat interval features," IEEE Trans Biomed Eng, 51: 1196-1206, 2004
  • 2
    E. J. Keogh, Q. Zhu, B. Hu, Y. Hao, X. Xi, L. Wei, C. A. Ratanamahatana, C. A. The UCR Time Series Classification/Clustering Homepage, 2011
  • 3
    P. Sykacek, S.J. Roberts, "Bayesian time series classification," NIPS : 937-944, 2001
  • 4
    P. Sykacek, S.J. Roberts, "Bayesian time series classification," NIPS : 937-944, 2001
  • 5
    P. Belton, R. Costello, "Flight sounds of the females of some mosquitoes of Western Canada." Entomologia experimentalis et applicata, 26(1), 105-114, 1979
  • 6
    S. Boll, "Suppression of acoustic noise in speech using spectral subtraction. Acoustics," Speech and Signal Processing, IEEE Transactions on, 27(2), 113-120, 1979
  • 7
    Y. Chen, Supporting Materials https://sites.google.com/site/insectclassification/, 2013
    » https://sites.google.com/site/insectclassification
  • 8
    Y. Chen, B. Hu, E. Keogh, G. Batista, "DTW-D: time series semi-supervised learning from a single example." In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 383-391, 2013
  • 9
    MA. Deakin, "Formulae for insect wingbeat frequency."Journal of Insect Science,10(96):1, 2010
  • 10
    YP. Mack, M. Rosenblatt, "Multivariate k-nearest neighbor density estimates." Journal of Multivariate Analysis, 9(1), 1-15, 1979
  • 11
    P. Mermelstein, "Distance measures for speech recognition, psychological and instrumental, Pattern Recognition and Artificial Intelligence," 116, 374-388, 1976.
  • 12
    A. Moore, RH. Miller, "Automated identification of optically sensed aphid (Homoptera: Aphidae) wingbeat waveforms." Ann. Entomol. Soc. Am. 95: 1-8, 2002
  • 13
    L. Wei and E. Keogh, "Semi-Supervised Time Series Classification." SIGKDD 2006.
  • 14
    X. Xi, E. J. Keogh, C. R. Shelton, L. Wei, C. A. Ratanamahatana, "Fast time series classification using numerosity reduction," ICML: 1033-40, 2006
  • 15
    G. Batista, E. Keogh, A. Mafra-Neto, E. Rowton, "Sensors and software to allow computational entomology, an emerging application of data mining." KDD: 761-764, 2011.
  • 16
    Ding, G. Trajcevski, P. Scheuermann, X. Wang, E. J. Keogh. "Querying and mining of time series data. Experimental comparison of representations and distancemeasures." PVLDB 1(2): 1542-1552, 2008.
  • 17
    L. Morales, M. P. Arbetman, S. A. Cameron, and M. A. Aizen. "Rapid ecological replacement of a native bumble bee by invasive species." Frontiers in Ecology and theEnvironment., 2013.
  • 18
    P. Chazal, M. O'Dwyer, R.B. Reilly, "Automatic classification of heartbeats using ECG morphology and heartbeat interval features," IEEE Trans Biomed Eng, 51: 1196- 1206, 2004
  • 19
    N. Begum, B. Hu, T. Rakthanmanon, E. Keogh, "A Minimum Description Length Technique for Semi-Supervised Time Series Classification." Integration of Reusable Systems., Springer International Publishing: 171-192, 2014.
  • 20
    N. Begum, B. Hu, T. Rakthanmanon, E. Keogh, "Towards a Minimum Description Length Based Stopping Criterion for Semi-Supervised Time Series Classification." IRI : 333 - 340, 2013.
  • 21
    G. Batista, E. J. Keogh, A. Mafra-Neto, E. Rowton, "SIGKDD demo: Sensors and Software to allow Computational Entomology, an Emerging Application of Data Mining." KDD'11: 761-764, 2011.
  • 22
    Taylor, MDR. Jones, "The circadian rhythm of flight activity in the mosquito Aedes aegypti (L.): the phase-setting effects of light-on and light-off." Journal of Experimental Biology 51, no. 1 (1969): 59-70, 1969
  • 23
    J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake, M. Cook, R.Moore, "Real-time human pose recognition in parts from single depth images." Communications of the ACM, 56(1), 116-124, 2013
  • 24
    SSC. Rund, SJ. Lee, BR. Bush, GE. Duffield, "Strain- and sex-specific differences in daily flight activity and the circadian clock of Anopheles gambiae mosquitoes." Journal of Insect Physiology 58: 1609-19, 2012
  • 25
    KS. Repasky, JA. Shaw, R. Scheppele, C. Melton, JL. Carsten, LH. Spangler, "Optical detection of honeybees by use of wing-beat modulation of scattered laser light for locating explosives and land mines." Appl. Opt., 45: 1839-1843, 2006
  • 26
    SC. Reed, CM. Williams, LE. Chadwick, "Frequency of wing-beat as a character for separating species races and geographic varieties of Drosophila." Genetics 349-361, 1942
  • 27
    Bagnall, A., Davis, L.M., Hills, J., Lines, J.: Transformation Based Ensembles for Time Series Classification. In: SDM. vol. 12, pp. 307-318. SIAM (2012)
  • 28
    M. A. Ranzato, M. Szummer, "Semi-supervised learning of compact document representations with deep networks," ICML: 792-799, 2008

Publication Dates

  • Publication in this collection
    2016

History

  • Received
    03 Feb 2016
  • Accepted
    14 July 2016
Instituto de Tecnologia do Paraná - Tecpar Rua Prof. Algacyr Munhoz Mader, 3775 - CIC, 81350-010 Curitiba PR Brazil, Tel.: +55 41 3316-3052/3054, Fax: +55 41 3346-2872 - Curitiba - PR - Brazil
E-mail: babt@tecpar.br