HYPERSPECTRAL IMAGE DENOISING USING MULTIPLE LINEAR REGRESSION AND BIVARIATE SHRINKAGE WITH 2-D DUAL-TREE COMPLEX WAVELET IN THE SPECTRAL DERIVATIVE DOMAIN

In this paper, a new denoising method is proposed for hyperspectral remote sensing images, and tested on both the simulated and the real-life datacubes. Predicted datacube of the hyperspectral images is calculated by multiple linear regression in the spectral domain based on the strong spectral correlation of the useful signal and the inter-band uncorrelation of the random noise terms in hyperspectral images. A two dimensional dual-tree complex wavelet transform is performed in the spectral derivative domain, where the noise level is elevated temporarily to avoid signal deformation during the wavelet denoising, and then the bivariate shrinkage is used to shrink the wavelet coefficients. Simulated experimental results demonstrate that the proposed method obtains better results than the other denoising methods proposed in the reference, improves the signal to noise ratio up to 0.5dB to 10dB. The real-life data experiment shows that the proposed method is valid and effective.


Introdutcion
Hyperspectral remote sensing images can be viewed as three-dimensional data consisted of onedimensional spectral information and two-dimensional spatial information.The reliability of the information delivered by hyperspectral remote sensing applications highly depends on the quality of the captured data.Although over the last decades the development of imaging spectrometers is flourishing, the hyperspectral remote sensing image is still affected by many complex factors during the procedures of acquisition and transmission, which will produce a mass of noises.It will affect the target detection, classification and segmentation of hyperspectral image.Therefore, it is necessary to study the denoising method for hyperspectral images (Sun and Luo, 2009).The minimum noise fraction and smoothing filters are widely used for the hyperspectral image noise reduction.However, as the minimum noise fraction and smoothing filters have a negative effect on sharp signal when removing the noise from the noisy image, the spectral curve will be damaged (Green, et al, 1988;Lee, et al, 1990).In recent years, many noise reduction methods for hyperspectral images have been constantly introduced with the development of hyperspectral remote sensing technology.Most of noise reduction methods combine correlation of spatial and spectral domain.Currently, wavelet is widely used for denoising methods (Donoho, 1995;Donoho and Johnstone, 1994;Donoho and Johnstone, 1995).Signal denoising based on the wavelet thresholding and shrinkage has the near-optimal property in the minimax sense.Mihcak et al. (1999) utilize the observed noisy data in a local neighbourhood to perform an approximate maximum a posteriori estimation of the variance for each coefficient.Then, an approximate minimum mean squared error estimation procedure is used to deal with the noisy image coefficients.Sendur and Selesnick (2002) develop a bivariate shrinkage function for image denoising.Their results show that the estimated wavelet coefficients depend on the parent coefficients.The smaller the parent coefficients are, the greater the shrinkage is.Chen andQian (2008, 2009) combine wavelet denoising with dimensionality reduction for hyperspectral image by using bivariate wavelet thresholding, wavelet packets, and principal component analysis (PCA).Othman and Qian (2006) develop a noise reduction algorithm, the hybrid spatial-spectral noise-reduction algorithm (HSSNR) for hyperspectral datacube.The algorithm resorts to the spectral derivative domain, where the noise level is elevated temporarily to avoid signal deformation during the wavelet denoising, and it benefits from the dissimilarity of the signal regularity in the spatial and spectral domains.Recently, Chen and Qian (2011) develop a new denoising method to avoid removing the fine features of the datacubes during denoising process.PCA is used to decorrelate the fine features of the datacube from noise.Bivariate wavelet thresholding is used in denoising the low-energy PCA output channels.And a 1-D signal denoising method is proposed by using the dual-tree complex Bol.Ciênc.Geod., sec.Artigos, Curitiba, v. 22, n o 4, p.822 -834, out -dez, 2016.
wavelet transform and considering a local neighborhood during the thresholding process.In this paper the proposed method in (Chen and Qian, 2011) is referred to as PCABS.
Signal-to-noise ratio (SNR) is a key parameter on measuring the hyperspectral image quality.A sufficiently high SNR can be achieved by adopting some excessive measures in the instrument design, such as increasing the size of the optical system, the integration time, or the detector area.However, these measures are prohibitively expensive, particularly in the spaceborne instruments (Chen and Qian, 2011).So noise reduction algorithms provide a high-effective solution which is becoming more and more affordable in terms of computational time and expense, due to the availability of the advanced computing devices.The reference (Chen, et al, 2010) shows that, for 3-D data cube, it is necessary to perform 1-D spectral signature denoising in addition to 2-D denoising for every spectral band.Xu, et al (2013) propose denoising method for hyperspectral remote sensing image based on multiple linear regression and wavelet shrinkage.To get better performance, curvelet transform is employed for hyperspectral image denoising, which overcomes the major drawback that wavelet cannot really represent twodimensional objects with edges sparsely (Xu, et al, 2013).
In this paper, due to the strong spectral correlation, the multiple linear regression (MLR) and the spectral derivative are employed to remove the one-dimensional spectral noise.Since the predicted datacubes obtained by MLR have a relatively high SNR, the noise is subtle to distinguish from the useful signals.In order to avoid removing the useful signal during the denoising process, the noise level is elevated temporarily by transforming the datacube into the spectral derivative domain.Then two-dimensional dual-tree complex wavelet transform (DTCWT2) is performed on the datacube in the spectral derivative domain and the wavelet coefficients are shrank by bivariate wavelet thresholding to remove the spatial noise.
Mathematical tools applied in this paper are introduced in Section 2. Section 3 proposes the new denoising algorithm.The noise model is given in Section 4, and the simulated and the real-life data experiment are performed.Section 5 draws the conclusion.

Mathematical Tools
In this section, several mathematical tools are introduced for proposing the noise reduction frame for the hyperspectral image.First, the MLR theory is introduced, which has been widely applied to denoise the hyperspectral images (Roger and Arnold, 1996;Gao, et al, 2008;Bioucas-Dias and Nascimento, 2008;Acito, et al, 2011).In this paper, MLR is considered to perform 1-D spectral noise reduction.For 2-D spatial denoising, three mathematical tools spectral derivative, 2-D dual-tree complex wavelet and Bivariate shrinkage (BS) are introduced.The useful datacube obtained by MLR has high SNR, so it is necessary to transform the useful datacube into the spectral derivative domain to evaluate the noise level, in order to avoid removing the fine features as the noise during the denoising process in the spatial domain.To overcome the shortcomings of the traditional wavelet, the complex wavelet is introduced, which is nearly shift invariant, and directionally selective in higher dimensions.In terms of computational complexity and denoising performance, the bivariate wavelet thresholding is introduced as an effective and low-complexity image denoising method.

MLR Model
Assume that the hyperspectral image has B spectral bands and each band of hyperspectral image has MN  pixels.Let X denote a PB  matrix of the B spectral observed vectors of size P ( is the -th k column vector of the matrix X .Namely, lexicographically arranging -th k band image into a column vector X k .In this paper, Xk is the vector predicted for the signal X k of band k .The L adjacent bands (not including itself) are utilized to perform MLR, that is X X ( 1, 2, , ) where the , the least squares estimator of the regression vector

Spectral derivative domain
It is well known that hyperspectral data in derivative domain is sensitive to noise, but it is relatively less sensitive in the spectral domain (Othman and Qian, 2006;Tsai and Philpot, 1998).
Transforming hyperspectral datacube into the spectral derivative domain is equivalent to highpass filtering, where the noise level is elevated.The derivative of the spectral band image is defined as follows ( , , ) ( , , ) ( , , ) where ( , , ) x i j  stands for the pixel value of the position ( , ) ij in spectral band center  and   is the distance between the adjacent bands.In computation, the equation ( 5) is also given as follows ( , , ) ( , , 1) ( , , ), 1, 2, where ( , , ) x i j k stands for the pixel value of the position ( , ) ij in band k , ( , , ) x i j k  stands for the pixel value of the derivative domain and B is the total number of spectral bands in hyperspectral image.

2-D dual-tree complex wavelet
Wavelet has a good time-frequency-localization property.As a result of multi-resolution approach, the signal power is concentrated in the low frequency subband by wavelet transform, and the high frequency subband describes non-stationary characteristics of the signal well.However, the real wavelet transform suffers from four fundamental, intertwined shortcomings, which are oscillations, shift variance, aliasing and lack of directionality.
The complex wavelet transform (CWT) is proposed to overcome the four shortcomings.The dual-tree complex wavelet transform (DTCWT) is the relative enhancement to the discrete wavelet transform (DWT) (Selesnick, et al, 2005).Comparing to DWT, DTCWT has two important additional properties, which is nearly shift invariant, and directionally selective in higher dimensions.This means that the squared magnitude of a given complex wavelet coefficient provides an accurate measure of spectral energy at a particular location in space, scale and orientation.The CWT can also give a substantial performance boost to DWT noise reduction algorithms.It also means that CWT-based algorithm will automatically be almost shift invariant, thus reducing many of the artefacts of the critically sampled DWT.And oscillations and aliasing have also been overcome during performing the DTCWT.
The effective performances of several other denoising algorithms using the CWT have also been described in (Sendur and Selesnick, 2002;Ye and Lu, 2003).The CWT obtains better results than the DWT, which is an important reason for using the CWT in this paper.

Bivariate shrinkage (BS)
It is important to choose the wavelet thresholding for wavelet-based image denoising algorithms.In this paper, bivariate wavelet thresholding is introduced because of its effectiveness and lowcomplexity (Sendur and Selesnick, 2002).The method exploits the statistical dependencies between a coefficient and its parent.Moreover, the combination of BS with local variance estimation and CWT obtains better results than other wavelet-based image denoising algorithms.x are the wavelet coefficients at the finest scale, and where M is the number of pixels in the 2-D neighbourhood window S .The size of S is chosen to be 77  in terms of denoising performance and computational time.
For the complex-valued coefficients, the real and imaginary parts are not shift invariant individually but their magnitudes are.So when thresholding the complex-valued coefficients of the CWT, it is typically more effective to apply the nonlinearity to the magnitude rather than to the real and imaginary parts separately.

Denoising Algorithm
The denoising process is as follows (Figure 1): (1) The 1-D spectral noise is firstly removed by the MLR, which utilizes L bands (not include itself); (2) The noise level is elevated temporarily by transforming hyperspectral datacube into the spectral derivative domain; (3) Performing the 2-D DTCWT (DTCWT2) on each band in the spectral derivative domain; (4) Bivariate wavelet thresholding is performed to shrink the complex-valued coefficients; (5) Performing the inverse DTCWT2 (IDCWT2) on the complex -valued coefficients which have been shrank;  In the last step of the denoising process, the denoised signal ˆ( , , ) x i j k is retrieved by spectral integration (Othman and Qian, 2006) where the denoised spectral derivative x  is obtained by the inverse DTCWT2 (IDTCWT2).The process is given by According to the equation ( 9), computational error accumulated due to the integration in the spectral derivative of hyperspectral image after denoising, and the error grows with k .In order to suppress the growing error, the moving average filter (MAF) is applied as shown in Figure 1.
The MAF is a simple low-pass filter, and its width of sliding window is 1  , which is also named as the correction window.The integration updating process is given by where ( , , ) x i j k is the denoised signal after correction, ˆ( , , ) x i j k is the denoised signal before correction, ( , , ) x i j l is the noisy signal and 1  is the size of correction window.The reason for using the noisy signal (instead of the pure signal) is that the pure signal is unknown.The size of correction window is 5.

Experimental Results
In this section, the noise model is introduced for simulated experiment.The noise model is composed of both the signal-dependent noise and the signal-independent noise.Different noises can be simulated by choosing different parameters.Then, both the simulated experiment and the real-life data experiment are carried out, compared with the other methods in the literature.

Noise Model
A generalized noise model has been proposed to deal with several different acquisition systems.In this paper, we use the following parametric model (Acito, et al, 2011;Alparone, et al, 2009) () where y is the observed noisy image, x is the noise-free image, () nx is the random noise.The noise corresponding to different pixel positions are statistically independent.In equation ( 12), () nx is composed of both photon and thermal noise and is modeled as a zero mean Gaussian random vector whose covariance matrix depends on the useful signal x .According to the reference (Acito, et al, 2011), the random noise has diagonal covariance matrix whose diagonal entries are the variances of each band 2 ( ) 1, , where kk ux  is regarded as signal-dependent noise and k w is regarded as the signal-independent noise.
Here, the SNR is defined as follows where X P is the power of the pure signals ( , , ) x i j k , and N P is the noise power in the noisy signals ( , , ) )

Experiment
The simulated experiment of the noise reduction is carried out on AVIRIS images, Jasper Ridge and Moffett Field provided by JPL, NASA.The size of datacube is extracted from the AVIRIS images for testing is 256  256  224 (width  height  band).Figure 2 shows the image of band 40.For the AVIRIS image a 28 28 mm  ground sample distance (GSD) datacube is derived by spatially averaging the 44 mm  GSD datacube elevating the SNR to 38.45dB.Having such high SNR, this datacube is viewed as a pure datacube (Othman and Qian, 2006), which is used as a reference to measure the SNR before and after denoising.The images are corrupted by noise model described before.In this paper, the SNR of the hyperspectral image for testing is 27.78dB and 30.00dB respectively, and the parameter  is chosen to be 0.25, 1 and 4, respectively.In order to illustrate the superiority of the proposed algorithm in this paper, the proposed method is compared with the algorithm HSSNR proposed in (Othman and Qian,, 2006), and the algorithm PCABS proposed in (Chen and Qian, 2011).In this paper, L is equal to 223, that is, all bands (not including itself) have been utilized to perform MLR.The DTCWT pack is acquired from the web: http://dap.rice.edu/.The types of wavelet are LeGall 5, 3 tap filters and quarter sample shift orthogonal (Q-Shift) 10, 10 tap filters, and the level of decomposition is 6, by which the best results have been obtained.Table 1 and Table 2 show the SNR of the hyperspectral image Jasper Ridge and Moffett Field denoised by the proposed method, the PCABS and the HSSNR.The results of Table 1 and Table 2 indicate that the proposed method is better than the HSSNR method and the PCABS method for hyperspectral image denoising in terms of SNR.We use OMIS (operational modular imaging spectrometer) data developed by the Shanghai Institute of Technical Physics of the Chinese Academy of Sciences for the real-life data experiment to verify the correctness and performance of algorithm.It has 128 spectral bands ranging from visible to thermal infrared wavelength.The size of datacube extracted from OMIS data for testing is 256  256  128 (width  height  band).We perform denoising for the original OMIS data.Figure 3 shows band #20, #60 #90 of the original image and the denoised image obtained by PCABS and the proposed method, and the difference between the original image and the denoised image obtained by PCABS and the proposed method in this paper.The difference is considered as the noise removed from the original image.Therefore the less fine details of the image in the difference means the better denoised results.From the Figure 3 (d) and (e), the image (d) contains more fine details than the image (e), which is easy to distinguish by the human eyes.The result shows that the PCABS method removes the finer details during the denoising process.Therefore, the proposed algorithm is superior to the PCABS in terms of removing noise and simultaneously maintaining fine features during the denoising process.To see statistics of the images to show the difference among the three images, we make the covariance matrix with them as follows:

5．CONCLUSION
A denoising method of hyperspectral remote sensing image based on the combination of MLR and bivariate shrinkage with 2-D dual-tree complex wavelet transform is proposed in this paper.Due to the strong spectral correlation of the useful signal and the weak spectral correlation of the random noise in hyperspectral images, MLR is employed to decouple the spectral correlation.To elevate the noise level, the hyperspectral image is transformed in the derivative domain.Bivariate shrinkage with dual-tree complex wavelet transform is performed to remove the spatial noise.
In the simulated experiment, AVIRIS images Jasper Ridge and Moffett Field are tested by HSSNR, PCABS and our proposed method.The results show that the all three methods improve the image quality in terms of SNR.For the 27.78dB noisy image, the proposed method improves the SNR up to 10dB, which is about 5dB higher than HSSNR, and 0.5dB to 0.8dB higher than PCABS.For the 30dB noisy image, the proposed method improves the SNR up to 9.5dB to 9.8dB, which is about 4.3dB higher than HSSNR, and about 0.7dB higher than PCABS.Among the three methods, our proposed method performs best in the simulated experiment.For the reallife OMIS data, the experiment results show that our proposed method maintains more fine details of the image during the denoising.This is better than the other two denoising methods which have already proven to perform well in the literature.

Figure 3 :
Figure 3: Band #20, #60 #90 of (a) the original image and (b) the denoised image obtained by the PCABS, (c) the denoised image obtained by the proposed method, and (d) the difference between the original image and the denoised image obtained by PCABS, (e) the difference between the original image and the denoised image obtained by the proposed method.

Table 1 :
SNR of Jasper Ridge by the HSSNR, PCABS, and the proposed algorithm (dB)

Table 2 :
SNR of Moffett Field by the HSSNR, PCABS, and the proposed algorithm (dB)