Is Texture Denoising Efficiency Predictable?

Images of different origin contain textures, and textural features in such regions are frequently employed in pattern recognition, image classification, information extraction, etc. Noise often present in analyzed images might prevent a proper solution of basic tasks in the aforementioned applications and is worth suppressing. This is not an easy task since even the most advanced denoising methods destroy texture in a more or less degree while removing noise. Thus, it is desirable to predict the filtering behavior before any denoising is applied. This paper studies the efficiency of texture image denoising for different noise intensities and several filter types under different visual quality criteria (quality metrics). It is demonstrated that the most efficient existing filters provide very similar results. From the obtained results, it is possible to generalize and employ the prediction strategy earlier proposed for denoising techniques based on the discrete cosine transform. Accuracy of such a predicti...


Introduction
Texture regions found in almost all natural scene images can occupy a di®erent percentage of image area. 18 Textures play an important role in geomorphometry 21,45 content-based image retrieval, 9 remote sensing, 19,40,46 pattern recognition and clas-si¯cation, 25,47 etc. Meanwhile, texture features are often masked or distorted due to noise present in the acquired images. This noise can be of a di®erent type (additive, multiplicative, signal-dependent 13,46,48 ) and origin being inherent for di®erent types of images (optical, radar, medical, hyperspectral). Therefore, a task is to remove this noise preserving the texture features in a maximally careful manner. 17,32,35,36,46,54 One might expect that this task of e±cient texture denoising which was already relevant a decade or two ago, 17,32,46 and now with all recent advancements (nonlocal ltering methods) in image denoising 11,12,14,26,46 has been successfully solved. However, this is not true. As it was shown by Milanfar and Chatterjee, 7 the potential of nonlocal¯ltering approach is limited for textural images. This has been empirically con¯rmed in Ref. 16 for highly textural images from TID2013 database. 27 It has been shown in Refs. 35 and 36 that the problems in noise removal arise for¯lters based on discrete cosine transform (DCT) 16,22,26 and one of the most advanced nonlocal¯ltering methods, BM3D (block matching three dimensional)¯lter. 11 Then, one might think that denoising techniques based on other principles are able to cope with a noise in texture images in a better way. Analysis carried out in the recent papers (see Refs. 34 and 36) shows that this is not true also for many advanced and e±cient modern¯lters such as translation invariant wavelet shrinkage (TI-WS), 10 Bayesian least squares of Gaussian scale mixtures (BLS-GSM), 31 nonlocal mean (NLM), 4 ā lter based on principal components analysis with local pixel grouping (LPG-PCA) in the spatial domain, 6 spatially adaptive iterative¯ltering (SAIF). 50 Recently, the powerful clustering-based denoising schemes have been proposed: KSVD 43 and KLLD 3 that use learned dictionaries in di®erent ways, and a¯lter based on gradient histogram preservation (GHP). 54 Their analysis has shown that most of the aforementioned denoising techniques perform similarly (approximately at the same level as standard DCT-based¯lter 22,26 and BM3D¯lter 11 ) whilst NLM and LPG-PCĀ lters 4,6 perform su±ciently worse. Moreover, it is worth performing a more careful analysis. There are several reasons behind this. The paper in Ref. 36 does not present data for the¯lters GHP, LPG-PCA, SAIF, KSVD, and KLLD. 3,6,43,50,54 The paper in Ref. 34 gives data for these¯lters in the form of scatter-plots used to predict¯ltering e±ciency which are di±cult to analyze and compare. Besides, data is given as scatter-plots only for one denoising e±ciency metric, improvement of peak signal-to-noise ratio (IPSNR), whilst for visual quality metrics such as Peak-Signal-to-Noise Ratio with accounting Human Visual System and masking (PSNR-HVS-M), 30 Multiscale Structural Similarity Index Measure (MSSIM), 52 and Feature Similarity Index Measure (FSIM) 53 only e±ciency approximation curves are presented. 34 Here we would like to mention two important aspects in texture¯ltering that explain our attention to the listed metrics. Firstly, visual quality of denoised images is important for many applications. 23,36,49 Thus, it is expedient to employ adequate visual quality metrics 23,28,41 in the analysis of¯ltering e±ciency. Secondly, a positive e®ect from denoising (noise suppression) is often comparable to a negative e®ect of texture smearing or distorting. 36 Then, the following question arisesis it worth applying denoising at all? Accompanying questions are: can we predict expedience of ltering for each particular case, and is it possible to reliably undertake a decision to carry out or to skip denoising?
What can be said about visual quality and metrics that can be employed for its characterization: there are no commonly accepted and fully reliable metrics. Studies in this direction continue. 23,28 If one wants to have a reliable assessment, it is reasonable to employ several adequate visual quality metrics, and to check the consistency of conclusions based on the analysis of these metrics. Below, we followed this approach and considered the aforementioned metrics PSNR-HVS-M, MSSIM, and FSIM, which are among the best for the case of grayscale image denoising. 28 Concerning the prediction of denoising e±ciency: the e±ciency as we mean is(are) the value(s) of some parameter(s) (indicator(s)) that can quantitatively characterize changes in image quality due to¯ltering. These can be the improvement of PSNR (IPSNR), reduction of the output Mean Square Error (MSE) compared to a noise variance in the original image, or other improvements of visual quality metrics. An idea that such indicators of denoising e±ciency can be predicted (estimated before image denoising is applied) has been put forward in papers. 1,8 The way proposed in Ref. 8 requires considerable computations and, thus, the time needed to derive a prediction indicator is comparable to a¯ltering procedure itself that restricts the practical application of this approach.
In fact, one needs a simple, fast and accurate enough way to predict the e±ciency quantitatively. The approach in Ref. 1 is just fast and simple. It implies calculation of one input parameter over a limited (small) number of 8 Â 8 pixel blocks for which 2D DCT is performed. This input parameter is then used for calculating the output parameter. Input and output parameters are linked by a function that can be also called approximation (prediction) curve. This curve could be obtained by regression in o²ine mode (in advance, before applying it for prediction). Due to a small number of blocks and simple calculations in them to determine the input parameter, the prediction can be carried out much faster than¯ltering. An important question is then: how accurately is the e±ciency indicator estimated?
More or less extensive analysis of this accuracy is performed in papers Refs. 1,35,36,34 showing that the accuracy depends upon many factors including what are the input and output parameters, how regression is calculated, what¯lter is analyzed, etc. Good prediction characteristics are already obtained for many modern¯lters if IPSNR is predicted. However, the prediction is considerably less accurate for visual quality metrics. There exist methods to improve this accuracy 33,38,51 that, in particular, deal with a joint use of two or more input parameters. But, these options have not been tested for texture image denoising yet.
Therefore, the main contributions of this paper are the following. Firstly, a thorough analysis of the denoising e±ciency for texture images corrupted by noise of di®erent intensity using several metrics is performed. It allows carrying out comparisons of the¯ltering e±ciency for the aforementioned set of¯lters and give wellmotivated practical recommendations on their use. In particular, it is shown that NLM and LPG-PCA¯lters 29,30 do not perform well for texture images. Secondly, the prediction accuracy is analyzed and ways to improve it are proposed. Also, it is shown that the use of input PSNR as the second input parameter allows improving prediction accuracy.
The paper is structured as follows. Section 2 brie°y describes the image/noise model, considered test images, analyzed e±ciency criteria (metrics) and¯ltering techniques. Section 3 deals with the analysis of denoising e±ciency. Approaches to the prediction of denoising e±ciency indicators are considered in Sec. 4. New solutions to the prediction are presented in Sec. 5. Finally, the conclusions follow.
2. Image-Noise Model, Test Images, Metrics and Filters 2.1. Image-noise model and used test images In our study, we use a typical simple observation model for noisy grayscale (or components of multichannel) images Here i; j denote pixel indices, I tr ij and n ij are the true image value and noise, respectively, i ¼ 1; . . . ; I Im and j ¼ 1; . . . ; J Im ; I Im and J Im de¯ne the image size. It is well understood that (1) is the idealized noise model.
Recall that texture is a prime interest. So, the images to be tested have to be either fully textural or to contain large areas that belong to textures. Besides, we analyze the case of grayscale images here and are interested in the result generality. Taking these aspects into consideration, 12 textural images presented in Fig. 1 were used in experiments same as in our previous paper. 36 Nine images that have indices 1-8, and 11, have been taken from USC-SIPI Image Database 44 and they can be treated as fully textural images where the texture is the same for the entire image. Two other test images #9 and #10 (Baboon and Grass) have been widely used in optical image analysis. The last test image (#12) is a good example of aerial remote sensing images of terrain having a complex (textural) structure; such textures are often used in geomorphometric analysis. 45 (1) Noise n ij in the observation model (1) is supposed to be zero mean, additive, white and Gaussian (AWGN), which is the most widely used model in image processing literature 36,48 and it is quite adequate for many practical situations. Moreover, if a noise is signal-dependent or multiplicative, processing is often carried out using a homomorphic or a variance stabilizing transform 24,38,40 that makes noise to be additive and close to Gaussian in the transformed images subject to¯ltering. Note, that the case of spatially correlated noise is more complicated, and some additional pre-whitening may be required. A more complex model has been brie°y considered in Ref. 36. Consideration of spatially correlated noise models falls out of the scope of this paper; this could be a subject for future work.
A parameter that characterizes AWGN intensity is its variance 2 0 or a standard deviation (STD), 0 . In general, noise variance values used in the¯lter performance analysis vary in a very wide range where one might often meet noise standard deviation (STD) values up to 100 in some recent studies. In our opinion, it is enough to use three practical values where STD ¼ 5 relates to the case of hardly noticeable noise 16 for 8-bit represented images; the noise with such STD will be further treated as a low intensity noise. The STD ¼ 10 corresponds to a middle intensity noise and the case of STD ¼ 15 relates to an intensive noise that can be regarded as annoying.

Used metrics
Denoising e±ciency can be analyzed and quantitatively characterized in many different ways. Probably, the most common is to present and analyze the output MSE (for a given noise variance) where I f ij is a denoised image value for an ijth pixel. It is also possible to employ the ratio MSE out = 2 0 Ref. 1. One more standard parameter is the output PSNR or the e±ciency indicator, called improvement of PSNR, determined as IPSNR ¼ 10 log 10 ð 2 0 =MSE out Þ and expressed in dB. Peculiarities of the Human Vision System (HVS) are in one way or another taken into account in the visual quality metrics (also called HVS-metrics). For example, the metric PSNR-HVS-M 30 is determined as PSNR-HVS-M ¼ 10 log where MSE HVS-M is the mean square error calculated in the DCT domain for 8 Â 8 pixel blocks with weighting that takes into account two peculiarities of HVS: less sensitivity to distortions for higher spatial frequencies and masking e®ects, typical for texture. PSNR-HVS-M is also expressed in dB and, similarly to IPSNR, it is reasonable to calculate and analyse an improvement of PSNR-HVS-M (IPSNR-HVS-M) subtracting the input PSNR-HVS-M value from the output value. 36 Besides, we consider two other metrics called MSSIM 52 and FSIM. 53 These metrics are among the best in literature, especially for conventional types of distortions such as di®erent types of noise, blur, distortions caused by lossy compression or¯ltering of noisy images. 27 In contrast to PSNR-HVS-M, MSSIM and FSIM metrics values vary from 0 to 1, where unity corresponds to the perfect visual quality. Similar to IPSNR-HVS-M, one can analyze an improvement of MSSIM (IMSSIM) and an improvement of FSIM (IFSIM) that are determined as di®erences in their values before and after denoising. A question is: how informative are IMSSIM and IFSIM without knowing MSSIM or FSIM for the original image. Note, that MSSIM or FSIM for original (noisy) images cannot be accurately determined in practice since the true image is absent.

Considered¯lters
We have already mentioned above which¯lters will be used. Our particular goal is to consider state-of-the-art¯ltering methods that belong to di®erent classes. Let us brie°y describe them. The DCT-based¯lter performs data processing in 8 Â 8 pixel fully overlapping blocks using a hard thresholding of the DCT coe±cients with the threshold set to 2:7 0 . The BM3D¯lter 11 employs a search for similar blocks and two stages of 3D DCT-based processing to¯nd similar patches with thresholding and weighted aggregation. Principles of operation for the wavelet-based¯lter TI-WS 10 and the nonlocal mean (NLM)¯lter 4 are well known. The BLS-GSM¯lter 31 exploits a complex model of wavelet coe±cient statistics for their thresholding. Another considered¯lter, LPG-PCA, uses principal components analysis with a local pixel grouping to suppress noise in the spatial domain. 50 This¯lter employs vector variables to represent neighbor pixels as training patches for block matching. LPG-PCA can be performed iteratively to improve the denoising e±ciency starting with adjusting the noise level from the second iteration.
The denoising performed iteratively to improve¯ltering e±ciency has also been applied for spatial domain¯lters. Spatially adaptive iterative¯lter 43 (SAIF, available at https://users.soe.ucsc.edu/$htalebi/SAIF.php) processes iteratively the image local content using some base¯lter (we have used a NLM¯lter for this purpose) and automatically optimizes the iteration number with respect to the mean-squared error estimated by the SURE risk estimator. 42 We have also studied nonlocal denoising techniques in the transform domain. Two e±cient clusteringbased denoising schemes have been proposed recently: KSVD 3 and KLLD. 6 Thesē lters learn dictionaries in di®erent ways. KSVD (available at http://www.cs.technion.ac.il/$elad/software/) represents the signal in a sparse and redundant form and learns°exible and sparse dictionaries that are compact and provide e±cient representation of sample data. This scheme has demonstrated its advantages performing 3D denoising. Alternatively, KLLD (available at https://users.soe.ucsc.edu/ $priyam/K-LLD/) employs as features the local weight functions based on a steering kernel regression and uses the SURE risk estimator for denoising e±ciency improvement. A special texture-oriented denoising method based on gradient histogram preservation 54 (GHP, available at http://www4.comp.polyu.edu.hk/ $cslzhang/code) enforces the gradient histogram to be close to the reference gradient histogram of the original image. The method estimates the reference histogram from noisy observations of unknown images. For NLM, LPG-PCA, KSVD, and KLLD lters, we have used parameter values recommended by authors in their scripts at the above mentioned sites.

Analysis of Texture Denoising E±ciency
Recall that our simulations have been carried out for 12 test images depicted in Fig. 1. Ten realizations of AWGN with aforementioned values of STDs (5, 10, and 15) have been added to each test image. For each test image and a given noise variance, denoising has been performed by each of the¯lters described above, namely, DCTF, BM3D, BLS-GSM, NLM, LPG-PCA, SAIF, KSVD, KLLD, and GHP. For each denoised image, the following parameters have been calculated: . MSE out , then PSNR out and IPSNR; . MSE HVS-M , then PSNR-HVS-M out and IPSNR-HVS-M; . output value MSSIM out and, then, IMSSIM; . output value FSIM out and, then, IFSIM.
It is supposed that the input values of controlled metrics (obtained for input, i.e. noisy images) are known or pre-estimated. The considered output metric values obtained for a given test image and a given noise variance have been later averaged for the analyzed noise realizations. Note that the values of the metrics IPSNR and IPSNR-HVS-M change from one realization to another by no more than 0.2 dB, whereas the values of other metrics did not change much.
The results for the noise STD equal to 5 are presented in Fig. 2. The¯rst and the most obvious observation is that the data for the seven best¯lters (BM3D, BLS-GSM, DCT, SAIF, KSVD, KLLD, GHP) are very similarplots for them practically coincide (to show di®erence in performance of these¯lters more clearly, Figures 2(c) and 2(d) present the plots for IPSNR and IPSNR-HVS-M using another scale). Meanwhile, the data for two¯lters, namely LPG-PCA and NLM, are su±ciently worse than for other¯lters according to all quality metrics.
The second observation is that, according to all visual quality metrics (in terms of their improvements), even the best¯lters do not, in fact, noticeably improve the quality of the processed texture images. Meanwhile, the use of LPG-PCA or NLM lters can lead to considerable degradations of the denoised image compared to the original. According to the metrics IPSNR and IPSNR-HVS-M, the test image #10 seems to be the most \unfavorable" for NLM. This¯lter smears the texture (see an example in Fig. 3(c)) thus degrading a visual quality. According to the metrics IMSSIM and IFSIM, the test images #5 and #12 are the most unfavorable for LPG-PCA.
The third observation is that, according to IPSNR, the image quality can be slightly improved for some test images such as the test images #3 and #8. However, the improvement is so small (0.5-1 dB in the former case and 1.5-2.0 in the latter case) that it seems not worth applying¯ltering for these test images as well if noise is not intensive (for STD about 5 and less). Comparing image fragments in Figs. 3(a) Here, the di®erence in the¯ltering e±ciency for the considered test images becomes essential. For example, IPSNR reaches 4 dB for the test image #8 whilst IPSNR for the test image #10 is less than 1 dB for the best¯lters. Visual quality does not improve su±ciently for all, even the best,¯lters. The IPSNR-HVS-M exceeds 1 dB only in a few cases; the metrics IMSSIM and IFSIM are also close to zero. Despite IPSNR is about 4 dB for the test image #8, the visual quality metrics do not indicate any improvement by¯ltering. One fragment of the noise-free, noisy and ltered test image #10 is presented in Fig. 5. Although noise is quite intensive, it is seen only in quasi-homogeneous regions (as leaves or grass) that occupy a small percentage of the total image area. Meanwhile, noise is masked by texture in other regions. Denoising removes noise well in the aforementioned quasi-homogeneous regions. In the textural fragments, noise is partly removed whilst texture is partly smeared. In aggregate, there is an impression that the visual quality has been slightly improved by¯ltering, and it is possible to \agree" with the visual quality metrics which indicate a small improvement.
Analysis of data in Fig. 4 shows that there is a certain agreement between results for the metrics IPSNR and IPSNR-HVS-M. At least, the largest improvements are observed for the test images #3, #8, #9, and #12. The metrics IMSSIM and IFSIM are, in general, in agreement between each other indicating the largest improvements for the test images #9 and #12. On average, we can state that the best performance is provided by GHP and BM3D¯lters although¯ve others perform only a little worse.
Finally, let us consider the data obtained for intensive noise. The results are presented in Fig. 6. Compared to the earlier considered cases of less intensive noise, the improvements due to denoising are larger. The values of IPSNR vary from 1 to 5 dB for most¯lters. Moreover, even for the¯lters LPG-PCA and NLM the IPSNR  values are positive for all test images except one (#10). Concerning the visual quality metrics, LPG-PCA and NLM do not improve image visual quality. Other¯lters, in most cases, provide some improvement of the visual quality although this improvement is not large. According to the presented results, the¯lter GHP performs slightly better than its counterparts including BM3D. But, GHP¯lter performs slower and requires more memory than BM3D. Note that most¯lters improve the quality of the test image #8 in the largest degree.
The presented examples demonstrate that even if IPSNR-HVS-M exceeds 1 dB or IMSSIM exceeds 0.01, the improvement of visual quality after denoising is not obvious. There are even more problems with IPSNR, at least, in the case of texture image denoising. The presented example (Fig. 7) demonstrates that even the values  of IPSNR about 3. . .4 dB do not guarantee that visual quality of the texture images has been improved by¯ltering. An empirical rule can be the following: if a¯lter provides IPSNR over 3 dB and IPSNR-HVS-M over 1 dB, then one can expect that the¯ltering is expedient. Certainly, other empirical rules to perform denoising or to skip it can be applied.
The di®erence between the input images and the denoised images for DCTF, NLM, BM3D, and K-SVD¯lters are presented in Ref. 34. It is shown there that the di®erences in texture regions are larger than the ones in homogeneous regions. The largest distortions are introduced by the NLM¯lter.

Filtering E±ciency Prediction
By denoising e±ciency prediction, we mean that some indicator (metric) able to quantitatively and adequately characterize¯lter performance can be estimated without (before) denoising itself. Then, this indicator is somehow used (analyzed) in order to undertake a decision: whether it is worth denoising this image, what¯lter to use, what parameter of a chosen¯lter to employ, etc.
The used prediction procedure proposed in Ref. 42 and further advanced in Refs. 33, 37, 38 and 51 is based on the following assumptions: . There are input statistical parameters that can jointly or separately describe image complexity and noise intensity for an image to be denoised (we assume here that a noise type and parameters are either known or pre-estimated with an appropriate accuracy in advance. Currently, there are methods that provide an accurate estimation in a blind manner, see Ref. 11). . There are indicators that are able to adequately characterize¯ltering performance. . There is a strict correlation (available in advance, before¯ltering) that allows estimating this indicator (indicators) with certain accuracy.
Then, the prediction itself for a given noisy image presumes the following steps: calculation of input parameters and their use as arguments for estimating output parameters (denoising e±ciency indicators).
Keeping this in mind, it becomes clear that there are certain requirements to such a prediction. The main points of these requirements are the following: (1) input parameters have to be calculated considerably faster than denoising; (2) output parameters have to be estimated (predicted) accurately enough for further processing (decision undertaking,¯lter parameter settings, etc.). Next, a set of subtasks needs to be solved: (1) What input parameters to apply? (2) What output parameters to use? (3) How to establish dependence between input and output parameters? (4) What accuracy of output parameter estimation (prediction) is appropriate for solving further tasks and how to provide such an accuracy? It is di±cult to thoroughly study all these subtasks within the scope of a paper. Thus, let us discuss what is already known and point out what subtasks will be considered below.
Earlier studies 33,38,51 have shown the following. Firstly, there exist quite simple statistics of DCT coe±cients which can serve as good input parameters. They are, e.g. mean probabilities P that absolute values of AC DCT coe±cients in a limited number of 8 Â 8 pixel blocks do not exceed 0 , where is a non-negative value of the order 0.5-2.0 and STD of AWGN 0 is assumed a priori known or accurately pre-estimated. By saying a limited number of blocks, we meant that the local estimatesP q ; q ¼ 1; . . . ; Q are obtained in Q (at least, 300-500) randomly selected blocks or more that allow estimating the aforementioned probabilities quite accurately. 33 In general, the estimated P depends upon image properties and on the block positions, but the error in determination of P is not the dominant factor in prediction accuracy.
Secondly, it has been already demonstrated that IPSNR for many¯lters can be predicted quite accurately for DCT-based¯lters 1,33,38,51 and some other denoising techniques. 34 Accuracy can be characterized di®erently where the root mean square error (RMSE) of estimates is one of the most adequate and widely used quantitative criteria. 5 The results presented in Ref. 34 show that RMSE for IPSNR is smaller than 1 dB for most considered¯lters if P 0:5 is used as only one input parameter and simple dependences of exponential or polynomial types are employed in the scatter-plot data regression. Now, we come to the methodology of obtaining the approximating (prediction, regression) curves that can be calculated in advance. Figure 8 presents examples of two scatter-plots obtained for metrics IPSNR (a) and IPSNR-HVS-M (b) on the input parameter P 2 . Horizontal coordinate of a scatter-plot point corresponds to the metric value and the argument relates to the input parameter value for a given test image corrupted by AWGN with a certain variance, and then denoised by a con-sidered¯lter. The data in Fig. 8 is presented for a DCT based¯lter where 40 test images have been used and STD values equal to 3, 5, 10, 15, 20, 25, and 30 cover a wide range of possible values for input and output parameters.   Fig. 8 additionally show regression curves where the following simple¯tted exponential functions are used: IPSNR ¼ 0:012 Á expð6:7 Á P 2 Þ and IPSNR-HVS-M ¼ 0:002 Á expð8:3 Á P 2 Þ. Visual analysis of the data in Fig. 8 shows the following. General tendencies in dependences of IPSNR and IPSNR-HVS-M on P 2 are clear: if P 2 is larger, the metric is larger too. Meanwhile, a comparison of the scatter-plots also reveals that it is much harder to predict IPSNR-HVS-M than IPSNR since the former scatter-plot exhibits a larger dispersion of points. This is con¯rmed by data in Ref. 34 where the RMSE values for IPSNR-HVS-M are mostly larger than 1 dB.

Examples in
One more criterion that directly characterizes¯tting (regression) and su±ciently in°uences the prediction accuracy is a goodness of¯t parameter R 2 (see Ref. 5) that should approach unity if the output parameter dependence on the input parameter is essential and¯tting is performed well. As is demonstrated, 34 R 2 for most¯lters exceeds 0.9 for IPSNR and is smaller than 0.9 for IPSNR-HVS-M. Thus, an accurate prediction of IPSNR-HVS-M is more problematic. A similar situation holds for IMSSIM and IFSIM where R 2 for them is about 0.85 for most of¯lters. 34 This means that an improvement of prediction is more important and problematic just for the visual quality metrics.

New Solutions for Prediction of Denoising E±ciency Indicators
As has been already demonstrated above and in Refs. 34, 33 and 51 it is more problematic to predict visual quality metrics than IPSNR. Several ways to improve the prediction (to increase R 2 and to reduce¯tting RMSE) have been already proposed. They presume¯nding a better input parameter, 33 search and employment of a better approximation function, 51 and usage of two or more input parameters 38 aggregated in one or another way. In the latter case, di®erent statistics of local (calculated in analyzed blocks) probabilities have been employed: mean, median, variance, skewness, kurtosis. Below, we propose and study another approach where two input parameters are used in which the¯rst is statistical and the second characterizes the quality of the original image subject to denoising.
Let us explain why we expect this approach to be reasonable. Firstly, looking at the scatter-plots in Figs. 8(b) and 9, these scatter-plots can be divided into three regions: . P 2 < 0:5 or P 0:5 < 0:25 that mainly correspond to highly textural images corrupted by non-intensive noise for which denoising is practically useless since improvement of quality according to all considered metrics is negligible; . 0:5 P 2 < 0:9 or 0:25 P 0:5 < 0:35 (that relate to either middle complexity images or to textural images corrupted by quite intensive noise) for which there is an essential diversity of metric values and denoising seems to be expedient for many, but not for all images; . P 2 ! 0:9 or P 0:5 ! 0:35 for which it is worth employing denoising with a high probability of a positive result. So, the main task is to improve prediction just for the second region (0:5 P 2 < 0:9 or 0:25 P 0:5 < 0:35). Note, that IMSSIM less than 0.005 or IPSNR-HVS-M smaller than 0.5 dB practically cannot be considered as a visual quality improvement. 37 Secondly, it has been shown in Refs. 26 and 51 that IPSNR about 3 dB is not recognized as e±cient denoising if noise in original images is intensive. Therefore, we can expect that the values of a metric that characterizes quality of the original (noisy) image can be helpful for better prediction of the metric that describes the image quality improvement due to denoising.
Let us now check our assumption for IPSNR and input PSNR used together. Recall that input PSNR can be easily determined for a known noise type and parameters and is able to characterize noise intensity irrespectively of image complexity.
There are numerous methods to aggregate two or more input parameters. To have an easy option of 2D curve¯tting into scatter-plot of two arguments (see example in Fig. 10), let us use the¯tting (regression) de¯ned as M out ¼ a expðbP 0:5 þ cM inp Þ, where M out and M inp are used as output and input metrics, respectively; a; b, and c are the determined parameters of the¯tted function.
Analysis of the data in Fig. 10 shows the following. The tendency of IPSNR to increase if P 0:5 increases remains. There is also a tendency for IPSNR increasing if the input PSNR becomes smaller (the scatter-plot in Fig. 10 was obtained for the conventional DCT-based¯lter varying input PSNR from about 20 dB using noise STD ¼ 30 to about 40 dB for noise STD ¼ 3).
As a result of 2D¯tting for the case in Fig. 10, we have a ¼ 0:12; b ¼ 11:46, and c ¼ 0:01. The parameters that characterize¯tting accuracy are the following: R 2 ¼ 0:956 and RMSE ¼ 0:671. These parameters both con¯rm that¯tting is quite good and RMSE of IPSNR prediction is less than 1 dB. Meanwhile, the in°uence analysis of both input parameters shows that the role of P 0:5 is dominating: whilst the factor expðbP 0:5 Þ varies by about 100 times in the limits of P 0:5 variation, the factor expðcPSNR inp Þ varies only by about from 10 to 30%. In the case of¯tting for the data obtained for only one input parameter for predicting IPSNR as IPSNR ¼ a expðbP 0:5 Þ, the parameters are the following: a ¼ 0:18; b ¼ 10:79; R 2 ¼ 0:953 and RMSE ¼ 0:695, i.e.¯tting results are only slightly worse than in the case of two input parameters.
The case considered above is good from di®erent viewpoints. Firstly,¯tting is¯ne for both cases of one and two input parameters. Secondly, in practice, it is possible to use one input parameter since this procedure is easier, but almost is of the same accuracy when two input parameters are used. Thirdly, the second input parameter can be calculated as well as PSNR ¼ 10 log 10 ð255 2 = 2 0 Þ. Unfortunately, such favorable conditions do not always take place in practice. For example, consider the case of IMSSIM and input MSSIM which is hypothetical (cannot be estimated) when the scatter-plot can be obtained by simulations. The scatter-plot and the¯tted curve IMSSIM ¼ a expðbP 0:5 þ cMSSIM inp Þ is represented in Fig. 11 for the conventional DCTF. First of all, the input MSSIM variation range is rather narrow (from about 0.7 to almost unity) although the same wide range of noise standard deviation variation has been used. This means that though the range from 0 to 1 is declared for the metric MSSIM, only a part of it is of value. Moreover, input values are concentrated in the neighborhood quite close to unity. It makes the use of the metric MSSIM quite complicated (the same relates to FSIM). Secondly, IMSSIM value depends on both input parameters su±ciently. In the case of two input parameters, a ¼ 0:02; b ¼ 13:68; c ¼ À3:96; R 2 ¼ 0:935, and RMSE ¼ 0:02. Negative values of the parameter c mean that the IMSSIM value increases if the MSSIM value becomes smaller (this can be seen in Fig. 11). Depending upon the input MSSIM value, the IMSSIM value varies by several times in the limits of the input MSSIM variation. Therefore, it is desirable to take the input MSSIM value into account for the prediction (for the case in Fig. 9,¯tting is characterized by R 2 ¼ 0:856 and RMSE ¼ 0:029, i.e. it is su±ciently less accurate). Since the input MSSIM value is not available (to our best knowledge, there are no methods to estimate it), the considered option to improve prediction cannot be realized in practice.
Let us give two more examples, both for the BM3D¯lter. Figure 12 represents the scatter-plot for IPSNR-HVS-M versus two input parameters. Fitting leads to R 2 ¼ 0:852 and RMSE ¼ 0:954. If IPSNR-HVS-M is predicted using only P 0:5 , then R 2 ¼ 0:78 and RMSE ¼ 1:16. Obviously, a prediction is possible, but its accuracy is worse than in the case of two input parameters. Consider that PSNR-HVS-M for a noisy image again cannot be estimated.  Figure 13 presents the scatter-plot for IFSIM versus two input parameters. Similar to MSSIM, FSIM values vary in the limited range (0.7-1.0) and IFSIM considerably depends on both input parameters. Fitting is rather good: R 2 ¼ 0:906 and RMSE ¼ 0:017, but it is impossible to determine the input FSIM value in practice. In the case of using only one input parameter (P 0:5 ),¯tting is su±ciently worse: R 2 ¼ 0:836 and RMSE ¼ 0:022.
Summarizing the obtained results, it is possible to conclude the following. Firstly, a prediction is, in general, possible not only for IPSNR and IPSNR-HVS-M (shown in our previous publications 33,38 ) but for some other HVS-metrics as well, e.g. IMSSIM and IFSIM. Secondly, one potential way to improve the prediction accuracy is to use more than one input parameter.
Some input parameters such as input PSNR-HVS-M, MSSIM or FSIM cannot be determined. Therefore, we propose to use the input PSNR value as the second input parameter keeping in mind that it can be determined for a noisy image. The results obtained in this case for DCTF and BM3D¯lter are given in Table 1 in columns de¯ned as P 0:5 þ PSNR under the assumption that the input PSNR value is estimated without error.
As one can see, there is a su±cient accuracy improvement for predicting IPSNR-HVS-M compared to the case of using one input parameter (see the data in Table 1, columns marked as P 0:5 Þ. There is practically no improvement in the accuracy of predicting IMSSIM and IFSIM, and the results are worse than for a hypothetic case (see the data in Fig. 13 and in Table 1 columns marked as P 0:5 þ RealPar). We have also tried the noise standard deviation as the second input parameter employed alongside P 0:5 . The obtained data (not presented in Table 1) is very similar to the earlier case using input PSNR. Hence, we prefer applying input PSNR as a more general characteristic of noise. The best results for a practically realizable combination of two input parameters are marked in Table 1 in bold.  Table 2 presents results for accuracy of¯tting characterized by RMSE. The same notations are used. These results coincide well with the data in Table 1. If R 2 in Table 1 for a given¯lter and metric is larger, RMSE in Table 2 is smaller. The smallest RMSE for each¯lter and practically realizable combination of two input parameters is marked in bold. Although the RMSE values for IPSNR and IPSNR-HVS-M are considerably larger than those for IMSSIM and IFSIM, the former two metrics are expressed in dB and vary in in¯nite limits. Also, note that the accuracy of predicting IPSNR-HVS-M is always worse than the accuracy of predicting IPSNR. Similarly, the accuracy of predicting IFSIM is better than the accuracy of predicting IMSSIM.
We have also analyzed another approach to improve the accuracy that was proposed in Ref. 38. The input parameters are some statistics of the local estimateŝ P q ; q ¼ 1; . . . ; Q of the probabilities in blocks. The best results have been obtained in Ref. 44 for the mean and variance (VarP) of the local estimates. Hence, let us consider this combination in our experiments. The obtained results are presented in columns marked as P 0:5 þ VarP.    Analysis shows that there is a considerable improvement of the prediction accuracy for all visual quality metrics compared to the case of using only P 0:5 . Obviously, there is a di®erence between the considered metrics. For IPSNR-HVS-M, it is better to use the input PSNR value as the second parameter. On the contrary, it is better to apply the combination P 0:5 þ VarP for IMSSIM and IFSIM.
The¯tting curve parameters for the best combinations of two input parameters (those marked in bold in Tables 1 and 2) are presented in Table 3. As is seen, IPSNR can be predicted well even if only one input parameter is used (see data in Tables 1  and 2). The potential accuracy of the IPSNR-HVS-M prediction is worth improving although the use of the second input parameter (input PSNR) helps to provide a su±ciently better accuracy. For IMSSIM and IFSIM, it is worth using VarP as the second input parameter and the R 2 values are already about 0.9. This shows that a rather good prediction is possible but improving its e±ciency is still worth trying. We should stress besides that the¯tting function parameters for a given metric (e.g. IPSNR) are very close for the DCT and BM3D¯lters.
One might think that the obtained prediction results relate only to DCT-based lters since one input parameter is P 0:5 . This is not so. The prediction approaches are considerably more general. To demonstrate this, we have collected data for six ¯lters (namely, DCTF, BM3D, BLSGSM, SAIF, KSVD, and KLLD) into joint scatter-plotswith one input parameter if only P 0:5 is employed, and two input parameters (pairs P 0:5 and input PSNR, P 0:5 and VarP). Fitting functions for one and two arguments have been obtained. Two examples are presented in Figs. 14 and 15. The scatter-plot in Fig. 14 can be compared to the scatter-plot in Fig. 9. It is seen that the main properties of these scatter-plots are very similar. Moreover, the R 2 and RMSE values are very similar, too. The only di®erence in these scatter-plots is in the number of points (six times more points for the scatter-plot in Fig. 14).
If two parameters are used, the prediction is more accurate (see data in Fig. 15). The value R 2 increases and RMSE reduces su±ciently.
It is possible to present all obtained scatter-plots. Instead, to save space, only the main conclusions and data are given. For IPSNR and IPSNR-HVS-M, it is worth using input PSNR as the second parameter, whilst it is better to employ VarP as the second parameter for IMSSIM and IFSIM. The obtained R 2 and RMSE values are very close to those presented for DCTF and BM3D in Tables 1 and 2. Here, we present only the best results and parameters of the¯tting functions in Table 4. As is seen, the values of analyzed parameters are very close to the corresponding values in Table 3. We can state that the provided approximations can be used for all six¯lters. In other words, for each particular image to be denoised, it is possible to predict what the e±ciency indicators for the best existing¯lters are.
Let us come back to the prediction accuracy. Clearly, it su±ciently depends upon quality of¯tting, but there are also other factors mentioned earlier. Obtaining the scatter-plot and approximating functions can be considered as a special learning task. If so, a question is to verify the prediction. For this purpose, we have taken an extra 36 images from the database ESPL-LIVE HDR Image Quality database 15 not used for learning. Scatter-plots have been obtained for the DCTF¯lter and then the R 2 and RMSE parameters have been calculated with respect to the earlier obtained approximations ( Table 4). The new data is collected in Table 5. Its analysis shows the following. As can be seen, RMSE values have increased almost twice and R 2 have reduced. The IPSNR-HVS-M prediction using a generalized approximation for all lters is unsatisfactory and still requires some improving. For the other three metric indicators, we can state that the designed prediction is quite general, stable, and accurate.
From this analysis, it is possible to conclude the following. Firstly, we can recommend using individual approximating functions instead of generalized approximations given in Table 4. Secondly, it is worth using more than 40 test images to obtain scatter-plots for further curve¯tting. Thirdly, some examples of scatter-plots show that more complex functions can be used in¯tting to obtain smaller RMSE and larger R 2 . In spite of all these ideas and recommendations for further improvements of the prediction accuracy, we can state that a prediction of denoising e±ciency indicators is possible.
If an answer to the question in the paper title is positive, then the next question is whether to apply denoising or not. The following procedure has been proposed in Ref. 26. The initial step 1 is to skip¯ltering if P 0:5 < 0:25. For larger P 0:5 , the rule could be: apply¯ltering if the predicted IPSNR value exceeds 3.5 dB and the predicted IPSNR-HVS-M value exceeds 1 dB.
Step 1 is motivated by an analysis of many scatter-plots, e.g. those given in Figs. 9, 10, and 12. Improvements of the metrics in this case are negligible. The situation changes if P 0:5 exceeds 0.25. Then, a rather large IPSNR (>3.5 dB) is needed to guarantee an essential improvement of the image quality. Concerning the visual quality metrics, the following study has been carried out for images in the database TID2013. A reliable denoising e±ciency measure is the opinion of observers that have assessed a quality of noisy and¯ltered images. Note, such images and assessments exist for the databases TID2008 and TID2013. 27,29 There are images distorted by the AWGN (distortion type # 1) and distortions due to denoising (distortion type # 9) in these databases. They contain 25 test images and four and ve levels of distortions for TID2008 and TID2013, respectively. In our further analysis, we have used data for the database TID2013 since it is more advanced. Each database image is characterized by the corresponding mean opinion score (MOS) values that can be treated as a reliable assessment of the image visual quality (higher MOS corresponds to a better visual quality).
The scatter-plot of MOS vs PSNR-HVS-M values is presented in Fig. 16 where thē tted straight lines are given (points of red color relate to images corrupted by AWGN, blue color points correspond to images with residual distortions after denoising). These lines are in a good agreement (the angle between them is small). This shows that the metric PSNR-HVS-M correlates with MOS well enough for the analyzed types of distortions. If PSNR-HVS-M is about 35-40 dB (almost invisible distortions), MOS values for the¯ltered images are higher than for the images corrupted by AWGN. The situation is slightly di®erent for PSNR-HVS-M smaller than 30 dB. To be sure that the denoising is worth applying, one needs the predicted value of IPSNR-HVS-M to be positive. This explains why we have proposed using the threshold equal to 1 dB at the second stage of our procedure.
The metrics IMSSIM and/or IFSIM can be potentially used in the decision undertaking as well. However, their peculiarities described above prevent giving simple and direct rules. More studies are necessary to provide such rules.

Conclusions
Analysis of denoising e±ciency has been carried out for several modern¯lters with the application to texture images corrupted by AWGN. Di®erent visual quality metrics are employed in the analysis and comparisons. It is demonstrated that noise removal from texture images is complicated, and even the most sophisticated existing¯lters often have low¯ltering e±ciency in terms of the used metrics. Visual examples con¯rm this observation. In such situations, it is reasonable to skip denoising in order to save resources.
The corresponding decision can be undertaken in an automatic manner based on the prediction of the parameters characterizing the¯ltering e±ciency. Such a prediction can be fast and accurate enough. Several ways to improve the accuracy are put forward. The use of input PSNR as the second input parameter provides a considerable improvement of a prediction accuracy. It is shown that a general prediction approach is possible for the set of the best existing¯lters despite the fact that Analysis has demonstrated that the prediction accuracy is worse for visual quality metrics than for the conventional PSNR. Although the prediction accuracy has been improved for visual quality metrics, it is worth continuing research in this direction.