Zum Hauptinhalt springen

Hybrid DCT-HAAR block

Saliba, Mireille ; Farah, Rachelle ; et al.
In: Multimedia Tools and Applications, Jg. 78 (2019-01-05), S. 17265-17286
Online unknown

Hybrid DCT-HAAR block 

In this paper we have constructed a new hybrid transform. We have combined the DCT block and the Haar block into one block which we have called hybrid DCT-HAAR block. This new block combines the advantages of the DCT which works extremely well for highly correlated data and the advantages of the Haar transform which gives superior results for images exhibiting rapid gradient variations. The DCT is applied to the upper left corner of the block. The Haar is applied to the remaining parts of the block. We derived the equations of the matrix M needed to recover the hybrid block. Our method increased the PSNR obtained by the DCT transform and enhanced the edge recovery of the Haar transform.

Keywords: DCT; Haar; Wavelet; Hybrid; PSNR

Introduction

Image compression is an efficient technique of reducing the size of data representing an image that is highly correlated. The purpose of image compression is to save memory and transmission time [[8]]. There are different methods in which image can be compressed. One of the most common compressed graphic image formats is the JPEG format which is based on the Discrete Cosine Transform (DCT) [[12]]. Another most commonly transform used in image compression is the Wavelet transform. One of the simplest wavelet is the Haar wavelet. It is based on doing average and difference operations on the image intensity values [[15]].

There are two kinds of compression [[17]]. Lossless compression involves the preservation of the image as if no detail is lost. Lossy compression, allows less than perfect reproduction of the original image. The advantage of lossy compression (our work) is that it can produce a much smaller compressed file than lossless compression. Note that JPEG applies DCT with a zigzag quantization that makes the compression visually lossless. JPEG 2000 uses the Wavelet Transform in their compression scheme. DCT is well suited for highly correlated data. It does a kind of de-correlation [[1]] for those data and has the property of energy compactness. Energy compactness means that DCT tries to concentrate energy or information in some elements of the block instead of spreading it in the whole block. This helps in increasing the compression since we are only interested in the elements that contain the maximum energy needed. Note that DCT concentrates this energy in the upper left corner of the block where the low frequencies reside [[16]].

On the other hand, the Haar transform [[13]] works well for high frequencies and achieves a high compression ratio while maintaining the quality of the image. It is very well performant for images that have rapid gradient variations.

Combining DCT and DWT was done in [[2], [14]] by applying the discrete cosine transform on the discrete wavelet transform coefficients. In [[9]] Hybrid matrices are generated using combination of various matrices like DCT, Hadamard and Kekre's Transform.

JPEG

The JPEG (Joint Photographic Experts Group) algorithm was founded to compress images with minimal data loss and high compression ratios. The JPEG consists of 4 different phases. First, the image is divided into 8 × 8 blocks of pixels. Next, the DCT can be applied to each block of image to convert it to the frequency domain. Then each one is compressed by performing a quantization process to remove the unnecessary information. Then, the transform coefficients which are quantized are transmitted to the receiver. Therefore, the DCT plays an important role of compression in the JPEG standard that offers a lossy compression of images. The JPEG process is shown in Fig. 1.

Graph: Fig. 1The JPEG process

DCT

The discrete cosine transform (DCT) is a concept for converting a signal into elementary frequency components and is based on dividing images into parts of different frequencies. The DCT de-correlates the image data. The de-correlation is the removal of redundancy between adjacent pixels [[3]]. It is one of the useful properties of Discrete Cosine Transform. After applying the DCT, the autocorrelation becomes very small. The data becomes uncorrelated. We can remove the unnecessary coefficients (small values) and transform the coefficients orderly to encode them independently and uniquely without repeating the same information represented in correlated pixels twice or more.

Equation 1 shows an example of the equation of the Discrete Cosine Transform (DCT)

1 gxu=λucosπ2x+1u2Nwhereλu=2Nu=01Notherwise

Graph

The Discrete Cosine Transform (DCT) basis is shown in Fig. 2.

Graph: Fig. 2The DCT Basis

Wavelet transform

Another very popular technique in image compression is the Discrete Wavelet Transform. The purpose of Wavelet transform is to change the data from time-space domain to time-frequency domain for a better compression. In addition, The Discrete Wavelet Transform can give high compression rate [[11]]. First, the image is passed into two filters: a high (H) and a low (L) pass filter. This process can be repeated in order to get the four sub-bands LL, HL, LH and HH. This is called the 2-D DWT [[11]]. Figure 3 shows a 3-level wavelet decomposition.

Graph: Fig. 3The 3 Level Decomposition of Wavelet

The Wavelet Transform produces a large number of zero or near zero coefficients. The compression is done by thresholding these values.

Haar transform

The Haar transform is a part of the wavelet family. It has its own basis and is the simplest wavelet basis. Equation 2 shows the Haar Transform basis formulas.

2 Ψ(x)=1,0.0x<0.51,0.5x<1.00,otherwisepqx=2p2Ψ2pxq+1

Graph

The Haar Transform Basis for 8 × 8 blocks is shown in Fig. 4.

Graph: Fig. 4The Haar Basis

PSNR

It is regularly necessary to proceed with some measures to quantify the quality of reconstruction of lossy compression. The Peak Signal-to-Noise Ratio (PSNR) is one of these measures [[7], [10]]. It is given by the following equation:

3 PSNR=20log102551nmi=0n1j=0m1fijfij2

Graph

where.

  • nxmis the size of the image,
  • f(i,j)is the pixel of row i and column j of the original image, and;
  • f'(i,j)is the pixel of row i and column j of the reconstructed image

In [[6]] they use suitable graph Fourier transforms (GFTs) to minimize the.

total signal representation cost of each pixel block. In [[5]] clusters of super-pixels are used as coding blocks. A graph transform is computed within each region. In [[4]] steerable DCT (SDCT) enables precise matching of directionality in each image block, achieving improved coding efficiency.

DCT vs HAAR

DCT transform

Let n be a positive integer. The one-dimensional DCT of order n is defined by an n x n matrix C whose entries are

Cij=aicos2j+1π2n

Graph

C=2n1212Λ12cosπ2ncos3π2nΛcos2n1π2nMMΛMcosn1π2ncosn13π2nΛcosn12n1π2n

Graph

The discrete cosine transform, C, has one basic characteristic: it is a real orthogonal matrix.

CTC=1C1=CT

Graph

The 2d- DCT is a separable transform:

We apply the one dimensional DCT first in the horizontal direction F = C*XT. Then we applied it in the vertical direction Y = C*FT.

The 2d-DCT is the matrix:

Y=CCXTT=CXCT

Graph

Since C is an orthogonal matrix we can solve for X using C−1 = CT

X=CTYC

Graph

The 8 × 8 DCT BASIS (C for n = 8):

Graph

And the DCT of an 8 × 8 image block X is: Y=CXCT.

For example the 8 × 8 image block X and its 8 × 8 DCT Y is shown below:

Graph

DCT can be used for image compression because the first few coefficients could capture most of the image. DCT discards some of the high frequency information (edges) and transmit the "important" coefficients (the coefficients that have most information about the image).

The most important values to our eyes will be placed in the upper left corner of the matrix (low frequency). The least important values will be mostly in the lower right corner (Fig. 5) of the matrix (high frequency).

Graph: Fig. 5The DCT Block

Applying this concept to our previous example, we get:

Graph

In general mxm sub-block in the upper left corner of the block is kept while the other entries are put to zero. An illustration is shown in Fig. 6.

Graph: Fig. 6DCT Quantization

HAAR Transform

The 8 × 8 HAAR transform matrix is

H=1811111111111111112222000000002222122000000002200000000220000000022

Graph

The HAAR of an 8 × 8 block X is: Y=HAHT.

It is not difficult to see that that is H orthogonal. H−1 = HT.

The compressed image is obtained from: X = HTYH.

After applying the Haar transform, the values that are less than a threshold α are zeroed. The remaining values are kept. A small compression ratio leads to a small compression. However, the PSNR will be better. An illustration is shown in Fig. 7.

Graph: Fig. 7Haar Quantization

Difference between HAAR and DCT

The energy packing properties of the DCT is better than the Haar transform.

Haar transform gives superior results for images exhibiting rapid gradient variations. Unlike the discrete cosine transform, the HAAR transform is not Fourier-based and therefore wavelets do a better job of handling discontinuities in data.

The basis for the Fourier transform (FT) are sinusoids. To reconstruct a spike like signal (or edges), you will need to superpose infinite number of sinusoids. The FT doesn't work efficiently with non-stationary signals. (Fig. 8).

Graph: Fig. 8The HAAR high frequencies

The HAAR efficiency in handling high frequencies is shown in the following image:

Our contribution is to combine these 2 transforms into one transform which has a new basis which is the combination of the DCT and the HAAR basis. Y = (C + H)X(C + H)T.

Taking advantage of the energy compactness property of DCT and the efficiency of the Haar transform, we suggest a process where a hybrid DCT-Haar block is used. The way this block is formed is illustrated in Fig. 9.

Graph: Fig. 9The DCT-Haar block

Mathematical derivation of the hybrid block

The process

The process consists of the following steps:

  • Dividing the image into 8 × 8 blocks
  • Computing the DCT and Haar transforms for each block based on Eq. 4.
  • ath display="block" xmlns="http://www.w3.org/1998/Math/MathML">O=BABT _ht_

Graph

where.

  • Ais the 8 × 8 original block,
  • Bis the combined 8 × 8 (DCT + Haar) basis (C + H).
  • Ois the combined 8 × 8 (DCT + Haar) block.

Equation 4 is equivalent to an equation of the following form:

5 O=MA˜

Graph

where.

  • (O)is the vectorized version of O,
  • ( A~ )is the vectorized version of A~ ,
  • Mis a 64 × 64 matrix to be determined.
The M matrix

We are interested in the matrix M because it will be the key for block recovery. Note that M is of size n2xn2for an nxn block.

In order to get the general formula of M for any nxn block, a general form of A and B are taken. The results appear in Figs. 10 and 11.

Graph: Fig. 10Matrix M for an 8 × 8 Block

Graph: Fig. 11Some Sub-Matrices of M for an 8 × 8 Block

Figure 11 shows some submatrices of the matrix M. the other sub matrices can be constructed in the same manner because there is a repeated pattern. A feasible code is presented in Fig. 12.

Graph: Fig. 12Algorithm for generating the M matrix

Figure 12 shows how to calculate the n2x n2 Matrix for a given nxn block. It calculates all the components of the M matrix (represented by m(i,j) in the algorithm) given in Fig. 9.

For example m(1,1) = b11xb11, m(1,2) = b11xb12,......

Figures 13, 14, and 15 show the M matrix for DCT, Haar and mixed transform for 4 × 4 blocks.

Graph: Fig. 13M Matrix for Haar Basis for 4 × 4 Blocks

Graph: Fig. 14M Matrix for DCT Basis for 4 × 4 Blocks

Graph: Fig. 15M Matrix for Mixed 4 × 4 Blocks

Block recovery

If M is invertible, then M−1 is used to get A~ . There is nothing that guarantees the invertibility of M. Therefore, we can get A~ by getting the pseudo inverse of M or simply the transpose of M and multiplying the answer by O as shown in Eq. 6.

6 A˜=MO

Graph

Then A~ should be arranged correctly to get the 8 × 8 recovered block A~ .

Simulation and results

To validate our proposed methods, we used a set of 4 grayscale images: Barbara, Boat, Peppers and Lena that are represented respectively in Fig. 16. The size of each image is 512 × 512.

Graph: Fig. 16Images used

The efficiency of our method is tested in this section. It must be compared with the DCT and Haar transforms applied on 8 × 8 blocks using the same quantization method.

In order to compare our method with the DCT transform, different values for the parameter m are used and the PSNR value for each m is calculated. For the Haar blocks chosen in our method, α is set to 15. Plots of m versus PSNR values for different images are shown in Fig. 17.

Graph: Fig. 17DCT Parameter m vs PSNR for Different Images Using DCT and our Method

It is clear that our method gives a higher PSNR than DCT for the same value of m for those images especially when there is more compression. In fact a small value of m means that a bigger portion of the block is set to zero i.e. more compression is involved. More compression means a smaller PSNR value. The relation between m and the compression ratio for the Boat image is shown in Fig. 18.

Graph: Fig. 18Relation between m and Compression Ratio

In order to compare our method with the Haar transform, different values for the parameter α are used and the PSNR value for each α is calculated. For the DCT blocks chosen in our method, m is set to 4. Plots of α versus PSNR values for different images are shown in Fig. 19.

Graph: Fig. 19Haar Threshold α vs PSNR for Different Images Using Haar and our Method

It is clear that our method gives a higher PSNR than Haar for the same value of α for those images especially when there is more compression. In fact a large value of α means that a bigger portion of the block is set to zero i.e. more compression is involved. More compression means a smaller PSNR value. The relation between α and the compression ratio for the Boat image is shown in Fig. 20.

Graph: Fig. 20Relation Between α and Compression Ratio

The results show that our method outperforms the DCT and Haar transforms for those four images. In addition, for the Barbara image where there are a lot of high frequencies due to the big number of edges, it is clear that our method has a better recovery than DCT and Haar.

To investigate more, DCT and our method are applied for the 4 images for the same compression ratio as shown in Figs. 21, 22, 23, and 24.

Graph: Fig. 21Results for the Barbara Image

Graph: Fig. 22Results for the Lena Image

Graph: Fig. 23Results for the Pepper Image

Graph: Fig. 24Results for the Boat Image

The Figures show that our method gives a PSNR better than that of DCT for the same compression ratio. For the same compression ratio, the error of recovery for both the Haar transform and our method is shown in Fig. 25.

Graph: Fig. 25Error Images after Applying Haar and our Method

Figure 25 shows that the error of recovery via the Haar transform in all images is bigger than the error produced from our method. The error reflects the ability to recover the images' edges. Therefore, our method gives a better PSNR than DCT and better edge recovery than Haar. In order to see the performance of our method, the thresholds α of Haar and m of DCT are varied. This is accompanied by presenting the percentage of the sent DCT blocks for the Pepper image. Results are shown in Table 1.

Variation of α and m for Pepper Image

Threshold α for Haar blocks

5

15

25

35

m = 2 (for DCT blocks)

Percentage of DCT blocks

1.147461

53.78418

68.9209

75.56152

Compression ratio (%)

41.18938

78.53937

82.9937

84.62334

PSNR (dB)

41.96088

35.22682

33.40736

32.26997

m = 3 (for DCT blocks)

Percentage of DCT blocks

3.271484

68.79883

80.98145

86.25488

Compression ratio (%)

40.81726

71.31433

74.52049

75.40245

PSNR (dB)

41.97367

35.67078

34.15795

33.32225

m = 4 (for DCT blocks)

Percentage of DCT blocks

7.446289

79.1748

88.5498

92.38281

Compression ratio (%)

39.48746

60.32076

62.00457

62.18891

PSNR (dB)

42.00663

36.29016

35.11305

34.56648

m = 6 (for DCT blocks)

Percentage of DCT blocks

25.24414

92.84668

96.99707

97.90039

Compression ratio (%)

28.40385

25.55332

24.9011

24.56732

PSNR (dB)

42.3059

38.3067

37.9219

37.809

Table 1 shows that for any value of m, increasing α leads to more compression. However, for the same α increasing m leads to less compression. Compression ratio and PSNR are inversely proportional. In fact, increasing the compression leads to a sacrifice in the image quality. The percentage of the DCT blocks taken is affected by the values of α and m. Increasing α for a fixed m leads to take more DCT blocks because big values of α lead to a high compression and low PSNR for Haar blocks. This makes the sender choose more DCT blocks. For a fixed value of α, decreasing m leads to choosing more Haar blocks. This can be explained in the same manner as before.

Overall, examining the values of the table can help in choosing the best α and m to be applied in the algorithm. For α = 15 and m = 4, the compression ratio is acceptable and the PSNR value indicates a good quality of the image. Therefore, those values (for 8 × 8 blocks) are chosen to be applied in the compression algorithm of our method.

We compared our algorithm to DCT to one of the state art approach which is the steerable discrete cosine transform [[4]] using 8 × 8 blocks. Our approach gave higher PSNR for all the images as shown in Table 2.

Comparison of our approach with DCT and Steerable DCT using 8 × 8 block

PSNR in db

DCT

SDCT

DCT + HAAR

LENA

30.12

37.87

39.45

BARBARA

28.51

32.81

35.63

BOAT

23.81

33.30

35.24

Conclusion

This paper combined DCT and Haar transforms into one hybrid block. It exploited the advantages of both transforms while minimizing the weaknesses of each one. While DCT is applied to the upper left corner of the block the remaining parts of the block are transformed via Haar. We derived the equations of the matrix M needed to recover the hybrid block. Our method increased the PSNR obtained by the DCT transform and enhanced the edge recovery of the Haar transform.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References 1 Ahmed N, Natarajan T, Rao KR. Discrete cosine transform. IEEE Trans Comput. 1974; 23: 90-93356555. 10.1109/T-C.1974.2237840273.65097 2 Benchikh S, Corinthios M (2011) A hybrid image compression technique based on DWT and DCT transforms, ICAIT, pp. 1-8 3 Ding JJ, Huang YW, Lin PY, Pei SC, Chen HH, Wang YH. Two-dimensional orthogonal DCT expansion in trapezoid and triangular blocks and modified JPEG image compression. Image Processing, IEEE Transactions. 2013; 22; 9: 3664-3675. 10.1109/TIP.2013.2268971 4 Fracastoro G, Fosson SM, Magli E. Steerable discrete Cosine transform. IEEE Trans Image Process. 2017; 26; 1: 303-3143580004. 10.1109/TIP.2016.262348907012817 5 Fracastoro G, Verdoja F, Grangetto M, Magli E. Superpixeldriven graph transform for image compression. Proc IEEE Int Conf Image Process. 2015; 2015: 2631-2635 6 Hu W, Cheung G, Ortega A, Au OC. Multiresolution graph Fourier transform for compression of piecewise smooth images. IEEE Trans Image Process. 2015; 24; 1: 419-4333300678. 10.1109/TIP.2014.237805507009804 7 Huynh-Thu Q, Ghanbari M. Scope of validity of PSNR in image/video quality assessment. Electron Lett. 2008; 44; 13: 800. 10.1049/el:20080522 8 Joshi MA, Raval MS, Dandawate YH, Joshi KR, Metkar SP. Image and video compression: fundamentals, techniques, and applications. 2014: Boca Raton; CRC press 9 Kekre HB, Sarode T, Dhannawat R (2012) Image fusion using Kekre's hybrid wavelet transform, ICCICT, pp. 1-6 Kozhemiakin R, Abramov S, Lukin V, Djurović B, Kharkiv ID (2016) Compression ratio prediction for DCT-based coder. 9th International Symposium on Physics and Engineering of Microwaves, Millimeter and Submillimeter Waves (MSMW), pp. 1–4 Lewis AS, Knowles G. Image compression using the 2-D wavelet transform. IEEE Trans Image Process. 1992; 1; 2: 244-250. 10.1109/83.136601 Pennebaker WB, Mitchell JL. JPEG Still Image Data Compression Standard. 1993: New York; Springer Sahoo R, Roy S, Chaudhuri SS (2014) Haar Wavelet Transform image compression using Run Length Encoding, ICCSP, 71-75 Shrestha S, Wahid K (2010) Hybrid DWT-DCT algorithm for biomedical image and video compression applications, ISSPA, pp. 280-283 Talukder KH, Harada K (2007) Haar wavelet based approach for image compression and quality assessment of compressed image. International Journal of Applied Mathematics 3(6) Taubman D, Marcellin M. JPEG2000 Image Compression Fundamentals, Standards and Practice. 2002: New York; Springer Usevitch BE. A tutorial on modern lossy wavelet image compression: foundation of JPEG 2000. IEEE Signal Process Mag. 2001; 18; 5: 22-35. 10.1109/79.952803

By Issam Dagher; Mireille Saliba and Rachelle Farah

Reported by Author; Author; Author

Issam Dagher finished his MS in electrical engineering degree in 1994 from Florida International University, Miami, USA. He finished his PhD in 1997 from University of Central Florida, Orlando USA. He is now an associate professor at the University of Balamand, Lebanon. His area of interests are pattern recognition, neural networks, artificial intelligence, computer vision. He published many papers on these topics.

Mireille Saliba finished her MS in electrical engineering degree in 2016 from the University of Balamand, Elkoura, Lebanon. Her area of interests are image processing and computer vision.

Rachel Farah finished her MS in electrical engineering degree in 2016 from the University of Balamand, Elkoura, Lebanon. Her area of interests are image processing and computer vision.

Titel:
Hybrid DCT-HAAR block
Autor/in / Beteiligte Person: Saliba, Mireille ; Farah, Rachelle ; Dagher, Issam
Link:
Zeitschrift: Multimedia Tools and Applications, Jg. 78 (2019-01-05), S. 17265-17286
Veröffentlichung: Springer Science and Business Media LLC, 2019
Medientyp: unknown
ISSN: 1573-7721 (print) ; 1380-7501 (print)
DOI: 10.1007/s11042-018-7039-5
Schlagwort:
  • Computer Networks and Communications
  • Computer science
  • Haar
  • Dct transform
  • 020207 software engineering
  • 02 engineering and technology
  • Matrix (mathematics)
  • Hardware and Architecture
  • Block (telecommunications)
  • 0202 electrical engineering, electronic engineering, information engineering
  • Media Technology
  • Discrete cosine transform
  • Multimedia information systems
  • Enhanced Data Rates for GSM Evolution
  • Algorithm
  • Computer communication networks
  • Software
Sonstiges:
  • Nachgewiesen in: OpenAIRE
  • Rights: CLOSED

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

oder
oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

oder
oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.

xs 0 - 576
sm 576 - 768
md 768 - 992
lg 992 - 1200
xl 1200 - 1366
xxl 1366 -