Xem mẫu

  1. KỸ THUẬT XỬ LÝ ẢNH BẰNG PHƯƠNG PHÁP PHÂN TÍCH SUY BIẾN Nguyễn Thị Hương Trường Đại học Hà Nội Tóm tắt - Phương pháp phân tích suy biến (singular value decomposition) được viết tắt là SVD là một trong những phương pháp thuộc nhóm phân tích ma trận (matrix factorization ) được phát triển lần đầu bởi những nhà hình học vi phân. Ban đầu mục đích của phương pháp này là tìm ra một phép xoay không gian sao cho tích vô hướng của các vector không thay đổi. Từ mối liên hệ này khái niệm về ma trận trực giao đã hình thành để tạo ra các phép xoay đặc biệt. Phương pháp SVD đã được phát triển dựa trên những tính chất của ma trận trực giao và ma trận đường chéo để tìm ra một ma trận xấp xỉ với ma trận gốc. Phương pháp này sau đó đã được ứng dụng rộng rãi trong các lĩnh vực như hình học vi phân, hồi qui tuyến tính, xử lý hình ảnh, clustering, các thuật toán nèn và giảm chiều dữ liệu. Đề tài này trình bày những kiến thức căn bản về phương pháp phân tích suy biến (SVD) thông qua những ví dụ cụ thể trên ma trận nhằm giới thiệu kĩ thuật phân tích suy biến một cách cụ thể, và ứng dụng của nó trong việc xử lý ảnh, góp phần rất lớn trong việc tiết kiệm bộ nhớ khi lưu trữ dữ liệu và làm tăng tốc quá trình xử lý ảnh. Từ khóa: Nén hình ảnh, phân tách giá trị số ít, dạng bậc hai, quy trình Gramid Schmidt, xử lý hình ảnh, xếp hạng, trực giao, cơ sở trực giao, ma trận đường chéo Abstract: Nowadays the data are transmitted in the form of images, graphics, audio and video. This types of data require a lot of storage capacity and transmission bandwidth. Consequently, the theory of data compression becomes more significant for reducing the data redundancy in order to save more transfer and storage of data. This paper addresses the problem of the lossy compression of images. I will be discussing how Linear algebra can be used in the compression of images. Basically I will be discussing how SVD techniques are extensively used in image compression process resulting in saving computer’s memory. The basic idea here is each image can be represented as a matrix and we apply SVD on this matrix and get a reduced matrix out of this original matrix and the image corresponding to this reduced matrix requires much lesser storage space as compared to the original image. Keywords: Image compression, singular value decomposition, quadratic form, Gram– Schmidt process, image processing, rank, orthogonal, orthonormal basis, diagonal matrix. IMAGE PROCESSING USING SINGULAR VALUE DECOMPOSITION (SVD) I. INTRODUCTION 70
  2. The Singular Value Decomposition (SVD) is a generalization of the eigen- decomposition used to analyze rectangular matrices. Its play an important role in many applications: physical and biological processes, mathematical models in economics, data mining applications to rank documents in very large databases, including the Web, image processing applications, etc. In this paper, we will study the SVD applied to the image compression. Image compression is a type of data compression that involves encoding information in images using fewer bits than the original image representation. The main ideal of image compression is reducing the redundancy of the image and the transferring data in an efficient form. The image compression takes an important place in several domains like web designing, in fact, maximally reduce an image allows us to create websites faster and saves bandwidth users, it also reduces the bandwidth of servers and thus save time and money. When taking about compression, we generally take into account two aspects: image size in pixels and its degree of compression. The nature of the image is also playing a significant role. The main goal of system is to reduce the storage quantity as much as possible while ensuring that the decoded image displayed in the monitor can be visually similar to the original image as much as possible. II. IMAGE PROCESSING USING SVD An 𝑚 × 𝑛 pixels image can be represented b 𝑚 × 𝑛 matrix. Suppose we have a 12 megapixel gray-scale image, which is 2000 × 2000 pixels or 2000 × 2000 matrix. For each pixel, we have some level of black and white color, given by some integer between 0 and 255. 0 representing black color and 255 representing white color. Each of these integers (and hence each pixel) requires approximately 1 byte to store, resulting in approximately 8.6 Mb image. A color image usually has three components, a red, a green, and a blue. Each of these is represented by a matrix, hence storing color images requires times the space (25.8 Mb). 2.1 Singular Value Decomposition (SVD) The diagonalization theorem plays a part in many interesting applications. Unfortunately, as we know, not all matrices can be factored as 𝐴 = 𝑃 𝐷𝑃 −1 with 𝐷 diagonal. However, a factorization 𝐴 = 𝑃 𝐷𝑃−1 is possible for any 𝑚 × 𝑛 matrix 𝐴. A special factorization of this type, called the singular value decomposition, is one of the most useful matrix factorization in applied linear algebra. The singular value decomposition is based on the following property of the ordinary diagonalization that can be imitated for rectangular matrices: The absolute values of the eigenvalues of a symmetric matrix 𝐴 measure the amounts that 𝐴 stretches or shrinks certain vectors (the eigenvectors). If 𝐴𝑥 = 𝜆𝑥 and ∥ 𝑥 ∥ = 1, then ∥ 𝐴𝑥 ∥ = ∥ 𝜆𝑥 ∥ = |𝜆| ∥ 𝑥 ∥ = |𝜆| (1) 71
  3. If 𝜆1 is the eigenvalue with the greatest magnitude, then a corresponding unit eigenvector 𝑣1 identifies a direction in which the stretching effect of 𝐴 is greatest. That is, the length of 𝐴𝑥 is maximized when 𝑥 = 𝑣1, and ∥ 𝐴𝑣1 ∥ = |𝜆1 |, by (1). This description of 𝑣1 𝑎𝑛𝑑 |𝜆1 | has an analogue for rectangular matrices that’s will lead to singular value decomposition. 4 11 14 Example 1 If 𝐴 = [ ], then the linear transformation 𝑥 ↦ 𝐴𝑥 maps the 8 7 −2 unit sphere {𝑥 ∶∥ 𝑥 ∥ = 1} 𝑖𝑛 ℝ3 onto an ellipse in ℝ2 . Find a unit vector x at which the length ∥ 𝐴𝑥 ∥ is maximized, and compute this maximum length. SOLUTION The quantity ∥ 𝐴𝑥 ∥2 is maximized at the same x that maximizes ∥ 𝐴𝑥 ∥, and ∥ 𝐴𝑥 ∥2 is easier to study. Observe that ∥ 𝐴𝑥 ∥2 = (𝐴𝑥)𝑇 (𝐴𝑥 ) = 𝑥 𝑇 𝐴𝑇 𝐴𝑥 = 𝑥 𝑇 (𝐴𝑇 𝐴)𝑥 Also, 𝐴𝑇 𝐴 is a symmetric matrix, since (𝐴𝑇 𝐴)𝑇 = 𝐴𝑇 𝐴𝑇𝑇 = 𝐴𝑇 𝐴. So the problem now is to maximize the quadratic form 𝑥 𝑇 (𝐴𝑇 𝐴)𝑥 subject to the constraint ∥ 𝑥 ∥ = 1. The maximum value is attained at a unit eigenvectors of 𝐴𝑇 𝐴. Also, the maximum value is attained at a unit eigenvector of 𝐴𝑇 𝐴 corresponding to 𝜆1 . For the matrix 𝐴 in this example, 4 8 80 100 40 4 11 14 𝐴𝑇 𝐴 = [11 7 ] [ ] = [100 170 140] 8 7 −2 15 −2 40 140 200 The eigenvalues of 𝐴𝑇 𝐴 are 𝜆1 = 360, 𝜆2 = 90, 𝑎𝑛𝑑 𝜆3 = 0. Corresponding unit eigenvectors are, respectively, 1/3 −2/3 2/3 𝑣1 = [2/3 ] , 𝑣2 = [−1/3 ] , 𝑣3 = [−2/3 ] 2/3 2/3 1/3 The maximum value of ∥ 𝐴𝑥 ∥2 is 360, attained when x is the unit vector 𝑣1 . The vector 𝐴𝑣1 is a point on the ellipse, namely, 1/3 4 11 14 2/3 18 𝐴𝑣1 = [ ][ ]=[ ] 8 7 −2 6 2/3 For ∥ 𝑥 ∥ = 1, the maximum value of ∥ 𝐴𝑥 ∥ is ∥ 𝐴𝑣1 ∥ = √360 = 6√10. ∎ 72
  4. Example 1 suggest that the effect of 𝐴 on the unit sphere in ℝ3 is related to the quadratic form 𝑥 𝑇 (𝐴𝑇 𝐴)𝑥. In fact, the entire geometric behavior of the transformation 𝑥 ↦ 𝐴𝑥 is captured by this quadratic form, as we shall see. The Singular Values of 𝒎 × 𝒏 Matrix Let 𝐴 be an 𝑚 × 𝑛 matrix. Then 𝐴𝑇 𝐴 is symmetric and can be orthogonally diagonalized. Let {𝑣1 , … , 𝑣𝑛 } be an orthonormal basis for ℝ𝑛 consisting of eigenvectors of 𝐴𝑇 𝐴, and let 𝜆1 , … , 𝜆𝑛 be the associated eigenvalues of 𝐴𝑇 𝐴. Then, for 1 ≤ 𝑖 ≤ 𝑛, ∥ 𝐴𝑣𝑖 ∥2 = (𝐴𝑣𝑖 )𝑇 𝐴𝑣𝑖 = 𝑣𝑖𝑇 𝐴𝑇 𝐴𝑣𝑖 = 𝑣𝑖𝑇 (𝜆𝑖 𝑣𝑖 ) = 𝜆𝑖 (2) So the eigenvalues of 𝐴𝑇 𝐴 are all nonnegative. By renumbering, if necessary, we may assume that the eigenvalues are arranged so that 𝜆1 ≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝑛 ≥ 0 The singular value of 𝐴 are the square roots of the eigenvalues of 𝐴𝑇 𝐴, denoted by 𝜎1 , … . , 𝜎𝑛 and they are arranged in decreasing order. That is, 𝜎𝑖 = √𝜆𝑖 for 1 ≤ 𝑖 ≤ 𝑛. By equation (2), the singular values of A are the lengths of the vectors 𝐴𝑣1 , … , 𝐴𝑣𝑛 . THEOREM 1 Suppose {𝑣1 , … , 𝑣𝑛 } is an orthonormal basis of ℝ𝑛 consisting of eigenvectors of 𝐴𝑇 𝐴, arranged so that the corresponding eigenvalues of 𝐴𝑇 𝐴 sastify 𝜆1 ≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝑛 , and suppose 𝐴 has 𝑟 nonzero singular values. Then {𝐴𝑣1 , … , 𝐴𝑣𝑛 } is an orthogonal basis for 𝐶𝑜𝑙 𝐴, and 𝑟𝑎𝑛𝑘 𝐴 = 𝑟. NUMERICAL NOTE In some cases, the rank of A may be very sensitive to small changes in the entries of A. The obvious method of counting the number of pivot columns in A does not work well if A is row reduced by a computer. Round off error often creates an echelon form with full rank. In practice, the most reliable way to estimate the rank of a large matrix A is to count the number of nonzero singular values. In this case, extremely small nonzero singular values are assumed to be zero for all practical purposes, and the effective rank of the matrix is the number obtained by counting the remaining nonzero singular values. THE SINGULAR VALUE DECOMPOSITION The decomposition of 𝐴 involves an 𝑚 × 𝑛 “diagonal” matrix Σ of the form 73
  5. 𝐷 0 Σ= [ ] (3) 0 0 where 𝐷 is an 𝑟 × 𝑟 diagonal matrix for some r not exceeding the smaller of 𝑚 and 𝑛. If 𝑟 equals 𝑚 or 𝑛 or both, some or all of zero matrices do not appear. Definition: (The Singular Value Decomposition) Let A be an 𝑚 × 𝑛 matrix with rank 𝑟. Then there exist an 𝑚 × 𝑛 𝑚𝑎𝑡𝑟𝑖𝑥 Σ for which the diagonal entries in 𝐷 are the first 𝑟 singular values of 𝐴, 𝜎1 ≥ 𝜎2 ≥ … ≥ 𝜎𝑟 > 0, and there exist an 𝑚 × 𝑚 orthogonal matrix 𝑈 and an 𝑛 × 𝑛 orthogonal matrix 𝑉 such that 𝐴 = 𝑈 Σ 𝑉𝑇 Any factorization 𝐴 = 𝑈 Σ 𝑉 𝑇 , with 𝑈 and 𝑉 orthogonal, Σ as in (3), and positive diagonal entries in D, is called a singular value decomposition (or SVD) of 𝐴. The matrices 𝑈 and 𝑉 are not uniquely determined by 𝐴, but the diagonal entries of Σ are necessarily the singular values of 𝐴. The columns of 𝑈 in such a decomposition are called left singular vectors of 𝐴, and the columns of 𝑉 are called right singular vectors of 𝐴. 1 −1 EXAMPLE 2 Find a singular value decomposition of 𝐴 = [−2 2 ] 2 −2 9 −9 SOLUTION First, compute 𝐴𝑇 𝐴 = [ ]. The eigenvalues of 𝐴𝑇 𝐴 are 18 and 0, −9 9 with corresponding unit eigenvectors 1/√2 1/√2 𝑣1 = [ ], 𝑣2 = [ ] 1/√2 1/√2 These unit vectors form the columns of 𝑉: 1/√2 1/√2 𝑉 = [ 𝑣1 𝑣2 ] = [ ] −1/√2 1/√2 The singular value are 𝜎1 = √18 = 3√2 𝑎𝑛𝑑 𝜎2 = 0. Since there is only one nonzero singular value, the matrix 𝐷 may be written as a single number. That is, 𝐷 = 3√2 . The matrix Σ is the same size as 𝐴, with 𝐷 in its upper left corner: 𝐷 0 3√2 0 [ Σ= 0 0] = [ 0 0] 0 0 0 0 To construct 𝑈, first construct 𝐴𝑣1 and 𝐴𝑣2 : 74
  6. 2/√2 0 𝐴𝑣1 = [−4/√2], 𝐴𝑣2 = [0] 4/√2 0 As a check on the calculations, verify that ∥ 𝐴𝑣1 ∥ = 𝜎1 = 3√2 . Of course, 𝐴𝑣2 = 0 because ∥ 𝐴𝑣2 ∥ = 𝜎2 = 0. The only column found for 𝑈 so far is 1/3 1 𝑢1 = 𝐴𝑣1 = [ −2/3 ] 3√2 2/3 The other columns of 𝑈 are found by extending the set {𝑢1 } to an orthonormal basis for ℝ3 . In this case, we need two orthogonal unit vectors u2 and u3 that are orthogonal to u1. (See Figure 3.) Each vector must satisfy 𝑢1𝑇 𝑥 = 0, which is equivalent to the equation 𝑥1 − 2𝑥2 + 2𝑥3 = 0. A basis for the solution set of this equation is 2 −2 𝑤1 = [1], 𝑤2 = [ 0 ] 0 1 (Check that 𝑤1 𝑎𝑛𝑑 𝑤2 are each orthogonal to 𝑢1 ). Apply the Gram–Schmidt process (with normalizations) to {𝑤1 , 𝑤2 }, and obtain 2/√5 −2/√45 𝑢1 = [1/√5], 𝑤2 = [ 4/√45 ] 0 5/√45 Finally, set 𝑈 = [𝑢1 𝑢2 𝑢3 ], take Σ 𝑎𝑛𝑑 𝑉 𝑇 from above, and write 1 −1 1/3 1/√5 −2/√45 3√2 0 1/√2 −1/√2 𝐴 = [−2 2 ] = [−2/3 1/√5 4/√45 ] [ 0 0] [ ] 2 −2 −1/√2 1/√2 2/3 0 5/√45 0 0 2.2 An Application to Image Processing Each piece is a column vector times a row vector. An 𝑚 by 𝑛 matrix has m times n entries (a big number when the matrix represents an image). But a column and a row only have 𝑚 + 𝑛 components, far less than m times n. Those (column)(row) pieces are full size matrices that can be processed with extreme speed—they need only 𝑚 plus 𝑛 numbers. Unusually, this image processing application of the SVD is coming before the matrix algebra it depends on. I will start with simple images that only involve one or two pieces. Right now I am thinking of an image as a large rectangular matrix. The entries 𝑎𝑖𝑗 tell the grayscales of all the pixels in the image. Think of a pixel as a small square, 𝑖 steps across and j steps up from the lower left corner. Its grayscale is a number (often a whole number in the range 0 ≤ 𝑎𝑖𝑗 < 256 = 28 ). An all-white pixel has 𝑎𝑖𝑗 = 255 = 11111111. That number has eight 1’s when the computer writes 255 in binary notation. 75
  7. You see how an image that has 𝑚 times 𝑛 pixels, with each pixel using 8 bits (0 or 1) for its grayscale, becomes an 𝑚 by 𝑛 matrix with 256 possible values for each entry 𝑎𝑖𝑗 . In short, an image is a large matrix. To copy it perfectly, we need 8 (m)(n) bits of information. High definition television typically has 𝑚 = 1080 and 𝑛 = 1920. Often there are 24 frames each second and you probably like to watch in color (3 color scales). This requires transmitting (3)(8) (48, 470, 400) bits per second. That is too expensive and it is not done. The transmitter can’t keep up with the show. When compression is well done, you can’t see the difference from the original. Edges in the image (sudden changes in the grayscale) are the hard parts to compress. Major success in compression will be impossible if every 𝑎𝑖𝑗 is an independent random number. We totally depend on the fact that nearby pixels generally have similar grayscales. An edge produces a sudden jump when you cross over it. Cartoons are more compressible than real-world images, with edges everywhere. Lower Rank Image Example The easiest images to compress are all black or all white or all a constant grayscale 𝑔. The matrix 𝐴 has the same g in every entry: 𝑎𝑖𝑗 = 𝑔. When 𝑔 = 1 and 𝑚 = 𝑛 = 6, here is an extreme example of the central SVD dogma of image processing: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Example Don’t send 𝐴 = 1 1 1 1 1 1 send this 𝐴 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [ 1 1 1 1 1 1] [ 1] 36 numbers become 12 numbers. With 300 by 300 pixels, 90, 000 numbers become 600. And if we define the all-ones vector x in advance, we only have to send one number. That number would be the constant grayscale g that multiplies 𝑥𝑥 𝑇 to produce the matrix. Of course this first example is extreme. But it makes an important point. If there are special vectors like x = ones that can usefully be defined in advance, then image processing can be extremely fast. Example Suppose the flag has two triangles of different colors. The lower left triangle has 1’s and the upper right triangle has 0’s. The main diagonal is included with the 1’s. Here is the image matrix when 𝑛 = 4. It has full rank 𝑟 = 4 so it is invertible: 1 0 0 0 1 0 0 0 Triangular matrix 𝐴 = [ 1 1 0 0 ] and 𝐴−1 = [ −1 1 0 0] 1 1 1 0 0 −1 1 0 1 1 1 1 0 0 −1 1 With full rank, 𝐴 has a full set of 𝑛 singular values 𝜎 (all positive). The SVD will produce n pieces 𝜎𝑖 𝑢𝑖 𝑣𝑖𝑇 of rank one. Perfect reproduction needs all 𝑛 pieces. In 76
  8. compression small 𝜎 ′ 𝑠 can be discarded with no serious loss in image quality. We want to understand and plot the 𝜎 ′ 𝑠 for 𝑛 = 4 and also for large 𝑛. Working by hand, we begin with 𝐴𝐴𝑇 (a computer proceed differently): 1 1 1 1 2 −1 0 0 𝐴𝐴𝑇 = [1 2 2 2] and (𝐴𝐴𝑇 )−1 = (𝐴−1 )𝑇 𝐴−1 = [−1 2 −1 0 ] 1 2 3 3 0 −1 2 −1 1 2 3 4 0 0 −1 1 That −1,2, −1 inverse matrix is included because its eigenvalues all have the form 2 − 2 cos 𝜃. So we know the 𝜆′𝑠 for 𝐴𝐴𝑇 and the 𝜎 ′ 𝑠 for 𝐴: 1 1 1 𝜆 = 2−2 cos 𝜃 = 4𝑠𝑖𝑛2 (𝜃/2) gives 𝜎 = √𝜆 = 2sin(𝜃/2). The 𝑛 different angles 𝜃 are equally spaced, which makes this example so exceptional: 𝜋 3𝜋 (2𝑛 − 1)𝜋 3𝜋 𝜃 𝜃= , ,…, (𝑛 = 4 𝑖𝑛𝑐𝑙𝑢𝑑𝑒𝑠 𝜃 = 𝑤𝑖𝑡ℎ 2𝑠𝑖𝑛 = 1) 2𝑛 + 1 2𝑛 + 1 2𝑛 + 1 9 2 That special case gives 𝜆 = 1 as an eigenvalue of 𝐴𝐴𝑇 when 𝑛 = 4. So 𝜎 = √𝜆 =1 is a value of singular 𝐴. You can check that the vector 𝑢 = (1,1,0, −1) has 𝐴𝐴𝑇 𝑢 = 𝑢 (a truly special case). The important point is to graph the n singular values of A. Those numbers drop off (unlike the eigenvalues of A, which are all 1). But the drop off is not steep. So the SVD gives only moderate compression of this triangular flag. Your faithful author has continued research on the ranks of flags. Quite a few are based on horizontal or vertical stripes. Those have rank one—all rows or all columns are multiples of the ones vector (1, 1, …, 1). Armenia, Austria, Belgium, Bulgaria, Chad, Colombia, Ireland, Madagascar, Mali, Netherlands, Nigeria, Romania, Russia (and more) have three stripes. Indonesia and Poland have two! At the other extreme, many flags include diagonal lines. Those could be long diagonals as in the British flag. Or they could be short diagonals coming from the edges of a star as in the US flag. The text example of a triangle of ones shows how those flag matrices will have large rank. The rank increases to infinity as the pixel sizes get small. Other flags have circles or crescents or various curved shapes. Their ranks are large and also increasing to infinity. These are still compressible! The compressed image won’t be perfect but our eyes won’t see the difference (with enough term 𝜎𝑖 𝑢𝑖 𝑣𝑖𝑇 from the SVD). Those examples actually bring out the main purpose of image compression. III. CONCLUSION 77
  9. Despite the attention it has received in the last years, SVD in image processing is still in its infancy. Many SVD characteristics are still unutilized in image processing. Using SVD techniques, we can save a lot of computer memory which be used for other purpose. REFERENCES [1] M Moonen, P van Dooren, J Vandewalle, “Singular value decomposition updating algorithm for subspace tracking”, SIAM Journal on Matrix Analysis and Applications (1992) [2] H. C. Andrews and C. L. Patterson, “Singular value decompositions and digital image processing,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. ASSP-24, pp. 26–53, 1976. [3] David C. Lay, Steven R. Lay, Judi J. McDonald (2016), Linear Algebra and Its Applications, 5th Ed., USA: Pearson.
 [4] D. V. S. Chandra, “Digital Image Watermarking Using Singular Value Decomposition,” Proceeding of 45th IEEE Midwest Symposium on Circuits And Systems, pp. 264-267, Tulsa, OK, August 2002. [5] R. Liu and T. Tan, “A SVD-Based Watermarking Scheme for Protecting Rightful Ownership,” IEEE Transaction on Multimedia, 4(1), pp.121- 128, March 2002 [6] R. Karkarala and P.O. Ogunbona, ”Signal Analysis Using a Multiresolution Form of the Singular Value Decomposition,” IEEE Trans. Image Processing, vol. 10, pp. 724-735, May 2001. 78
nguon tai.lieu . vn