Image | Wynn186

In the old age where a diskette (ha! hipsters can relate) is equivalent to today’s USB in terms of usefulness, Storage was once a problem, and in finding a way to solve this, they have come across what is currently known as image compression, which is still being used today (hello JPEG). In this activity, we show how to compress images using Principal Components Analysis (PCA).

Basically what PCA does is that it converts a set of correlated signals into orthogonal functions whose sums (with their respective coefficients) can reconstruct any of the signals. (I do not want to indulge in here. :PPP) If we have a set of functions, f1, f2, f3, …. , fn, then we can obtain a set of orthogonal functions W1, W2, W3… Wm such that:

$f_i = \sum\limits_{n=1} ^{m}a_n,_i W_n$

Where $a_n,_i$ is the coefficient for the function $W_n$ . The functions $W_n$ are called the principal components. However, each $W_n$ has a different percent contribution to the reconstruction of the image. In Scilab, $W_n$ and $a_n$ can be obtained using a single line of code. . If X is the matrix containing the signals (each column being an individual signal), the principal components and the coefficients can be obtained by:

$[lambda, facpr, comprinc] = pca(X);$

where facpr is the matrix containing the principal components and comprinc the matrix containing the coefficients. Lambda gives the percent contribution of each principal components to the signals.

In image compression, what we do is that we cut our image into $NxN$ parts. Then we convert each part to a single matrix with size $1 x N^2$ . Each $1 x N^2$ matrix will be one signal. Then we concatenate all the signals into one matrix which will be input to the pca. For this activity, we use $N = 10$ .

There will be $N^2 = 100$ principal components. What we will then do is try to reconstruct the images using the principal components. Because each principal component have different percent contribution, we will investigate what will happen to the reconstruction if we use only n principal components (0<n<100).

We use a picture of a cat (well its a tiger but…. still a cat).

A cat.

In the figure below, we show the difference of the original picture (in grayscale) to the reconstruction images.

The topleft figure is the grayscale image of the cat. The top center is for n=1, top right is for n = 5, bottom left is n = 10, bottom center is n = 50 and bottom right is n = 100.

From the figure above, we can see that the higher the number of principal components used, the better the quality. However, it can be observed that for n = 50 and n = 100, the images do not vary much. This is due to the percent contribution of the principal components.

In the next figure, we show what the principal components look like.

The principal components are each 10 x 10 part of this picture.

We show another example. This one came from the anime Fate zero, where the protagonist is a female swordsman…..well you can search it if you like.

The number of principal components are the same as the cat figure.

The author gives himself a score of 9/10 for this work. He acknowledges Mr. Tingzon for the help. 🙂

Sources:

Wikipedia

Dr. Maricor Soriano

We try to enhance images in the Frequency Domain (hello again, FFT. Haha). For some background, it is recommended to read on the Fourier transform first. 🙂

Anyways, Image enhancement is almost always done in the frequency domain. This is because noise and spurious parts of the image cannot be removed from the image itself. However, these unwanted distortions can be removed in the frequency domain since they have different (higher) frequencies than the important details of the image. What generally happens is we multiply a mask in the frequency domain to remove the higher frequencies. But if we multiply in the frequency domain, we are actually performing a convolution in the image domain. So before everything else, lets discuss convolution! (For a little bit of background, you may want to read on my previous post. Or search it on google. Or read arfken. Whichever.)

Convolution is actually smearing of two functions to create another function that has the properties the two functions. This is in contrast to correlation, where the resulting function is the similarity of the two functions. In more simplified words, convolution is a union of properties, while correlation is an intersection of properties. A convolution is presented below:

$f(x,y) = \int \int h(x',y')g(x-x', y-y') dx' dy'$

or:

$f = g*h$

The integral is quite cumbersome to work with, especially when coding. There are some programming languages that have a convolution function, but generally the integral itself has to be worked out. Thankfully, performing the Fourier transform simplifies things:

$\mathcal{F}(f = g*h) \Rightarrow F = GH$

where the $\mathcal {F}$ is the Fourier transform operator. I have already performed some convolution myself. Figure 1. shows a the result of convolving circles of different sizes and their Fourier transforms.

Figure 1. The left side shows the result of the convolution in the image domain. Convolving with a sum of Dirac delta functions is just placing the center of the image in that position. In the frequency domain, the convolution is just the multiplication of the Fourier transform of each function.

The Fourier transform of a two-Dirac delta function which is symmetric wrt the y-axis is a sinusoid while the Fourier transform of a circle is an Airy pattern. The airy pattern is very prominent in the right side, though the sinusoid is not that discernible. This could be due to the fact that the Dirac deltas are far from the center, hence they are high frequency. High frequency in images are the ones that cannot be discerned if the image is rescaled to a smaller size.

We show some more convolution. This time the same Dirac delta function and a square (figure 2), and a Gaussian (figure 3).

Figure 2. Convolution of a square and the Dirac delta function.

Figure 3. Convolving a gaussian with the same Dirac delta function.

The above images show the end result when the convolution is done in one plane; multiplication is done on the other. However, I haven’t really shown you what really happens when you convolve two things in the same plane. Figure 4 show what happens when you convolve two functions.

We generate Dirac deltas in random places and convolve them with the following patterns (Im sorry, i wasn’t able to convolve them with the same set of Dirac deltas. Since the position are random, the Dirac deltas are not the same. :[..)

Figure 4. n is the number of Dirac deltas. A comparison shows that for the x pattern, we can observe ghost spectra.

Figure 4 shows the convolution between Dirac deltas and square function, and Dirac deltas and the x pattern. The square and the x pattern is simply place where the Dirac delta is placed.

Figure 5 shows the Fourier transform of uniformly distributed Dirac deltas in the x and y axis, with different distances from each other.

Figure 5. The FFT of Dirac deltas in the x and y-direction with different distances from each other (1, 2, 4, and 8).

Image enhancement

Now we finally get to enhance some image! First off, we’ll try to enhance the famous image, the image from the lunar orbiter, shown in figure 6.

Figure 6. The image from the lunar orbiter.

As observed, there are a lot of spurious white lines in the image. This is because the image was transmitted part by part so there was some irregularity. We try to remove the whitelines by multiplying a mask in the Fourier domain. Applying FFT:

Figure 7. The image is in log scale since the low frequency details have higher magnitude such that all the other details seem black.

We can see a lot of white noise, though their magnitude is smaller. If we refer to figure 5, we can estimate the frequency of the inconspicuously white lines, and in figure 7, that would be the straightlines. We multiply a mask, an example of which is shown in the next figure.

Figure 8. The mask that was used.

Observing figure 8, we may point out that the center is being left out. This is because, in many images, the important details often have low spatial frequency (of course this applies to most signals as well), so we have to retain those information. We multiply figure 8 and 7 then apply IFFT. The result is shown in figure 9.

Figure 9. The output of applying a mask in the Fourier domain.

In figure 9, the white lines become lesser. However, they can still be seen. It is recommended to thicken the lines in figure 8 to remove more. However, this has the probability of removing some important detail, so we did not do it here.

In figure 10, we show another example of an enhanced image, a painting by Dr. Vincent Daria, Fredericksborg.

Figure 10. The Fredericksborg.

The top left image is the painting before doing anything (converted to grayscale). The lower left is the FFT then the lower right is the mask. After applying the mask to the FFT of the image and applying IFFT, the top right image is obtained. It is obviously cleaner than the one on the right. The brush strokes are removed.

The author gives himself 10/10 for completing the minimum requirements.

acknowledgements to Abby for helping.

Wynn186

Scientific thinking

Tag Archives: Image

Image Compression (PCA)

Image Enhancement (Frequency)