Novel Approach to Content Based Image Retrieval Using Evolutionary Computing

Content Based Image Retrieval (CBIR) is an active research area in multimedia domain in this era of information technology. One of the challenges of CBIR is to bridge the gap between low level features and high level semantic. In this study we investigate the Particle Swarm Optimization (PSO), a stochastic algorithm and Genetic Algorithm (GA) for CBIR to overcome this drawback. We proposed a new CBIR system based on the PSO and GA coupled with Support Vector Machine (SVM). GA and PSO both are evolutionary algorithms and in this study are used to increase the number of relevant images. SVM is used to perform final classification. To check the performance of the proposed technique, rich experiments are performed using coral dataset. The proposed technique achieves higher accuracy compared to the previously introduced techniques (FEI, FIRM, simplicity, simple HIST and WH).


INTRODUCTION
Volatile increase in digital contents has made Content Based Image Retrieval (CBIR) an attractive research area in multimedia domain.Different areas such as fashion design, education, entertainment and art galleries generate huge image databases for us to deal with.To get optimum benefits from these databases, an efficient CBIR system is needed.Researchers of multimedia domain are aggressively working to improve the performance of CBIR.However, one of the challenges of CBIR is to minimize the gap between low level features and high level semantics.In the beginning, low level image features like color, texture and shape features became the focus of multimedia researchers.Low level features based CBIR techniques are easy to implement and perform well for simple images.However, it is very difficult for visual feature to elaborate the semantic of an image.These algorithms have several limitations when deal with the broad content of image database.Therefore, to enhance the performance of the CBIR, region based retrieval through image segmentation was launched.These algorithms tried to cover up the problem of low level features by representing the image in multiple objects, which is considered closest to the observation of a human visual system.But these techniques are mainly dependent on the segmentation results.Poor segmentation leads to lower accuracy.The difference between what the user desire and the representation of the image are known as semantic gap.To reduce the semantic gap, relevance feedback was introduced.The objective of using relevance feedback is to capture the human perception subjectivity.The primary goal behind the relevance feedback is to include the human perception subjectivity into the query process and provide user with the opportunity to assess retrieval results.These assessments refine the similarity measures.However, even though the relevance feedback is a powerful tool to enhance the performance of CBIR, it still experiences a few problems (Han et al., 2005).Many users prefer not to provide feedback to the system repeatedly; they want results in the first iteration.The semantic based retrieval algorithms try to find out the actual semantic meaning of an image and utilize it for similar image retrieval.However, finding the semantic is a high level cognitive job and difficult to automate.In this study, we have proposed a Support Vector Machine (SVM) based CBIR system which is supported by the Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) namely PSOGA-SVM.Many researchers used SVM for the traditional retrieval methods and relevance feedback (Seo, 2007;Yildizer et al., 2012;Saadatmand-Tarzjan and Moghaddam, 2007).However, the performances of these systems are not very good.They consider the retrieval as a classification problem where in the training set, they used the relevant and irrelevant images marked by the user as feedback.Due to the imbalance set of feedback labeling i.e., the number of relevant feedbacks are less than the number of irrelevant feedbacks, these techniques do not perform as good as the users desire.Stagnation is also considered as a critical problem, when the search process converges to a suboptimal local solution.If the database size increases, then this problem can occur more frequently.To resolve this problem, we have proposed a method in which we train the support vector machine on the best swarms of the particle swarm optimization algorithm.We use the PSO algorithm proposed by Kennedy and Eberhart (1995).PSO is not only an optimization algorithm but also an effective space exploration technique which avoids premature convergence of populations.
Particle swarm optimization: Particle Swarm Optimization (PSO) is a stochastic technique, presented by Kennedy and Eberhart (1995).The algorithm simulates the behavior of bird flock flying together in multi dimensional space in search of some optimum place by adjusting their movements and distances for better search.PSO is similar in nature with other evolutionary computing algorithm like Genetic Algorithm (GA).PSO starts with a random initialization of the swarms while searching for an optimal solution by updating the generations.This study investigates how PSO can be used to improve the performance of CBIR, while combining it with genetically optimized SVM.
PSO has two models, the social and cognition models.Social model is used for global search, while local search is performed by cognition model.The idea of this algorithm is to find the global optimum through flying particles in search space.In PSO, P i ∈ [a, b] represent a particle where i = 1, 2, 3. D and a, b ∈ R, D represents dimensions and R represent real numbers (Hajira Jabeen and Baig, 2009).Each particle has its own position and velocity, which are initialized randomly at the start.Once initialization is done, the particles have to find the new best positions.Every particle maintains the local and global positions called p best and g best , respectively.Following equations are used to update the position and velocity of each particle: where, X i is the position, V i is the velocity, P best is the personal best position and g best is the global best position for PSO.Similarly r 1 and r 2 are two random numbers their range is [0, 1], C 1 and C 2 are learning factors representing the cognition and cognition component influential respectively.
To improve the performance of PSO, different variants have been proposed by the researchers (Imran et al., 2010(Imran et al., , 2011(Imran et al., , 2012)).The detail about PSO variants can be found in Imran et al. (2013).The performance of PSO has been proven in different practical problems (Parsopoulos and Vrahatis, 2002) including several domains as a good optimizer, the PSO has proved its performance for different practical problems (Parsopoulos and Vrahatis, 2002) and in several domains like the classification of the digital contents (Chandramouli, 2007), ad-hoc sensor network (Yuan et al., 2004), antenna's array designing (Gies and Rahmat-Samii, 2003) and neural networks (Liu et al., 2004).Researchers of multimedia domain also explored PSO to improve the performance of CBIR.Self Organized feature Maps (SOM) are optimized using PSO by Chandramouli et al. (2008).Image retrieval ranking has been improved using PSO, done by Okayama et al. (2008).Wu et al. (2010) applied PSO to fine tune the weights of parameters in similarity computation.To grasp the user semantics, PSO was used by Broilo and De Natale (2010) where PSO was used as a classifier.

Swarm representation:
The particles of PSO are represented by the feature set and each particle is a feature vector.The particles fly in the search space generated by the features of the image database.To extract the feature set, we used the Color Layout Descriptor (CLD) from MPEG-7 and Haar wavelets.

Color layout descriptor:
Color Layout Descriptor is one of the descriptor from the MPEG-7 descriptors.To describe the spatial distribution of color in an arbitrary shaped region, the CLD is the best descriptor (Sikora, 2001).There are four stages to extract the CLD.In the first stage, input image is partitioned into 64 (8×8) blocks.In the second stage, a single representative color is selected for each block.As a result, a tiny image representation of size 8×8 is obtained.In the third stage, each of the three color components are transformed by 8×8 DCT.Three sets of 64 DCT coefficients are obtained.Following equation is used to calculate the DCT in a 2D array.Following formula is used: The values B pq are called the DCT coefficients of A. In the final stage zigzag scanning is performed for each set of coefficients (Fig. 1).
Color feature vector: After zigzag scanning, we obtain three matrices for each block of Y, Cb and Cr color space.Three feature vectors for an image can be obtained by taking the sum of the three matrices.The resulting feature vector is obtained by horizo concatenating the three feature vectors.

Wavelet packets:
To extract texture features, we used the wavelets packets.This is a generalization of wavelet decomposition and can be described by the collection of functions {WJ (x) |J∈ Z+}, obtained fro 2009): where, p is a scale index, l is the translation index (x) = (x) is the scaling function, W 1 basic wavelet function and h m and g m mirror filters.Wavelet packets are well localized in both time and frequency and thus provide an attractive alternative to pure frequency (Fourier) analysis.For a given orthogonal wavelet function, we obtain a library of bases called wavelet packet bases.Each of these App.Sci. Eng. Technol., 8(6): 691-701, 2014 693 at level 2 (5) are called the DCT coefficients of A. In the final stage zigzag scanning is performed for After zigzag scanning, we obtain three matrices for each block of Y, Cb and Cr color space.Three feature vectors for an image can be obtained by taking the sum of the three matrices.The resulting feature vector is obtained by horizontally To extract texture features, we used the wavelets packets.This is a generalization of wavelet decomposition and can be described by the collection of Z+}, obtained from (Rao et al., is the is the quadratic Wavelet packets are well localized in both time and frequency and thus provide an attractive alternative to pure frequency (Fourier) analysis.For a wavelet function, we obtain a library of bases called wavelet packet bases.Each of these bases offers a particular way of coding signals, reconstructing exact features and preserving global energy.The inverse relationship between wavelet packets of different scales can be seen in Rao (2009): Equation ( 8) can be used to calculate the wavelet packets.Coefficients of coarser scale can be calculated using Eq. ( 6) and ( 7).The main difference between normal wavelet decomposition and wavelet packets decomposition is that despite of just splitting the approximation components, wavelet packets decomposes the detail components as well.So, by using this method, rich analysis becomes possible.
Wavelet packets procedure results in a large number of decompositions and its explicit enumerations are unmanageable.Therefore, it is necessary to find the optimal decomposition with respect to some reasonable criterions.One convenient criterion can be the selection of tree nodes on the basis of best entropy values (Fig. 2).
In this study, we have used Shannon entropy measure to calculate the entropy.This can be calculated as: Using the Shannon entropy, the optimal or the best tree can be calculated using the following scheme.A node N will be split into two nodes N only if the sum of the entropy of N 1 the entropy of N.This is a local criterion based only on the information available at node N. It will create a form of a tree, which is of much smaller size than the actual tree.
Shannon entropy based wavelet packets are used to generate the signature of the database images up to 3 level.The formula used to generate the Coiflets wavelets is given as: can be used to calculate the wavelet packets.Coefficients of coarser scale can be calculated using Eq. ( 6) and ( 7).The main difference between normal wavelet decomposition and wavelet packets decomposition is that despite of just splitting the tion components, wavelet packets decomposes the detail components as well.So, by using this method, rich analysis becomes possible.
Wavelet packets procedure results in a large number of decompositions and its explicit enumerations efore, it is necessary to find the optimal decomposition with respect to some reasonable criterions.One convenient criterion can be the selection basis of best entropy values In this study, we have used Shannon entropy measure to calculate the entropy.This can be calculated (9) Using the Shannon entropy, the optimal or the best tree can be calculated using the following scheme.A nodes N 1 and N 2 , if and 1 and N 2 is less than the entropy of N.This is a local criterion based only on the information available at node N. It will create a form of a tree, which is of much smaller size than the Shannon entropy based wavelet packets are used to signature of the database images up to 3 rd level.The formula used to generate the Coiflets (10) computed wavelet signature (texture representation of the intensity value of all size of the sub image (Rao et al., 2009)

GENETIC ALGORITHM
Genetic Algorithm is an evolutionary algorithm based on the nature evaluation procedure (Goldberg, 1989).The solutions of GA are represented as chromosomes known as population.One population is used to generate a new population.The concept of generating new population is based on a hope that the new population will be better than the previous one.Individuals of the first population are initialized randomly.After initialization, fitness is computed for every individual of the population.Parents are selected according to the fitness value to generate new offspring.The process of generating the new solutions is repeated until some criteria are met; the criteria can be the specific number of population or the specific fitness of the solution.GA has been utilized in different fields such as work done by Galdwell and Johnston (1991) and Baker and Seltzer (1993) including CBIR by Cho and Lee (2002), Gali et al. (2012) and Syam and Rao (2012).To resolve the imbalanced labeling problem of support vector machine, this study explores the multiple arrangements of image features.

GA procedure:
The aim of this study is to modulate the retrieval process as an optimization problem by searching maximum relevant images against any query image.The output generated by SVM as illustrated above is used by GA to get back the Chromosomes.

Construction of the chromosomes:
Chromosomes are defined as: where, N represents the number of genes in one chromosome and M is the size of the population.To perform genetic operation, population of Chromosomes is generated.Initially, feature set of query image and the labeled positive images are represented as chromosomes.The following equation represents the parent chromosome: [ , ] where, p = 1, 2, 3 ....24 and q = [1, 2, 3....10] are the features obtained using CLD and Coiflets Wavelet.
Population evolution: New solutions are generated through crossover and mutation operator of the GA.The population size was constant at 100.Offspring are generated as following: where # and $ represent the offspring.P and C are used for the parents and cut-prints for superiority three types of chromosomes are selected: • Original chromosome rep-resenting parents and relevant feedback • Chromosomes passed through evolution test Following equation is used as evaluation test: By using this test we select chromosome having less distance from best parents, n is the number of remaining off springs, which can become elite other than the original parents.We set the mutation rate as 0.05.

GA fitness function:
Fitness function in any evolutionary techniques has its own importance and the performance of algorithm highly depends on the fitness function.Following fitness function (Saadatmand-Tarzjan and Moghaddam, 2007) is used in this independent study: where, C shows the maximum possible relevant images.As in our image database the number of images in a class known, so value of C is already determined.y j m,p indicates the p th member from y j m and {. { is the Kronecker delta function (Saadatmand-Tarzjan and Moghaddam, 2007): Result of the fitness function is in term of 1 and 0. When system matches the user solution the result is 1 and the result is 0 if fully mismatched.Proposed GAPSO-SVM approach: When user input query image to the system, system extracts the color and texture features of the image through CLD and coif-lets wavelets.Using Manhattan distance similarity measure is performed between the query image features and the database image features.Retrieved results are ranked and used as input to the PSO for swarm initializations.PSO consider top 50 images as relevant images and rest as irrelevant.The evolutionary process of the PSO started as discussed in next section.PSO generates the output as relevant and irrelevant images which are used for training of SVM.Further SVM performed the classification and generate the relevant and irrelevant sets of the images.The relevant set of images marked by SVM is used by GA.The process of GA is discussed in section IV.GA produced the relevant and irrelevant set of images which are used by SVM for training and classification purpose.SVM classify the positive and negative images then system displayed positive images to the user.The flow chart of proposed approach is shown in Fig. 3.

METHODOLOGY
Calculating distance: Image is described in terms of features to calculate the distance.Two feature vectors are extracted, one for color features and the other for texture features from the image.Both vectors are combined to get the final feature vector.From database images, the features calculation is performed offline.
When user input query image to search the similar image from image database its features are extracted and mapped with the all database images, the most similar images are displayed to the user based on Eq. ( 20): ( : ) ( ,: ) where MNHT is the Manhattan distance calculated between the query image and the image from the database.After calculating the Dist (x_q; x_j), j = 1, 2… N DB where N DB represent the number of images in the database.Based on the distance from the query image, the result is sorted and used as input to the PSO.

Swarm initialization and fitness evaluation:
The objective of this study is to enhance the performance of CBIR by considering the retrieval process as an optimization problem.For this purpose, PSO and GA are selected.The particle of the PSO is defined as p n as feature vector from the feature space.Let P is the number of particles with N FB ≤P<N DB , where N FB are the number of relevant images and P<N DB are the total number images in database.Each particle has independent speed vector defined as v k n , n = 1, 2…P.Any optimization process highly dependent on the fitness function, in this study fitness function is used by Broilo and De Natale (2010) and given as: where, , 1, 2...
the relevant and irrelevant sub set of images.The distance vector Dist (.) already calculated by Eq. (20).
If the particle is closer to the relevant set and far from the irrelevant images, the function produce, the lower value and vice versa.Therefore, smaller fitness value means better particle position.This fitness value is used to reorder the swarms to generate the novel ranking.During each iteration, changes occur in fitness Due to this reason, some relevant images can be moved to the irrelevant zone.Most of the time during iterations, irrelevant images can dominate the relevant images.This aspect was under consideration during the assembling of the objective function, making it dependent on the inverse of the distance from the irrelevant images.In this way, if the average distance of the particle from the irrelevant images increases, the fitness depends only on the relevant images.
Termination criteria: To know how the swarms evolution during this optimization process, elements of the particles are need to be defined.Each particle carries two positions, one is personal best position ˬ and the other is global best position .In this study, the way of updating personal best and global best is different from the traditional PSO.The query image input by the user is treated as global best.At the beginning, the personal is initialized by the original feature vector.Updating of personal best depends on the result of Eq. ( 21).If { { ≤ { { # than personal best will be updated.The speed vector of each particle is updated as: where, r 1 and r 2 are two random numbers, their range is chosen from [0, 1] and ω is the inertia weight kept static 0.4.C 1 and C 2 are learning factors specifically the cognition and cognition component influential respectively.
To update the particles we used the following equation: After the initialization of particle's initial positions and velocity, the position and velocity of each particle are updated during each iteration through Eq. ( 22) and (23).Equation ( 21) is used to calculate the fitness of particles.Updating of the personal best depends on the fitness value.The images are ranked from lower to higher fitness, then the relevant repository is updated and returned to user.This iterated process ends when one of the following conditions is met: • Reached to the target number of iterations or • Retrieved the predefined number of relevant images, in our case, we set the specific number of iterations.The output generated from PSO is used to train the SVM

EXPERIMENTAL SETUP
Proposed GAPSO-SVM approach is validated through extensive experiment on real image data set.Results of the GAPSO-SVM are compared with some previous CBIR techniques.The detail about the data set and experiments are provided in next subsections.

Analysis:
The proposed GAPSO-SVM method is compared with FEI, SIMPLI city, FIRM, WH (Banerjee et al., 2009;Wang et al., 2001;Chen and Wang, 2002;Karpagam and Rangarajan, 2012) and simple HIST taken from Karpagam and Rangarajan (2012).For comparison, we have used the results described in Karpagam and Rangarajan (2012).Following conclusion are made from the results described in the Fig. 4: • GAPSO-SVM has outperformed FEI, SIMPLI city, FIRM, WH and simple HIST Method.• WH has to shown better results than many previous techniques.
Figure 4a describes the precision and recall at different top 'n'.Class wise performance of previous techniques has shown in Fig. 4b. Figure 4c illustrates that the accuracy achieved by GAPSO-SVM approach is much higher than previous techniques.Achieved precision by GAPSO-SVM on different top retrieval is higher than WH (Karpagam and Rangarajan, 2012) as shown in Fig. 4d.performance of GAPSO-SVM and FIRM (Chen and Wang, 2002).The performance of GAPSO-SVM on each class is given in Fig. 4f.

CONCLUSION
The study proposed a CBIR technique named GAPSO-SVM using evolutionary computing techniques, PSO and GA to enhance the performance of CBIR.The proposed system used the Color Layout Descriptor (CLD) of MPEG-7 for color features and wavelets packets for texture feature extraction by using PSO and GA coupled with SVM.To validate the system performance, results are compared with FEI, SIMPLI city, FIRM, WH and simple HIST methods of CBIR.GAPSO-SVM achieved more than 95% accuracy for class dinosaurs and flowers.For class African, Buses, horses and mountains, the accuracy of GAPSO-SVM is higher than 70%.Beach, Building and food achieve accuracy higher than 50%.The only class that has less than 50% accuracy is Elephants.Wang, J., J. Li and G. Wiederhold, 2001

Fig. 1 :
Fig. 1: Extraction process of the CLD wavelet signature (texture feature representation) C ij = The representation of the intensity value elements of sub image i * j = The size of the sub image (Rao GENETIC ALGORITHM Genetic Algorithm is an evolutionary algorithm based on the nature evaluation procedure (Goldberg, bases offers a particular way of coding signals, reconstructing exact features and preserving global energy.The inverse relationship between wavelet rent scales can be seen in Rao et al. Texture feature extraction algorithm: Let I be the image of size w×w Divide the image I into four bands I 1 , I 2 , I 3 , I 4 based on Coiflets wavelet of size w/2×w/2 Compute Signatures f r for I 2 , I 3 , I 4 Now take the image I 1 and divide it into 4 bands Namely I 11 , I 12 , I 13 , I 14 of size w/4×w/4 Compute signatures f r for I 12 , I 13 , I 14 Again take the I 11 and divide it into 4 bands Namely I 111 , I 112 , I 113 , I 114 of size w/8×w/8 Now we obtain 10 signatures then stop the process: ˦ = * where f r is the computed wavelet signature (texture feature representation), C ij is the representation of the intensity value of all elements of sub image and i×j is the size of the sub.Color feature extraction algorithm: • Image partitioning o Divide the image in to 8×8 blocks • Representative color selection o A single representative color is selected from each block o The selection results in a tiny image icon of size 8×8 o The color space conversion between RGB and YcbCr is applied • DCT Transformation o The luminance (Y) and the blue and red chrominance (Cb and Cr) are transformed by 8×8 DCT • Zigzag Scanning o A zigzag scanning is performed with these three sets of DCT coefficients o As a result we obtain three matrixes for each block of Y, Cb and Cr color space o Take sum of each matrix o Horizontally concatenate the three feature vector to obtain a final feature vector for an image

Fig. 3 :
Fig. 3: Flow chart of the proposed technique function due to the dynamic changes in X X k REL and X k IRR .Due to this reason, some relevant images can be moved to the irrelevant zone.Most of the time during iterations, irrelevant images can dominate the relevant images.This aspect was under consideration during the assembling of the objective function, making it dependent on the inverse of the distance from the irrelevant images.In this way, if the average distance of the particle from the irrelevant images increases, the fitness depends only on the relevant images.
Image database: Coral database(Wang et al., 2001), which consist of 1000 images has been used to perform retrieval process.Coral database has images from 10 different classes and each class has 100 images.The classes are Elephants, Africa, Beach, Buses, Buildings, Flowers, Dinosaurs, Mountains, Food and Horses.Ten images are randomly selected from each class and used as query images.Different experiments are performed to check the robustness of the technique and in each experiment different number of images are retrieved from the system ranging from 10 to 100 and known as top 10, 20 and so on till top 100.Precision and recall are computed to measure the performance.Where Precision is Number of relevant images/Total Number of retrieved images and Recall is Total number of relevant retrieved images/Total number relevant images in the database.Precision is the key point to evaluate the performance of an algorithm.The value of precision ranges from 0.0 to 1.0 where 1 means 100% accuracy.Precision curve is the averaged precision values of 10 queries for each class.The precision curve evaluates the effectiveness of a given algorithm and recall curve evaluates the robustness of the algorithm.