Optimal Censoring Scheme Selection Based on Artificial Bee Colony Optimization ( ABC ) Algorithm

Life testing plans are more vital for carrying out researches on reliability and survival analysis. The inadequacy in the number of testing units or the timing limitations prevents the experiment from being continued until all the failures are detected. Hence, censoring grows to be an inheritably important and well-organized methodology for estimating the model parameters of underlying distributions. Type I and II censoring schemes are the most widely employed censoring schemes. The chief problem associated with the designing of life testing experiments practically is the determination of optimum censoring scheme. Hence, this study attempts to determine the optimum censoring through the minimization of total cost spent for the experiment, consuming less termination time and reasonable number of failures. The ABC algorithm is being employed in this study for obtaining the optimal censoring schemes. Entropy and variance serves as the optimal criterion. The proposed method utilizes Risk analysis to evaluate the efficiency or reliability of the optimal censoring scheme that is being determined. Optimum censoring scheme indicates the process of determining the best scheme from among the entire censoring schemes possible, in accordance to a specific optimality criterion.


INTRODUCTION
The life testing and reliability studies may not always acquire the entire information on failure times for each and every experimental unit.Here, the problem under consideration is to produce results that enable making of deductions about the processes or populations involved.The data resulting from such experiments are known as censored data.Censoring is mainly employed to decrease the entire test time and the cost related with the experiment (Rolski et al., 1999;Hogg and Klugman, 1984;Klugman et al., 2004).Utilization of various types of experimental designs is limited by both cost and time factors.Hence, censoring serves as an efficient tool for limiting the time, cost or a mixture of both.This allows deciding the most excellent designs that are capable of making inferences, when these constraints are provided (Benckert and Jung, 1974;Beirlant and Teugels, 1992).Getting extra information from future independent samples also draws attention.When several independent censored samples are available, the probability can be written unambiguously at all times.Yet, this is not applicable for multiple independent samples when distributional assumptions are not being made.A censoring scheme that can balance between total times used up for the experiment, number of units utilized in the experiment and the efficiency of statistical inference depending on the outcomes of the experiment is much preferred.
There are two types of censoring schemes that are used in common.They are the Type-I censoring and the Type-II censoring (Mikosch, 1997;Lawless, 1982) weibull distribution; (Ross, 1994(Ross, , 1996;;Yang and Xie, 2003).In Type-I (time) censoring; the life testing experiment will be ended at a fixed time T. On the other hand, the life testing experiment will end up at the commencement of r-th (r is pre-fixed) failure in Type-II (failure) censoring.But, the conventional Type-I and Type-II censoring schemes lack the flexible nature of removing units at points other than the terminal point of the experiment.To compensate this flexibility problem, a more common censoring scheme called progressive Type-II right censoring has been introduced (Weibull Distribution).
The two independent samples that are both Type-II right censored or progressively Type-II censored allow the way of making distribution free intervals for quantiles, tolerance intervals and prediction intervals (Kalaivani and Somasundaram, 2013;Balakrishnan et al., 2010).The authors have proved that the gains in the maximum coverage probabilities were far better for two sample situation than using one sample.Beutner and Cramer (2010) have thought of nonparametric inference for two independent samples of minimal repair systems.In addition, they have shown that how the prediction intervals for future samples can be made until a specified time has been reached.Yet again there are gains than in the equivalent one sample scenarios.One may search for approaches that hold excellent for one or more samples (Balakrishnan and Aggarwala, 2000;Rai and Singh, 2008).Determination of optimal progressive censoring schemes has been taken into account for several criteria with different assumptions.The main objective of this research is to select the optimal censoring scheme for a set of censor data.Here we are intended to use Artificial Bee Colony Optimization to find the suitable censoring scheme.This will be based on entropy and variance.This research determines the optimum censoring through the minimization of total cost spent for the experiment, consuming less termination time and reasonable number of failures.

LITERATURE REVIEW
This section presents a brief review on a handful of recent research works available in the literature.Pradhan and Kundu (2013) have made an estimation of unknown parameters of the Birnbaum-Saunders distribution, provided that the data are progressively Type-II censored.The MLEs that correspond to the unknown parameters cannot be obtained in explicit forms.Hence, they have utilized the EM algorithm for computing the unknown parameters and to estimate the asymptotic variance-covariance matrix numerically.They have offered the optimal censoring scheme depending on various information measures.The computation of the optimal censoring scheme is fairly computer intensive and this has lead to the development of new censoring schemes.The relative efficiencies of the sub-optimal censoring schemes are found to be moderately large in all cases.Hence, sub-optimal censoring schemes can be reasonably used in practice.Sen et al. (2013) has taken the Bayesian inference of the Linear Hazard Rate (LHR) distribution into consideration under a progressively censoring scheme.A combination of both Type I and II censoring is presented based on the independent gamma priors for the parameters to obtain the posteriors as mixtures of gamma.The priors are motivated from a probability matching perspective.A joint credible set is built together with marginal inference and prediction by making use of the posterior distribution of certain quantities of interest.The Bayesian inference reveals a close connection to the frequent inference results obtained with Type-II censoring scheme.Bayesian planning strategies help in discovering the optimal progressive censoring schemes based on a variance criterion and a principle based on the length of a credible interval for percentiles.Sultan et al. (2014) have made a research on the statistical inference of the unknown parameters of a two parameter Inverse Weibull (IW) distribution.This scheme relies on the progressive Type-II censored sample.The maximum likelihood estimators are not available in explicit forms.For this reason, they have introduced the approximate maximum likelihood estimators that can be made available in explicit forms.The Bayes and generalized Bayes estimators for the IW parameters and the reliability function that depend on the squared error and Linex loss functions are presented.The Lindley's approximation is utilized to produce the Bayes and generalized Bayes estimators because they cannot be obtained explicitly.Moreover, a computation on the highest posterior density credible intervals of the unknown parameters based on Gibbs sampling technique is being made and then the optimal censoring scheme is achieved using an optimality criterion.Simulation experiments are carried out to evaluate the effectiveness of the estimators and two data sets have been examined for descriptive purposes.
The chances of employing entropy-information measures for designing an optimality type-II progressive censoring scheme with an illustrative application to a simple form of Pareto distribution is being examined by Awad (2013).They have formulated several mathematical formulas on the basis of sixteen entropy-information measures for the efficiency of progressive type-II censoring scheme.Yet, these sixteen information measures do not support in selecting an optimal scheme because their values are not dependent on the censoring scheme vector.Since the selection of a sup-entropy measure results in an optimal scheme, they have developed a Mathematica-7 code for calculating the numerical value of the ten supentropy measures dealt in this study.A numerical example has been demonstrated to prove that the optimal scheme is a one-step censoring from left after the detection of first failure.Kohansal and Rezakhah (2013) have proposed the joint R´enyi entropy of progressively censored order statistics in terms of an incomplete integral of the hazard function and have presented a simple estimate of the joint R´enyi entropy of progressively Type-II censored data.A goodness of fit test statistic that relies on the R´enyi Kullback-Leibler information with the progressively Type-II censored data was set up and the performance was compared against the leading test statistic.A Monte Carlo simulation study provides evidence that the proposed test statistic offers better powers than the leading test statistic and the alternatives with monotone increasing, monotone decreasing and non monotone hazard functions.

PROPOSED METHODOLOGY
Estimation of hazard rate: Weibull distribution constitutes a continuous probability distribution.The Weibull distribution is much familiar because of its ability to take huge number of shapes with the variation in its parameters.Works have been performed on this distribution to a larger extent, both from the frequentist and Bayesian perspective.The probability density function of a Weibull random variable x is as follows: ( ) k is the shape parameter whose value is greater than zero and λ is the scale parameter of the distribution with values greater than zero.Its complementary cumulative distribution function is a stretched exponential function.The Weibull distribution has relation to a number of other probability distributions, specifically, it interpolates between the exponential distribution (k = 1) and the Rayleigh distribution (k = 2).If the quantity x is a "time-tofailure", the Weibull distribution provides a distribution for which the failure rate is proportional to a power of time.The shape parameter is power plus one and hence, this parameter can be inferred directly as follows: • A value of k<1 specifies that the failure rate decreases over time.This would occur if a significant "infant mortality" exists or if the defective items fail early and the failure rate reduce over time as the defective items are weeded out of the population.• A value of k = 1 denotes that the failure rate is constant over time.This might imply that the random external events are producing mortality or failure.• A value of k>1 indicates that the failure rate increases with time.This takes place due to the occurrence of an "aging" process or parts that are more expected to fail as time go on.
The Weibull distribution is utilized in survival analysis, reliability engineering and failure analysis and industrial engineering to characterize manufacturing and delivery times, extreme value theory and weather forecasting.It is eminent that the Weibull Probability Density Function (PDF) can be decreasing or unimodal and the Hazard Function (HF) can be either decreasing or increasing based on the shape parameter.Due to the flexible nature of the PDF and HF, the Weibull distribution has been used fairly in situations where the data denotes a monotone HF.The Weibull distribution is not applicable when the data specifies a nonmonotone and unimodal HF.Hence in several practical implementations, it is initially determined that the hazard rate (Rai and Singh, 2008) is not a monotone.
Here, entropy is used to find the optimal censoring schemes and a progressive Type II censored schemes has been considered.Progressive censoring scheme is of greater interest than the other censoring schemes in the past few years, especially in reliability analysis.It is a more general censoring mechanism when compared to the traditional Type I and II censoring schemes.This approach depends on the maximization of the joint entropy of progressive censored samples.
The entropy H is a measure of the uncertainty of random variables.Let X be a random variable with a Cumulative Distribution Function (CDF) F (x) and Probability Density Function (PDF) f (x) The differential entropy H (X) of the random variable is given in Eq. ( 1): The entropy is equivalent to unity minus the expectation of the natural logarithm of the hazard rate.Hence, maximization of the entropy is equivalent to minimization of the expectation of the logarithm of hazard rate.The joint entropy in progressive Type II censor (X 1:m:n , X 2:m:n ….X m:m:n ) can be defined as: The likelihood function is given as in Eq. ( 2): In the case of H 1....m:m:n , problem emerges from the removal as well as the expression of H 1....i:m:n , which involves integration over i random variables.Hence, simplifying the calculation of H 1....i:m:n is more striking.This study focuses on the properties of the joint entropy in progressively Type II censored OS.Here, a computational method for calculation of the joint entropy based on progressive type II censoring is being employed.Reducing m integrals in the calculation of H 1....m:m:n to no integral where the computation of the entropy in progressively Type II censored samples simplifies to a sum; entropy of the smallest OS of varying sample size h 1:n.
Let X 1 X 2 ,…, X n , be the random sample of size n from pdf f (x) with cdf F (x) and hazard function h and let X 1:n X 2:n X….X n:n , be OS corresponding to this sample then: Let (X 1:m:n , X 2:m:n …… X m:m:n ) be a progressively Type II censored sample with censoring scheme (R 1 , R 2 ,… R m ).The entropy in the progressively Type II censored sample (X 1:m:n , X 2:m:n …… X m:m:n ) can be written as in Eq. ( 3): where, h1: γj is the entropy of smallest OS varying sample size γj.
Using markov chain properties of progressive Type II censoring, we can write: The density of the first order statistic of a sample of size (n -R 1 -... -R i -i) with the truncated density is g . Therefore, we have: H 1:m:n,……, m:m:n = H 1:m:n + H2|1:m:n +….+ H m|m-1:m:n (5) Thus, the expected entropy can be calculated as in Eq. ( 6): Noting the condition on X i:m:n = x i, X i+1:m:n has same pdf as first order statistics for the random sample: The above equation can be written as: where, By changing integrals we have: By using ( 5) and ( 8) and marginal density function, we can derive H 1:m:n as: Applying the identity I′ (˩) , (# =1, the result follows.But still, there are some integral expressions which were hidden in h 1 : γj and these might be difficult to compute.

Estimation of variance and entropy:
Progressive type II censoring schemes are being employed to compute the variance.The analysis of competing threats data when the data are progressively type II censored under the latent failure times model.The Maximum Likelihood Estimators (MLEs) of the unknown parameters are being calculated.It is found that the MLEs cannot be obtained in explicit form.But, it can be obtained by solving a one dimensional optimization problem.Due to the fact that the MLEs cannot be obtained in explicit form, Approximate Maximum Likelihood Estimators (AMLEs) that has explicit expressions can be used.
The following m observations can be made to get the Maximum likelihood estimates for the given R 1 ……, R m : {(x i:m:n , 1); i ε I 1 } and {(x i:m:n , 2);) i ε I 2 } Based on the above observations, the loglikelihood function without the additive constant can be denoted as in Eq. ( 10): ln l ( ) The log-likelihood of α can be obtained with the derivation of λ1 and λ2 and equating for fixed α as: p ( ) The MLE of a can be obtained by maximizing the above equation with respect to a Approximate maximum likelihood estimator.If the random variable X follows Weibull (a, λ), then the pdf of Y = lnX has the extreme value distribution with the pdf as in Eq. ( 11): It is assumed that X 1i and X 2i are independent Weibull random variables with parameters (a, λ1) and (a, λ2) respectively.Therefore, Yi = min {X 1i ; X 2i } will also indicate a Weibull (a, λ1) distribution.Ignoring the cause of failures, the Approximate Maximum Likelihood Estimators (AMLEs) of µ and λ can be obtained as in Eq. ( 12): In AMLE, the operations are as follows: λ1 and λ2.Optimal selection of censoring scheme using ABC algorithm: After assuming or determining the values of n, m, the progressive censoring scheme {R 1 ,…., R m }, such that R 1 +...+ R m = n-m are fixed.Practically, selecting an optimal censoring scheme is vital to provide maximum information of the unknown parameters.It is evident that unless n and m are fixed, the problem may not make much sense.Naturally, it is very apparent that choosing n = m and making n larger would offer more information of the unknown parameters.Further in practical applications, the sample size n and the elective sample size m are fixed in prior always.Therefore, the usual query is whether the progressive censoring scheme {R1, …, Rm} has to be selected based on convenience or depending on some scientific basis.Here, for fixed n and m, possible censoring schemes implies the entire possible choices of R1, …, R m , such that: Equation ( 13) denotes the selection of the particular scheme that provides maximum information of unknown parameters among the available censoring schemes.Usually two questions arise.The former query is how to define information measures of unknown parameters s based on particular progressive censoring data and the latter query is how to compare two different information measures based on two different progressive censoring schemes.As a result, two important issues are involved while discovering the optimal censoring scheme.They are as follows: • Find a proper criterion • Find the best censoring scheme based on this criterion Both the points are essential and neither of it is an insignificant issue in this case.
Dervis Karaboga has explained ABC algorithm in 2005.This algorithm has its motivation from the smart behavior of honey bees.The colony of artificial bees possesses three set of bees in ABC algorithm and they are the employed bees, the onlookers and the scouts.A bee which stays on the dance area for composing a selection to accept an association rule is called an onlooker and a bee which goes to the Censoring Schemes that is chosen by the onlooker is called an employed bee.The scout bee is the one that performs disorderly search for discovering new sources.The place of the censoring schemes or rules describes a practical solution to the optimization issue and the value of schemes linked to the quality (fitness) of the associated solution is estimated by Eq. ( 14): where, i = Number of censoring schemes r = Censoring schemes The key steps of ABC algorithm are as follows.
Initialize: Association rules repeat.Place the employed bees on the censoring schemes.
Place the onlooker bees on the censoring schemes depending on their nectar amounts.Send the scouts to the search area for discovering new censoring schemes.

Memorize:
The best censoring schemes found so far until requirements are met.
The collective intelligence of honey bee swarms comprises of three components.They are Employed bees, Onlooker bees and Scout bees.There are two main behaviors.Censoring schemes: To select the censoring schemes, hunter bee evaluates various properties.To reduce effort, one quality can be considered.

Employed bees:
The employed bee is used in specific censoring schemes.Its function is to distribute the information about the particular Censoring Schemes with other bees in the hive.The data which is carried by the bee includes direction, Profitability and the distance.

Unemployed bees:
The unemployed bees can be both the onlooker bees and the scout bees.The onlooker bee looks for the Censoring Schemes with the help of data provided by the employed bees.The scout bee looks for the Censoring Schemes randomly in the environment.
Each cycle in an ABC algorithm consists of three steps.In the initial step, the employed bee is sent to trace out the Censoring Schemes to evaluate their values and then, the onlooker bee uses the data supplied by the employed bee to select the Censoring schemes.Later, the scout bees were sent to find out the novel Association rules.During the initialization stage, the values of the numerous censoring schemes discovered by the bees are calculated.At the first step of the cycle, the employed bees visit the hive to distribute the data regarding the Censoring Schemes and their value information to the bees waiting in the dance area.The onlooker bees receive the data about the Association rules.The employed bees would then take a voyage to their appropriate Censoring Schemes that has been previously visited and locate the neighboring Censoring Schemes in comparison through visual information.
During the second phase of the cycle, the onlooker bee selects the Association rules based on the data supplied by the employed bees.The chances of choosing Censoring Scheme increases with increase in optimization.The onlooker bee on entering the region would select the neighboring Censoring Schemes by using the data of the employed bee and visually comparing the values like the employed bee.This visual comparison of values by the bees enable new Censoring schemes to be discovered.At the third phase of the cycle, novel Censoring schemes are found while the Censoring schemes already discovered are used up the bees.A scout bee would randomly select the novel Censoring Schemes and replaces the old Censoring Schemes with the new Censoring schemes.The bee which has high fitness values results in higher fitness.The detailed description of the ABC algorithm is as follows: • Initialize the Schemes of the solutions s i, j .
• Compute the population.
• Set cycle = 1; the cycle indicates an iterative value.
• Generate a solution u i, j in the neighborhood of s i, j using the following formula: based on the fitness.• Calculate the probability values P i for the solutions s i, j using their fitness values based on the following formula: • In order to estimate the fitness values of the solution, the following formula is being used: • Normalize the P i values into (0, 1).
Finally, the optimal censoring scheme is chosen depending on the optimal fitness in ABC.

RESULTS AND DISCUSSION
The proposed method utilizes ABC algorithm to choose the optimal censoring scheme based on entropy

Dataset:
The censored dataset used here is from stellar astronomy.The authors search for differences in the properties of stars that do and do not host extra solar planetary systems.This study targets on the abundances of the light elements Beryllium (Be) and Lithium (Li) that are believed to be depleted by internal stellar burning, so that surplus Be and Li should be present only in the planet accretion scenario of metal enrichment.
The columns of the dataset are as follows: • From the values in Table 1, the hazard rate and reliability can be calculated.The value of the hazard rate and reliability are specified in Table 2.
In this study, the selection of the optimal censoring scheme is carried out based on the measures of entropy and variance.Table 3 represents the Hazard rates that correspond to entropy values.The optimal censoring scheme results when the value of the entropy and the variance measures is large.An optimal Censoring scheme presents improved reliability with decreased hazard rate.This is expressed in Fig. 1.

CONCLUSION
This study utilizes entropy and variance measures as the optimality criterion.Here, the ABC algorithm attempts to establish precise optimal schemes for few of the significant lifetime distributions through the usage of maximum entropy and variance as the optimality criterion.The optimal fitness enables the selection of the best censoring criteria.The results of experimentation prove that the proposed methodology of selecting optimal censoring scheme yields better reliability with reduced hazard rate.The risk analysis is performed for computing the efficiency of the proposed method.The optimal censoring scheme obtained with this proposed study can produce improved results with any dataset.

First
MLE as the AMLE of λ1 and λ2.
Star name • Sample: Type = 1 indicates planet-hosting stars.Type = 2 is the control sample • Teff (in degrees Kelvin) stellar surface temperature • Log N (Be), log of the abundance of beryllium scaled to the Sun's abundance (i.e., the Sun has log N (Be) = 0.0) • Measurement error to log N (Be) based on modelfitting of the observed stellar spectrum • Log N (Li), log of the abundance of lithium scaled to the Sun's abundance The dataset consists of 39 stars known to host planets (plotted as filled circles) and 29 stars in a control sample (open circles).

Fig. 2 :
Fig. 2: Risk estimation Deficiency (Def) concept is necessary to compute the diverse estimators of VaR α (X), since the efficiency of ∧ VaR α (X) = {-ln (1 -α)} 1/ˆβˆθ relies on the joint efficiency of the estimators ˆθ and ˆβ.The risk values are estimated using Eq.(15).Table4represents the simulated mean and MSE of different estimators of the weibull parameters.Here θ = 1 and β = 0.5.The use of the concept of Deficiency (Def) is vital for comparing the different estimators of VaR α (X) because the efficiency of ∧ VaR α (X) = {-ln (1 -α)} 1/ˆβ ˆθ depends on the joint efficiency of the estimators ˆθ and ˆβ.The risk values are estimated based on the Eq.(15):VaR α (X) = {-ln (1 -α)} 1/ˆβ ˆθ (15)Table5and Fig.2reveal the various risk values calculated for various α and β values.The proposed methodology chooses the optimal censoring criterion based on the entropy and variance measures by employing ABC algorithm.

Table 1 :
Sample data set (Santos et al., 2004nd hazard rate for different α and β valuesand variance.The CASt(Santos et al., 2004; CASt dataset) censoring database that is depicted in Table1is used to illustrate the proposed work.Based on these values, the computation of hazard rate and reliability value is made.

Table 4 :
The simulated mean and MSEs of the different estimators of the weibull parameters