Characteristics and Application of the NHPP Log-Logistic Reliability Model

In this paper, a Nonhomogeneous Poisson Process (NHPP) reliability model based on the two-parameter Log-Logistic (LL) distribution is considered. The essential model’s characteristics are derived and represented graphically. The parameters of the model are estimated by the Maximum Likelihood (ML) and Non-linear Least Square (NLS) estimation methods for the case of time domain data. An application to show the flexibility of the considered model are conducted based on five real data sets and using three evaluation criteria. We hope this model will help as an alternative model to other useful reliability models for describing real data in reliability engineering area.


Introduction
The Log-Logistic (LL) distribution that results from a simple transformation of the familiar logistic distribution has been found useful in many areas such as engineering, reliability data analysis, economics and hydrology.In the literature, it is well-known as the Fisk distribution due to (Fisk, 1961).In some cases, the LL distribution is proved to be a good alternative to the log-normal distribution since it characterizes increasing hazard rate function.Further, its use is well appreciated in case of censored data that usually common in reliability and life-testing experiments.The Cumulative Distribution Function (CDF) and Probability Density Function (PDF) of the two-parameter LL distribution can be defined respectively as follows: and where  > 0 is the scale parameter, and β > 0 is the shape parameter.The log-logistic reliability growth model is quite flexible to analyze reliability data since it can capture increasing/decreasing nature of the failure occurrence rate per fault.This property has attracted more attention of researchers.(Gokhale and Trivedi, 1998) considered the log-logistic reliability growth model.The Maximum Likelihood (ML) estimation method of several existing finite-failure NHPP models, as well as the log-logistic model was conducted based on inter-failure times data.They presented analysis using two real data sets which encouraged the development of the log-logistic model.(Harishchandra, 2016) considered a software reliability model in which time between two successive failures is assumed to follow the log-logistic distribution.
The parameters of their model were estimated using the ML method in the cases of interval domain data and time domain data.A simulation study and real data were used to examine their model.The results showed that their considered model performs better compared to previously suggested four NHPP models.The rest of the paper is organized as follows.In Section 2, we define the NHPP LL model and provide mathematical formulas and plots of its reliability characteristics.Estimation by the method of the ML and NLS methods is presented in Section 3. Evaluation criteria is presented in Section 4.An application to a real data set illustrates the flexibility of the NHPP LL model is given in Section 5. Conclusions are presented in Section 6.

NHPP Log-Logistic Reliability Growth Model
One way to model software failure phenomena is Non-Homogeneous Poisson process (NHPP) family of models with Mean Value Function (MVF) at time t , m(t ).The derivative of the MVF is the failure intensity, h(t ), of the software which ordinarily decreases as faults are detected and removed.If F(t) is the distribution function that denotes the expected number of faults that would be detected in a given infinite testing time, then the MVF as presented in (Lyu, 2002) is as follows: (3) By inserting Eq.(1) in Eq.(3), we obtain the MVF of the NHPP LL model as follows: where t , i = (1, 2, … , n) is the failure times, N is the number of initial errors,  is positive scale parameter, and β is shape parameter.
The failure intensity function corresponding to (4) is defined as: while the constructed model's number of remaining errors function is given by: also, its error detection rate function is given as follows: Additionally, the instantaneous mean time between failures (MTBF) can be found by the inverse of the intensity function: while the cumulative MTBF can be calculated by: lastly, we have the conditional reliability function as follows: All the above reliability characteristics of the NHPP LL model are summarized in Table 1.While, show plots of the NHPP LL model's characteristic for different selected values of parameters.Figure 1 displays that the intensity function varies in shape over the different selected values of the shape parameter, it reaches a larger peak level with the larger value of the parameter N. Figure 2 illustrates the MVF which represents the variation of number of faults detected with respect to time.From this figure we can see that, initially the faults detected during testing are very high but later on become constant, also larger value of the parameter N gives higher MVF form.The number of remaining errors function in Figure 3 decreases as the testing time increases, smaller value of the parameter N gives lower form of the number of remaining errors function.Figure 4 shows the effect of different values of the parameters on the error detection rate function, when the shape parameter is less than or equal to 1 the error detection rate function is declining exponentially, while the error detection rate function is increasing at the beginning before start declining when the shape parameter is greater than1.In Figure 5, the conditional reliability function shows a decrease form with the progress of time, the sharpness of the decreasing varies according to the variation in the selected parameters' values, larger value of the parameter N gives lower reliability form.The instantaneous and cumulative MTBF functions in Figure 6 and Figure 7, respectively, either increase rapidly with the progress of testing time or show an initial decrease before start increasing, in both cases larger value of the parameter N gives lower MTBF form.
Table 1.Listing of the NHPP LL model's characteristics.

Number of remaining errors function (NRE)
.

Maximum Likelihood Estimation
Suppose that we have n observations, denoted by  ,  , … ,  , then the likelihood function of the NHPP model can be written as follows: where Θ is the NHPP model's parameters, m(t ) and (t ) are, respectively, the NHPP model's mean value and intensity functions.
To simplify the mathematical computations, we take the natural logarithm of both sides of Eq.( 11): ln L Θ S = −m(t ) + ∑ ln (t ) .
(12) By substituting Eqs.( 4) and (5) in Eq.( 12), the log-likelihood function of the NHPP LL model can be written as:  Differentiating Eq. ( 14) with respect to N,  and , we have: Setting the three expressions of Eq.( 15) to zero we get the following system of equations: The second and third expressions of Eq.( 16) do not have a closed-form so we need numerical methods to obtain the ML estimates of the parameters  and , then by substituting  and  in the first expression,  can be obtained.

Nonlinear Least Squares Estimation
Assuming (t , y ), (t , y ), . . ., (t , y ) are n pairs of observations where i = 1, … , n.The model to be fitted to these data is:  = ( , ) +  , (17) where θ is the parameter vector, and ε is the error term.In statistics theory ε is assumed as independent variables of normal distribution N(0, σ ), where σ : is the variance of the normal distribution.The NLS estimation method involves in determining the value of the unknown parameters that minimizes:  = ∑  − ( , ) .
(18) By substituting Eq.(4), our considered fitting function, in Eq.( 18), the NLS estimates of the NHPP LL model's parameters are obtained by minimizing: Differentiating Eq. ( 19) with respect to , ,   then equating the resulted equations to zero subsequently yields the following system of equations: The estimates of the parameters    can be obtained by solving the nonlinear Eqs.( 21) and ( 22) numerically, then by substituting these estimates in Eq.( 20) the estimate of the parameter N can be obtained.

Numerical Application
To illustrate the estimation procedures and examine the considered model, data analysis of real software data set is carried out in this section.

Description of Datasets
The dataset used in our data analysis was that developed by (Musa, 1980) of Bell Telephone Laboratories, Cyber Security and Information Systems Information Analysis Centre (CSIAC).Tables [2-6] present the selected five data sets.Three evaluation criteria are used in the application.The variation between the predicted and actual values of observations is calculated by the Mean Square Error (MSE) as follows (Hwang and Pham, 2009): where n is the number of observations and k is the number of model's unknown parameters, y denotes the number of faults observed to the moment t , and m (t ) denotes the estimated number of faults detected to the time t according to the considered model; for i=1, 2, …, n.The lower MSE indicates less fitting error, thus better performance.The Theil Statistic (TS) is the average deviation percentage over all periods with regard to the actual values.The closer, TS is to zero, the better the prediction capability of model.It is defined as (Li et al., 2005): The coefficient of multiple determinations R value indicates the predictive measure of the difference among the forecasting values.It is defined as follows (Xie and Yang, 2003): It ranges from 0 to1.The larger R is the better the model fits data.

Numerical Results and Analysis
The parameter estimation and evaluation criteria results of the NHPP LL model for the five considered data sets using the ML and NLS estimation methods are respectively shown in Table [7] and Table [8].By comparing the results in these two tables, it is clear that the NHPP LL model provides better values of the MSE, R , and TS criteria when using the NLS estimation method for all cases.Based on the two studied methods of estimation, it is observed that the ranking of the data sets varies with respect to the selection of evaluation criteria as follows: According to MSE criteria the NHPP LL model's performance is the best for Ds-1.While, according to TS and R the NHPP LL model's performance is the best for Ds-3.According to all considered criteria the NHPP LL model's performance is the worst for Ds-5.

Conclusion
As software has become more diverse and spread, software reliability has also become a key concern in software development process.During the last 47 years numerous reliability models have been proposed (see; Yamada et al., 1983;Goel, 1985;Cai and Lyu, 2007;Yamada, 2013).These models are used to measure the software reliability through several characteristics such as: number of remaining errors, error detection rate, and mean time between failures.In this paper, we have considered a NHPP model that based on the log-logistic distribution which can capture increasing/decreasing nature of hazard function.Several essential characteristics of our studied model, the NHPP LL model, have been obtained and represented graphically.The considered model's parameters have been estimated using the ML, and NLS estimation methods.An application has been conducted using five real data sets and three different evaluation criteria.The considered model displays acceptable performance for the studied real data sets, particularly in the case of Ds-1 and Ds-3.The findings reveal that that the NHPP LL model gives a reasonable predictive capability for the studied real failure data.

Figure 1 .Figure 2 .
Figure 1.Plots of the NHPP LL model's intensity function for some selected values of parameters (Solid lines indicate N=50 and dashed lines indicate N=100)

Figure 3 .
Figure 3. Plots of the NHPP LL model's NRE function for some selected values of parameters

Figure 6 .
Figure 6.Plots of the NHPP LL model's I-MTBF function for some selected values of parameters

Table 7 .
Estimated parameters values and comparison criteria results using MLE method.

Table 8 .
Estimated parameters values and comparison criteria results using NLSE method.