Journal of Industrial Engineering and Management Science

Vol: 2016    Issue: 1

Published In:   January 2018

Software Reliability Model Selection Based on Deep Learning with Application to the Optimal Release Problem

Article No: 3    Page: 43-58    doi: https://doi.org/10.13052/jiems2446-1822.2016.003

 1 2 3 4 5 6 7

Software Reliability Model Selection
Based on Deep Learning with
Application to the Optimal
Release Problem

• 1Information Science and Engineering Section, Department of Engineering, Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Ube-shi, Japan
• 2Department of Social Management Engineering, Graduate School of Engineering, Tottori University, Tottori-shi, Japan

Received 1 September 2016; Accepted 10 October 2016
Publication 21 October 2016

Abstract

In the past, many software reliability models have been proposed by several researchers. Also, several model selection criteria such as Akaike’s information criterion, mean square errors, predicted relative error and so on, have been used for the selection of optimal software reliability models. These assessment criteria can be useful for the software managers to assess the past trend of fault data. However, it is very important to assess the prediction accuracy of model after the end of fault data observation in the actual software project. In this paper, we propose a method of optimal software reliability model selection based on the deep learning. Moreover, we show several numerical examples of software reliability assessment in the actual software projects. In particular, we discuss the optimal release time and total expected software cost in terms of the model selection based on the deep learning.

Keywords

• Software reliability model
• optimal model selection
• deep learning
• optimal release time
• software cost

1 Introduction

In actual software development projects, the comprehensive use of the technologies and methodologies in software engineering is needed for improving software quality/reliability, i.e., requirement specification, design, coding, and, testing. In particular, the waterfall model is well known as the sequential phases in software development process. In the past, many software reliability models [13] have been applied to assess the reliability for quality management and testing-progress control of software development. However, it is difficult for the software managers to select the optimal software reliability model for the actual software development project. As an example, the software managers can assess the software reliability for the past data sets by using usual model evaluation criteria. On the other hand, the estimation results based on the past fault data cannot be guaranteed for the future data sets of actual software projects.

The selection method of the optimal software reliability model based on the deep learning is proposed in this paper. Also, several numerical examples of software reliability assessment by using the fault data in the actual software projects are shown. Moreover, we compare the methods to estimate the cumulative numbers of detected faults based on the deep learning with that based on neural network. Furthermore, we discuss the optimal release time and total expected software cost in terms of the model selection.

2 Software Reliability Growth Models

Many software reliability models have been applied to assess the reliability for quality management and testing-progress control of software development. As an example, we show the typical software reliability models with the mean value function representing the expected number of detected faults during (0, t) as follows:

Exponential NHPP (nonhomogeneous Poisson process) model

$E(t)=a(1−e−bt), (1)$

Delayed S-shaped NHPP model

$S(t)=a{1−(1+bt)e−bt}, (2)$

Logarithmic Poisson execution time model

$μ(t)=1θln[λ0θt+1], (3)$

Exponential SDE (stochastic differential equation) model

$SDEe(t)=a{ 1−e−bt+σ22t }, (4)$

S-shaped SDE model

$SDEs(t)=a{ 1−(1+bt)e−bt+σ22t }, (5)$

where a is the expected number of initial inherent faults, b a fault-detection rate per unit time per fault, λ0 the intensity of initial inherent failure, 𝜃 the reduction rate of the failure intensity rate per inherent fault, and σ a positive constant representing a magnitude of the irregular fluctuation. In this paper, we discuss above software reliability growth models.

3 Optimal Software Release Problem

Several optimal software release problems considering software development process have been proposed by many researchers [4, 5]. It is interesting for software developers to predict and estimate the time when we should stop testing in order to develop a highly reliable software system efficiently.

We define the following:

c1: the fixing cost per fault during the testing phase,

c2: the cost per unit time during the testing phase,

c3: the maintenance cost per fault after the testing phase.

Then, the expected software cost in the testing phase can be formulated as:

$C1(t)=c1H(t)+c2t, (6)$

where H(t) means the mean value function of software reliability growth model.

Also, the expected maintenance cost after the release of software is represented as follows:

$C2(t)=c3{ a−H(t) } . (7)$

Consequently, from Equations (6) and (7), the total expected software cost is given by

$C(t)=C1(t)+C2(t). (8)$

The optimum release time t is obtained by minimizing C(t) in Equation (8).

4 Optimal Software Reliability Model Selection Based on Neural Network

Let $wij1$(i = 1,2,…,I; j = 1,2,…,J) be the connection weights from i-th unit on the sensory layer to j-th unit on the association layer, and $wjk2$(j = 1,2,…,J;k = 1,2,…,K) denote the connection weights from j-th unit on the association layer to k-th unit on the response layer. Moreover, xi(i = 1,2,…,I) represent the normalized input values of i-th unit on the sensory layer, and yk(k = 1,2,…,K) are the output values. We apply the actual number of detected faults per unit time Ni(i = 1,2,…,I) to the input values xi(i = 1,2,…,I).

Considering the amount of characteristics for the software reliability models, we apply the following amount of information as parameters λi(i = 1,2,…,I) to the input data xi(i = 1,2,…,I).

• The estimated values of Akaike’s information criterion
• The estimated values of mean square errors
• The estimate of all parameters included in model
• The estimated error between the estimated and actual values of the total number of detected faults in the testing time point of 25%, 50%, and 75% for data sets

The input-output rules of each unit on each layer are given by

$hj=f(∑i=1Iwij1xi), (9) yk=f(∑j=1Jwjk2hj), (10)$

where a logistic activation function f(⋅) which is widely-known as a sigmoid function given by the following equation:

$f(x)=11+e−θx, (11)$

where 𝜃 is the gain of sigmoid function. We apply the multi-layered neural networks by back-propagation in order to learn the interaction among input and output [6]. We define the error function given by the following equation:

$E=12∑k=1K(yk−dk)2, (12)$

where dk(k = 1,2,…,K) are the target input values for the output values. We apply 5 kinds of model type to the target input values dk(k = 1,2,…,5) for the output values, i.e., the exponential NHPP model, the delayed S-shaped NHPP model, the Logarithmic Poisson execution time model, the exponential SDE model, and the S-shaped SDE model, respectively. Then, the number of units I in sensory layer is 37 because of 5 models.

5 Optimal Software Reliability Model Selection Based on Deep Learning

The structure of the deep learning in this paper is shown in Figure 1. In Figure 1, zl(l = 1,2,…,L) and zm(m = 1,2,…,M) mean the pre-training units. Also, on(n = 1,2,…,N) is the amount of compressed characteristics. Several algorithms in terms of deep learning have been proposed [712]. In this paper, we apply the deep neural network to learn the software testing phase.

Figure 1 The structure of deep learning.

As with the neural network, we apply the 4 characteristic quantities of pre-training units discussed above. Moreover, we apply 5 kinds of model type to the amount of compressed characteristics, i.e., the exponential NHPP model, the delayed S-shaped NHPP model, the Logarithmic Poisson execution time model, the exponential SDE model, and the S-shaped SDE model, respectively.

6 Numerical Examples

We focus on actual software project data sets [13] in order to assess the performance of our method.

6.1 Estimation Results Based on Trend Analysis for Each Model

We apply 5 models as the typical two kinds of curve types, i.e., exponential and S-shaped reliability growth model curves for the total number of detected faults, to the actual fault data sets. The estimation results by using 90% of data set for each software development project are shown in Figures 2–4. Similarly, the estimation results by using 80% of data set for each software development project are shown in Figures 5–7. From Figures 2–7, we can confirm that the model best fitted for the past data are not always fitted for the future in fact. Then, it is necessary for the software managers to offer the model best fitted for the future.

6.2 Estimated Results of the Optimal Model Based on the Deep Learning

The estimated results of the optimal model based on the neural network and deep learning are shown in Tables 1 and 2. From Tables 1 and 2, we found that the estimated recognition rates based on the deep learning perform better than that of the neural network. In particular, the estimation results based on the deep learning give a high-recognition rates in terms of the type of model. Moreover, the estimation results based on the deep learning for the long-term prediction by using 80% of data sets perform significantly better than that of neural network.

Figure 2 The estimation results by using 90% of data set of software project 1.

Figure 3 The estimation results by using 90% of data set of software project 2.

Figure 4 The estimation results by using 90% of data set of software project 3.

Figure 5 The estimation results by using 80% of data set of software project 1.

Figure 6 The estimation results by using 80% of data set of software project 2.

Figure 7 The estimation results by using 80% of data set of software project 3.

Table 1 The comparison of prediction accuracy after 90% of actual fault data

 Accurate Model Model Selected by Neural Network Model Selected by Deep Learning S-shape (NHPP) Exp (SDE) Exp (NHPP) S-shape (SDE) Exp (SDE) S-shape (SDE) S-shape (SDE) Exp (SDE) S-shape (NHPP) Recognition Rate by Neural Network Recognition Rate by Deep Learning Type of Model 0% 67% Name of Model 0% 33%

Table 2 The comparison of prediction accuracy after 80% of actual fault data

 Accurate Model Model Selected by Neural Network Model Selected by Deep Learning Exp (NHPP) Exp (SDE) Log/Exp (NHPP) S-shape (NHPP) Exp (SDE) Exp (SDE) S-shape (SDE) Exp (SDE) S-shape (SDE) S-shape (SDE) S-shape (SDE) S-shape (SDE) S-shape (NHPP) S-shape (SDE) S-shape (SDE) S-shape (SDE) S-shape (SDE) S-shape (SDE) Recognition Rate by Neural Network Recognition Rate by Deep Learning Type of Model 67% 83% Name of Model 33% 50%

6.3 Discussion of the Optimal Release Time

We show the comparison results of the estimated optimal software release time and total expected software cost for each software project data in Tables 3–5, respectively. Then, we focus on 4 models without the logarithmic Poisson execution time model because this model assumes the initial inherent faults of infinity. Considering the prediction accuracy after 90% of actual fault data, the best fitted models are “S-shape (NHPP)” and “S-shape (SDE)” as shown in Table 1. Then, we found that “S-shape (NHPP)” and “S-shape (SDE)” is the shortest period of release time and the minimal cost from Tables 3–5. From these results, it is very important to decide the best model in terms of optimal release time, because the model selection has a great influence on the optimal release time and total software cost.

Table 3 The comparison of the estimated optimal software release time and total expected software cost in case of software project 1

 Models Optimal Release Time Total Expected Software Cost Exp (NHPP) 46.070 6827.9 S-shape (NHPP) 22.112 5528.0 Exp (SDE) 45.642 6791.7 S-shape (SDE) 27.040 5832.1

Table 4 The comparison of the estimated optimal software release time and total expected software cost in case of software project 2

 Models Optimal Release Time Total Expected Software Cost Exp (NHPP) 678.35 16872.7 S-shape (NHPP) 74.124 3331.3 Exp (SDE) 229.46 6429.4 S-shape (SDE) 68.989 3193.9

Table 5 The comparison of the estimated optimal software release time and total expected software cost in case of software project 3

 Models Optimal Release Time Total Expected Software Cost Exp (NHPP) 116.99 6415.2 S-shape (NHPP) 29.124 2951.2 Exp (SDE) 65.768 4155.1 S-shape (SDE) 31.475 3051.7

7 Conclusion

In the past, many software reliability models have been proposed. In fact, these software reliability models have been applied to many software development projects. However, it is very difficult for the software managers to select the optimal software reliability model for the actual software development project. As an example, the software managers can assess the software reliability for the past data sets by using usual model evaluation criteria such as the Akaike’s information criterion, mean square errors, predicted relative errors, and so on. On the other hand, the estimation results based on the past fault data cannot guarantee for the prediction of future data sets in actual software projects.

This paper focuses on the learning for the software development projects. We have proposed the selection method of optimal software reliability model based on the deep learning. In particular, it is difficult to assess the reliability by only using the past fault data, because the estimation results based on the past fault data cannot be guaranteed for the future data sets of actual software projects. In this paper, we have compared the methods of reliability assessment based on neural network with that of deep learning. Moreover, we have shown that the proposed method based on the deep learning can assess better than that based on neural network. Thereby, we have found that our method can assess the software reliability in the future with high accuracy based on the past software project data.

Furthermore, we have discussed the optimal release time and total expected software cost in terms of the model selection based on the deep learning. In particular, we have found that it is very important to decide the best model in terms of optimal release time, because the model selection has a great influence on the optimal release time and total software cost.

8 Acknowledgments

This work was supported in part by the Telecommunications Advancement Foundation in Japan, the Okawa Foundation for Information and Telecommunications in Japan, and the JSPS KAKENHI Grant No. 15K00102 and No. 16K01242 in Japan.

References

[1] Lyu, M. R., ed. (1996). Handbook of Software Reliability Engineering. Los Alamitos, CA: IEEE Computer Society Press.

[2] Yamada, S. (2014). Software Reliability Modeling: Fundamentals and Applications. Tokyo/Heidelberg: Springer.

[3] Kapur, P. K., Pham, H., Gupta, A., and Jha, P. C. (2011). Software Reliability Assessment with or Applications. London: Springer.

[4] Yamada, S., and Osaki, S. (1985). Cost-reliability optimal software release policies for software systems. IEEE Trans. Reliab. R-34, 422–424.

[5] Yamada, S., and Osaki, S. (1987). Optimal software release policies with simultaneous cost and reliability requirements. Eur J. Operat. Res. 31, 46–51.

[6] Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. IEEE Trans. Neural Netw. 1, 239–242.

[7] Kingma, D. P., Rezende, D. J., Mohamed, S., Welling, M. (2014). “Semi-supervised learning with deep generative models,” in Proceedings of Neural Information Processing Systems, 3581–3589.

[8] Blum, A., Lafferty, J., Rwebangira, M. R., and Reddy, R. (2004). “Semi-supervised learning using randomized mincuts,” in Proceedings of the International Conference on Machine Learning, 1–13.

[9] George, E. D., Dong, Y., Li, D., and Alex, A. (2012). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20: 30–42.

[10] Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P. A. (2010). Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408.

[11] Martinez, H. P., Bengio, Y., and Yannakakis, G. N. (2013). Learning deep physiological models of affect. IEEE Comput. Intell. Mag. 8, 20–33.

[12] Hutchinson, B., Deng, L., and Yu, D. (2013). Tensor deep stacking networks. IEEE Trans Pattern Anal. Mach Intell. 35, 1944–1957.

[13] Brooks, W. D., and Motley R. W. (1980). Analysis of Discrete Software Reliability Models. Technical Report RADC-TR-80-84, Rome Air Development Center, Berlin.

Biographies

Yoshinobu Tamura received the B.S.E., M.S., and Ph.D. degrees from Tottori University in 1998, 2000, and 2003, respectively. From 2003 to 2006, he was a Research Assistant at Tottori University of Environmental Studies. From 2006 to 2009, he was a Lecturer and Associate Professor at Faculty of Applied Information Science of Hiroshima Institute of Technology, Hiroshima, Japan. Since 2009, he has been working as an Associate Professor at Yamaguchi University, Ube, Japan. His research interests include reliability assessment for open source software. He is a regular member of the Institute of Electronics, Information and Communication Engineers of Japan, the Information Processing Society of Japan, the Operations Research Society of Japan, the Society of Project Management of Japan, and the IEEE. Dr. Tamura received the Presentation Award of the Seventh International Conference on Industrial Management in 2004, the IEEE Reliability Society Japan Chapter Awards in 2007, the Research Leadership Award in Area of Reliability from the ICRITO in 2010, and the Best Paper Award of the IEEE International Conference on Industrial Engineering and Engineering Management in 2012.