Skip to main content

A new generalized Weibull family of distributions: mathematical properties and applications

Abstract

We propose a generalized Weibull family of distributions with two extra positive parameters to extend the normal, gamma, Gumbel and inverse Gausssian distributions, among several other well-known distributions. We provide a comprehensive treatment of its general mathematical properties including quantile and generating functions, ordinary and incomplete moments and other properties. We introduce the log-generalized Weibull-log-logistic, this is new regression model represents a parametric family of models that includes as sub-models several widely known regression models that can be applied to censored survival data. We discuss estimation of the model parameters by maximum likelihood and provide two applications to real data.

Introduction

We introduce a generalized family of univariate distributions generated by Weibull random variables. For any baseline cumulative distribution function (cdf) G(x;η) (for \(x \in \mathbb {R}\)) and probability density function (pdf) g(x;η)=d G(x;η)/d x, depending on a parameter vector η, let q denote the dimension of the vector η. The generalized Weibull (“GW” for short) family of distributions is defined by the cdf

$$\begin{array}{@{}rcl@{}} F(x)=\alpha\,\beta\,\int_{0}^{-\log\left[1-G(x;{\boldsymbol{\eta}})\right]} t^{\alpha-1}\, \mathrm{e}^{-\alpha\,t^{\beta}} dt= 1-\exp\left\{-\alpha\,\left(-\log\left[1-G(x;{\boldsymbol{\eta}})\right]\right)^{\beta}\right\}. \end{array} $$
((1))

The pdf corresponding to (1) is given by

$$\begin{array}{@{}rcl@{}} f(x)=\frac{\alpha\,\beta\,g(x;{\boldsymbol{\eta}})}{\left[1-G(x;{\boldsymbol{\eta}})\right]}\,\left\{-\log\left[1-G(x;{\boldsymbol{\eta}})\right] \right\}^{\beta-1}\,\exp\left\{-\alpha\,\left(-\log\left[1-G(x;{\boldsymbol{\eta}})\right]\right)^{\beta}\right\}. \end{array} $$
((2))

Hereafter, a random variable X having the called generalized Weibull-G (GW-G) density function (2) is denoted by X GW-G (α,β,η). The aim of this paper is to derive some mathematical properties of X in explicit forms.

Alzaatreh et al. (2013) proposed a new method of generating families of continuos distributions, called the T-X family, which has the Weibull-X family as a special case. The cdf of the T-X family is given by

$$\begin{array}{@{}rcl@{}} F(x)= \int^{-\log[1-G(x;{\text{\boldmath \(\eta\)}})]}_{0} r(t)\,dt = R\{ -\log[\!1-G(x;{\boldsymbol{\eta}})]\}, \end{array} $$
((3))

where R(t) and r(t) are the cdf and pdf of a random variable T, respectively. If a random variable T in (3) has the Weibull distribution, we obtain the GW-G distribution. Each GW-G distribution can be obtained from a specified G distribution. For α=β=1, the G distribution arises as a basic exemplar of the GW-G distribution with a continuous crossover towards cases with different shapes (for example, a particular combination of skewness and kurtosis). The hazard rate function (hrf) of X is given by

$$\begin{array}{@{}rcl@{}} h (x) =\frac{\alpha\,\beta\,g(x;{\boldsymbol{\eta}})}{\left[1-G(x;{\boldsymbol{\eta}})\right]}\,\left\{-\log[1-G(x;{\boldsymbol{\eta}})] \right\}^{\beta-1}. \end{array} $$

We provide explicit expressions for the quantile function (qf), ordinary and incomplete moments, mean deviations, Bonferroni and Lorenz curves, Rényi entropy, Shannon entropy, reliability and some properties of the order statistics.

The paper is outlined as follows. Section 2 provides some special distributions in the GW family. In Section 3, we derive useful expansions for the pdf and cdf of X. We can easily apply these expansions for all GW-G distributions. In Section 4, we obtain the quantile function (qf) of X. In Section 5, we derive explicit expressions for the ordinary and incomplete moments. The moment generating function (mgf) of X is determined in Section 6. Mean deviations, probability weighted moments (PWMs), entropies and reliability are investigated in Sections 7, 8, 9 and 10. In Section 11, we derive an expansion for the density function of the GW order statistics. Some inferential tools are discussed in Section 12. In Section 13, we present a generalization of regression models based on the GW family. The performance of the maximum likelihood estimators (MLEs) are also investigated by a simulation study in this section. In Section 14, we fit some GW-G distributions to two real data sets to demonstrate the potentiality of this family. Finally, Section 15 ends with some conclusions.

Special Weibull-G distributions

The GW family density function (2) allows for greater flexibility of its tails and can be widely applied in many areas of engineering and biology. Here, we present and study some special cases of this family because it extends several widely-known distributions in the literature. The density (2) will be most tractable when the cdf G(x;η) and pdf g(x;η) have simple analytic expressions.

2.1 The generalized Weibull-normal (GW-N) distribution

The GW-N distribution is defined from (2) by taking G(x;η) and g(x;η) to be the cdf and pdf of the normal N(μ,σ 2) distribution, where η=(μ,σ 2)T. Its density function is given by

$${} {\fontsize{9.4pt}{9.6pt}\selectfont{\begin{aligned} f_{\mathcal{G}W-N}(x)=\frac{\alpha\beta\phi\Big(\frac{x-\mu}{\sigma}\Big)}{1-\Phi\Big(\frac{x-\mu}{\sigma}\Big)} \left\{\!-\log\left[\!1-\Phi\left(\frac{x-\mu}{\sigma}\right)\!\right]\!\right\}^{\beta-1} \!\exp\!\left\{\!-\alpha\!\left[\!-\log\left(\!1-\Phi\left(\!\frac{x-\mu}{\sigma\!}\right)\!\right)\!\right]\!\right\}, \end{aligned}}} $$
((4))

where \(x \in \mathbb {R}\), \(\mu \in \mathbb {R}\) is a location parameter, σ>0 and α>0 are scale parameters, β>0 is a shape parameter, and ϕ(·) and Φ(·) are the pdf and cdf of the standard normal distribution, respectively. A random variable with density (4) is denoted by XGW-N(α,β,μ,σ 2). For μ=0 and σ=1, we obtain the standard GW-N distribution. Further, the GW-N distribution with α=β=1 becomes the normal distribution. Plots of the GW-N density function for selected parameter values are displayed in Fig. 1.

Fig. 1
figure 1

The GW-N density function for some parameter values: a For values β=1.5, μ=0 and σ=1. b For values α=1.5, μ=0 and σ=1

2.2 The generalized Weibull-Gumbel (GW-Gu) distribution

Consider the Gumbel distribution with location parameter \(\mu \in \mathbb {R}\) and scale parameter σ>0, where the pdf and cdf (for \(x\in \mathbb {R}\)) are

$$g(x;{\boldsymbol{\eta}})=\frac 1\sigma \exp\left\{\left(\frac{x-\mu}{\sigma}\right) -\exp\left(\frac{x-\mu}{\sigma}\right) \right\}$$

and

$$G(x;{\boldsymbol{\eta}})=1-\exp\left\{-\exp\left(\frac{x-\mu}{\sigma}\right)\right\}, $$

respectively. In this case η=(μ,σ)T. The mean and variance are equal to μγ σ and π 2 σ 2/6, respectively, where γ is the Euler’s constant (γ≈0.57722). Inserting these expressions into (2) gives the GW-Gu density function

$$\begin{array}{@{}rcl@{}} f_{{\mathcal{G}W-G}u}(x)=\frac{\alpha\beta}{\sigma}\, \exp\left\{\beta\Big(\frac{x-\mu}{\sigma}\Big) -\alpha\exp\left(\beta\left(\frac{x-\mu}{\sigma}\right)\right)\right\}, \end{array} $$
((5))

where \(x, \mu \in \mathbb {R}\) and α,β,σ>0. The Gumbel distribution corresponds to α=β=1. Plots of (5) for selected parameter values are dispalyed in Fig. 2.

Fig. 2
figure 2

The GW-Gu density function for some parameter values: a For β=1.5, μ=0 and σ=3. b For α=1.5, μ=0 and σ=3

2.3 The generalized Weibull-log-normal (GW-LN) distribution

Let G(x) be the log-normal distribution with cdf

$$G(x;{\boldsymbol{\eta}})=1-\Phi\Big(\frac{-\log(x)+\mu}{\sigma}\Big) $$

for x>0, σ>0 and \(\mu \in \mathbb {R}\), where η=(μ,σ)T. The GW-LN density function (for x>0) reduces to

$$\begin{array}{@{}rcl@{}} f_{{\mathcal{G}W-LN}}(x)&=&\frac{\alpha\beta (\sqrt{2\pi}\,\sigma\,x)^{-1}} {\,\Phi\left(\frac{-\log(x)+\mu}{\sigma}\right)} \exp\left\{-\frac{1}{2}\left[\frac{\log(x)-\mu}{\sigma}\right]^{2}-\alpha \left[-\log\left(\Phi\left(\frac{-\log(x)+\mu}{\sigma}\right)\right)\right]^{\beta}\right\}\\ && \times \,\Big\{-\log\Big[\Phi\Big(\frac{-\log(x)+\mu}{\sigma}\Big)\Big]\Big\}^{\beta-1}. \end{array} $$

For α=β=1, we obtain the log-normal distribution. Figures 3 and 4 display some possible shapes of the GW-LN density and hazard functions, respectively, for some parameter values.

Fig. 3
figure 3

The GW-LN density function for some parameter values: a For β=2.5, μ=1 and σ=2. b For α=2, μ=1 and σ=2

Fig. 4
figure 4

Plot of the GW-LN hazard function for some parameter values

2.4 The generalized Weibull-log-logistic (GW-LL) distribution

Consider the log-logistic distribution with shape parameter a>0 and scale parameter γ>0, where the pdf and cdf (for x>0) are given by

$$g(x;{\boldsymbol{\eta}})=\frac{\gamma}{a^{\gamma}}x^{\gamma-1}\left[1+\left(\frac{x}{a}\right)^{\gamma}\right]^{-2}\qquad \text{and} \qquad G(x;{\boldsymbol{\eta}})=1-\frac{1}{1+\left(\frac{x}{a}\right)^{\gamma}},$$

respectively, where η=(a,γ)T. Inserting these expressions into (2) yields the GW-LL density function

$${} {\fontsize{8.9pt}{9.6pt}\selectfont{\begin{aligned} f_{{\mathcal{G}W-LL}}(x)=\frac{\alpha\beta\gamma x^{\gamma-1}}{a^{\gamma}} \left(1+\Big(\frac{x}{a}\Big)^{\gamma}\right)^{-1} \left[\log\left(1+\Big(\frac{x}{a}\Big)^{\gamma}\right)\right]^{\beta-1} \exp\left\{-\alpha\left[\log\left(1+\Big(\frac{x}{a}\Big)^{\gamma}\right)\right]^{\beta}\right\}. \end{aligned}}} $$
((6))

The log-logistic distribution corresponds to α=β= 1. Plots of (6) and hazard function for selected parameter values are displayed in Figs. 5 and 6, respectively.

Fig. 5
figure 5

The GW-LL density function for some parameter values: a For β=2.5, a=25 and γ=1. b For α=2, a=25 and γ=1

Fig. 6
figure 6

Plot of the GW-LL hazard function for some parameter values

Useful expansions

For any real parameter c and z(0,1), it can be proven that

$$\begin{array}{@{}rcl@{}} \left[-\log(1-z)\right]^{c}=z^{c}+c\,\sum^{\infty}_{i=0}p_{i}(c+i)\,z^{i+c+1}, \end{array} $$
((7))

where p i (c) are Stirling polynomials. The first six polynomials are p 0(w)=1/2, p 1(w)=(2+3w)/24, p 2(w)=(w+w 2)/48, p 3(w)=(−8−10w+15w 2+15w 3)/5760, p 4(w)=(−6w−7w 2+2w 3+3w 4)/11520 and p 5(w)=(96+140w−224w 2−315w 2+63w 5)/2903040. These coefficients are related to the Stirling polynomials1 by p n−1(w)=S n (w)/[n!(w + 1)] for n≥1, where S 0(w) = 1,S 1(w)=(w+1)/2, etc. The proof of the expansion (7) is given in details by Flajonet and Odlyzko (1990) (see Theorem 3A, page 227) and Flajonet and Sedgewick (2009) (see Theorem VI.2, page 385). In this paper, we adopt the polynomials p i (w) in accordance with Nielson (1906) and Ward (1934).

Some useful expansions for (1) and (2) can be derived using the concept of exponentiated distributions. For an arbitrary baseline cdf G(x), a random variable is said to have the exponentiated-G (“exp-G”) distribution with power parameter a>0, say X exp-G (a), if its pdf and cdf are

$$\begin{array}{@{}rcl@{}} h_{a}(x)=a\,G(x)^{a-1} g(x)\,\,\,\,\,\,\text{and}\,\,\,\,\,\,H_{a}(x)=G(x)^{a}, \end{array} $$

respectively. The properties of exponentiated distributions have been studied by many authors in recent years, see Mudholkar and Srivastava (1993) for exponentiated Weibull, Gupta et al. (1998) for exponentiated Pareto, Gupta and Kundu (1999) for exponentiated exponential, Nadarajah (2005) for exponentiated Gumbel, Kakde and Shirke (2006) for exponentiated lognormal, and Nadarajah and Gupta (2007) for exponentiated gamma distributions.

By expanding the exponential function in (1), we can write

$$\begin{array}{@{}rcl@{}} F(x)=\sum^{\infty}_{m=0} \frac{(-1)^{m+2}\,\alpha^{m+1}}{(m+1)!}\,\Big\{-\log\big[1-G(x)\big]\Big\}^{(m+1)\beta} \end{array} $$

and then using (7)

$$\begin{array}{@{}rcl@{}} F(x)\,=\,\sum^{\infty}_{m=0}\! \frac{(-1)^{m+2}\,\alpha^{m+1}}{(m+1)!}\! \left\{\!G(x)^{(m+1)\,\beta}\,+\,(m+1)\beta\!\sum^{\infty}_{i=0}p_{i}[(m+1)\beta+i]G(x)^{i+(m+1)\beta+1}\!\right\}\!. \end{array} $$

Expanding G(x)(m+1)β and G(x)i+(m+1)β+1 in power series, F(x) can be expressed as

$$\begin{array}{@{}rcl@{}} F(x)=\sum_{k=0}^{\infty}w_{k}\,G(x)^{k}=\sum_{k=0}^{\infty}w_{k}\,H_{k}(x), \end{array} $$
((8))

where H k (x) denotes the cdf of the exp-G (k) distribution and

$$\begin{array}{@{}rcl@{}} w_{k}&=&\sum^{\infty}_{m,j=0} \frac{(-1)^{m+j+k+2}\,\alpha^{m+1}}{(m+1)!} {j \choose k} \left\{{(m+1)\beta \choose j} + (m+1)\beta\sum^{\infty}_{i=0}p_{i}\Big[(m+1)\beta+i\Big] \right. \\ && \left.\times\,{i+(m+1)\beta+1 \choose j} \right\}. \end{array} $$
((9))

The corresponding pdf of X can be expressed as

$$\begin{array}{@{}rcl@{}} f(x)=\sum_{k=0}^{\infty}(k+1)\,v_{k}\,G(x)^{k}\,g(x)=\sum_{k=0}^{\infty}v_{k}\,h_{k+1}(x), \end{array} $$
((10))

where h k+1(x) denotes the pdf of the exp-G (k+1) distribution and v k =w k+1. So, several properties of the GW-G distribution can be obtained by knowing those of the exp-G distribution, see, for example, Mudholkar et al. (1995), Gupta and Kundu (2001) and Nadarajah and Kotz (2006a), among others.

Quantile function

Let Q G (u)=G −1(u) be the quantile function (qf) of G for 0<u<1. Inverting F(x)=u in (1), we obtain the qf of X as

$$ x=F^{-1}(u)=Q_{G}\left(1-\exp\left\{-\left[-\alpha^{-1}\,\log(1-u)\right]^{1/\beta}\right\}\right). $$
((11))

Hence, Eq. (11) reveals that the GW-G qf can be expressed in terms of the G qf. Quantiles of interest can be obtained from (11) by substituting appropriate values for u. In particular, the median of X is obtained when u=1/2, expressed by

$$M=Q_{G}\left(1-\exp\left\{-\left[\alpha^{-1}\,\log(2)\right]^{1/\beta}\right\}\right). $$

We can also use (11) for simulating GW-G random variables by setting u as a uniform random variable In the unit interval (0,1). Using the power series expansion in Eq. (11), we have

$$\begin{array}{@{}rcl@{}} x=F^{-1}(u)=Q_{G}\left\{1-\exp\left[-\left(\sum^{\infty}_{k=0} v_{k+1}\,u^{k+1} \right)\right]\right\}. \end{array} $$

where v k =(−1)k+1/(k α). Hence, the last equation reveals that the GW-G qf can be expressed as the G qf applied a power series.

Moments

From now on, let Y k+1 exp-G (k+1). A first formula for the nth moment of X can be obtained from Eq. (10) as

$$\begin{array}{@{}rcl@{}} \mu_{n}^{\prime}=E(X^{n})=\sum_{k=0}^{\infty} v_{k}\,E\left(Y_{k+1}^{n}\right). \end{array} $$
((12))

Explicit expressions for moments of several exponentiated distributions are given by Nadarajah and Kotz (2006a). They can be used to produce \(\mu _{n}^{\prime }\).

A second formula for \(\mu _{n}^{\prime }\) can be obtained from (10) in terms of the baseline quantile function Q G (u). We obtain

$$\begin{array}{@{}rcl@{}} \mu_{n}^{\prime}=\sum_{k=0}^{\infty} (k+1) \,v_{k}\,\tau(n,k), \end{array} $$
((13))

where the integral can be expressed in terms of the G quantile function

$$\begin{array}{@{}rcl@{}} \tau(n,a)=\int_{-\infty}^{\infty} x^{n}\,G^{a}(x)\,g(x)dx={\int_{0}^{1}} Q_{G}(u)^{n}\,u^{a} du. \end{array} $$
((14))

The ordinary moments of several GW-G distributions can be determined directly from Eqs. (13) and (14). Here, we give two examples. For the first example, we consider the Gumbel distribution with cdf \(G(x)=1-\exp \left \{-\exp \left (\frac {x-\mu }{\sigma }\right)\right \}\). The moments of the exponentiated Gumbel distribution with parameter (k+1) can be obtained from Nadarajah and Kotz (2006a). The nth moment of the GW-Gu distribution becomes

$$E(X^{n})=\sum^{\infty}_{k=0} v_{k}\, (k+1)\, \sum^{n}_{i=0} {n \choose i} \mu^{n-i}\, (-\sigma)^{i} \left(\frac{\partial }{\partial p} \right)^{i}\, \Bigg[(k+1)^{-p} \,\Gamma(p) \Bigg]\Bigg|_{p=1}.$$

For the second example, we consider the generalized Weibull-standard logistic (GW-SL) distribution, where G(x)=(1+ex)−1. A result from (Prudnikov et al. 1986, Section 2.6.13, Eq. 4) gives (for t<1)

$$\begin{array}{@{}rcl@{}} E(X^{n})= \sum_{k=0}^{\infty} v_{k}\, \left(\frac{\partial}{\partial t}\right)^{n} \,B(t+k+1,1-t) \bigg{|}_{t = 0}, \end{array} $$

where \(B(\textit {a,b}) = {\int _{0}^{1}} t^{a-1}\,(1-t)^{b-1} \textit {dt}\) is the beta function.

Further, the central moments (μ r ) and cumulants (κ r ) of X can be determined as

$$\begin{array}{@{}rcl@{}} \mu_{r}=\sum_{k=0}^{r}(-1)^{k}\,\binom{r}{k}\,\mu_{1}^{\prime k}\,\mu_{r-k}^{\prime} \qquad \text{and} \qquad \kappa_{r}=\mu_{r}^{\prime}-\sum_{k=1}^{r-1}\binom{r-1}{k-1}\,\kappa_{k}\,\mu_{r-k}^{\prime}, \end{array} $$

respectively, where \(\kappa _{1}=\mu ^{\prime }_{1}\). Plots of the skewness and kurtosis varying the values of α and β for the GW-N and GW-LL distributions are displayed in Figs. 7 and 8, respectively. These plots reveal that the skewness and kurtosis depend on both parameters α and β.

Fig. 7
figure 7

Skewness and kurtosis of the GW-N distribution as a function of β for some values of α and μ=1 and σ=2. a Skewness. b Kurtosis

Fig. 8
figure 8

Skewness and kurtosis of the GW-LL distribution as a function of β for some values of α and μ=1 and σ=2. a Skewness. b Kurtosis

The incomplete moments play an important role for measuring inequality, for example, the Lorenz and Bonferroni curves, which depend upon the first incomplete moment of a distribution. The nth incomplete moment of X is calculated as

$$\begin{array}{@{}rcl@{}} m_{n}(y) = \int^{y}_{-\infty} x^{n}\,f(x)\,dx = \sum_{k=0}^{\infty} (k+1)\,v_{k}\,\int_{0}^{G(y)} Q_{G}(u)^{n} u^{k} du. \end{array} $$

The last integral can be computed for most baseline G distributions.

Let \(\mu _{n}^{\prime }=E(X^{n})\) be the nth ordinary moment of X calculated from (12) or (13). The nth descending factorial moment of X is

$$\begin{array}{@{}rcl@{}} \mu_{(n)}^{\prime}= E\left(X^{(r)}\right)=E \left[X(X-1)\times\cdots \times(X-r+1)\right] = \sum_{k=0}^{r} s(r,k)\,\mu_{k}^{\prime}, \end{array} $$

where

$$\begin{array}{@{}rcl@{}} s(r,k)=\frac{1}{k!}\,\left[\frac {d^{k}}{d x^{k}} x^{(r)} \right]_{x=0} \end{array} $$

is the Stirling number of the first kind which counts the number of ways to permute a list of r items into k cycles. So, we can obtain the factorial moments from the ordinary moments given before.

Generating function

Here, we provide two formulae for the moment generating function (mgf) M(t)=E(etX) of X. A first formula for M(t) comes from (10) as

$$\begin{array}{@{}rcl@{}} M(t)=\sum_{k=0}^{\infty} \,v_{k}\,M_{k+1}(t), \end{array} $$

where M k+1(t) is the mgf of Y k+1. Hence, M(t) can be determined from the generating function of the exp-G (k+1) distribution.

A second formula for M(t) can be derived from (10) as

$$\begin{array}{@{}rcl@{}} M(t)=\sum_{k=0}^{\infty} (k+1)\,v_{k}\,\rho(t,k), \end{array} $$
((15))

where ρ(t,a) can be calculated from the parent qf Q G (x) by

$$\begin{array}{@{}rcl@{}} \rho(t,a)=\int_{-\infty}^{\infty} \exp(t x)\,G(x)^{a}\,g(x) dx={\int_{0}^{1}}\exp\left\{t\,Q_{G}(u)\right\}\,u^{a} d u. \end{array} $$
((16))

We can obtain the mgfs of several GW-G distributions directly from Eqs. (15) and (16). For example, the mgf of the GW-SL distribution (for t<1) is given by

$$\begin{array}{@{}rcl@{}} M(t)=\sum_{k=0}^{\infty} (k+1)\,B(t+k+1,1-t)\,v_{k}. \end{array} $$

Mean deviations

The mean deviations about the mean (\(\delta _{1}=E(|X-\mu ^{\prime }_{1}|)\)) and about the median (δ 2=E(|XM|)) of X can be expressed as

$$\begin{array}{@{}rcl@{}} \delta_{1}=2 \mu^{\prime}_{1}\,F(\mu^{\prime}_{1})-2 m_{1}(\mu^{\prime}_{1}) \qquad\,\, \text{and} \qquad\,\, \delta_{2}=\mu^{\prime}_{1}-2 m_{1}(M), \end{array} $$
((17))

respectively, where \(\mu ^{\prime }_{1}=E(X)\), M=M e d i a n(X) is the median given in Section 3, \(F(\mu ^{\prime }_{1})\) is easily calculated from the cdf (1) and \(m_{1}(z)=\int _{-\infty }^{z} x\,f(x) dx\) is the first incomplete moment.

Here, we provide two alternative ways to compute δ 1 and δ 2. First, a general equation for m 1(z) can be derived from (10) as

$$\begin{array}{@{}rcl@{}} m_{1}(z)= \sum_{k=0}^{\infty} v_{k}\,J_{k+1}(z), \end{array} $$

where

$$\begin{array}{@{}rcl@{}} J_{k+1}(z)=\int_{-\infty}^{z} x\,h_{k+1}(x)dx. \end{array} $$
((18))

Equation (18) is the basic quantity to compute the mean deviations of the exp-G distributions. Hence, the mean deviations in (17) depend only on the mean deviations of the exp-G distribution.

A second general formula for m 1(z) can be derived by setting u=G(x) in (18)

$$\begin{array}{@{}rcl@{}} m_{1}(z)=\sum_{k=0}^{\infty} (k+1)\,v_{k}\,\,T_{k+1}(z), \end{array} $$
((19))

where

$$\begin{array}{@{}rcl@{}} T_{k+1}(z)=\int_{0}^{G(z)} Q_{G}(u)\,u^{k}\, du \end{array} $$
((20))

is a simple integral defined from the baseline qf Q G (u).

In a similar way, the mean deviations of any GW-G distribution can be computed from Eqs. (19)–(20). For example, the mean deviations of the GW-SL distribution are determined immediately (by using the generalized binomial expansion) from the function

$$\begin{array}{@{}rcl@{}} T_{k+1}(z)=\frac{1}{\Gamma(z)}\sum_{m=0}^{\infty} \frac {(-1)^{m}\,\Gamma(k+m+1)\,\left[1-\exp(-m z)\right]}{(m+1)!}. \end{array} $$

Applications of the first incomplete moment can be addressed to obtain Bonferroni and Lorenz curves defined for a given probability π by \(B(\pi)= m_{1}(q)/[\pi \mu ^{\prime }_{1}]\) and \(L(\pi)=m_{1}(q)/\mu ^{\prime }_{1}\), respectively, where \(\mu ^{\prime }_{1}=E(X)\) and q=Q G (1− exp{−[−α −1 log(1−π)]1/β}) is the GW-G qf at π, see equation.

Probability weighted moments

A very useful mathematical quantity is the probability weighted moment (PWM) of X. The (n,s)th PWM is given by κ n,s =E{X n F(X)s} for n,s=0,1,… Using the binomial theorem, κ n,s can be written as

$$\begin{array}{@{}rcl@{}} \kappa_{n,s}&=&\alpha\beta \sum^{s}_{k=0} (-1)^{k}\,{s \choose k} \int^{\infty}_{-\infty} \frac{x^{n}\,g(x)}{[\!1-G(x)]} \left\{-\log[\!1-G(x)] \right\}^{\beta-1} \\ && \times \exp\left\{-\alpha\,\left(-\log\left[1-G(x)\right]\right)^{\beta}\right\} \exp\left\{-k\alpha\big[-\log(1-G(x))\big]^{\beta} \right\}dx. \end{array} $$

Using the power series expansion in the last equation, we have

$$\begin{array}{@{}rcl@{}} \kappa_{n,s}=\alpha\beta \sum^{s}_{k=0}\sum^{\infty}_{m,l=0} \frac{(-1)^{k+m+l} \,\alpha^{m+l}}{m!\,l!\,k^{-m}}\! {s \choose k}\!\! \int^{\infty}_{-\infty} \frac{x^{n}\,g(x)}{[\!1-G(x)]} \!\left\{-\log[\!1-G(x)] \right\}^{\beta(m+l+1)-1}\!dx. \end{array} $$

Using (7) and the power series expansion, we can write

$$\begin{array}{@{}rcl@{}} \frac{\left\{-\log\big[1-G(x)\big]\right\}^{\beta(m+l+1)-1}}{\Big[1-G(x)\Big]}&&= \sum^{\infty}_{q=0} \Bigg\{G(x)^{\beta(m+l+1)+q-1} + \Big[\beta(m+l+1)-1\Big] \Bigg.\\ &&\quad\times \sum^{\infty}_{i=0} p_{i}\big[\beta(m+l+1)+i-1\big]\, G(x)^{\beta(m+l+1)+q+i}\Bigg\}. \end{array} $$

Expanding G(x)β(m+l+1)+q−1 and G(x)β(m+l+1)+q+i, we have

$$\begin{array}{@{}rcl@{}} \kappa_{n,s}= \sum^{\infty}_{r=0} \omega_{r}\, \int^{\infty}_{-\infty} x^{n}\, G^{r}(x)\, g(x)\,dx \end{array} $$
((21))

where

$$\begin{array}{@{}rcl@{}} \omega_{r}&=& \beta\,\sum_{k=0}^{s}\,\sum_{m,l,q,j=0}^{\infty} \frac{(-1)^{k+m+l+j+r}\,\alpha^{m+l+1}}{m!\, l!\, k^{-m}} {s \choose k}{j \choose r} \left\{{\beta(m+l+1)+q-1 \choose j} \right.\\ &&\left. +\ \Big[\beta(m+l+1)+i-1\Big]\sum^{\infty}_{i=0} p_{i}\Big[\beta(m+l+1)+i-1\Big] {\beta(m+l+1)+q+i \choose j} \right\}. \end{array} $$

The quantity κ n,s can be obtained from (21) in terms of the baseline qf by setting G(x)=u. We have

$$\begin{array}{@{}rcl@{}} \kappa_{n,s}= \sum^{\infty}_{r=0} \omega_{r}\,\tau(n,r), \end{array} $$
((22))

where τ(n,r) is given by (14).

Equation (22) can be applied for most baseline G distributions to derive explicit expressions for κ n,s , since the baseline qf can usually be expressed as a power series.

Entropies

An entropy is a measure of variation or uncertainty of a random variable X. Two popular entropy measures are the Rényi and Shannon entropies (Shannon 1951; Rényi 1961). The Rényi entropy of a random variable with pdf f(·) is defined by

$$\begin{array}{@{}rcl@{}} I_{R}(\gamma)=\frac{1}{1-\gamma}\log\int_{0}^{\infty} f^{\gamma} (x) dx \end{array} $$

for γ>0 and γ≠1. The Shannon entropy of a random variable X is defined by E[− logf(X)]. It is the particular case of the Rényi entropy for γ 1.

Here, we derive expressions for the Rényi and Shannon entropies when X is a generalized Weibull-G random variable. Using (7), we can write

$$\begin{array}{@{}rcl@{}} \left\{ -\log[\!1-G(x)] \right\}^{\gamma(\beta - 1)}=G(x)^{\gamma(\beta-1)} + \gamma(\beta-1)\sum^{\infty}_{i=0} p_{i}(\gamma\beta-\gamma+i)\,G(x)^{i+\gamma(\beta-1)+1}. \end{array} $$

Expanding G(x)γ(β−1) and G(x)i+γ(β−1)+1, the last equation becomes

$$\begin{array}{@{}rcl@{}} \left\{ -\log[\!1-G(x)] \right\}^{\gamma(\beta - 1)}= \sum_{k=0}^{\infty}\, s_{k}\,G(x)^{k}, \end{array} $$
((23))

where

$$\begin{array}{@{}rcl@{}} {}s_{k}=\sum_{m=0}^{\infty} (-1)^{m+k}\, {m \choose k}\, \left[{\gamma(\beta-1) \choose m} + \gamma(\beta-1)\, \sum_{i=0}^{\infty} \,p_{i}(\gamma\beta-\gamma+i)\, {i+\gamma(\beta-1)+1 \choose m} \right]. \end{array} $$
((24))

By expanding the exponential function and using the results obtained in (23), we can write

$$\begin{array}{@{}rcl@{}} \exp\left\{-\gamma \alpha \left[ -\log(1-G(x)) \right]^{\beta}\right\}=\sum^{\infty}_{\ell=0}\tau_{\ell}\,G(x)^{\ell} \end{array} $$

where τ is given by

$$\begin{array}{@{}rcl@{}} \tau_{\ell}=\sum_{j,r=0}^{\infty} \frac{(-1)^{r+\ell+j}\,(\gamma\,\alpha)^{j}}{j!}\,{r \choose \ell}\, \left[ {j\beta \choose r} + j\,\beta\, \sum_{i=0}^{\infty} p_{i}(j\beta+i)\, {i+j \beta+1 \choose j} \right]. \end{array} $$
((25))

So,

$$\begin{array}{@{}rcl@{}} \int_{0}^{\infty} f^{\gamma} (x) dx &=& \int_{0}^{\infty} \frac{(\alpha\,\beta)^{\gamma}\,g(x)^{\gamma}}{\left[1-G(x)\right]^{\gamma}}\,\left\{-\log[\!1-G(x)] \right\}^{\gamma(\beta-1)}\,\exp\left\{-\gamma\alpha\,\left(-\log\left[1-G(x)\right]\right)^{\beta}\right\} dx\\ &=& (\alpha\beta)^{\gamma}\sum^{\infty}_{q,k,\ell=0}\kappa_{q,k,\ell}\,I_{q+k+\ell}\,, \end{array} $$

where

$$ \kappa_{q,k,\ell}= (-1)^{q}\,{-\gamma \choose q}\,s_{k}\,\tau_{\ell}, $$

s k and τ are given by (24) and (25), respectively, and I q+k+ comes from the parent distribution as

$$I_{q+k+\ell}=\int_{0}^{\infty} g(x)^{\gamma}\, {G(x)^{\,q+k+\ell}}\, dx. $$

Hence, the Rényi entropy of X is given by

$$\begin{array}{@{}rcl@{}} I_{R}(\gamma)=\frac{\gamma \log (\alpha\beta) }{1-\gamma} + \frac{1}{1-\gamma} \log \left(\sum_{q,k,\ell=0}^{\infty} \kappa_{q,k,\ell}\,I_{q+k+\ell} \right). \end{array} $$
((26))

The Shannon entropy can be obtained by limiting γ 1 in (26). However, it is easier to derive an expression for it from first principles. Using (2), the Shannon entropy cam be expressed as

$$\begin{array}{@{}rcl@{}} E\left[-\log f(X)\right] &=& -\log (\alpha\beta) - E [\log (g(X))] +\, E [\log (1-G(X))]\\ &&+\ (1-\beta)\, E\{\log [-\log(1-G(X))]\} + \alpha\, E \left\{\left[-\log(1-G(X))\right]^{\beta}\right\}. \end{array} $$

Using the series expansion for log(1−z), we can write

$$\begin{array}{@{}rcl@{}} \{\log [-\log(1-G(X))]\}&=&\left[ \log \left\{\sum_{r = 1}^{\infty} \frac {G(X)^{r}}{r} \right\} \right] \\ &=&\log G(X) + \log \left(1+ \sum_{r = 2}^{\infty} \frac {G(X)^{r-1}}{r} \right) \end{array} $$
((27))
$$\begin{array}{@{}rcl@{}} &=&\log G(X) + \sum^{\infty}_{j=1} \frac{(-1)^{j+1}}{j} \left(G(x) \sum_{r = 0}^{\infty} \frac {G(X)^{r}}{r+2} \right)^{j}. \end{array} $$
((28))

Henceforth, we use an equation by Gradshteyn and Ryzhik (2007) for a power series raised to a positive integer n

$$\begin{array}{@{}rcl@{}} \left(\sum_{i=0}^{\infty} a_{i}\,u^{i}\right)^{n}=\sum_{i=0}^{\infty} c_{n,i}\,u^{i}, \end{array} $$
((29))

where the coefficients c n,i (for i=1,2,…) are easily determined from the recurrence equation

$$ c_{n,i}=(i\,a_{0})^{-1}\sum_{m=1}^{i}\,[m\,(n+1)-i]\,a_{m}\,c_{n,i-m}, $$
((30))

where \(c_{n,0}={a_{0}^{n}}\). The coefficient c n,i can be determined from c n,0,…,c n,i−1 and hence from the quantities a 0,…,a i . In fact, c n,i can be given explicitly in terms of the coefficients a i , although it is not necessary for programming numerically our expansions in any algebraic or numerical software.

Based on Eqs. (29), (28) can be rewritten as

$$\begin{array}{@{}rcl@{}}{} \log G(X)+ \sum_{j=1}^{\infty}\frac{(-1)^{j+1}}{j}\!\! \left(\!\!G(X) \sum_{r=0}^{\infty} \frac {G(X)^{r}}{r+2}\right)^{j} = \log G(X)+ \sum_{j=1}^{\infty}\frac{(-1)^{j+1}}{j}\sum_{r =0}^{\infty} e_{j,r}\,G(X)^{r+j}, \end{array} $$
((31))

where

$$\begin{array}{@{}rcl@{}} e_{j,r} = 2 r^{-1}\sum_{m=1}^{r} \frac{\left[m(j+1)-r\right]}{m+2}\,\,e_{r,r-m} \end{array} $$

for r=1,2,… and e j,0=2j. Using the result (31) and expanding log(1−G(x)) in a similar form to (27), the Shannon entropy reduces to

$$\begin{array}{@{}rcl@{}}{\kern1pt} E\left[-\log f(X)\!\right] = -\log (\alpha\beta) - E \left[\log g(X)\right] - \!\sum^{\infty}_{\tau=0} \frac{1}{\tau+1} E\Big[\!G(X)^{\tau+1}\!\Big] + (1-\beta)E\Big[\!\!\log G(X)\Big] \\ +\ (1-\beta)\sum_{j=1}^{\infty}\frac{(-1)^{j+1}}{j}\sum_{r =0}^{\infty} e_{j,r}\,E\Big[G(X)^{r+j}\Big] + \alpha\, E \left\{\left[-\log(1-G(X))\right]^{\beta}\right\}. \end{array} $$

For any real parameter β and G(x) (0,1), we can write from (7)

$$\begin{array}{@{}rcl@{}} \left[-\log(1-G(X))\right]^{\beta}&=& G(x)^{\beta}\,+ \sum^{\infty}_{\ell=0} p_{\ell}(\beta)\,G(X)^{\ell+\beta+1} \end{array} $$

where p 0(β)=β/2, p 1(β)=β (3β+5)/24, p 2(β)=β (β 2+5β+6)/48, etc. Then, the Shannon entropy for the GW-G family is given by

$${} {\fontsize{9.2pt}{9.6pt}\selectfont{\begin{aligned} E\left[-\log f(X)\!\right] = -\log (\alpha\beta) - E \left[\log g(X)\right] -\! \sum^{\infty}_{\tau=0} \frac{1}{\tau+1} E\Big[G(X)^{\tau+1}\!\Big] + (1-\beta)E\Big[\log G(X)\Big] \\ +\ (1-\beta)\!\sum_{j=1}^{\infty}\!\frac{(-1)^{j+1}}{j}\sum_{r =0}^{\infty} e_{j,r}\,E\Big[\!G(X)^{r+j}\Big] + \alpha\,E\Big[ \!G(x)^{\beta} \Big] + \alpha\,\sum^{\infty}_{\ell=0} p_{\ell}\, E\Big[\!G(X)^{\ell+\beta+1}\Big]. \end{aligned}}}\vspace*{-21pt} $$
((32))

The expectations in (32) can be easily evaluated numerically for a given G(·) and g(·). Using (10), they can also be represented as

$$\begin{array}{@{}rcl@{}} E \left[\log G(X)\right] &=& \sum_{k=0}^{\infty} (k+1)\,v_{k} \int_{0}^{\infty} \log\left[G(x)\right]\,G(x)^{k}\, g (x)\, dx = -\sum_{k=0}^{\infty} \frac {v_{k}}{k+1}, \\ E\left[G(X)^{\beta}\right] &=& \sum_{k=0}^{\infty} \frac{(k+1)v_{k}} {\beta+k+1}, \\ E\left[G(X)^{\tau+1}\right] &=& \sum_{k=0}^{\infty} \frac{(k+1) v_{k}}{\tau+k+2}, \\ E\left[G(X)^{r+j}\right] &=& \sum_{k=0}^{\infty} \frac{(k+1) v_{k}}{r+j+k+1}, \\ E\left[G(X)^{\ell+\beta+1}\right] &=& \sum_{k=0}^{\infty} \frac{(k+1) v_{k}}{\ell+\beta+k+2} \end{array} $$

and

$$\begin{array}{@{}rcl@{}} E\left[\log g(X)\right] &=& \sum_{k=0}^{\infty} (k+1) v_{k} \int_{0}^{\infty} \log[g(x)]\,G(x)^{k}\, g(x)\, dx.\qquad \qquad \quad \end{array} $$

The last of these representations can also be expressed in terms of the parent qf Q G (u)=G −1(u) as

$$\begin{array}{@{}rcl@{}} E \left[ \log g (X) \right] = \sum_{k=0}^{\infty} (k+1) v_{k} {\int_{0}^{1}} \log\left\{g\left[Q_{G}(u)\right]\right\}\,u^{k}\,du, \end{array} $$

where the integral can be calculated for most baseline distributions using a power series expansion for Q G (u).

Reliability

Here, we derive the reliability, R= Pr(X 2<X 1), when X 1 GW-G (α 1,β 1) and X 2 GW-G (α 2,β 2) are independent random variables. Probabilities of this form have many applications especially in engineering concepts. Let f i denote the pdf of X i and F i denote the cdf of X i . By using the representations, (8) and (10), we can write

$$\begin{array}{@{}rcl@{}} R =\sum_{j,k=0}^{\infty} w_{j}\,w_{k+1}\,\int_{0}^{\infty} H_{j}(x)\,h_{k+1}(x) dx = \sum_{j,k=0}^{\infty} w_{j}\,w_{k+1}\,R_{jk}, \end{array} $$
((33))

where R jk = Pr(Y j <Y k ) is the reliability between the independent random variables Y j exp-G (j) and Y k exp-G (k+1). Hence, the reliability for the GW-G random variables is a linear combination of those for exp-G random variables. In the particular case α 1=α 2 and β 1=β 2, Eq. (33) gives R=1/2.

Order statistics

Order statistics make their appearance in many areas of statistical theory and practice. Suppose X 1,…,X n is a random sample from the GW-G distribution. Let X i:n denote the ith order statistic. The pdf of X i:n can be expressed from (8) and (10) as

$$\begin{array}{@{}rcl@{}} f_{i:n}(x) &=&\,K\, \sum_{j=0}^{n-i} (-1)^{j}\,{n - i \choose j}\,f(x)\,F (x)^{j+i-1} \\ &=&K\, \sum_{j = 0}^{n - i} (-1)^{j}\,{n - i \choose j}\, \left[\sum_{r=0}^{\infty} v_{r}\,(r+1)\,G(x)^{r}\,g(x)\right] \left[\sum_{k=0}^{\infty} w_{k}\,G(x)^{k}\right]^{j+i-1}, \end{array} $$

where K=n!/[ (i−1)! (ni)!]. Using (29) and (30), we can write

$$\begin{array}{@{}rcl@{}} \left[\sum_{k=0}^{\infty} w_{k}\,G(x)^{k}\right]^{j+i-1}= \sum_{k=0}^{\infty} f_{j+i-1,k}\,\,G(x)^{k}, \end{array} $$

where f j+i−1,0=(w 0)j+i−1,

$$\begin{array}{@{}rcl@{}} f_{j+i-1,k}=(k\,w_{0})^{-1}\sum_{m=1}^{k}[m(j+i)-k]\,w_{m}\,f_{j+i-1,k-m}, \qquad \text{for}\quad k=1,2,\ldots \end{array} $$

and w k is given by (9). Hence,

$$\begin{array}{@{}rcl@{}} f_{i:n}(x)= \sum_{j=0}^{n-i}\,\sum_{r,k=0}^{\infty}\,m_{\text{\textit{j,r,k}}}\,h_{k+r+1}(x), \end{array} $$
((34))

where

$$\begin{array}{@{}rcl@{}} m_{\text{\textit{j,r,k}}}=\frac {(r+1)(-1)^{j}\,n! \,\,v_{r}\,\,f_{j+i-1,k}} {(k+r+1)\,j!\,(i-1)!\,(n-i-j)!}. \end{array} $$

Equation (34) is the main result of this section. It reveals that the pdf of the GW-G order statistics is a triple linear combination of exp-G density functions. So, several mathematical quantities of these order statistics like ordinary, incomplete and factorial moments, mgf, mean deviations and several others can be obtained from those quantities of generalized Weibull-G distributions. Clearly, the cdf of X i:n can be expressed as

$$\begin{array}{@{}rcl@{}} F_{i:n}(x)=\sum_{j = 0}^{n-i} \sum_{r,k=0}^{\infty} m_{\text{\textit{j,r,k}}}\,H_{k+r}(x). \end{array} $$

Maximum likelihood estimation

Several approaches for parameter point estimation were proposed in the literature but the maximum likelihood method is the most commonly employed. The maximum likelihood estimates (MLEs) enjoy desirable properties and can be used when constructing confidence intervals and also in test statistics. Large sample theory for these estimates delivers simple approximations that work well in finite samples. Statisticians often seek to approximate quantities such as the density of a test statistic that depend on the sample size in order to obtain better approximate distributions. The resulting approximation for the MLEs in distribution theory is easily handled either analytically or numerically. The goodness of fit statistics including the Akaike information criterion (AIC), Bayesian information criterion (BIC), Consistent Akaike information criterion (CAIC), Anderson-Darling (A) and Cramér–von Mises (W) are computed to compare the fitted models.

Here, we consider estimation of the unknown parameters of the GW-G distribution by the method of maximum likelihood. Let x 1, …, x n be a sample from (2) and θ=(α,β,η T)T be vector of parameters of dimension (q+2). The log-likelihood function for θ is given by

$$\begin{array}{@{}rcl@{}} l({\boldsymbol{\theta}}) &=& n\left[\log(\alpha) + \log(\beta)\right] + \sum_{i = 1}^{n} \log \left[g(x_{i};{\boldsymbol{\eta}})\right] - \sum_{i = 1}^{n} \log \left[ 1-G(x_{i};{\boldsymbol{\eta}}) \right] \\ &&+\ (\beta-1) \sum_{i = 1}^{n} \log \left\{- \log \left[1-G(x_{i};{\boldsymbol{\eta}}) \right]\right\} -\alpha \sum_{i = 1}^{n} \left\{ - \log\left[1-G(x_{i};{\boldsymbol{\eta}}) \right] \right\}^{\beta}. \end{array} $$
((35))

The score functions for the parameters α, β and η are easily derived analytically as

$$\begin{array}{@{}rcl@{}} U_{\alpha}({\boldsymbol{\theta}})&=&\frac{n}{\alpha}- \sum_{i = 1}^{n} \left\{ - \log [\!1- G(x_{i};{\boldsymbol{\eta}})]\right\}^{\beta}, \\ U_{\beta}({\boldsymbol{\theta}})&=&\frac{n}{\beta}+ \sum_{i = 1}^{n} \log \left\{ - \log [\!1- G(x_{i};{\boldsymbol{\eta}})]\right\} \\ &&- \alpha \sum_{i = 1}^{n} \left\{ - \log [\!1- G(x_{i};{\boldsymbol{\eta}})]\right\}^{\beta} \log \{-\log [\! 1- G(x_{i};{\boldsymbol{\eta}})]\} \\ \text{and}\\ U_{{\boldsymbol{\eta}}}({\boldsymbol{\theta}})\!\! &=&\!\!\sum_{i = 1}^{n} \!\frac{\partial g \left(x_{i}; {\boldsymbol{\eta}}\right)\!/\partial{{\boldsymbol{\eta}}}} { g \left(x_{i};{\boldsymbol{\eta}} \right)} \,+\,\! \sum_{i = 1}^{n}\! \frac{\partial G (x_{i};{\boldsymbol{\eta}})/\partial{{\boldsymbol{\eta}}}}{[\!1-G(x_{i};{\boldsymbol{\eta}})]} \,+\, (1-\beta)\! \sum_{i = 1}^{n}\! \frac{\partial G \left(x_{i};{\boldsymbol{\eta}} \right)\!/\partial{{\boldsymbol{\eta}}}} {\log [\!1- G(x_{i};{\boldsymbol{\eta}})]\, [\!1-\!G(x_{i};{\boldsymbol{\eta}})\!]} \\ &&- \alpha\beta \sum_{i = 1}^{n}\left\{ -\log [\!1-G(x_{i};{\boldsymbol{\eta}})] \right\}^{\beta-1} \left\{\frac{\partial G (x_{i};{\boldsymbol{\eta}})/\partial{{\boldsymbol{\eta}}}}{[\!1-G(x_{i};{\boldsymbol{\eta}})]}\right\}, \end{array} $$

respectively.

The MLE \(\widehat {\boldsymbol {\theta }}\) of θ is obtained by solving the nonlinear likelihood equations U α (θ)=0, U β (θ)=0 and U η (θ)=0. These equations cannot be solved analytically and statistical software can be used to solve them numerically. We can use iterative techniques such as a Newton-Raphson type algorithm to obtain \(\widehat {{\boldsymbol {\theta }}}\). We employ the numerical procedure NLMixed in SAS.

Let J(θ)={J ab } be the (q + 2)×(q + 2) observed information matrix (for a,b=α,β,η), whose elements can be calculated numerically. Based on the approximate multivariate normal \(N_{q+2}(0,J(\widehat {\boldsymbol {\theta }})^{-1})\) distribution of θ̂, we can construct approximate confidence intervals for the model parameters. We can compute the maximum values of the unrestricted and restricted log-likelihoods to obtain likelihood ratio (LR) statistics for testing some sub-models of the GW-G distribution. Hypothesis tests of the type H 0: ψ=ψ 0 versus H 1: ψψ 0, where ψ is a vector formed with some components of θ and ψ 0 is a specified vector, can be performed using LR statistics. For example, the test of H 0:α=β=1 versus H 1: H 0 isnot true is equivalent to compare the GW-G and G distributions and the LR statistic is given by

$$w=2\left\{\ell\left(\widehat{\alpha},\widehat{\beta},\widehat{{\boldsymbol{\eta}}}\right)-\ell(1,1,\widetilde{{\boldsymbol{\eta}}})\right\}, $$

where \(\widehat {\alpha }\), \(\widehat {\beta }\) and \(\widehat {{\boldsymbol {\eta }}}\) are the MLEs under H and \(\widetilde {{\boldsymbol {\eta }}}\) is the estimate under H 0.

Regression models

In many practical applications, the lifetimes are affected by explanatory variables such as the cholesterol level, blood pressure, weight and many others. Parametric models to estimate univariate survival functions and for censored data regression problems are widely used. A regression model that provides a good fit to lifetime data tends to yield more precise estimates of the quantities of interest.

Let X be a random variable having the pdf (2). A class of regression models for location and scale is characterized by the fact that the random variable Y= log(X) has a distribution with location parameter μ(v) dependent only on the explanatory variable vector and a scale parameter σ. Then, we can then write

$$Y=\mu(\textbf{v})+\sigma Z, $$

where σ>0 and Z has the distribution which does not depend on v. The random variable Y (for y) has density function given by

$$\begin{array}{@{}rcl@{}} f(y;\alpha,\beta,\mu,\sigma)&=&\frac{\alpha\,\beta}{\sigma}\frac{g\left(\frac{y-\mu(\textbf{v})}{\sigma}\right)} {\left[1-G\left(\frac{y-\mu(\textbf{v})}{\sigma}\right)\right]} \left\{-\log\left[1-G\left(\frac{y-\mu(\textbf{v})}{\sigma}\right)\right]\right\}^{\beta-1}\\ && \times \exp\left\{-\alpha\left(-\log\left[1-G\left(\frac{y-\mu(\textbf{v})}{\sigma}\right)\right]\right)^{\beta}\right\}, \end{array} $$
((36))

where the functions G(·) and g(·) are defined in Section 1.

For illustrative purposes, let X be a random variable having the GW-LL density function defined in Section 2.1. The random variable Y=log(X) re-parameterized in terms of μ= log(a) and σ=1/γ is given by

$$\begin{array}{@{}rcl@{}} f(y)&=& \frac{\alpha\beta} {\sigma}\, \exp\Big(\frac{y-\mu}{\sigma}\Big) \left[1+ \exp\Big(\frac{y-\mu}{\sigma}\Big) \right]^{-1} \left\{\log\left[1+\exp\Big(\frac{y-\mu}{\sigma}\Big)\right]\right\}^{\beta-1} \\ &&\times \exp\left\{-\alpha\left[\log\left(1+\exp\Big(\frac{y-\mu}{\sigma}\Big)\right)\right]^{\beta}\right\}, \qquad y \in \Re, \end{array} $$
((37))

where α>0 and β>0 are shape parameters, μ is the location parameter and σ>0 is the scale parameter.

We refer to Eq. (37) as the log-generalized Weibull-log-logistic (LGW-LL) distribution, say YLGW-LL(α,β,μ,σ). If XGW-LL(α,β,a,γ), then Y= log(X)LGW-LL(α,β,μ,σ). For α=β=1, we obtain the logistic model. The survival function corresponding to (37) is given by

$$\begin{array}{@{}rcl@{}} S(y)=\exp\left\{-\alpha\left[\log\left(1+ \exp\left(\frac{y-\mu}{\sigma}\right)\right)\right]^{\beta}\right\}. \end{array} $$
((38))

Plots of the density function (37) for selected parameter values are displayed in Fig. 9, which show great flexibility for different values of α and β.

Fig. 9
figure 9

Plots of the GW-LL density for some parameter values. a For different values of α with β=2.5 and μ=0 and σ=1. b For different values of β with α=0.1, μ=0 and σ=1.0. c For different values of α and β with μ=0 and σ=1.0

Now, we define the standardized random variable Z=(Yμ)/σ having the density function

$$\begin{array}{@{}rcl@{}} f(z)=\frac{\alpha\,\beta\,\exp(z)}{\left[1+ \exp(z)\right] } \left\{\log\left[1+\exp(z)\right]\right\}^{\beta-1} \exp\left\{-\alpha\left[\log\left\{1+\exp(z)\right\}\right]^{\beta}\right\}. \end{array} $$
((39))

Next, we propose a linear location-scale regression model linking the response variable y i and the explanatory variable vector \(\textbf {v}_{i}^{T}=(v_{i1},\ldots,v_{\textit {ip}})\) as follows

$$ y_{i} = \textbf{v}_{i}^{T} {\boldsymbol{\tau}} + \sigma\,z_{i}, \,\,i=1, \ldots,n, $$
((40))

where the random error z i has density function (39), τ=(τ 1,…,τ p )T, σ>0, α>0 and β>0 are unknown parameters. The parameter \(v_{i}=\textbf {v}_{i}^{T} {\boldsymbol {\tau }}\) is the location of y i . The location parameter vector v=(v 1,…,v n )T is represented by a linear model v=V τ, where V=(v 1,…,v n )T is a known model matrix. The LGW-LL model (40) opens new possibilities for fitted many different types of data.

Consider a sample (y 1,v 1),…,(y n ,v n ) of n independent observations, where each random response is defined by y i = min{log(x i ), log(c i )}. We assume non-informative censoring such that the observed lifetimes and censoring times are independent. Let F and C be the sets of individuals for which y i is the log-lifetime or log-censoring, respectively. Conventional likelihood estimation techniques can be applied here. The log-likelihood function for the vector of parameters θ=(α,β,σ,τ T)T from model (40) has the form \(l({\boldsymbol {\theta }})=\sum \limits _{i \in F}l_{i}({\boldsymbol {\theta }})+\sum \limits _{i \in C}l_{i}^{(c)}({\boldsymbol {\theta }})\), where \(l_{i}({\boldsymbol {\theta }})=\log [f(y_{i})]\), \(l_{i}^{(c)}({\boldsymbol {\theta }})=\log [\!S(y_{i})]\), f(y i ) is the density (37) and S(y i ) is the survival function (38) of Y i . The total log-likelihood function for θ reduces to

$$\begin{array}{@{}rcl@{}} l({\boldsymbol{\theta}})&=&r\log\left(\frac{\alpha\,\beta}{\sigma}\right)+\sum_{i \in F}z_{i}-\sum_{i \in F}\log\left[1+\exp(z_{i})\right] +(\beta-1)\sum_{i \in F}\log\left\{\log\left[1+\exp(z_{i})\right]\right\} \\ && -\alpha\sum_{i \in F}\log^{\beta}\left[1+\exp(z_{i})\right] -\alpha\sum_{i \in C}\log^{\beta}\left[1+\exp(z_{i})\right], \end{array} $$
((41))

where r is the number of uncensored observations (failures). The MLE \(\widehat {{\boldsymbol {\theta }}}\) of the vector of unknown parameters can be calculated by maximizing the log-likelihood (41). We use the procedure NLMixed in SAS to calculate the estimate \(\widehat {{\boldsymbol {\theta }}}\). Initial values for β and σ are taken from the fit of the log-Weibull regression model with α=0 and β=1.

The elements of the (p+3)×(p+3) observed information matrix \(-\ddot {\textbf {L}}({\boldsymbol {\theta }})\), namely −L α α ,−L α β , \(-\textbf {L}_{\alpha \sigma },-\textbf {L}_{{\alpha \tau }_{j}}, -\textbf {L}_{\beta \beta },-\textbf {L}_{\beta \sigma }, -\textbf {L}_{{\beta \tau }_{j}},-\textbf {L}_{\sigma \sigma },-\textbf {L}_{{\sigma \tau }_{j}}\phantom {\dot {i}\!}\) and \(-\textbf {L}_{\beta _{j}\beta _{s}}\phantom {\dot {i}\!}\) (for j,s=1,…,p) can be calculated numerically. Inference on θ can be conducted in the classical way based on the approximate multivariate normal \(N_{p+3}\left (0,-\ddot {\textbf {L}}(\widehat {{\boldsymbol {\theta }}})^{-1}\right)\) distribution for \(\widehat {{\boldsymbol {\theta }}}\). Further, we can use LR statistics for comparing the LGW-LL model with some of its sub-models.

13.1 Simulation

For simulating of the GW-N distribution, we consider from Eq. (11) that U is a random variable from a uniform distribution in (0,1). We simulate the GW-N(α=2,β=1.5,0.5,μ=0,σ=1) model for n = 50, 150 and 300 times. For each sample size, we compute the MLEs of α, β, μ and σ. Then, we repeat this process 1000 times and compute the averages of the estimates (AEs), biases and means squared errors (MSEs). The results are reported in Table 1.

Table 1 The AEs, biases and MSEs based on 1000 simulations of the GW-N distribution with α=2, β=1.5, 0.5 μ=0 and σ=1, with n=50, 150 and 300

Based on the figures in Table 1, we note that the MSEs of the MLEs of α, β, μ and σ and a decay toward zero as the sample size increases, as usually expected under standard regularity conditions. As the sample size n increases, the mean estimates of the parameters tend to be closer to the true parameter values. This fact supports that the asymptotic normal distribution provides an adequate approximation to the finite sample distribution of the estimates. The usual normal approximation can be oftentimes improved by making bias adjustments to the MLEs. Approximations to the biases of the MLEs in simple models may be obtained analytically. In order to improve the accuracy of these estimates using analytical bias reduction one needs to obtain several cumulants of log likelihood derivatives which are notoriously cumbersome for the proposed model. In Fig. 10 we present the true density and the density of the average values of the parameters for different sample sizes.

Fig. 10
figure 10

Average estimates for different sample sizes of the GW-N distribution for fixed values α = 2, μ = 0, σ = 1 and: ac β = 1.5; df β = 0.5

Applications

In this section, we present two applications to read data. In the first, the computations were performed using the subroutine g o o d n e s s.f i t in the script AdequacyModel of the R package. In the second application for censured data the computations were done using the subroutine nlmixed of the SAS software.

14.1 Data: Strengths of glass fibers

The data (n=63) set is on the strengths of 1.5 cm glass fibers from Smith and Naylor (1987) contained in the gamlss.data library of the R software. Barreto-Souza et al. (2010) fitted the beta generalized exponential (BGE) distribution to these data and proved that its fit is better than those of the beta exponential (BE) (Nadarajah and Kotz 2006b) and generalized exponential (GE) (Gupta and Kundu 1999) distributions. Barreto-Souza et al. (2011) proved that the beta Fréchet (BF) distribution gives a better fit than the Fréchet and exponentiated Fréchet (EF) (Nadarajah and Kotz 2003) distributions. Alzaghal et al. (2013) fitted the exponentiated Weibull-exponential (EWE) distribution to the current data and conclude that this distribution provides a better fit than the BGE and BF distributions. Recently, Bourguignon et al. (2014) fits the Weibull-exponential (WE) distribution and shows that it is better than the exponentiated Weibull (EW) (Mudholkar and Srivastava 1993) and exponentiated exponential (EE) (Gupta and Kundu 1999) models.

Now, we compare the EWE and WE models with some other GW-G models fitted to these data. We also present the fits of the baseline distributions to compare the gain with the generated distributions. Table 2 provides the MLEs (and the corresponding standard errors in parentheses) of the model parameters and the values of the statistics AIC, BIC, A and W for some models.

Table 2 Estimates of the model parameters for the myelogenous leukemia data, the corresponding SEs (given in parentheses) and the AIC, BIC, A and W statistics

Formal tests for the extra skewness parameters (α,β) in the GW-N distribution are performed using LR statistics as described in Section 12. We compare the GW-N and normal models and the GW-LL and LL models, where the LR values are listed in Table 3. For the strengths of glass fibers data, we reject the null hypotheses of the LR tests in favor of the GW-N and GW-LL distributions thus indicating the gain added by the parameters α and β.

Table 3 LR tests

In order to assess if the model is appropriate, Fig. 11 a and b display the histogram of the current data and the fitted densities of the GW-N, N, GW-LL, GW-Gu, GW-LN, EWE and WE models. The figures in Table 4 and the plots of Fig. 11 indicate that the GW-Gu distribution has a significant gain compared with other distributions.

Fig. 11
figure 11

Estimated densities for strengths of glass fibers of the: a GW-N, N, GW-LL and LL models; b GW-Gu, GW-LN, EWE and WE models

Table 4 MLEs of the model parameters for the entomology data, the corresponding SEs (given in parentheses) and the statistics AIC, CAIC and BIC

14.2 Entomology data

The data come from a study carried out at the Department of Entomology of the Luiz de Queiroz School of Agriculture, University of São Paulo, which aim to assess the longevity of the mediterranean fruit fly (ceratitis capitata). The need for this fly to seek food just after emerging from the larval stage has permitted the use of toxic baits for its management in Brazilian orchards for at least fifty years. This pest control technique consists of using small portions of food laced with an insecticide, generally an organophosphate, that quickly kills the flies, instead of using an insecticide alone. Recently, there have been reports of the insecticidal effect of extracts of the neem tree leading to proposals to adopt various extracts (aqueous extract of the seeds, methanol extract of the leaves and dichloromethane extract of the branches) to control pests such as the mediterranean fruit fly. The experiment was completely randomized with eleven treatments, consisting of different extracts of the neem tree, at concentrations of 39,225 and 888 ppm. After preliminary statistical analysis, these eleven treatments were allocated into two groups, namely:

  • Group 1: Control 1 (deionized water); Control 2 (acetone - 5 %); aqueous extract of seeds (AES) (39 ppm); AES (225 ppm); AES (888 ppm); methanol extract of leaves (MEL) (225 ppm); MEL (888 ppm); and dichloromethane extract of branches (DMB) (39 ppm).

  • Group 2: MEL (39 ppm); DMB (225 ppm) and DMB (888 ppm).

For more details, see Silva et al. (2013). The response variable in the experiment is the lifetime of the adult flies in days after exposure to the treatments. The experimental period was set at 51 days, so that the numbers of larvae that survived beyond this period were considered as censored observations. The total sample size was n=72, because four cases were lost. Therefore, the variables used in this study were: x i -lifetime of ceratitis capitata adults in days, δ i -censoring indicator and v i1-group (1 = group 1, 0 = group 2). We start the analysis of the data considering only failure (x i ) and censoring (δ i ) data.

Recently, Alexander et al. (2012) analyzed these data using the McDonald-Weibull (McW) distribution with scale parameter β>0 and shape parameter λ>0. We focus on this distribution since it extends various distributions previously discussed in the lifetime literature, as: beta Weibull (BW) (Lee et al. 2007), Kumaraswamy Weibull (KwW) (Cordeiro et al. 2010), exponentiated Weibull (EW) (Mudholkar et al. 1995) distributions and more.

Now, we compare the McW distribution and some of their sub-models. For some fitted models, Table 4 provides the MLEs (and the corresponding standard errors in parentheses) of the parameters and the values of the AIC, BIC and CAIC statistics. The computations were performed using the NLMixed subroutine in SAS. They indicate that the GW-LL model has the lowest AIC, BIC and CAIC values among those values of the fitted models, and therefore it could be chosen as the best model.

In order to assess if the model is appropriate, Fig. 12 a displays the empirical and estimated cumulative distributions for the fitted GW-LL and LL distributions to the current data. Further, Fig. 12 b gives the plots of the empirical survival function and the estimated GW-LL and LL survival functions. These plots indicate the GW-LL model provides a good fit to these data.

Fig. 12
figure 12

a Estimated GW-LL and LL cdf for the entomology data. b Estimated GW-LL and LL survival function and the empirical survival for the entomology data

Now, we present results by fitting the model

$$\begin{array}{@{}rcl@{}} y_{i}=\tau_{0}+\tau_{1}v_{i1}+\sigma z_{i}, \end{array} $$

where the random variable Y i follows the LGW-LL distribution given in (37). The MLEs of the model parameters and the asymptotic standard errors of these estimates calculated using the NLMixed procedure in SAS are listed in Table 5.

Table 5 MLEs of the parameters from the fitted LGW-LL regression model to the entomology data, the corresponding SEs (given in parentheses), p-values in [ ·] and the statistics AIC, CAIC and BIC

A summary of the values of the measures AIC, CAIC and BIC to compare the LGW-LL and logistic regression models is given in Table 5. We conclude that the fitted LGW-LL regression model has the lowest AIC, CAIC and BIC values compared with those values of the fitted logistic model. Figure 13 provides the plots of the estimated survival function and estimated cdf of the LGW-LL distribution. These plots indicate this regression model provides a good fit to these data.

Fig. 13
figure 13

Estimated LGW-LL for: a survival function and empirical survival. b cdf and empirical cdf for group 1. c cdf and empirical cdf for group 2

Conclusions

We study some mathematical properties of a new generalized Weibull family of distributions with two extra positive parameters. The family is able to generalize any continuous distribution. We provide some special models, a very useful mixture representation in terms of exponentiated distributions, explicit expressions for the ordinary and incomplete moments, generating function, mean deviations, probability weighted moments, entropies, reliability and order statistics. The model parameters are estimated by the method of maximum likelihood. We introduce a location-scale regression model based on the new family. The importance of the proposed models is illustrated by means of two real life data sets. The new models provide consistently better fits than other competitive models for these data.

References

  • Alexander, C, Cordeiro, GM, Ortega, EMM, Sarabia, JM: Generalized beta-generated distributions. Comput. Stat. Data Anal. 56, 1880–1897 (2012).

    Article  MATH  MathSciNet  Google Scholar 

  • Alzaatreh, A, Lee, C, Famoye, F: A new method for generating families of continuous distributions. Metron. 71, 63–79 (2013).

    Article  MATH  MathSciNet  Google Scholar 

  • Alzaghal, A, Famoye, F, Lee, C: Exponentiated T-X Family of Distributions with Some Applications. Int. J. Stat. Probab. 2, 31 (2013).

    Article  Google Scholar 

  • Barreto-Souza, W, Santos, AH, Cordeiro, GM: The beta generalized exponential distribution. J. Stat. Comput. Simul. 80, 159–172 (2010).

    Article  MATH  MathSciNet  Google Scholar 

  • Barreto-Souza, W, Cordeiro, GM, Simas, AB: Some results for beta Fréchet distribution. Commun. Stat. Theory Methods. 40, 798–811 (2011).

    Article  MATH  MathSciNet  Google Scholar 

  • Bourguignon, M, Silva, RB, Cordeiro, GM: The Weibull-G Family of Probability Distributions. J. Data Sci. 12, 53–68 (2014).

    MathSciNet  Google Scholar 

  • Cordeiro, GM, Ortega, EMM, Nadarajah, S: The Kumaraswamy Weibull distribution with application to failure data. J. Frankl. Inst. 347, 1399–1429 (2010).

    Article  MATH  MathSciNet  Google Scholar 

  • Flajonet, P, Odlyzko, A: Singularity analysis of generating function. SIAM: SIAM J. Discr. Math. 3, 216–240 (1990).

    Google Scholar 

  • Flajonet, P, Sedgewick, R: Analytic Combinatorics. ISBN 978-0-521-89806-5. Cambridge University Press (2009).

  • Gradshteyn, IS, Ryzhik, IM: Table of Integrals, Series, and Products. seventh edition. Academic Press, San Diego (2007).

    MATH  Google Scholar 

  • Gupta, RC, Gupta, PL, Gupta, RD: Modeling failure time data by Lehman alternatives. Commun. Stat. Theory Methods. 27, 887–904 (1998).

    Article  MATH  Google Scholar 

  • Gupta, RD, Kundu, D: Generalized exponential distributions. Aust. N. Z. J. Stat. 41, 173–188 (1999).

    Article  MATH  MathSciNet  Google Scholar 

  • Gupta, RD, Kundu, D: Exponentiated exponential family: an alternative to gamma and Weibull distributions. Biom. J. 43, 117–130 (2001).

    Article  MATH  MathSciNet  Google Scholar 

  • Kakde, CS, Shirke, DT: On exponentiated lognormal distribution. Int. J. Agric. Stat. Sci. 2, 319–326 (2006).

    Google Scholar 

  • Lee, C, Famoye, F, Olumolade, O: Beta Weibull distribution: some properties and applications to censored data. J. Modern Appl. Stat. Methods. 6, 173–186 (2007).

    Google Scholar 

  • Mudholkar, GS, Srivastava, DK: Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Trans. Reliab. 42, 299–302 (1993).

    Article  MATH  Google Scholar 

  • Mudholkar, GS, Srivastava, DK, Freimer, M: The exponentiated Weibull family: A reanalysis of the bus-motor-failure data. Technometrics. 37, 436–445 (1995).

    Article  MATH  Google Scholar 

  • Nadarajah, S: The exponentiated Gumbel distribution with climate application. Environmetrics. 17, 13–23 (2005).

    Article  Google Scholar 

  • Nadarajah, S, Gupta, AK: The exponentiated gamma distribution with application to drought data. Calcutta Stat Assoc. Bull. 59, 29–54 (2007).

    MATH  MathSciNet  Google Scholar 

  • Nadarajah, S, Kotz, S: The exponentiated Frechet distribution. Interstat Electron. (2003). http://interstat.statjournals.net/YEAR/2003/abstracts/0312001.php.

  • Nadarajah, S, Kotz, S: The exponentiated type distributions. Acta Applicandae Mathematicae. 92, 97–111 (2006a).

    Article  MATH  MathSciNet  Google Scholar 

  • Nadarajah, S, Kotz, S: The beta exponential distribution. Reliab. Eng. Syst. Saf. 91, 689–697 (2006b).

    Article  MathSciNet  Google Scholar 

  • Nielson, N: Handbuch der theorie der gamma funktion. Chelsea Publ. Co., New York (1906).

    Google Scholar 

  • Prudnikov, AP, Brychkov, YA, Marichev, OI: Integrals and Series, Vol. 1, 2 and 3. Gordon and Breach Science Publishers, Amsterdam (1986).

    Google Scholar 

  • Rényi, A: On measures of entropy and information, Vol. 1. University of California Press, Berkeley (1961).

    Google Scholar 

  • Shannon, CE: Prediction and entropy of printed English. Bell Syst. Technical J. 30, 50–64 (1951).

    Article  MATH  Google Scholar 

  • Silva, MA, Bezerra-Silva, GCD, Vendramim, JD, Mastrangelo, T: Sublethal effect of neem extract on Mediterranean fruit fly adults. Rev. Bras. Frutic. 35, 93–101 (2013).

    Article  Google Scholar 

  • Smith, RL, Naylor, JC: A comparison of maximum likelihood and Bayesian estimators for the three-parameter Weibull distribution. Appl. Stat. 36, 358–369 (1987).

    Article  MathSciNet  Google Scholar 

  • Ward, M: The representation of Stirling’s numbers and Stirling’s polynomials as sums of factorials. Am. J. Math. 56, 87–95 (1934).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edwin M. M. Ortega.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cordeiro, G.M., M. Ortega, E.M. & Ramires, T.G. A new generalized Weibull family of distributions: mathematical properties and applications. J Stat Distrib App 2, 13 (2015). https://doi.org/10.1186/s40488-015-0036-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40488-015-0036-6

Keywords

Mathematics Subject Classification