qats.stats
#
Sub-package for statistics/distributions.
qats.stats.empirical
#
Basic functions for statistical inference.
Functions overview
|
Empirical cumulative distribution function given a sample size. |
API
- empirical_cdf(n, kind='mean')#
Empirical cumulative distribution function given a sample size.
- Parameters:
- Returns:
Empirical cumulative distribution function
- Return type:
array
Notes
Gumbel recommended the following quantile formulation
Pi = i/(n+1)
. This formulation produces a symmetrical CDF in the sense that the same plotting positions will result from the data regardless of whether they are assembled in ascending or descending order.Jenkinson’s/Beard’s method is based on the “idea that a natural estimate for the plotting position is the median of its probability density distribution”.
A more sophisticated formulation
Pi = (i-0.3)/(n+0.4)
approximates the median of the distribution free estimate of the sample variate to about 0.1% and, even for small values of n, produces parameter estimations comparable to the result obtained by maximum likelihood estimations (Bury, 1999, p43)The probability corresponding to the unbiased plotting position can be approximated by the Gringorten formula in the case of type 1 Extreme value distribution.
References
Plotting positions, About plotting positions
qats.stats.gumbel
#
Gumbel
class and functions related to Gumbel distribution.
Classes and functions overview
|
The Gumbel maxima distribution. |
|
Quantify mean and coefficient of variation of Gumbel distribution parameters using parametric bootstrapping |
|
Fit Gumbel distribution parameters to sample by method of least square fit to empirical cdf |
|
Fit distribution parameters to sample by maximum likelihood estimation |
|
Fit Gumbel distribution parameters to sample by method of sample moments |
|
Plot data sample versus empirical and fitted cumulative distribution function on linearized Gumbel scales |
|
Fit Gumbel distribution parameters to sample by method of probability weighted moments [7]. |
Class API
- class Gumbel(loc, scale, data=None)#
The Gumbel maxima distribution.
The cumulative distribution function is defined as:
F(x) = exp{-exp[-(x-a)/b]}
where a is location parameter and b is the scale parameter.
- Parameters:
loc (float) – Gumbel location parameter.
scale (float) – Gumbel scale parameter.
data (array_like, optional) – Sample data, used to establish empirical cdf and is included in plots. To fit the Gumbel distribution to the sample data, use
Gumbel.fit()
.
- Attributes:
loc (float) – Gumbel location parameter.
scale (float) – Gumbel scale parameter.
data (array_like) – Sample data.
Examples
To initiate an instance based on parameters, use:
>>> from qats.stats.gumbel import Gumbel >>> gumb = Gumbel(loc, scale)
If you need to establish a Gumbel instance based on a sample data set, use:
>>> gumb = Gumbel.fit(data, method='msm')
References
Statistical models in applied science., Bury, K.V. (1975), Wiley, New York
Bruk av asymptotiske ekstremverdifordelinger, Haver, S. (2007)
Plotting positions, About plotting positions
Probability weighted moments, Greenwood, J. A.; Landwehr, J.M.; Matalas, N.C.; Wallis, J.R., 1979, Water Resources Research. 15(5): 1049-1054.
Probability weighted moments compared with some traditional techniques in estimating gumbel parameters and quantiles., Landwehr, J.M.; Matalas, N.C.; Wallis, J.R., 1979., Water Resources Research. 15(5): 1063-1064.
Properties
Distribution coefficient of variation (C.O.V.)
Median rank empirical cumulative distribution function associated with the sample
Distribution kurtosis
Distribution mean value
Distribution median value
Distribution mode value
Mean squared error of fitted cumulative distribution (a,b,c) and empirical distribution
Distribution parameters.
Distribution skewness
Distribution standard deviation
Methods
cdf
([x])Cumulative distribution function (cumulative probability) for specified values x
fit
(data[, method, verbose])Determine distribution parameters by fit to sample.
fit_from_weibull_parameters
(wa, wb, wc, n[, ...])Calculate Gumbel distribution parameters from n independent Weibull distributed variables.
invcdf
([p])Inverse cumulative distribution function for specified probabilities
pdf
([x])Probability density function for specified values x
plot
([filename])Plot cumulative distribution function
plot_linear
([filename])Plot cumulative distribution function on linearized Gumbel scales
rnd
([size, seed])Draw random samples from probability distribution
- property cov#
Distribution coefficient of variation (C.O.V.)
- Returns:
Distribution c.o.v.
- Return type:
- property ecdf#
Median rank empirical cumulative distribution function associated with the sample
- Returns:
Empirical cumulative distribution function
- Return type:
array
Notes
Requires data/sample to be specified.
Gumbel recommended the following mean rank quantile formulation
Pi = i/(n+1)
. This formulation produces a symmetrical CDF in the sense that the same plotting positions will result from the data regardless of whether they are assembled in ascending or descending order.A more sophisticated median rank formulation
Pi = (i-0.3)/(n+0.4)
approximates the median of the distribution free estimate of the sample variate to about 0.1% and, even for small values of n, produces parameter estimations comparable to the result obtained by maximum likelihood estimations (Bury, 1999, p43). A median rank method,pi=(i-0.3)/(n+0.4)
, is chosen to approximate the mean of the distribution [2].The empirical cdf is also used as plotting positions when plotting the sample on probability paper.
- property mse#
Mean squared error of fitted cumulative distribution (a,b,c) and empirical distribution
- Returns:
mean squared error
- Return type:
Notes
Requires data/sample to be specified.
- property params#
Distribution parameters.
- Returns:
Distribution parameters: (loc, scale).
- Return type:
- property std#
Distribution standard deviation
- Returns:
Distribution standard deviation
- Return type:
- property skew#
Distribution skewness
- Returns:
Distribution skewness
- Return type:
Notes
zetac is the complementary Riemann zeta function (zeta function minus 1). See http://docs.scipy.org/doc/scipy/reference/generated/scipy.special.zetac.html
- cdf(x=None)#
Cumulative distribution function (cumulative probability) for specified values x
- Parameters:
x (array_like, optional) – Calculate cumulative probability for these values
- Returns:
Cumulative probabilities for specified values x
- Return type:
array
Notes
A range of x values [loc, loc+3*std] are applied if x is not specified.
- classmethod fit(data, method='msm', verbose=False)#
Determine distribution parameters by fit to sample.
- Parameters:
Examples
Assuming data is a sample array/list:
>>> from qats.stats.gumbel import Gumbel >>> gumb = Gumbel.fit(data, method="msm")
- classmethod fit_from_weibull_parameters(wa, wb, wc, n, verbose=False)#
Calculate Gumbel distribution parameters from n independent Weibull distributed variables.
- Parameters:
Notes
A warning is issued if Weibull shape parameter less than 1. In this case, the convergence towards asymptotic extreme value distribution is slow , and the asymptotic distribution will be non-conservative relative to the exact distribution. The asymptotic distribution is correct with Weibull shape equal to 1 and conservative with Weibull shape larger than 1. These deviations diminish with larger samples. See [1, p. 380].
References
Bury, Karl V., 1975, “Statistical Models in Applied Science”, University of British Columbia, John Wiley & Sons
Examples
>>> from qats.stats.gumbel import Gumbel >>> gumb = Gumbel.fit_from_weibull_parameters(wa, wb, wc, n)
- invcdf(p=None)#
Inverse cumulative distribution function for specified probabilities
- Parameters:
p (array_like, optional) – Calculate the inverse cumulative distribution function for these probabilities
- Returns:
Values corresponding to the specified quantiles
- Return type:
array
Notes
A range of quantiles from 0.001 to 0.999 are applied if quantiles are not specified
- pdf(x=None)#
Probability density function for specified values x
- Parameters:
x (array_like, optional) – Cumulative probabilities for specified values x
- Returns:
Calculate probability density for these values x
- Return type:
array
Notes
A range of x values [loc, loc+3*std] are applied if x is not specified.
- plot(filename=None)#
Plot cumulative distribution function
- Parameters:
filename (str, optional) – Save plot as filename, default is to show plot on screen
- plot_linear(filename=None)#
Plot cumulative distribution function on linearized Gumbel scales
- Parameters:
filename (str, optional) – Save plot as filename, default is to show plot on screen
- rnd(size=None, seed=None)#
Draw random samples from probability distribution
- Parameters:
- Returns:
Random sample
- Return type:
array
Examples
Pick 1000 values randomly from a Gumbel distribution
>>> from qats.stats.gumbel import Gumbel >>> g = Gumbel(loc, scale) >>> sample = g.rnd(size=1000)
If you want to preset the seed for the random sampling (to be able to repeat the sampling)
>>> from qats.stats.gumbel import Gumbel >>> g = Gumbel(loc, scale) >>> sample = g.rnd(size=1000, seed=3)
Functions API
- bootstrap(loc, scale, size, repetitions, method='pwm')#
Quantify mean and coefficient of variation of Gumbel distribution parameters using parametric bootstrapping
- Parameters:
loc (float) – Source distribution location parameter
scale (float) – Source distribution scale parameter
size (int) – Size of bootstrapped sample
method (str, optional) – method of fit, optional ‘msm’ = method of sample moments ‘lse’ = least-square estimation ‘mle’ = maximum likelihood estimation ‘pwm’ = probability weighted moments (default)
repetitions (int, optional) – Number of bootstrap samples. default equal to 100
- Returns:
array – Mean distribution parameters
array – Coefficient of variation of distribution parameter
Notes
In statistics, bootstrapping is a method for assigning measures of accuracy to sample estimates (variance, quantiles). This technique allows estimation of the sampling distribution of almost any statistic using only very simple methods. Generally, it falls in the broader class of resampling methods. In this case a parametric model is fitted to the data, and samples of random numbers with the same size as the original data, are drawn from this fitted model. Then the quantity, or estimate, of interest is calculated from these data. This sampling process is repeated many times as for other bootstrap methods. If the results really matter, as many samples as is reasonable, given available computing power and time, should be used. Increasing the number of samples cannot increase the amount of information in the original data, it can only reduce the effects of random sampling errors which can arise from a bootstrap procedure itself. See [5] about bootstrapping.
Examples
To quantify the uncertainty (coefficient of variation) of a Gumbel distribution fitted to a sample with 5 values (using 100 repetition):
>>> from qats.stats.gumbel import bootstrap >>> m, cv = bootstrap(10, 2.5, 5, 100)
- lse(x)#
Fit Gumbel distribution parameters to sample by method of least square fit to empirical cdf
- Parameters:
x (array_like) – data sample
- Returns:
distribution loc and scale parameters
- Return type:
floats
Notes
Uses an approximate median rank estimate for the empirical cdf.
- mle(x)#
Fit distribution parameters to sample by maximum likelihood estimation
- Parameters:
x (array_like) – data sample
- Returns:
distribution loc and scale parameters
- Return type:
floats
Notes
MLE equation set is given in ‘Statistical Distributions’ by Forbes et.al. (2010) and referred at [4]
- msm(x)#
Fit Gumbel distribution parameters to sample by method of sample moments
- Parameters:
x (array_like) – data sample
- Returns:
distribution loc and scale parameters
- Return type:
floats
Notes
See description in [1] and [2].
- plot_fits(data, filename=None, methods=None)#
Plot data sample versus empirical and fitted cumulative distribution function on linearized Gumbel scales
- Parameters:
data (array_like) – Data sample
filename (str, optional) – Save plot as filename, default is to show plot on sc
methods (tuple, optional) –
Methods of fit. Options (default all):
msm
= method of sample momentslse
= least-square estimationmle
= maximum likelihood estimationpwm
= probability weighted moments
qats.stats.gumbelmin
#
GumbelMin
class and functions related to Gumbel (minima) distribution.
Classes and functions overview
|
The Gumbel minima distribution. |
|
Fit distribution parameters to sample by method of least square fit to empirical cdf |
|
Fit distribution parameters to sample by maximum likelihood estimation |
|
Fit distribution parameters to sample by method of sample moments |
Class API
- class GumbelMin(loc=None, scale=None, data=None)#
The Gumbel minima distribution.
The cumulative distribution function is defined as:
F(x) = 1 - exp{-exp[(x-a)/b]}
where a is location parameter and b is the scale parameter.
- Parameters:
loc (float) – Gumbel location parameter.
scale (float) – Gumbel scale parameter.
data (array_like, optional) – Sample data, used to establish empirical cdf and is included in plots. To fit the Gumbel distribution to the sample data, use
GumbelMin.fit()
.
- Attributes:
loc (float) – Gumbel location parameter.
scale (float) – Gumbel scale parameter.
data (array_like) – Sample data.
Examples
To initiate an instance based on parameters, use:
>>> from qats.stats.gumbelmin import GumbelMin >>> gumb = GumbelMin(loc, scale)
If you need to establish a Gumbel instance based on a sample data set, use:
>>> gumb = GumbelMin.fit(data, method='msm')
References
Bury, K.V. (1975) Statistical models in applied science. Wiley, New York
Haver, S. (2007), “Bruk av asymptotiske ekstremverdifordelinger”
Plotting positions, About plotting positions
Properties
Distribution coefficient of variation (C.O.V.)
Median rank empirical cumulative distribution function associated with the sample
Distribution kurtosis
Distribution mean value
Distribution median value
Distribution mode value
Distribution skewness
Distribution standard deviation
Methods
bootstrap
([size, method, N])Parametric bootstrapping of source distribution
cdf
([x])Cumulative distribution function (cumulative probability) for specified values x
fit
([data, method, verbose])Determine distribution parameters by fit to sample.
fit_from_weibull_parameters
(wa, wb, wc, n[, ...])Calculate Gumbel distribution parameters from n independent Weibull distributed variables.
gp_plot
([showfig, save])Plot data on Gumbel paper (linearized scales))
invcdf
([p])Inverse cumulative distribution function for specified quantiles p
pdf
([x])Probability density function for specified values x
plot
([showfig, save])Plot data on regular scales
rnd
([size, seed])Draw random samples from probability distribution
- property cov#
Distribution coefficient of variation (C.O.V.)
- Returns:
c – distribution c.o.v.
- Return type:
- property ecdf#
Median rank empirical cumulative distribution function associated with the sample
Notes
Gumbel recommended the following mean rank quantile formulation Pi = i/(n+1). This formulation produces a symmetrical CDF in the sense that the same plotting positions will result from the data regardless of whether they are assembled in ascending or descending order.
A more sophisticated median rank formulation Pi = (i-0.3)/(n+0.4) approximates the median of the distribution free estimate of the sample variate to about 0.1% and, even for small values of n, produces parameter estimations comparable to the result obtained by maximum likelihood estimations (Bury, 1999, p43) A median rank method, pi=(i-0.3)/(n+0.4), is chosen to approximate the mean of the distribution [2]
The empirical cdf is also used as plotting positions when plotting the sample on probability paper.
- property median#
Distribution median value
- Returns:
m – distribution median value
- Return type:
- property std#
Distribution standard deviation
- Returns:
s – distribution standard deviation
- Return type:
- bootstrap(size=None, method='msm', N=100)#
Parametric bootstrapping of source distribution
- Parameters:
- Returns:
array-like – m - mean distribution parameters
array_like – cv - coefficient of variation of distribution parameter
Notes
In statistics, bootstrapping is a method for assigning measures of accuracy to sample estimates (variance,quantiles). This technique allows estimation of the sampling distribution of almost any statistic using only very simple methods. Generally, it falls in the broader class of resampling methods. In this case a parametric model is fitted to the data, and samples of random numbers with the same size as the original data, are drawn from this fitted model. Then the quantity, or estimate, of interest is calculated from these data. This sampling process is repeated many times as for other bootstrap methods. If the results really matter, as many samples as is reasonable, given available computing power and time, should be used. Increasing the number of samples cannot increase the amount of information in the original data, it can only reduce the effects of random sampling errors which can arise from a bootstrap procedure itself. See [5] about bootstrapping.
- cdf(x=None)#
Cumulative distribution function (cumulative probability) for specified values x
- Parameters:
x (array_like) – values
- Returns:
cdf – cumulative probabilities for specified values x
- Return type:
array
Notes
A range of x values [location, location+3*std] are applied if x is not specified.
- fit(data=None, method='msm', verbose=False)#
Determine distribution parameters by fit to sample.
- Parameters:
data (array_like) – sample, optional
method ({'msm','lse','mle'}) – method of fit, optional ‘msm’ = method of sample moments ‘lse’ = least-square estimation ‘mle’ = maximum likelihood estimation
verbose (bool) – turn on output of fitted parameters
Notes
If data is not input any data stored in object (self.data) will be used.
- fit_from_weibull_parameters(wa, wb, wc, n, verbose=False)#
Calculate Gumbel distribution parameters from n independent Weibull distributed variables.
- Parameters:
Notes
A warning is issued if Weibull shape parameter less than 1. In this case, the convergence towards asymptotic extreme value distribution is slow , and the asymptotic distribution will be non-conservative relative to the exact distribution. The asymptotic distribution is correct with Weibull shape equal to 1 and conservative with Weibull shape larger than 1. These deviations diminish with larger samples. See [1, p. 380].
- gp_plot(showfig=True, save=None)#
Plot data on Gumbel paper (linearized scales))
- Parameters:
showfig (bool) – show figure immediately on screen, default True
save (filename) – save figure to file, default None
- invcdf(p=None)#
Inverse cumulative distribution function for specified quantiles p
- Parameters:
p (array_like) – quantiles (or. cumulative probabilities if you like)
- Returns:
x – values corresponding to the specified quantiles
- Return type:
array
Notes
A range of quantiles from 0.001 to 0.999 are applied if quantiles are not specified
- pdf(x=None)#
Probability density function for specified values x
- Parameters:
x (array_like) – values
- Returns:
pdf – probability density function for specified values x
- Return type:
array
Notes
A range of x values [location, location+3*std] are applied if x is not specified.
Functions API
- lse(x)#
Fit distribution parameters to sample by method of least square fit to empirical cdf
- Parameters:
x (array_like) – sample
Notes
Uses an approximate median rank estimate for the empirical cdf.
- mle(x)#
Fit distribution parameters to sample by maximum likelihood estimation
- Parameters:
x (array_like) – sample
Notes
MLE equation set is given in ‘Statistical Distributions’ by Forbes et.al. (2010) and referred at [4]
- msm(x)#
Fit distribution parameters to sample by method of sample moments
- Parameters:
x (array_like) – sample
Notes
See description in [1] and [2].
qats.stats.weibull
#
Weibull
class and functions related to Weibull distribution.
Classes and functions overview
|
The Weibull class offers miscellaneous functions for working with the Weibull distribution, defined as (cumulative distribution function). |
|
Quantify mean and coefficient of variation of Weibull distribution parameters using parametric bootstrapping |
|
Fit Weibull distribution parameters to sample by method of least square fit to empirical cdf. |
|
Fit Weibull distribution parameters to sample by maximum likelihood estimation |
|
Probability weighted moment Mljk of observation order l, order of cdf j, with emphasize on the right/upper tail (k=0). |
|
Fit Weibull distribution parameters to sample by method of sample moments |
|
Plot data sample versus empirical and fitted cumulative distribution function on linearized Weibull scales |
|
Fit distribution parameters to sample by method of probability weighted moments |
|
Fit distribution parameters to sample by method of probability weighted moments assuming the location parameter is zero. |
|
Calculate parameters of the asymptotic Gumbel extreme value distribution (Type 1) for the extreme value of N independent,Weibull distributed variables. |
Class API
- class Weibull(loc, scale, shape, data=None)#
The Weibull class offers miscellaneous functions for working with the Weibull distribution, defined as (cumulative distribution function):
F(x) = 1 - exp{-[(x-a)/b]^c}
where a is location parameter, b is scale parameter and c is shape parameter.
- Parameters:
loc (float) – Weibull location parameter.
scale (float) – Weibull scale parameter.
shape (float) – Weibull shape parameter.
data (array_like, optional) – Sample data, used to establish empirical cdf and is included in plots. To fit the Weibull distribution to the sample data, use
Weibull.fit()
.
- Attributes:
loc (float) – Weibull location parameter.
scale (float) – Weibull scale parameter.
shape (float) – Weibull shape parameter.
data (array_like) – Sample data. Exists only if distribution parameters are estimated from a sample.
Notes
For a Weibull 2-parameter distribution, specify location parameter 0 (zero).
Examples
To initiate an instance based on parameters, use:
>>> from qats.stats.weibull import Weibull >>> weib = Weibull(loc, scale, shape)
If you need to establish a Weibull instance based on a sample data set, use:
>>> weib = Weibull.fit(data, method='pwm')
References
Moment estimators for Weibull parameters and their asymptotic efficiencies, Waloddi Weibull, April 1969, Lausanne Switzerland, Technical report AFML-TR-69-135
Continuous univariate distributions, Volume 1, N.L.Johnson, S.Kotz and N.Balakrishnan, 1994, John Wiley and sons inc.
weibull.com, About location parameter
Plotting positions, About plotting positions
Bootstrapping, Bootstrapping statistics
Estimation of the generalized extreme value distribution by the method of probability weighted moments, Hosking, J. R. M., Wallis, J. R. and Wood, E. F., 1985, Technometrics, 27, pp. 251-261
Estimating the three-parameter Weibull distribution by the method of probability weighted moments with application to medical survival data, Bortolucci, A. A. et.al.
Theory and derivation for Weibull parameter probability weighted moment estimators, Grender, J.M., Dell, T.R., Reich, R.M., 1991 United Sates Department of Agriculture
Probability weighted moments, Greenwood, J. A.; Landwehr, J.M.; Matalas, N.C.; Wallis, J.R., 1979, Water Resources Research. 15(5): 1049-1054.
Probability weighted moments compared with some traditional techniques in estimating gumbel parameters and quantiles., Landwehr, J.M.; Matalas, N.C.; Wallis, J.R., 1979., Water Resources Research. 15(5): 1063-1064.
Properties
Distribution coefficient of variation (C.O.V.)
Empirical cumulative distribution function associated with the sample.
Distribution kurtosis.
Distribution mean value
Mean squared error of fitted cumulative distribution (a,b,c) and empirical distribution
Distribution parameters.
Distribution skewness
Distribution standard deviation
Methods
cdf
([x])Cumulative distribution function (cumulative probability) for specified values x
fit
(data[, method, verbose])Establish Weibull class instance by fit to sample.
fromsignal
(x[, method, verbose])Establish Weibull class instance by fit to global maxima from time series signal.
gumbel_parameters
([n])Calculate parameters of the asymptotic Gumbel extreme value distribution (Type 1) for the extreme value of N independent, Weibull distributed variables.
invcdf
([p])Inverse cumulative distribution function for specified quantiles p
pdf
([x])Probability density function for specified values x
plot
([filename])Plot data on regular scales
plot_linear
([filename])Plot data on Weibull paper (linearized scales))
rnd
([size, seed])Draw random samples from probability distribution
- property cov#
Distribution coefficient of variation (C.O.V.)
- Returns:
distribution c.o.v.
- Return type:
- property ecdf#
Empirical cumulative distribution function associated with the sample.
- Returns:
Empirical cumulative distribution function.
- Return type:
array
Notes
A mean rank method is chosen to approximate the mean of the distribution [2].
The empirical cdf is also used as plotting positions when plotting the sample on probability paper.
- property mse#
Mean squared error of fitted cumulative distribution (a,b,c) and empirical distribution
- Returns:
mean squared error
- Return type:
- property params#
Distribution parameters.
- Returns:
Distribution parameters: (loc, scale, shape).
- Return type:
- property std#
Distribution standard deviation
- Returns:
distribution standard deviation
- Return type:
- gumbel_parameters(n=None)#
Calculate parameters of the asymptotic Gumbel extreme value distribution (Type 1) for the extreme value of N independent, Weibull distributed variables.
- Parameters:
n (int) – number of independent weibull distributed variables, default equal to number of peaks (self.data.size)
- Returns:
Gumbel location and scale parameters
- Return type:
See also
Notes
If the sample x is based on lets say a 30-hour simulation but you seek an estimate of the e.g. 3-hour extreme value then n should be calculated as the nearest integer to:
n = (3 / 30) * nx
where nx is the total number of maxima during 30 hour.
References
Bury, K.V. (1975), “Statistical models in applied science”
- cdf(x=None)#
Cumulative distribution function (cumulative probability) for specified values x
- Parameters:
x (array_like) – values
- Returns:
cumulative probabilities for specified values x
- Return type:
array
Notes
A range of x values are applied if x is not specified.
- classmethod fit(data, method='msm', verbose=False)#
Establish Weibull class instance by fit to sample.
- Parameters:
data (array_like) – Sample.
method (str, optional) –
Method of fit. Available options:
msm
= method of sample moments (default)lse
= least-square estimationmle
= maximum likelihood estimationpwm
= probability weighted momentspwm2
= probability weighted moments, 2-parameter distribution
verbose (bool) – If True, fitted parameters are printed to screen.
- Returns:
Weibull class instance
- Return type:
See also
Examples
Assuming data is a sample array/list:
>>> from qats.stats.weibull import Weibull >>> weib = Weibull.fit(data, method="msm")
- classmethod fromsignal(x, method='msm', verbose=False)#
Establish Weibull class instance by fit to global maxima from time series signal.
- Parameters:
x (array_like) – Time series signal.
method (str, optional) – Method of fit. See
Weibull.fit()
for description of options.verbose (bool, optional) – If True, fitted parameters are printed to screen.
- Returns:
Class instance.
- Return type:
See also
Weibull.fit
,qats.stats.find_maxima
Examples
Assuming x is a time series signal:
>>> from qats.stats.weibull import Weibull >>> weib = Weibull.fromsignal(x, method='msm')
Note that the example above is equivalent to:
>>> from qats.signal import find_maxima >>> sample, _ = find_maxima(x, local=False, threshold=None, up=True) >>> weib = Weibull.fit(sample, method='msm')
- invcdf(p=None)#
Inverse cumulative distribution function for specified quantiles p
- Parameters:
p (array_like) – quantiles (or. cumulative probabilities if you like)
- Returns:
values corresponding to the specified quantiles
- Return type:
array
Notes
A range of quantiles from 0 to 1 are applied if quantiles are not specified
- pdf(x=None)#
Probability density function for specified values x
- Parameters:
x (array_like) – values
- Returns:
probability density function for specified values x
- Return type:
array
Notes
A range of x values are applied if x is not specified.
- plot(filename=None)#
Plot data on regular scales
- Parameters:
filename (str, optional) – Save plot as filename, default is to show plot on screen
Examples
Plot distribution and show the figure
>>> from qats.stats.weibull import Weibull >>> distribution = Weibull(100., 15., 2.5) >>> distribution.plot()
Plot distribution and save the figure as png
>>> from qats.stats.weibull import Weibull >>> distribution = Weibull(100., 15., 2.5) >>> distribution.plot(filename="plot.png")
- plot_linear(filename=None)#
Plot data on Weibull paper (linearized scales))
- Parameters:
filename (str, optional) – Save plot as filename, default is to show plot on screen
Examples
Plot distribution and show the figure
>>> from qats.stats.weibull import Weibull >>> distribution = Weibull(100., 15., 2.5) >>> distribution.plot_linear()
Plot distribution and save the figure as png
>>> from qats.stats.weibull import Weibull >>> distribution = Weibull(100., 15., 2.5) >>> distribution.plot_linear(filename="plot.png")
References
Continuous univariate distributions, Volume 1, N.L.Johnson, S.Kotz and N.Balakrishnan, 1994, John Wiley and sons inc.
Functions API
- bootstrap(loc, scale, shape, size, repetitions, method='pwm')#
Quantify mean and coefficient of variation of Weibull distribution parameters using parametric bootstrapping
- Parameters:
loc (float) – Source distribution location parameter
scale (float) – Source distribution scale parameter
shape (float) – Source distribution shape parameter
size (int) – Size of bootstrapped sample
method (str, optional) –
Method of fit. Available options:
msm
= method of sample momentslse
= least-square estimationmle
= maximum likelihood estimationpwm
= probability weighted moments (default)
repetitions (int, optional) – Number of bootstrap samples. default equal to 100
- Returns:
array – Mean distribution parameters
array – Coefficient of variation of distribution parameter
Notes
In statistics, bootstrapping is a method for assigning measures of accuracy to sample estimates (variance,quantiles). This technique allows estimation of the sampling distribution of almost any statistic using only very simple methods. Generally, it falls in the broader class of resampling methods. In this case a parametric model is fitted to the data, and samples of random numbers with the same size as the original data, are drawn from this fitted model. Then the quantity, or estimate, of interest is calculated from these data. This sampling process is repeated many times as for other bootstrap methods. If the results really matter, as many samples as is reasonable, given available computing power and time, should be used. Increasing the number of samples cannot increase the amount of information in the original data, it can only reduce the effects of random sampling errors which can arise from a bootstrap procedure itself. See [5] about bootstrapping.
Examples
To quantify the uncertainty (coefficient of variation) of a Weibull distribution fitted to a sample with 5 values (using 100 repetition):
>>> from qats.stats.weibull import bootstrap >>> m, cv = bootstrap(10., 5., 2.5, 5, 100)
- lse(x, threshold: float | None = None)#
Fit Weibull distribution parameters to sample by method of least square fit to empirical cdf.
- Parameters:
x (array_like) – sample data
threshold (float, optional) – Fit distribution to data points above this threshold. The threshold is defined as value <0, 1> in the empirical CDF. So with threshold=0.87 the distribution is fitted to the values exceeding the 0.87-quantile of the empirical cumulative distribution function.
- Returns:
Distribution parameters
(loc, scale, shape)
.- Return type:
tuple (floats)
Notes
Uses what are known as (approximate) mean rank estimates for the empirical cdf.
- mle(x)#
Fit Weibull distribution parameters to sample by maximum likelihood estimation
- Parameters:
x (array_like) – sample data
- Returns:
Distribution parameters
(loc, scale, shape)
.- Return type:
tuple (floats)
- mlj(sample, l, j)#
Probability weighted moment Mljk of observation order l, order of cdf j, with emphasize on the right/upper tail (k=0).
- Parameters:
- Returns:
Probability weighted moment
- Return type:
Notes
The probability weighted moment Mljk is defined by Greenwood and others (1979) as:
M_{l,j,k} = E[X^l * F^j * (1-F)^k]
, where X(F) is the inverse form of the distribution and F is the cumulative distribution function. When j=k=0 and l is a non-negative integer, then M_{l,0,0} represents the conventional moment of order l about the origin.
PWMs can be applied either when the small observations are more important than the large observations (k=0), as in strength properties of materials, or when the large observations should have more influence than the smaller observations (k=0) as with three diameter distribution modelling. Here we have chosen the latter and derived unbiased estimators for moments M_{l,j,0}(k=0), see eq. 32 in [8]:
M_{l,j,0} = (1 / n) * sum(x[i]^l * binom(i-1, j) / binom(n-1, j))
where i is a counter from j+1 to n and binom() is the binomial coefficient.
- msm(x)#
Fit Weibull distribution parameters to sample by method of sample moments
- Parameters:
x (array_like) – sample data
- Returns:
The loc, scale and shape distribution parameters
- Return type:
floats
Notes
See description in [1].
- plot_fit(x: ndarray, params: tuple, path: str | None = None)#
Plot data sample versus empirical and fitted cumulative distribution function on linearized Weibull scales
- pwm(x)#
Fit distribution parameters to sample by method of probability weighted moments
- Parameters:
x (array_like) – sample data
- Returns:
The loc, scale and shape distribution parameters
- Return type:
floats
Notes
Details on probability weighted moments are provided in [8].
See also
- pwm2(x)#
Fit distribution parameters to sample by method of probability weighted moments assuming the location parameter is zero.
- Parameters:
x (array_like) – sample data
- Returns:
The scale and shape distribution parameters
- Return type:
floats
Notes
Details on probability weighted moments are provided on p.14-15 in [8]. Note that only the scale and parameters are estimated, the location parameter is assumed zero.
- weibull2gumbel(loc, scale, shape, n)#
Calculate parameters of the asymptotic Gumbel extreme value distribution (Type 1) for the extreme value of N independent,Weibull distributed variables.
- Parameters:
- Returns:
Gumbel location and scale parameters
- Return type:
Notes
If the sample x is based on lets say a 30-hour simulation but you seek an estimate of the e.g. 3-hour extreme value then n should be calculated as the nearest integer to:
n = (3 / 30) * nx
where nx is the total number of maxima during 30 hour.
References
Bury, K.V. (1975), “Statistical models in applied science”