Statistics: random number generators, PDF and CDF functions, and hypothesis tests.
The module includes some basic statistical functions such as mean, variance, skew, and etc. It also includes the following three submodules.
The Rnd module provides random number generators of various distributions.
The Pdf module provides a range of probability density/mass functions of different distributions.
The Cdf module provides cumulative distribution functions.
Please refer to GSL documentation for details.
val std : ?w:float array ‑> ?mean:float ‑> float array ‑> float
std x
calculates the standard deviation of x
.
val sem : ?w:float array ‑> ?mean:float ‑> float array ‑> float
sem x
calculates the standard error of x
, also referred to as standard
error of the mean.
val kurtosis : ?w:float array ‑> ?mean:float ‑> ?sd:float ‑> float array ‑> float
kurtosis x
return the Pearson's kurtosis of x
.
val percentile : float array ‑> float ‑> float
percentile x p
returns the p
percentile of the data x
. p
is between
0. and 1. x
does not need to be sorted.
val first_quartile : float array ‑> float
first_quartile x
returns the first quartile of x
, i.e., 25 percentiles.
val third_quartile : float array ‑> float
third_quartile x
returns the third quartile of x
, i.e., 75 percentiles.
val rank : ?ties_strategy:[ `Average | `Min | `Max ] ‑> float array ‑> float array
Computes sample's ranks.
The ranking order is from the smallest one to the largest. For example
rank [|54.; 74.; 55.; 86.; 56.|]
returns [|1.; 4.; 2.; 5.; 3.|]
.
Note that the ranking starts with one!
ties_strategy
controls which ranks are assigned to equal values:
`Average
the average of ranks should be assigned to each value.
Default.`Min
the minimum of ranks is assigned to each value.`Max
the maximum of ranks is assigned to each value.val ecdf : float array ‑> float array * float array
ecdf x
returns (x',f)
which are the empirical cumulative distribution
function f
of x
at points x'
. x'
is just x
sorted in increasing
order with duplicates removed.
val metropolis_hastings : (float array ‑> float) ‑> float array ‑> int ‑> float array array
TODO: metropolis_hastings f p n
is Metropolis-Hastings MCMC algorithm.
f is pdf of the p
val gibbs_sampling : (float array ‑> int ‑> float) ‑> float array ‑> int ‑> float array array
TODO: gibbs_sampling f p n
is Gibbs sampler. f is a sampler based on the full
conditional function of all variables
val z_test : mu:float ‑> sigma:float ‑> ?alpha:float ‑> ?side:tail ‑> float array ‑> bool * float * float
z_test ~mu ~sigma ~alpha ~side x
returns a test decision for the null
hypothesis that the data x
comes from a normal distribution with mean mu
and a standard deviation sigma
, using the z-test of alpha
significance
level. The alternative hypothesis is that the mean is not mu
.
The result h,p,z
: h
is true
if the test rejects the null hypothesis at
the alpha
significance level, and false
otherwise. p
is the p-value and
z
is the z-score.
val t_test : mu:float ‑> ?alpha:float ‑> ?side:tail ‑> float array ‑> bool * float * float
t_test ~mu ~alpha ~side x
returns a test decision of one-sample t-test
which is a parametric test of the location parameter when the population
standard deviation is unknown. mu
is population mean, alpha
is the
significance level.
val t_test_paired : ?alpha:float ‑> ?side:tail ‑> float array ‑> float array ‑> bool * float * float
t_test_paired ~alpha ~side x y
returns a test decision for the null
hypothesis that the data in x – y
comes from a normal distribution with
mean equal to zero and unknown variance, using the paired-sample t-test.
val t_test_unpaired : ?alpha:float ‑> ?side:tail ‑> ?equal_var:bool ‑> float array ‑> float array ‑> bool * float * float
t_test_unpaired ~alpha ~side ~equal_var x y
returns a test decision for
the null hypothesis that the data in vectors x
and y
comes from
independent random samples from normal distributions with equal means and
equal but unknown variances, using the two-sample t-test. The alternative
hypothesis is that the data in x
and y
comes from populations with
unequal means.
equal_var
indicates whether two samples have the same variance. If the
two variances are not the same, the test is referred to as Welche's t-test.
val var_test : ?alpha:float ‑> ?side:tail ‑> var:float ‑> float array ‑> bool * float * float
var_test ~alpha ~side ~var x
returns a test decision for the null
hypothesis that the data in x
comes from a normal distribution with
variance var
, using the chi-square variance test. The alternative hypothesis
is that x
comes from a normal distribution with a different variance.
val jb_test : ?alpha:float ‑> float array ‑> bool * float * float
jb_test ~alpha x
returns a test decision for the null hypothesis that the
data x
comes from a normal distribution with an unknown mean and variance,
using the Jarque-Bera test.
val fisher_test : ?alpha:float ‑> ?side:tail ‑> int ‑> int ‑> int ‑> int ‑> bool * float * float
fisher_test ~alpha ~side a b c d
fisher's exact test for contingency table
|a
, b
|
|c
, d
|
.
The result h,p,z
: h
is true
if the test rejects the null hypothesis at
the alpha
significance level, and false
otherwise. p
is the p-value and
z
is prior odds ratio.
val runs_test : ?alpha:float ‑> ?side:tail ‑> ?v:float ‑> float array ‑> bool * float * float
runs_test ~alpha ~v x
returns a test decision for the null hypothesis that
the data x
comes in random order, against the alternative that they do not,
by runnign Wald–Wolfowitz runs test. The test is based on the number of runs
of consecutive values above or below the mean of x
. ~v
is the reference
value, the default value is the median of x
.
val mannwhitneyu : ?alpha:float ‑> ?side:tail ‑> float array ‑> float array ‑> bool * float * float
mannwhitneyu ~alpha ~side x y
Computes the Mann-Whitney rank test on
samples x and y. If length of each sample less than 10 and no ties, then
using exact test (see paper Ying Kuen Cheung and Jerome H. Klotz (1997)
The Mann Whitney Wilcoxon distribution using linked list
Statistica Sinica 7 805-813), else usning asymptotic normal distribution.
val wilcoxon : ?alpha:float ‑> ?side:tail ‑> float array ‑> float array ‑> bool * float * float
module Rnd : sig ... end
module Pdf : sig ... end
module Cdf : sig ... end