This Maple worksheet accompanies the papers:
Di Nardo E., G. Guarino, D. Senato (2008), A Maple algorithm for polykays and their generalizations, Adv. Appl. Stat. Vol. 8, No. 1, 19 - 36, http://www.pphmj.com/journals/adas.htm. Di Nardo E., G. Guarino, D. Senato (2008), An unifying framework for k-statistics, polykays and their generalizations, Bernoulli. Vol. 14(2), 440-468. Official Journal of the Bernoulli Society for Mathematical Statistics and Probability, http://isi.cbs.nl/bernoulli/, (download from http://www.unibas.it/utenti/dinardo/lavori.html) Di Nardo E., G. Guarino, D. Senato (2008), Symbolic computation of moments of sampling distributions, Comp. Stat. Data Analysis Vol. 52, no. 11, 4909-4922, (download from http://arxiv.org/PS_cache/arxiv/pdf/0806/0806.0129v1.pdf or http://www.unibas.it/utenti/dinardo/lavori.html) A Maple algorithm for k-statistics, polykays and their multivariate generalization
E. Di Nardo* elvira.dinardo@unibas.it http://www.unibas.it/utenti/dinardo/home.html; Tel: +39 0971205890, Fax: +39 0971205896G. Guarino** giuseppe.guarino@asl2.potenza.itD. Senato* domenico.senato@unibas.it * Dipartimento di Matematica e Informatica, Universit? degli Studi della Basilicata,Viale dell'Ateneo Lucano n.10, 85100 Potenza, Italy**Medical Scool, Universit? del Sacro Cuore (Rome branch), Largo Agostino Gemelli n.8, 00168 Roma, Italy
Introduction
Abstract: Through the classical umbral calculus, we provide a unifying syntax for single and multivariate k-statistics, polykays and multivariate polykays. Classical umbral calculus is a simbolic method for handling ordinary and exponential formal power series. A feature of the classical umbral calculus is that an umbra has the structure of a random variable but with non reference to a probability space. This brings the umbral syntax closer to statistical methods.
Application Areas/Subject: Combinatorics & algebraic methods in statistics
Keyword: umbral calculus, symmetric polynomials, set partitions, multiset, cumulants, k-statistics, polykays.
See Also: Maple algorithm [3]
Initialization
Subdivision of Multiset and Augumented symmetric function to Power Sum
The following algorithm function is used for listing all subdivision of a multiset. Theese algorithms is fully discussed in [3]
Subdivision of Multiset
See [3] for complete discussion about this algorithm.
The output of the following new function is a disjoint union of vectors.
Example
The output of the following function is a multiset subdivision. This algorithm is described in [3]. Remark: here the function maketab is slightly different from the function maketab in [3]. The input parameters are changed. In [3] we just give the multiplicities of elements. In this function we also name the elements. For example if all partitions of the multiset [a,a,b] are needed we can recall makeTab([a,a,b]) while in [3] the function calls makeTab(2,1).
Note: at the end of the function we have inserted a code block in order to calculate additional values of the coefficients for every term of the final expression. This is usefull either in the construction of the k-statistics either in the conversion between augumented symmetric functions and power sums.
Augumented symmetric function to Power Sum
The output of this function is a symmetric function expressed in terms of power sum symmetric function, e.g.
Note:
the symbol [a,a,a] represents the subdivision {{a}{a}{a}} and the symbol [, ] represents the subdivision {{a,a},{a}}.
The number associated to each symbol represents a product of , where k is the degree of the monomial in each block, multiplied by the integer associated to the same symbol in the output of maketab.
For example: for the symbol = {{a,a},{a}} we calculate [ ][ ] = [-11] = -1 and we multiply -1 with 3 that is the multiplicity of the subdivision:
Example: how we calculate augToPs([a,a,a])
1) we recall makeTab([a,a,a]) having following result:
2) we calculate:
2.1) block :
2.2) block :
2.3) block :
3) Multipling the multiplicities in the step 1) with the results found in step 2), we have:
4) from which we can express the following result:
Examples
the step 3) in the previouse note.
A unifying framework for k-statistics, polykays and their multivariate generalizations
The nth k-statistic is the unique symmetric unbiased estimator of the cumulant of a given statistical distribution.
is defined so that E[] =
The symmetric statistic is defined as
E[] = ...
where is a cumulant. These statistics called polykays, generalize the k-statistics. K-statistics, polykays and their multivariate generalization are commonly defined in terms of power sums, that are sums of the rth powers of the data points:
Note: in polyk we calculate an additional value: the cardinality of subdivision. For every subdivision we calculate where k is its cardinality.
For example has 3 blocks and we calculate
Example: how we calculate Polykays(3)
1) we call makeTab([a,a,a]) and this is the output:
3) Multipling the multiplicities in step 1) with the results found in step 2) we have:
4) from which we calculate:
=
The following function gives the expression for k-statistics and polykays and their multivariate generalizzations depending on the input parameters:
- for generate k-statistics the parameter is: [ r ]
- for generate polykays the parameter is: [ r ] , [ s ]
- for generate multivariate k-statistics the parameter is: [ r , s ]
- for generate multivariate polykays the parameter is: [ r , s ] , [ u , v]
examples: k-statistics
example: polykays
example: multivariate k-statistics
example: multivariate polykays
Example: steps for k-statistics and polykays construction
k-Statistics
Test previous result
polykays
test previous result
multivariate k-Statistics
multivariate polykays
Replacing symbols with numerical data
Sums of the rth powers of the data points:
This function allows us to process a k-statistic or polykay replacing the simbols with numerical data. The parameter is the following:
- for generate k-statistics the parameter is: [ r ], [ [ n1, n2, ...] ]
- for generate polykays the parameter is: [ [ r ] , [ s ] ], [ [ n1, n2, ...] ]
- for generate multivariate k-statistics the parameter is: [ [ r , s ] ], [ [ n1a, n2a], [ n1b, n2b] , ... ]
- for generate multivariate polykays the parameter is: [ [ r , s ], [ u , v] ], [ [ n1a, n2a], [ n1b, n2b] , ... ]
Examples: k-statistics and polykays
The estimator for the mean is given by
The estimator for the variance is given by
The estimator for the skewness is given by /
The estimator for the kurtosis is given by /
The estimator for the is given by
Examples: multivariate k-ktatistics and multivariate polykays
Conclusions
Umbral formulae for k-statistics and polykays, either in single or multivariate cases, share a common algorithm to construct multiset subdivisions. When the multiset has the form {a^(i)}, an efficient way is to resort integer partitions.
In general we may construct multiset subdivisions by using suitable set partitions, but this procedure result non efficient from computational point of view [4].
Indeed, subdivision may occur more than one time in the same formula so that it is necessary to built a procedure generating only different subdivision with their multiplicity i.e. the number of corresponding set partitions.
To accomplish this task, the algorithm makeTab takes into account the connection between multisets and integer partitions, reducing the overall computational complexity. To accomplish this task, the algorithm makeTab takes into account the connection between multisets and integer partitions, reducing the overall computational complexity. It is possible to build very fast algorithms for k-statistics, polikays and their generalizzations by forfeiting the elegant idea of producing only one algorithm for the whole subject see [5],[6].
References
[1] Di Nardo E., G. Guarino, D. Senato (2008) A Maple algorithm for polykays and their generalizations. Adv. Appl. Stat. Vol. 8, No. 1, 19 - 36, http://www.pphmj.com/journals/adas.htm. [2] Di Nardo E., G. Guarino, D. Senato (2008) An unifying framework for k-statistics, polykays and their generalizations. Bernoulli. Vol. 14(2), 440-468. Official Journal of the Bernoulli Society for Mathematical Statistics and Probability, http://isi.cbs.nl/bernoulli/,
(download from http://www.unibas.it/utenti/dinardo/lavori.html)
[3] Di Nardo E., G. Guarino, D. Senato, Multiset Subdivision, source Maple algorithm located in www.maplesoft.com (submitted)
[4] Di Nardo E., G. Guarino, D. Senato (2008) Symbolic computation of moments of sampling distributions. Comp. Stat. Data Analysis Vol. 52, no. 11, 4909-4922, (download from http://arxiv.org/PS_cache/arxiv/pdf/0806/0806.0129v1.pdf or http://www.unibas.it/utenti/dinardo/lavori.html)
[5] Di Nardo E., G. Guarino, D. Senato (2009), A new method for fast computing unbiased estimators of cumulants. Statistics and Computing Vol. 19, 155-165. (download from http://www.unibas.it/utenti/dinardo/lavori.html)
[6] Di Nardo E., G. Guarino, D. Senato, Fast Maple algorithms for k-statistics, polykays and their multivariate generalization , source Maple algorithm located in www.maplesoft.com (submitted)
Legal Notice: The copyright for this application is owned by the author(s). Neither Maplesoft nor the author are responsible for any errors contained within and are not liable for any damages resulting from the use of this material. This application is intended for non-commercial, non-profit use only. Contact the author for permission if you wish to use this application in for-profit activities