Statistics - Maple Programming Help

Online Help

All Products    Maple    MapleSim


Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Summary and Tabulation : Statistics/DataSummary

Statistics

  

DataSummary

  

compute seven summary statistics for a data sample

 

Calling Sequence

Parameters

Options

Description

Computation

Examples

References

Compatibility

Calling Sequence

DataSummary(A, options)

Parameters

A

-

data set or Matrix data set

options

-

(optional) equation(s) of the form option=value where option is one of ignore, output, summarize, tableweights, or weights; specify options for the DataSummary function

Options

  

The options argument can contain one or more of the options shown below. Some of these options are described in more detail in the Statistics[DescriptiveStatistics] help page.

• 

ignore : truefalse; This option controls how missing data is handled by the DataSummary command. Missing items are represented by undefined or Float(undefined). So, if ignore=false and A contains missing data, most of the statistics command will yield undefined. If ignore=true all missing items in A will be ignored. The default value is false.

• 

output : default or quantity where quantity is any of mean, standarddeviation, skewness, kurtosis, minimum, maximum and cumulativeweight, indicates which quantities need be calculated. The value of this option can also be a list. In this case the DataSummary command will return a list of the specified quantities in the specified order.

• 

summarize : false or embed; Display an embedded summary table. The default is false.

• 

tableweights : list(integer); Relative weights for the Table's columns' widths. By default all columns have equal weight.

• 

weights : Vector of data weights. The number of elements in the weights array must be equal to the number of elements in the original data sample. By default all elements in A are assigned weight 1.

Description

• 

The DataSummary function computes seven summary statistics for the data set A. These are the mean, standard deviation, coefficient of skewness, coefficient of kurtosis, minimum, maximum and the cumulative weight of a data sample. By default the DataSummary command returns a column vector of equations of the form quantity=value where quantity is one of mean, standarddeviation, skewness, kurtosis, minimum, maximum, or cumulativeweight.

• 

The first parameter A is the data set - such as a Vector.

Computation

• 

All computations involving data are performed in floating-point; therefore, all data provided must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.

• 

For more information about computation in the Statistics package, see the Statistics[Computation] help page.

Examples

withStatistics:

XRandomVariableNormal10,3:

ASampleX,104:

The DataSummary command returns a Vector containing the summary statistics.

DataSummaryA

mean=9.929906967641008standarddeviation=2.9854565077355004skewness=0.03205073815120898kurtosis=2.987474052351289minimum=1.6116642566301707maximum=20.78301180186928cumulativeweight=10000.0

(1)

DataSummaryA,output=mean,standarddeviation

9.92990696764101,2.98545650773550

(2)

DataSummaryA,output=minimum,mean,standarddeviation,maximum

−1.61166425663017,9.92990696764101,2.98545650773550,20.7830118018693

(3)

Consider the following Matrix data set.

MMatrix3,1130,114694,4,1527,127368,3,907,88464,2,878,96484,4,995,128007

31130114694415271273683907884642878964844995128007

(4)

For Matrix inputs, the DataSummary command outputs a Vector containing the corresponding summary statistics by column.

resultsDataSummaryM

mean=3.2standarddeviation=0.8366600265340756skewness=0.30734449954313kurtosis=1.4775510204081639minimum=2.0maximum=4.0cumulativeweight=5.0mean=1087.4standarddeviation=264.5719183889326skewness=0.9339774575409044kurtosis=2.061469467497881minimum=878.0maximum=1527.0cumulativeweight=5.0mean=1.110034105standarddeviation=17953.973120175935skewness=0.22301188518436363kurtosis=1.1020141039120812minimum=88464.0maximum=1.280070105cumulativeweight=5.0

(5)

To display the summary for one of the columns:

results1

mean=3.2standarddeviation=0.8366600265340756skewness=0.30734449954313kurtosis=1.4775510204081639minimum=2.0maximum=4.0cumulativeweight=5.0

(6)

If the input is a DataFrame object, then the result is a DataFrame that has the same column labels as the original input, and the row labels correspond to the output quantities requested.

dfDataFrameM,columns=a,b,c

DataFrame31130114694415271273683907884642878964844995128007,rows=1,2,3,4,5,columns=a,b,c

(7)

df_resultsDataSummarydf

DataFrame3.21087.41.1100341050.8366600265340756264.571918388932617953.9731201759350.307344499543130.93397745754090440.223011885184363631.47755102040816392.0614694674978811.10201410391208122.0878.088464.04.01527.01.2800701055.05.05.0,rows=mean,standarddeviation,skewness,kurtosis,minimum,maximum,cumulativeweight,columns=a,b,c

(8)

df_resultsb

DataSeries1087.4264.57191838893260.93397745754090442.061469467497881878.01527.05.0,labels=mean,standarddeviation,skewness,kurtosis,minimum,maximum,cumulativeweight,datatype=anything

(9)

The summarize option makes it possible to display an embedded table containing the results. Note that the embedded table is only for display and that the returned value of the DataSummary command is unchanged.

resultsDataSummarydf,summarize=embed:

 

a

b

c

mean

3.20000000000000018

1087.40000000000009

111003.399999999994

standarddeviation

0.836660026534075563

264.571918388932602

17953.9731201759350

skewness

−0.307344499543129979

0.933977457540904443

−0.223011885184363629

kurtosis

1.47755102040816388

2.06146946749788107

1.10201410391208121

minimum

2.

878.

88464.

maximum

4.

1527.

128007.

cumulativeweight

5.

5.

5.

Similar to the example above, the returned value for results is the same:

results1

DataSeries3.20.83666002653407560.307344499543131.47755102040816392.04.05.0,labels=mean,standarddeviation,skewness,kurtosis,minimum,maximum,cumulativeweight,datatype=anything

(10)

The tableweights option controls the width of columns in an embedded table.

interfacedisplayprecision=4:

DataSummarydf,summarize=embed,tableweights=4,2,2,2:

 

a

b

c

mean

3.2000

1087.4000

111003.4000

standarddeviation

0.8367

264.5719

17953.9731

skewness

−0.3073

0.9340

−0.2230

kurtosis

1.4776

2.0615

1.1020

minimum

2.0000

878.0000

88464.0000

maximum

4.0000

1527.0000

128007.0000

cumulativeweight

5.0000

5.0000

5.0000

References

  

Stuart, Alan, and Ord, Keith. Kendall's Advanced Theory of Statistics. 6th ed. London: Edward Arnold, 1998. Vol. 1: Distribution Theory.

Compatibility

• 

The A parameter was updated in Maple 16.

• 

The summarize option was introduced in Maple 2016.

• 

For more information on Maple 2016 changes, see Updates in Maple 2016.

• 

The Statistics[DataSummary] command was updated in Maple 2019.

• 

The tableweights option was introduced in Maple 2019.

• 

For more information on Maple 2019 changes, see Updates in Maple 2019.

See Also

Statistics

Statistics[Computation]

Statistics[DescriptiveStatistics]