ViolinPlot - Maple Help

Statistics

 ViolinPlot
 create violin plots from data

 Calling Sequence ViolinPlot(X, Y, options, plotoptions)

Parameters

 X - data Y - (optional) data to be plotted in conjunction with X. When Y is specified, the command will draw half a violin plot for each individual sample in X and Y respectively. This can make it easier to directly compare the the distributions. options - (optional) equation(s) of the form option=value where option is one of datasetlabels, offset, distance, width, mean, symbol, quantiles, interval, divider, orientation, color, filled, scale, method, range, kernel, bins, left, right, or bandwidth; specify options for generating the violin plot plotoptions - options to be passed to the plots[display] command

Description

 • The ViolinPlot command generates a violin plot for the specified data. A violin plot is a visualization of the distribution of data consisting of a rotated kernel density plot and markers for the quartiles and the mean.
 • The parameter X is either a single data sample - given as e.g. a Vector - or a list of data samples. Note that the individual samples may be of variable size.
 • The optional parameter Y is either a single data sample - given as e.g. a Vector - or a list of data samples. Note that the individual samples may be of variable size but the number of individual samples must be equal to the number samples specified in X.

Options

 The options argument can contain one or more of the options shown below. All unrecognized options will be passed to the plots[display] command. See plot/options for details.
 • datasetlabels=default or list
 Data set labels for the individual violin plots. The labels appear along the axes.  By default, the labels are set to 1, 2, 3, etc.
 • offset=realcons
 Initial offset along the x-axis. The default value is 0.
 Note: By default, the view wraps tightly around all visible plot objects and the horizontal axis is marked by data set labels, not regular coordinates, so this option will have no (visual) effect. It is meant for the case where this plot is combined with other plot elements.
 • distance=nonnegative
 This option controls the distance between the violin plots. The default value is 0.25.
 • width=realcons
 This option controls the width of the violin plots. The default value is 0.75.
 The following plot illustrates how the options offset, distance, and width are interpreted.
 – Note the lengths of the arrows labeled "offset", "width" and "distance" correspond to values for the offset, width and distance options respectively.
 • mean=true or false
 If this option is set to true then the mean is included in the plot. The default value is true.
 • symbol=name or list
 This option specifies the symbol type for the points representing the mean. By default, the symbol type is diamond. When two data sets are given, X and Y, you can specify one symbol type or a list of two symbol types. Providing a list of two names specifies the symbol types for for plotting the points corresponding to the means of X and Y.
 • quantiles=list of lists
 This option can be used to mark specific quantiles. Quantiles are represented by horizontal lines of specified length. Each sublist is a list containing two elements where the first element specifies the quantile to be marked and the second element specifies the length of the horizontal line. The default value is [ [3/4,.5], [1/2, .75], [1/4, .5] ]
 • interval=realcons
 This option controls the amount of space inserted between the two halves of a violin plot. The default value is set to 0.
 • divider=true or false
 If divider is set to true then a vertical line is drawn to separate the halves of a violin plot. By default divider=false in which case no such line is drawn.
 • orientation=horizontal or vertical
 Indicate the orientation of the violin plots. The default is vertical. The option descriptions in this help page assume the orientation is set to vertical as well.
 • color=name, list, or range
 This option specifies colors for the individual data sets. If a range of colors is given, the colors are generated by selecting an appropriate number of equally spaced points in the corresponding hue range. For a list of colors, the behavior depends on whether the optional data Y is specified. In the case where Y is specified, the list can have at most two colors where the halves corresponding to the first data are colored using the first value in the list and the halves corresponding to the optional data are colored using the second value. Otherwise each of the violin plots is colored with the corresponding color in the list.
 • filled=true or false
 If the filled option is set to true, the area inside each of the violin plots is filled with a solid color. The color value of a filled area is set to the color of the adjacent curve. The default value is set to true.
 • scale=value
 The option specifies the method used to scale the width of violin plots. The option can be specified in the following ways:
 • scale=width(scalelistX, scalelistY) or scale=width
 In the case where the scaling method is specified as width or width(scalelistX, scalelistY), the violin plots are scaled so that their widths are proportional to the values specified in scalelistX and optionally scalelistY, where the value 1 represents the width that makes the violin plot fill the bounding box defined by the width option exactly.
 In all cases, scalelistX and scalelistY are specified as lists of numeric values.
 The default value for a scalelist is a list of all ones.
 Note scale=width is used as default.
 • scale=area(scaletype, scalelistX, scalelistY) or scale=area
 When the scaling method is specified as area, by default the violin plots are scaled such that their areas are proportional to the values specified in scalelistX and optionally scalelistY, and the maximal width among all of them is given by the width option. Then if scaletype is setwise, the proportional comparison is only within the X data set and within the Y data set. Finally, if scaletype is pairwise, the proportional comparison is within each pair of (X, Y)-data sets plotted next to each other. See the Examples section for a specific example using this option.
 • scale=count(scaletype) or scale=count
 In the case where scaling method specified to be count, by default the violin plots are scaled such that the area of a particular violin plot is proportional to number of observations in the associated data sample, and the maximal width among all of them is given by the width option. Then if scaletype=setwise, the proportional comparison is only within the X data set and within the Y data set. Finally, if scaletype=pairwise, the proportional comparison is within each pair of (X, Y)-data sets plotted next to each other.
 If Y data is specified but only one list of scales is provided, then the specified list of scales is used to scale both sets of Violin Plots. However this generates an error if scaletype is set to pairwise.
 Note that the number of elements in scalelistX and scalelistY when specified, must at least be equal to the number of elements in X and Y respectively.
 • method=exact or piecewise
 This parameter specifies the method of plotting the kernel density estimate (by default this is piecewise).  For more information, see Statistics[KernelDensity].
 • range=deduce or realrange
 By default this is deduce.  This option is used to specify the vertical range in the violin plot.
 • kernel=gaussian, biweight, epanechnikov, triangular, or rectangular
 The default value is gaussian.  This option allows a non-Gaussian kernel to be used in developing the estimate. For more information, see Statistics[KernelDensity].
 • bins=posint
 The number of bins in which to categorize data points (128 by default). This value must be a power of 2 and is equal to the size of the array returned by KernelDensity when the option method=piecewise is specified.  This parameter is ignored if method=exact.
 • left=realcons
 This option specifies the lower boundary on valid data values. Any data values that are smaller than this value are discarded.  By default, this procedure will impose boundary conditions consistent with the specified range rng.
 • right=realcons
 This option specifies the upper boundary on valid data values. Any data values that are smaller than this value are discarded.  By default, this procedure will impose boundary conditions consistent with the specified range rng.
 • bandwidth=realcons
 The bandwidth is a positive quantity that specifies the width of the kernel (the amount each data point affects distant portions of the probability density estimate).  Each kernel is scaled such that the bandwidth is equal to the standard deviation of the kernel.

Notes

 • Note that the labels for the data sets are placed on the axes, and should not be confused for coordinates.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$
 > $A≔\mathrm{Array}\left(\left[-1.,-0.4,-0.2,0.,0.,0.1,0.2,0.7,0.9\right]\right):$
 > $\mathrm{ViolinPlot}\left(A,\mathrm{range}=-5..5,\mathrm{bins}=256,\mathrm{bandwidth}=\frac{1}{4},\mathrm{method}=\mathrm{exact}\right)$

The commands to create the plot from the Plotting Guide are

 > $N≔\mathrm{RandomVariable}\left(\mathrm{Normal}\left(0,1\right)\right):$
 > $\mathrm{PDF}\left(N,0.\right):$
 > $S≔\mathrm{Sample}\left(N,16\right):$
 > $\mathrm{ViolinPlot}\left(S,\mathrm{bins}=512,\mathrm{kernel}=\mathrm{epanechnikov},\mathrm{color}="Niagara Green",\mathrm{left}=-6,\mathrm{right}=6,\mathrm{method}=\mathrm{piecewise}\right)$
 > $\mathrm{ViolinPlot}\left(\left[S\right],\left[A\right],\mathrm{scale}=\mathrm{count}\right)$
 > $C≔\left[\mathrm{seq}\left(\mathrm{Sample}\left(\mathrm{Normal}\left(\mathrm{ln}\left(i\right),3\right),60\right),i=1..20\right)\right]:$
 > $F≔\left[\mathrm{seq}\left(\mathrm{Sample}\left(\mathrm{Normal}\left(\mathrm{sin}\left(i\mathrm{\pi }\right),3\right),120\right),i=1..20\right)\right]:$
 > $\mathrm{ViolinPlot}\left(C\left[1..3\right],F\left[1..3\right],\mathrm{size}=\left[1000,500\right],\mathrm{color}="LightBlue".."red",\mathrm{scale}=\mathrm{area}\left(\mathrm{pairwise},\left[1,1,2\right],\left[2,3,1\right]\right)\right)$

The ViolinPlot command also accepts a Matrix. The columns are understood as individual data samples.

 > $R≔\left[\mathrm{seq}\left(\mathrm{Sample}\left(\mathrm{Normal}\left(\mathrm{ln}\left(i\right),3\right),10\right),i=1..3\right)\right]:$
 > $M≔{\mathrm{Matrix}\left(R,\mathrm{scan}=\mathrm{columns}\right)}^{\mathrm{%T}}:$

Plot options such as title are passed to the plots:-display command:

 > $\mathrm{ViolinPlot}\left(M,\mathrm{color}="Niagara Blue",\mathrm{title}="Violin Plots"\right)$

Compatibility

 • The Statistics[ViolinPlot] command was introduced in Maple 2017.