Provide descriptive statistics for a dataset.

descriptives(x, ...)

Arguments

x

An object for which a method exists.

...

Additional arguments.

Value

A data.frame with descriptive statistics for x. Its elements are:

nameCharacterVariable name
typecharacterData type in R, as obtained by class(x)[1]
nIntegerNumber of valid observations
missingNumericProportion missing
uniqueIntegerNumber of unique values
meannumericMean value of non-missing entries, only defined for variables that can be coerced to numeric
mediannumericMedian value of non-missing entries, only defined for numeric variables
modeIntegerFor numeric variables: The mode value. For factors: The frequency of the mode value
mode_valueCharacterFor factors: value of the mode
sdnumericStandard deviation of non-missing entries, only defined for variables that can be coerced to numeric
vnumericVariability coefficient V for factor variables (Agresti, 1990). V is the probability that two independent observations fall in different categories
minnumericMinimum value for numeric variables
maxnumericMaximum value for numeric variables
rangenumericRange (distance between min and max) for numeric variables
skewnumericSkewness. The normalized third central moment of a numeric variable, which reflects its skewness. A symmetric distribution has a skewness of zero
skew_2senumericSkewness, divided by two times its standard error. Values greater than one can be considered "significant" according to a Z-test with significance level of .05
kurtnumericKurtosis. The normalized fourth central moment of a numeric variable, which reflects its peakedness. A heavy-tailed distribution has high kurtosis, a light-tailed distribution has low kurtosis (sometimes called platykurtic).
kurt_2senumericKurtosis, divided by two times its standard error. Values greater than one can be considered "significant" according to a Z-test with significance level of .05

References

Agresti, A. (2012). Categorical data analysis (Vol. 792). John Wiley & Sons.

Examples

descriptives(iris)
#>           name    type   n missing unique     mean median  mode mode_value
#> 1 Sepal.Length numeric 150       0     35 5.843333   5.80  5.80       <NA>
#> 2  Sepal.Width numeric 150       0     23 3.057333   3.00  3.00       <NA>
#> 3 Petal.Length numeric 150       0     43 3.758000   4.35  4.35       <NA>
#> 4  Petal.Width numeric 150       0     22 1.199333   1.30  1.30       <NA>
#> 5      Species  factor 150       0      4       NA     NA 50.00     setosa
#>          sd         v min max range       skew   skew_2se     kurt kurt_2se
#> 1 0.8280661        NA 4.3 7.9   3.6  0.3117531  0.7871027 2.426432 3.082490
#> 2 0.4358663        NA 2.0 4.4   2.4  0.3157671  0.7972372 3.180976 4.041048
#> 3 1.7652982        NA 1.0 6.9   5.9 -0.2721277 -0.6870579 1.604464 2.038279
#> 4 0.7622377        NA 0.1 2.5   2.4 -0.1019342 -0.2573597 1.663933 2.113826
#> 5        NA 0.6666667  NA  NA    NA         NA         NA       NA       NA