The four commands all produce summary statistics. The choice of the command, however, depends on the question we ask.
summarize returns a variety of univariate summary statistics. If no options have been specified for the variables, it displays the number of observations, mean, standard deviation, minimum and maximum values.
. sysuse auto
. sum weight
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
weight | 74 3019.459 777.1936 1760 4840
. sum weight, detail produces additional statistics with the detail option.
Weight (lbs.)
-------------------------------------------------------------
Percentiles Smallest
1% 1760 1760
5% 1830 1800
10% 2020 1800 Obs 74
25% 2240 1830 Sum of Wgt. 74
50% 3190 Mean 3019.459
Largest Std. Dev. 777.1936
75% 3600 4290
90% 4060 4330 Variance 604029.8
95% 4290 4720 Skewness .1481164
99% 4840 4840 Kurtosis 2.118403
tabulate produces one- and two-way tables of frequencies.
. tab rep78 produces the distribution of the variable car repair record rep78.
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
1 | 2 2.90 2.90
2 | 8 11.59 14.49
3 | 30 43.48 57.97
4 | 18 26.09 84.06
5 | 11 15.94 100.00
------------+-----------------------------------
Total | 69 100.00
In combination with the categorical variable, tabulate can produce a cross-tabulation of summary statistics by category.
. tab rep78 foreign produces the car repair record frequencies by foreign or domestic car type, where foreign is a categorical variable.
Repair |
Record | Car type
1978 | Domestic Foreign | Total
-----------+----------------------+----------
1 | 2 0 | 2
2 | 8 0 | 8
3 | 27 3 | 30
4 | 9 9 | 18
5 | 2 9 | 11
-----------+----------------------+----------
Total | 48 21 | 69
. tab rep78 foreign, column row reports relative frequencies within the columns and rows when the options column and row are specified.
+-------------------+
| Key |
|-------------------|
| frequency |
| row percentage |
| column percentage |
+-------------------+
Repair |
Record | Car type
1978 | Domestic Foreign | Total
-----------+----------------------+----------
1 | 2 0 | 2
| 100.00 0.00 | 100.00
| 4.17 0.00 | 2.90
-----------+----------------------+----------
2 | 8 0 | 8
| 100.00 0.00 | 100.00
| 16.67 0.00 | 11.59
-----------+----------------------+----------
3 | 27 3 | 30
| 90.00 10.00 | 100.00
| 56.25 14.29 | 43.48
-----------+----------------------+----------
4 | 9 9 | 18
| 50.00 50.00 | 100.00
| 18.75 42.86 | 26.09
-----------+----------------------+----------
5 | 2 9 | 11
| 18.18 81.82 | 100.00
| 4.17 42.86 | 15.94
-----------+----------------------+----------
Total | 48 21 | 69
| 69.57 30.43 | 100.00
| 100.00 100.00 | 100.00
Using by with the categorical or indicator variables produces separate tables of the statistics by each category.
. by foreign: tab rep78 returns the summary statistics in two tables for each foreign and domestic car type.
------------------------------------------------------------
-> foreign = Domestic
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
1 | 2 4.17 4.17
2 | 8 16.67 20.83
3 | 27 56.25 77.08
4 | 9 18.75 95.83
5 | 2 4.17 100.00
------------+-----------------------------------
Total | 48 100.00
-------------------------------------------------------------
-> foreign = Foreign
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
3 | 3 14.29 14.29
4 | 9 42.86 57.14
5 | 9 42.86 100.00
------------+-----------------------------------
Total | 21 100.00
tab1 is a convenience tool for one-way table of frequencies.
. tab1 rep78 foreign headroom returns three tables of frequencies for each variable, equivalent to
. tab rep78
. tab foreign
. tab headroom
-> tabulation of rep78
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
1 | 2 2.90 2.90
2 | 8 11.59 14.49
3 | 30 43.48 57.97
4 | 18 26.09 84.06
5 | 11 15.94 100.00
------------+-----------------------------------
Total | 69 100.00
-> tabulation of foreign
Car type | Freq. Percent Cum.
-----------+-----------------------------------
Domestic | 52 70.27 70.27
Foreign | 22 29.73 100.00
-----------+-----------------------------------
Total | 74 100.00
-> tabulation of headroom
Headroom |
(in.) | Freq. Percent Cum.
------------+-----------------------------------
1.5 | 4 5.41 5.41
2.0 | 13 17.57 22.97
2.5 | 14 18.92 41.89
3.0 | 13 17.57 59.46
3.5 | 15 20.27 79.73
4.0 | 10 13.51 93.24
4.5 | 4 5.41 98.65
5.0 | 1 1.35 100.00
------------+-----------------------------------
Total | 74 100.00
tabulate category, summarize(var) produces one- and two-way tables of means and standard deviations by category on var.
. tab foreign, sum(weight) returns the summary table of the means and standard deviations of the car weight by foreign and domestic car types.
| Summary of Weight (lbs.)
Car type | Mean Std. Dev. Freq.
------------+------------------------------------
Domestic | 3,317.115 695.36374 52
Foreign | 2,315.909 433.00345 22
------------+------------------------------------
Total | 3,019.459 777.19357 74
tabulate var, generate(newvar) creates indicator variables.
. tab foreign, gen(import) generates two new variables import1, indicating whether the car is domestic made, and import2, indicating whether the car is foreign made.
Car type | Freq. Percent Cum.
------------+-----------------------------------
Domestic | 52 70.27 70.27
Foreign | 22 29.73 100.00
------------+-----------------------------------
Total | 74 100.00
tabstat varlist, by(category) statistics (stat1 stat2…) reports summary statistics by groups, where multiple statistics can be specified, including mean, sum, count, standard deviation, percentiles etc.
. tabstat price weight, by(foreign) statistics(n mean sd p50)
reports the number of nonmissing observations, mean, standard deviation, and the 50th percentile (median) of variable price and weight by their car type of foreign or domestic made.
Summary statistics: N, mean, sd, p50
by categories of: foreign (Car type)
foreign | price weight
---------+--------------------
Domestic | 52 52
| 6072.423 3317.115
| 3097.104 695.3637
| 4782.5 3360
---------+--------------------
Foreign | 22 22
| 6384.682 2315.909
| 2621.915 433.0035
| 5759 2180
---------+--------------------
Total | 74 74
| 6165.257 3019.459
| 2949.496 777.1936
| 5006.5 3190
------------------------------
table category1, by(category2) contents (stat1 var stat2 var stat3 var…) is a more flexible command that produces one-, two-, and up-to-seven-way tables of various summary statistics. Up to four variables can be selected as categories and up to five statistics can be specified.
. table rep78, by(foreign) contents(n weight mean weight sd weight p50 weight)
reports the mean and the 50th percentile of car weight by repair record rep78 and foreign/domestic car types. foreign is the main category under which each value of price is specified.
------------------------------------------------------------------
Car type |
and |
Repair |
Record |
1978 | N(weight) mean(weight) sd(weight) med(weight)
----------+-------------------------------------------------------
Domestic |
1 | 2 3,100 523.259 3,100
2 | 8 3,353.8 445.9961 3,465
3 | 27 3,442.2 645.3701 3,350
4 | 9 3,532.2 806.2068 3,700
5 | 2 1,960 226.2742 1,960
----------+-------------------------------------------------------
Foreign |
1 |
2 |
3 | 3 2,010 158.7451 2,070
4 | 9 2,207.8 335.4764 2,160
5 | 9 2,403.3 405.0926 2,240
------------------------------------------------------------------