reshape long varlist, i(i) j(j) or reshape wide varlist, i(i) j(j) converts data from long to wide and vice versa.
i is the id variable of a higher level;
j is the id variable of a lower level;
wide means reshaping to wide;
long means reshaping to long.
There are more observations in the long format and more variables in the wide format. In the long format, variable values of the lower levels are grouped under values of a higher level. In the wide format, each column represents variable values of lower levels.
. use long
. list
+------------------------------------------+
| id semester course gpa attend~e |
|------------------------------------------|
1. | 1 1 A 3.81 1 |
2. | 1 1 B 3.82 2 |
3. | 1 2 A 3.76 3 |
4. | 1 2 B 3.77 4 |
5. | 2 1 A 3.56 4 |
|------------------------------------------|
6. | 2 1 B 3.55 3 |
7. | 2 2 A 3.45 2 |
8. | 2 2 B 3.47 1 |
+------------------------------------------+
. reshape wide gpa attendance, i(id semester) j(course) string converts the dataset from long to wide.
(note: j = A B)
Data long -> wide
-----------------------------------------------------------------------------
Number of obs. 8 -> 4
Number of variables 5 -> 6
j variable (2 values) course -> (dropped)
xij variables:
gpa -> gpaA gpaB
attendance -> attendanceA attendanceB
-----------------------------------------------------------------------------
. list
+---------------------------------------------------+
| id semester gpaA attend~A gpaB attend~B |
|---------------------------------------------------|
1. | 1 1 3.81 1 3.82 2 |
2. | 1 2 3.76 3 3.77 4 |
3. | 2 1 3.56 4 3.55 3 |
4. | 2 2 3.45 2 3.47 1 |
+---------------------------------------------------+
. reshape long changes it back to the long format.
(note: j = A B)
Data wide -> long
-----------------------------------------------------------------------
Number of obs. 4 -> 8
Number of variables 6 -> 5
j variable (2 values) -> course
xij variables:
gpaA gpaB -> gpa
attendanceA attendanceB -> attendance
-----------------------------------------------------------------------
collapse (stat1) varlist1 (stat2) varlist2…, by(group varlist) aggregates the dataset to summary statistics. stat options include mean, median, percentiles, standard deviations, standard errors, first/last values, maximum/minimum etc.
. use long
. collapse (mean) gpa, by(semester id)
returns the mean of gpa of each id by semester.
. list
+-----------------------+
| id semester gpa |
|-----------------------|
1. | 1 1 3.815 |
2. | 2 1 3.555 |
3. | 1 2 3.765 |
4. | 2 2 3.46 |
+-----------------------+