proc
sql;create
table newdataset asselect
*, count(*) as cty from olddatagroup
by gvkeyhaving cty >
6; quit;* In this data set, the two ID variables are GVKEY and YEAR. I want to count the number of rows
of data for each value of GVKEY and only keep those GVKEYs that have at least 7 rows of data.
This code takes the existing data set named OLDDATA and creates a new dataset named NEWDATATSET
that has in it all the variables in OLDDATA (that is what select * does) *and* it creates a
variable named CTY that is the count of the number of rows of data and names it CTY.
It does this counting for each value of the ID GVKEY the number of rows of data. Finally, it keeps in
the NEWDATASET only those GVKEYs that have at least 7 rows of data.