Minot
Using Stata for Survey Analysis
Page 24
Example 9: Using “tab…gen” to create dummy variables
egen
This is an extended version of “generate” to create a new variable by aggregating the existing data. It
is a powerful and useful command that does not exist in SPSS. To do the same thing in SPSS, you
would need to create a new file with “aggregate” and merge it with the original file using “match
files.” The syntax is:
egen newvar = fcn(argument) [if exp] [in range] , by(var)
where newvar is the new variable to be created
fcn is one of numerous functions such as:
count( )
max( )
min( )
mean( )
median( )
rank( )
sd( )
sum( )
argument is normally just a variable
var in the by() subcommand must be a categorical variable
Suppose you want to estimate the demand for rice using the BLSS data. You calculate a price
variable using the data, but some households do not buy rice. You can calculate dzongkhag-level
average price and replace missing values with that average price as follows:
egen avgprice = mean(price), by(province)
replace price=avgprice if price==.
Here are some other examples:
egen avg = mean(yield)
creates variable of average yield over entire
sample
egen avg2 = median(pcexpend), by(sexhead) creates variable of median pcexpend for
each sex
egen regprod = sum(prod), by(reg4)
creates variable of total production for each
region
Do'stlaringiz bilan baham: |