Using Stata for Survey Data Analysis
Minot
Page 23
If you use the “if” option, then the old values will be retained when the “if” statement is false
You can use the period (.) to
represent missing values
For example,
replace price = avgprice if price > 100000
replaces high values with an average price
replace pcexpend =. if pcexpend<=0
replace negative pcexpend with missing value
replace agehead = 25 in 1007
replace age=25 in observation #1007
Example 8 shows the use of the gen and replace commands to create a new variable called region.
The new variables has three values: 1 for west, 2 for center, and 3 for east.
Example 8: Using “generate” and “replace” to create new variables
tabulate … generate
This command is useful for creating a set of dummy variables (variables with a value of 0 or 1)
depending on the value of an existing categorical variable. The syntax is:
tabulate oldvariable,
generate(newvariable
)
It is easier to explain with an example. Suppose we want to create three dummy variables that
indicate whether a
household is in the west, center, or east of Bhutan. We can create three dummy
variables from the variable “region” as follows:
tab region, gen(reg)
This
creates three new variables, defined as follows:
reg1=1 if region=1 and 0 otherwise
reg2=1 if region=2 and 0 otherwise
reg3=1 if region=3 and 0 otherwise
In the example below, notice that there are 1746 households in region 1 (west) and
the same number
of households for which reg1 = 1.