:数据集earns1.dta
要求:1.生成一个新变量lnwkearns,其取值为log(wkearns)
2.利用egen命令,生成标准化的wkearns变量,并命名为stdwkearns.
3.利用group()函数,按照wkearns从小到大,将数据平均分成9组。
4.利用recode()函数,按照wkearns从小到大,将数据分成3组,要求第一组上限是157.9,第二组上限是184.83,第三组上限是198.41,新生成的变量命名为wkearnscat.
5.使用generate命令和replace命令组合重复第4题的操作。
6.为wkearnscat添加标签“wkearns catogory”
7.要求生成一个新的变量highearns。当某个观测值的wkearns大于wkearns的平均数时,highearns取值与wkearns相同,否则取值为0.
8.从1960年后的观测案例中随机抽取10年的数据,并保持为新文件earns2.dta,从1960年前的观测案例中随机抽取2年的数据,并保存为新文件earns3.dta.
基于Monster 组和GPT的调写:
gen lnwkearns = ln(wkearns)
egen stdwkearns = std(wkearns)
xtile group = wkearns, nq(9)
recode wkearns (min/157.9=1) (157.91/184.83=2) (184.84/198.41=3) (198.42/max=4), generate(wkearnscat)
gen wkearnscat = .
replace wkearnscat = 1 if wkearns <= 157.9
replace wkearnscat = 2 if wkearns > 157.9 & wkearns <= 184.83
replace wkearnscat = 3 if wkearns > 184.83 & wkearns <= 198.41
replace wkearnscat = 4 if wkearns > 198.41
label variable wkearnscat "wkearns category"
egen mean_wkearns = mean(wkearns)
gen highearns = wkearns if wkearns > mean_wkearns
replace highearns = 0 if wkearns <= mean_wkearns
use earns1.dta if year >= 1960
sample 10, count
sort year
save earns2.dta, replace
use earns1.dta if year < 1960
sample 2, count
sort year
save earns3.dta, replace