趙 昕
(南京理工大學(xué)理學(xué)院,南京210094)
?
部分線性測量誤差模型的模擬—推斷估計(jì)
趙昕
(南京理工大學(xué)理學(xué)院,南京210094)
該文研究了部分線性測量誤差模型,即無法直接觀測非參數(shù)部分協(xié)變量,只能得到其替代變量的模型.利用局部線性估計(jì)并結(jié)合模擬-推斷的方法(SIMEX)得到參數(shù)及非參數(shù)的估計(jì),并在適當(dāng)?shù)臈l件下,得到了所提估計(jì)量的漸近偏差及方差.將該文提出的模擬-推斷方法與Liang(2000)的估計(jì)方法比較,表明模擬-推斷法在處理測量誤差問題上的有效性.值得一提的是,模擬-推斷方法不需要對非參數(shù)部分協(xié)變量的分布提出假設(shè).
測量誤差; 部分線性模型; 替代變量; 模擬-推斷方法
(1.1)
本文其余內(nèi)容安排如下:第二節(jié)介紹了模擬-推斷(SIMEX)方法并證明了所得估計(jì)的漸近性質(zhì);第三節(jié)通過一些模擬實(shí)驗(yàn)以驗(yàn)證該方法的有效性,此外,本文還與Liang(2000)提出的估計(jì)方法進(jìn)行了比較,通過一些數(shù)量特征直觀地反映出本文方法的優(yōu)越性.文中定理的證明在附錄中給出.
本文結(jié)合模擬-推斷法、局部線性回歸及加權(quán)最小二乘法構(gòu)造了參數(shù)β及非參數(shù)函數(shù)g(·)的估計(jì).即先利用局部線性回歸對未知函數(shù)g(·)進(jìn)行估計(jì),然后對參數(shù)β進(jìn)行加權(quán)最小二乘估計(jì).具體的算法如下:
(i) 模擬
(2.1)
(ii) 估計(jì)
假設(shè)函數(shù)g(·)在x0的鄰域內(nèi)有連續(xù)的二階導(dǎo)數(shù),那么g(x)可被一線性函數(shù)逼近,即
g(x)≈g(x0)+g′(x0)(x-x0)=a+b(x-x0),
(2.2)
直接計(jì)算可得β的估計(jì)為
其中
(iii) 推斷
參數(shù)β及非參數(shù)g(·)的估計(jì)量的漸近性質(zhì)如下
定理1表明參數(shù)β的模擬-推斷估計(jì)量比Naive估計(jì)量具有更復(fù)雜的方差結(jié)構(gòu).該定理中各記號的定義將在附錄中給出.
定理2假設(shè)附錄中條件(C1)-(C5)成立,當(dāng)n→∞且B→∞時(shí),有
其中
Er是一個(gè)除了第一個(gè)元素為1,其它元素均為0的r階方陣,r為參數(shù)Q的維數(shù).
產(chǎn)生500個(gè)數(shù)據(jù)集,每個(gè)數(shù)據(jù)集包含個(gè)體數(shù)分別為n=50,100,150.
我們使用模擬-推斷法及Naive方法(即忽略測量誤差的估計(jì))來構(gòu)造參數(shù)β的估計(jì),其結(jié)果列在表1中.
表1 參數(shù)β的SIMEX及Naive估計(jì)及其標(biāo)準(zhǔn)差SD
此外,取不同的測量誤差方差σU=0.2,0.4,0.8,比較模擬-推斷估計(jì)與忽略測量誤差的Naive估計(jì)方法,假設(shè)樣本容量n=50.其結(jié)果列于表2中.當(dāng)測量誤差方差σ=0.2時(shí),對于非參數(shù)g(·)的估計(jì)展示于圖1中.
由表1,發(fā)現(xiàn)模擬-推斷估計(jì)方法比Naive法減少了估計(jì)偏差,且偏差隨著樣本容量的增加而減少;但是模擬-推斷估計(jì)的標(biāo)準(zhǔn)差比Naive法的略大些.由表2還可以發(fā)現(xiàn)偏差隨著測量誤差方差的增大而增大.
表2 在不同測量誤差下參數(shù)β的SIMEX及Naive估計(jì)及其標(biāo)準(zhǔn)差SD
圖1 非參數(shù)g(·)的估計(jì)曲線圖. 實(shí)線表示真實(shí)曲線. 虛線表示模擬-推斷估計(jì)曲線. 點(diǎn)虛線表示Na?ve估計(jì)曲線. 從上至下從左至右依次表示樣本容量n=50,100,150的情形
表3 參數(shù)β的HL及SIMEX估計(jì)的偏差
表4 非參數(shù)的HL及SIMEX估計(jì)的均方誤差
本文提出非參數(shù)部分帶有測量誤差的部分線性模型的模擬-推斷估計(jì).在適當(dāng)?shù)臈l件下得到了參數(shù)估計(jì)量的漸近偏差及方差.在模擬實(shí)驗(yàn)中與忽略測量誤差的Naive估計(jì)進(jìn)行比較,得以驗(yàn)證本文提出的模擬-推斷估計(jì)的有效性與優(yōu)越性;此外本文還與Liang(2000)提出的估計(jì)方法進(jìn)行比較,結(jié)果表明本文提出的方法處理測量誤差問題的合理性.值得一提的是,本文的方法無需對無法觀測的協(xié)變量Xi提出分布假設(shè).
本文所提的方法還可以推廣至更為一般的模型;如部分線性單指標(biāo)測量誤差模型、變系數(shù)部分線性測量誤差模型等,進(jìn)一步我們還可以考慮相應(yīng)變量缺失的測量誤差模型,這些均已超出本文的研究范圍,在此不作詳細(xì)介紹.
定理1的證明.首先給出定理滿足的條件
其中
類似于模擬-推斷的算法,我們的證明也分模擬、估計(jì)、推斷三部分進(jìn)行,假設(shè)B是一固定的正整數(shù).
(i)模擬
對每個(gè)b,由標(biāo)準(zhǔn)漸近理論有
(1)
(2)
其中
(3)
(4)
(ii)估計(jì)
定義
及
(5)
(6)
(iii)推斷
及
(7)
其中
定理2的證明.類似于定理1的證明.此處略去.
[1]柴根象, 徐克軍. 半?yún)?shù)回歸的線性小波光滑[J]. 應(yīng)用概率統(tǒng)計(jì), 1995, 15(1): 97-105.
[2]任哲, 陳明華.L-統(tǒng)計(jì)量的Bootstrap逼近[J]. 工科數(shù)學(xué), 1995, 4(11): 78-81.
[3]周恒忠,陳明華. 關(guān)于Vou-Mises統(tǒng)計(jì)量的非一致性收斂速度[J]. 工科數(shù)學(xué), 1995, 4(11): 103-106.
[4]Carroll R J, Lombard F, Kuchenhoff H and Stefanski L A. Asymptotics for the SIMEX estimation in structural measurement error models[J]. Journal of the American Statistical Association, 1996, 91: 242-250.
[5]Drum M, McCullagh P. Regression models for discrete longitudinal responses[J]: Comment. Statistical Science, 1993, 8(3): 300-301.
[6]Engle R F, Granger C W J, Rice J, et al. Semiparametric estimates of the relation between weather and electricity scales[J]. Journal of the American Statistical Association, 1986, 81: 310-320.
[7]Huang Z S. Empirical likelihood for the parametric part in partially linear errors-in-function models[J]. Statistics and Probability Letters, 2012, 82: 63-66.
[8]Liang H. Generalized partially linear mixed-effects models incorporating mismeasured covariants[J]. Annals of the Institute of Statistical Mathematics, 2009, 61(1): 27-46.
[9]Liang H. Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part[J]. Journal of Statistical Planning and Inference, 2000, 86: 51-62.
[10]Liang H Y, Jing B Y. Asymptotic normality in partial linear models based on dependent errors[J]. Journal of Statistical Planning and Inference, 2009, 139(4): 1357-1371.
[11]Robinson P M. Root-n-consistent semiparametric regression[J]. Economerika, 1988, 56: 931-954.
[12]Schimek M G. Estimation and inference in partially linear models with smoothing splines[J]. Journal of Statistical Planning and Inference, 2000, 91(2): 525-540.
[13]Shen C W, Tsou T S, Balakrishnan N. Robust likelihood inference for regression parameters in partially linear models[J]. Computational Statistics & Data Analysis, 2011, 55(4):1696-1714.
[14]Wang Q H. Consistent estimators in random censorship semiparametric regression models[J]. Science in China, Series A, 1996, 39: 163-176.
[15]Wang Q H, Jing B Y. Empirical likelihood for partial linear models[J]. Annals of the Institute of Statistical Mathematics, 2003, 55(3): 585-595.
[16]Xue L G, Liu Q. Bootstrap approximation of wavelet estimates in a semiparametric regression model[J]. Acta Mathematic a Sinica-English Series, 2010, 26(4): 763-778.
[17]Xue L G, Zhu L X. L_1-norm estimation and random weighting method in a semiparametric model[J]. Acta Mathematicae Applicatae Sinica, 2005, 21(2): 295-302.
[18]Zeger S L, Diggle P J. Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters[J]. Biometrics, 1994, 50: 689-699.
[19]Zhu L X, Xue L G. Empirical likelihood confidence regions in a partially linear single-index model[J]. Journal of the Royal Statistical Society: Series B, 2006, 68(3): 549-570.
Simulation-Extrapolation Estimation in Partially Linear Errors-in-Function Models
ZHAO Xin
(Nanjing University of Science and Technology, Nanjing 210094, China)
We consider partially linear model with measurement errors. The covariate in nonparametric part is not observable, but its surrogate variable is available. We propose estimators of parameter and nonparametric function by using local linear regression and the Simulation-extrapolation (SIMEX) technique. The asymptotic biases and variances of proposed estimators are obtained in some conditions. Some simulations are conducted to illustrate the proposed method. We compare our method with the estimators proposed by Liang (2000) and the results indicate that SIMEX technique is valid to deal with the measurement errors. Furthermore, it is worth pointing out that our method need not assume the distribution of the covariates in the nonparametric part.
measurement errors; partially linear model; surrogate variable; SIMEX technique
2016-04-01;[修改日期]2016-04-22
國家自然科學(xué)基金(10871072; 11501292);國家統(tǒng)計(jì)科學(xué)研究重點(diǎn)項(xiàng)目(2013LZ45);中央高?;究蒲谢?30920130111015);江蘇省自然科學(xué)基金面上項(xiàng)目(BK20131345)資助.
趙昕(1992-),女,碩士,研究生在讀. 從事非參數(shù)統(tǒng)計(jì)分析研究. Email: 2690167203@qq.com
O212.7
A
1672-1454(2016)04-0012-08