- QQ：99515681
- 郵箱：[email protected]
- 工作時間：8:00-23:00
- 微信：codinghelp

MATH 185 – Take-Home Exam 2

Due Sunday, June 9th, by 11:59 PM

AGREEMENT

By taking this exam, you agree to not discuss the exam with anyone, starting now,

neither with a classmate or anyone else, neither in person nor through other means,

including electronic. Please do not post questions on Piazza. Unless otherwise speci-

fied, it is acceptable to copy-paste from the lecture or homework solution code.

Problem 1. (Bootstrap tests for goodness-of-fit) We saw in lecture that when it comes to

goodness-of-fit (GOF) testing, it is quite “natural” to obtain a p-value by permutation. It is also

possible, however, to use the bootstrap for that purpose. Consider the two-sample situation for

simplicity, although this generalizes to any number of samples. Thus assume a situation where we

observe X1, . . . , Xm iid from F and (independently) Y1, . . . , Yn iid from G, where F and G are two

distributions on the real line. We want to test F = G versus F 6= G. We may want to use a statistic

T = T(X1, . . . , Xm, Y1, . . . , Yn) for that purpose, and the question is how to obtain a p-value for T

via a bootstrap. The idea is, as usual, to estimate the “best” null distribution and bootstrap from

that distribution. A natural approach to estimate the null distribution is to simply combine the

two samples as one, and estimate the corresponding distribution via the empirical distribution. We

thus use the empirical distribution from the combined sample to bootstrap from.

A. Write a function bootGOFdiff(x, y, B = 2000) that takes in two samples as vectors x and y,

and a number of replicates B (Monte Carlo samples from the estimated null distribution),

and returns the bootstrap GOF p-value for the difference in means T = |Xˉ Yˉ |.

B. Apply your function to the FIFA dataset to compare the wages of players ≤ 29 years old with

older players (≥ 30 years old).

Problem 2. (Local Absolute Linear Regression) Local linear regression is a popular

smoother. However, based on the squared errors, it is not robust. To make it more robust, one

option is to use the absolute errors instead.

A. Write a function localAbsLinearRegression(x, y, h, xnew = x) that takes in paired vectors x

(predictor) and y (response), and a bandwidth h, and computes the local absolute linear

regression (use any kernel of your liking). The function is evaluated at the vector xnew (equal

to x by default).

B. Apply your function to the Boeing stock closing prices from 1/01/2018 to 6/01/2019 — see

the BA.csv file, which was downloaded from here (some dates are missing for some unknown

reason). Plot the data and overlay the fitted curve for a few choices of bandwidth (identified

in a legend).

C. Choose the bandwidth by 10-fold cross-validation.

版權所有：編程輔導網 2018 All Rights Reserved 聯系方式：QQ:99515681 電子信箱：[email protected]

免責聲明：本站部分內容從網絡整理而來，只供參考！如有版權問題可聯系本站刪除。