## Preamble: Package Loading
import numpy as np
import ipywidgets as ipw
from IPython.display import display,display_html
import matplotlib.pyplot as plt
from matplotlib import gridspec
import pandas as pd
import json
import kernel as kr
import psc_dbl_sumdisp as psd
%%html
<style>
#.cell.selected~.unselected { display: none; }
#.cell.code_cell.unselected .input { display: none; }
</style>
By: Eric Penner
The following notebook contains results of a Monte Carlo Exercise conducted on the estimator detailed in 'shi.ipynb' and 'shi_wp.pdf' with a data sets generated by 'shi_dgp.ipnyb' (see this notebook for details of the DGP).
Important features of each of the following trials are presented here
# Trial 1.1: 0
inpt_filenames=[['pscout_9_4_1347.json' ,'pscout_9_4_1505.json' ,'pscout_9_4_1913.json' ,'pscout_9_4_1816.json']]
line_nms=[['Oracle','Known','Unknown','Lasso']]
#Trial 1.2: 1
inpt_filenames.append(['pscout_9_4_1232.json' ,'pscout_9_4_1837.json','pscout_9_4_1969.json' ,'pscout_9_4_1655.json' ])
line_nms.append(['Oracle','Known','Unknown','Lasso'])
#Trial 1.3: 2
inpt_filenames.append(['pscout_9_4_1636.json','pscout_9_4_1191.json','pscout_9_4_1510.json'])
line_nms.append(['Oracle','Known','Lasso'])
#Trial 1.4: 3
inpt_filenames.append(['pscout_9_4_1816.json','pscout_9_4_1655.json','pscout_9_4_1510.json','pscout_9_5_1624.json'])
line_nms.append(['t_inst=15','t_inst=30','t_inst=45','t_inst=100'])
#Trial 2.0: 4
inpt_filenames.append(['pscout_9_5_1624.json','pscout_9_6_1331.json','pscout_9_6_1606.json'])
line_nms.append(['ncs = 10','ncs = 25','ncs = 40'])
#Trial 3.0: 5
inpt_filenames.append(['pscout_9_5_1624.json','pscout_9_6_1537.json','pscout_9_6_1725.json'])
line_nms.append(['ntp = 50','ntp = 100','ntp = 150'])
res_out = [[psd.psc_load(inpt_filenames[k][i]) for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
estin_dcts = [[res_out[k][i][0] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
dgp_sum_filenames = [[estin_dcts[k][i]['input_filename'].replace('pscdata','pscsum')
for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
dgp_dicts = [[psd.pscsum_load(dgp_sum_filenames[k][i])
for i in range(len(dgp_sum_filenames[k]))]
for k in range(len(line_nms))]
dgpin_dcts = [[dgp_dicts[k][i][0] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
merged_dcts = [[{**estin_dcts[k][i],**dgpin_dcts[k][i]}
for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
true_bcoeffs = [[dgp_dicts[k][i][1] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
true_acoeffs = [[dgp_dicts[k][i][2] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
bcoeff = [[res_out[k][i][1] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
acoeff = [[res_out[k][i][3] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
btables = [[res_out[k][i][2] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
atables = [[res_out[k][i][4] for i in range(len(inpt_filenames[k]))]
for k in range(len(line_nms))]
For each time period $t \in \{1,2, \ldots, T\}$, component $ d\in \{1,2, \ldots , p_1\}$ of $Z_{1jt}$, and cross-section $j \in \{1,2,\ldots, q\}$ where $\{q,T\} \in \mathbb{N}$ consider the following,
\begin{align} Y_{jt} &= \beta_0 + [\; Z_{1jt}' \;\; Z_{2jt}' \;] \beta_1 + e_j + \varepsilon_{jt} \\[1em] % Z_{1jt,d} &= \alpha_{0jd} + Z_{2jt}' \alpha_{1jd} + W_{jt}' \alpha_{2jd} + V_{jt,d} \tag{2d} \\[1em] % E(&\varepsilon_{jt} | Z_{1jt} , Z_{2jt}) = E(\varepsilon_{jt} | Z_{1jt}) \neq 0 \\[1em] % E(& V_{jdt} | Z_{1j(t-1)},\ldots,Z_{1j(t-c)},Z_{2jt},W_{jt}) = 0 % \end{align}
Let; $n_{tp} \equiv T$ be the total number of time periods, $n_{end} \equiv p_1$ be the number of endogneous regressors included in the primary regression, $n_{exo} \equiv p_2$ be the number of exogenous regressors included in the primary regression, and $n_{tinst} \equiv w$, $ n_{cinst} \equiv w_j$ be the total number of available instruments and the number of instruments relevant to each crossection respectively. Now let,
$$ \begin{align*} \rho_{er} &= \begin{bmatrix} \rho_{er,1} & \rho_{er,2} & \cdots & \rho_{er,n_{end}} \end{bmatrix} \\ \rho_{inst} &= \begin{bmatrix} \rho_{inst,1} & \rho_{inst,2} & \cdots & \rho_{inst,n_{inst}-1} \end{bmatrix}\\ \rho_{ex} &= \begin{bmatrix} \rho_{ex,1} & \rho_{ex,2} & \cdots & \rho_{ex,n_{ex}-1} \end{bmatrix} \end{align*} $$
So that I can define the covariance matrices for each cross section as follows
$$ \begin{align*} cv_{er} &= \begin{bmatrix} 1 & \rho_{er,1} & \rho_{er,2} & \cdots & \rho_{er,n_{end}} \\ \rho_{er,1} & 1 & \rho_{er,1} &\cdots & \rho_{er,n_{end}-1} \\ \rho_{er,2} & \rho_{er,1} & 1 & \cdots & \rho_{er,n_{end}-2} \\ \vdots & &&\ddots& \\ \rho_{er,n_{end}} & \rho_{er,n_{end}-1} & \rho_{er,n_{end}-2} & \cdots & 1 \end{bmatrix} % \hspace{1cm} % % cv_{ex} &= \begin{bmatrix} 1 & \rho_{ex,1} & \rho_{ex,2} & \cdots & \rho_{ex,n_{ex}-1} \\ \rho_{ex,1} & 1 & \rho_{ex,1} &\cdots & \rho_{ex,n_{ex}-2} \\ \rho_{ex,2} & \rho_{ex,1} & 1 & \cdots & \rho_{ex,n_{ex}-3} \\ \vdots & &&\ddots& \\ \rho_{ex,n_{ex}-1} & \rho_{ex,n_{ex}-2} & \rho_{ex,n_{ex}-2} & \cdots & 1 \end{bmatrix} \end{align*} $$
Since all crossections will use some subset of the vector of instruments the following is the full covariance matrix for all instruments.
$$ CV_{inst} = \begin{bmatrix} 1 & \rho_{inst,1} & \rho_{inst,2} & \cdots & \rho_{inst,n_{tinst}-1} \\ \rho_{inst,1} & 1 & \rho_{inst,1} &\cdots & \rho_{inst,n_{tinst}-2} \\ \rho_{inst,2} & \rho_{tinst,1} & 1 & \cdots & \rho_{inst,n_{tinst}-3} \\ \vdots & &&\ddots& \\ \rho_{inst,n_{tinst}-1} & \rho_{inst,n_{tinst}-2} & \rho_{inst,n_{tinst}-3} & \cdots & 1 \end{bmatrix} % $$
As a result we can construct the covariance matrices for all cross sections,
$$ \begin{align*} CV_{er} &= \begin{bmatrix} cv_{er} & \mathbf{0}_{(n_{end}+1 \times n_{end}+1)} & \cdots & \mathbf{0}_{(n_{end}+1 \times n_{end}+1)} \\ \mathbf{0}_{(n_{end}+1 \times n_{end}+1)} & cv_{er} & \cdots & \mathbf{0}_{(n_{end}+1 \times n_{end}+1)} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{0}_{(n_{end}+1 \times n_{end}+1)} & \mathbf{0}_{(n_{end}+1 \times n_{end}+1)} & \cdots & cv_{er} \end{bmatrix} % \hspace{1cm} % CV_{ex} = \begin{bmatrix} cv_{ex} & \mathbf{0}_{(n_{ex} \times n_{ex})} & \cdots & \mathbf{0}_{(n_{ex} \times n_{ex})} \\ \mathbf{0}_{(n_{ex} \times n_{ex})} & cv_{ex} & \cdots & \mathbf{0}_{(n_{ex} \times n_{ex})} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{0}_{(n_{ex} \times n_{ex})} & \mathbf{0}_{(n_{ex} \times n_{ex})} & \cdots & cv_{ex} \end{bmatrix} \end{align*} $$
Now I generate, error terms, instruments, and exogneous variables from mulitvariate normal distribution. First let
$$ \begin{align*} Z_{2jt} &= \begin{bmatrix} Z_{2jt,1} & Z_{2jt,2} & \cdots & Z_{2jt,n_{ex}} \end{bmatrix}' \\[10pt] W_t &= \begin{bmatrix} W_{t,1} & W_{t,2} & \cdots & W_{t,n_{inst}} \end{bmatrix}' \\[10pt] \tilde{V}_{jt} &= \begin{bmatrix} V_{jt,1} & V_{jt,2}& \cdots & V_{jt,n_{end}} & \varepsilon_{j} \end{bmatrix}' \end{align*} $$
Then consider, $ W_{t} \sim N(\mathbf{0}_{n_{inst} \times 1}, CV_{inst})$
$$ \begin{bmatrix} Z_{21t}' & Z_{22t}' & \cdots & Z_{2n_{cs}t}' \end{bmatrix}' \sim N(\mathbf{0}_{n_{cs} \cdot n_{exo} \times 1}, CV_{ex}) \hspace{1cm} \text{ and } \hspace{1cm} \begin{bmatrix} \tilde{V}_{1t}' & \tilde{V}_{2t}' & \cdots & \tilde{V}_{n_{cs},t}' \end{bmatrix}' \sim N(\mathbf{0}_{n_{cs} \cdot (n_{end} +1) \times 1}, CV_{er}) $$
I will generate endogenous variables $ Z_{1jd} $ according to the following steps.
$$ Z_{1jd} = \alpha_{0jd} + Z_{2jt}' \alpha_{1d} + W_{jt}' \alpha_{2d} + V_{jt,d} \hspace{1cm} \text{ where } \hspace{1cm} \alpha_{0jd} = 1/2+j/2 $$
Having generated all primary regressors I will generate the regresand for the primary equation to do this I will,
$$ Y_{jt} = [\; Z_{1jt}' \;\; Z_{2jt}' \;] \beta_1 + e_j + \varepsilon_{jt} \;\;\;\; \text{ where } \;\;\;\; e_{j} = 1+j/2 $$
A number of variables are used below, here are their descriptions. Refer back to 'psc.ipynb' or 'psc_dgp.ipynb' for more details.
Variable Name | Description |
---|---|
k_H | Kernel number used for H function Estimation |
c_H | Plug in bandwidth constant for H function Estimation |
k_mvd | Kernel number used for multivariate d>2 density estimation |
c_mvd | Plug in bandwidth constant for multivariate d>2 density estimation |
k_uvd | Kernel number used for bivariate density estimation |
c_uvd | Plug in bandwidth used for bivariate density estimation |
dep_nm | Variable name of the dependent variable |
en_nm | Variable names of each endogenous variabble |
ex_nm | Variable names of each exogenous variable |
in_nm | Variable names of instruments relevant to each cross section |
err_vpro | Vector of covariances used to construct the error cov matrix |
ex_vpro | Vector of covariances used to construct the exog variable cov matrix |
inst_vpro | Vector of covariances used to construct the instrument cov matrix |
frc | Indicator for whether the functional form of control function is forced |
input_filename | Filename of dataset used to generate the results. |
kwnsub | Indicator for ifthe subset of instrument relevant to each crs is known |
n_end | Number of endogenous variables |
n_exo | Number of exogenous variables |
ncs | Number cross sections |
nds | Number of dgp data sets |
ntp | Number of time periods |
orcl | Indicator for whether residuals $V$ are observed (=1) or not |
r_seed | Random number generator seed used to generate the data set |
sec_pan | Indicator for whether the secondary eqn data is panel or not |
c_inst | Number of instrument relevant to each cross section |
t_inst | Total number of instruments |
inc | List of instrument relevant to at least one cross section |
tin | Variable name of the time period index |
cin | Variable name of the cross section index |
lasso | Indicator for lasso estimation |
alph | lasso penalty value |
epsil | Threshold for averaging "non zero" coefficients |
Here we examine the sampling distribution of $\hat{\beta}_1$, and $ \hat{\alpha}_{1} $.
Here I have merged together the dictionaries used to generate both the underlying dataset and the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below. For a description of each variable name please see Variable Description
merged_dcts[0][0]
Here I display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames0' above.
Note:
1.) A zero coefficient in the following matrix means that the instrument it multiplies is not relevant to that cross section.
2.) The density of the secondary regression coefficient matrix is 25%
display_html(true_acoeffs[0][0][0])
Here I show characteristics of the sampling distributions of the elements of $\hat{\alpha}_{dj}$.
for i in range(1,len(atables[0])):
display(atables[0][i][0])
display(line_nms[0][i])
Here I display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames0' above.
true_bcoeffs[0][0]
Here I show the sampling distribution of the elements of $\hat{\beta}_1$.
psd.tbl_dsp(btables[0],1,5,line_nms[0],1)
psd.coeffden(bcoeff[0],line_nms[0],[-0.4,0.4],9,2,1,0,0)
psd.coeffden(bcoeff[0],line_nms[0],[-0.4,0.4],9,2,0,0,0)
Here we examine the sampling distribution of $\hat{\beta}_1, \hat{\alpha}_{1}$
Here I have merged together the dictionaries used to generate both the underlying dataset and the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below. For a description of each variable name please see Variable Description
merged_dcts[1][0]
Here I display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames0' above.
Note:
1.) The density of the secondary regression coefficient matrix is 13%
display_html(true_acoeffs[1][0][0])
Here I show characteristics of the sampling distributions of the elements of $\hat{\alpha}_{dj}$.
for i in range(1,len(atables[1])):
display(atables[1][i][0])
display(line_nms[1][i])
Here I display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames0' above.
true_bcoeffs[1][0]
Here I show the sampling distribution of the elements of $\hat{\beta}_1$.
psd.tbl_dsp(btables[1],1,5,line_nms[1],1)
psd.coeffden(bcoeff[1],line_nms[1],[-0.4,0.4],9,2,1,0,0)
psd.coeffden(bcoeff[1],line_nms[1],[-0.4,0.4],9,2,0,0,0)
Here we examine the sampling distribution of $\hat{\beta}_1, \hat{\alpha}_{1}$.
Here I have merged together the dictionaries used to generate both the underlying dataset and the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below. For a description of each variable name please see Variable Description
merged_dcts[2][0]
Here I display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames0' above.
Note:
1.) A zero coefficient in the following matrix means that the instrument it multiplies is not relevant to that cross section.
2.) The density of the secondary regression coefficient matrix is 2%
display_html(true_acoeffs[2][0][0])
Here I show characteristics of the sampling distributions of the elements of $\hat{\alpha}_{dj}$.
for i in range(1,len(atables[2])):
display(atables[2][i][0])
display(line_nms[2][i])
Here I display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames0' above.
true_bcoeffs[2][0]
Here I show the sampling distribution of the elements of $\hat{\beta}_1$.
psd.tbl_dsp(btables[2],1,5,line_nms[2],1)
psd.coeffden(bcoeff[2],line_nms[2],[-0.4,0.4],9,2,1,0,0)
psd.coeffden(bcoeff[2],line_nms[2],[-0.4,0.4],9,2,0,0,0)
Here we examine the sampling distribution of $\hat{\beta}_1, \hat{\alpha}_{1}$.
Here I have merged together the dictionaries used to generate both the underlying dataset and the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below.
Below you will see a slider which can be used to summarize this merged dictionary corresponding to the position its file name appears in 'input_filenames0' above.
In accordance with the trial description, the only differences that should exist are the number of time periods (ntp) and the file name of the data set uded to generate the results.
Here I only display the dictionary for the new case where t_inst = 100
merged_dcts[3][3]
Here I display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames0' above. Here they should also be identical across data sets.
Note:
1.) A zero coefficient in the following matrix means that the instrument it multiplies is not relevant to that cross section.
2.) In accordance with the description above they should be identical across results data sets.
3.) Here I only show the parameter matrix for the new case where there are 100 total instruments
display_html(true_acoeffs[3][3][0])
Here I interactively show the sampling distribution of the elements of $\hat{\alpha}_{dj}$.
for i in range(1,len(atables[3])):
display(atables[3][i][0])
display(line_nms[3][i])
Here I interactively display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames0' above. Here they should be identical.
true_bcoeffs[3][0]
Here I show the sampling distribution of the elements of $\hat{\beta}_1$.
psd.tbl_dsp(btables[3],1,5,line_nms[3],1)
psd.coeffden(bcoeff[3],line_nms[3],[-0.4,0.4],9,2,1,0,0)
psd.coeffden(bcoeff[3],line_nms[3],[-0.4,0.4],9,2,0,0,0)
Here we examine the sampling distribution of $\hat{\beta}_1$, and $\hat{\alpha}_{1}$ as the number of cross section increases.
Here I have merged together the dictionaries used to generate both the underlying dataset and the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below. For a description of each variable name please see Variable Description
merged_dcts[4][0]
Here I display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames3' above.
Note:
1.) A zero coefficient in the following matrix means that the instrument it multiplies is not relevant to that cross section.
2.) For the sake of presentation I only show the true parameter matrix for the case where ncs = 25
# psd.indict_dsp(true_acoeffs[4],2)
display_html(true_acoeffs[4][1][0])
Here I interactively show the sampling distribution of the elements of $\hat{\alpha}_{dj}$.
for i in range(1,len(atables[4])):
display(atables[4][i][0])
display(line_nms[4][i])
Here I interactively display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames3' above. Here they should be identical.
true_bcoeffs[4][0]
Here I show the sampling distribution of the elements of $\hat{\beta}_1$.
psd.tbl_dsp(btables[4],1,5,line_nms[4],1)
psd.coeffden(bcoeff[4],line_nms[4],[-0.4,0.4],9,2,1,0,0)
psd.coeffden(bcoeff[4],line_nms[4],[-0.4,0.4],9,2,0,0,0)
Here we examine the sampling distribution of $\hat{\beta}_1, \hat{\alpha}_{1}$.
Here I have merged together the dictionaries used to generate both the underlying dataset and the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below. For a description of each variable name please see Variable Description
merged_dcts[5][0]
Here I display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames0' above.
Note:
1.) A zero coefficient in the following matrix means that the instrument it multiplies is not relevant to that cross section.
display_html(true_acoeffs[5][0][0])
Here I show characteristics of the sampling distributions of the elements of $\hat{\alpha}_{dj}$, only for the new cases where ntp = 100,150
# display(psd.cfs_dsp(acoeff[5],atables[5],2,7,line_nms[5]))
for i in range(1,len(atables[5])):
display(atables[5][i][0])
display(line_nms[5][i])
Here I display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames4' above.
true_bcoeffs[5][0]
Here I show the sampling distribution of the elements of $\hat{\beta}_1$.
psd.tbl_dsp(btables[5],1,5,line_nms[5],1)
psd.coeffden(bcoeff[5],line_nms[5],[-0.4,0.4],9,2,1,0,0)
psd.coeffden(bcoeff[5],line_nms[5],[-0.4,0.4],9,2,0,0,0)