Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed output of an old analysis after version update #7

Open
pdeninis opened this issue Apr 7, 2020 · 7 comments
Open

Changed output of an old analysis after version update #7

pdeninis opened this issue Apr 7, 2020 · 7 comments

Comments

@pdeninis
Copy link

pdeninis commented Apr 7, 2020

Hi, first of everything thank you for your wonderful package!

Time ago I ran with an older version of R an analysis with this data :
datapasq12 <-
structure(list(TREAT = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2), CAL_PRE = c(5, 6, 5.5, 5.5, 6, 5, 5, 5,
5, 6, 5, 6.5, 5, 4, 4, 4, 5, 5, 6, 5, 6, 6, 5, 6, 4, 5, 5, 5,
6, 7, 6, 6, 4, 5, 5, 6, 5, 6, 4, 5), CAL_CH = c(4, 5, 3.5, 3.5,
4, 3, 3, 3, 4, 5, 3, 4.5, 3, 2, 2, 2, 3, 3, 4, 3, 4, 4, 3, 4,
3, 4, 3, 4, 4, 5, 3, 3, 3, 3, 4, 4, 3, 4, 3, 4)), class = "data.frame", row.names = c(NA,
-40L))

and saved these results along with a graph:

inter.binning(Y = "CAL_CH", D = "TREAT", X = "CAL_PRE", data = datapasq12, weights = NULL, Ylabel = "CAL Gain", Dlabel = "Treatment", Xlabel="CAL pre-test", main="Kernel estimator of Treatment effect")
$est.binning
x0 coef se CI_lower CI_upper
CAL pre-test: [4,5] 5.00 0.30 0.2104 -0.1291 0.7291
CAL pre-test: (5,6] 6.00 -0.75 0.2716 -1.3040 -0.1960
CAL pre-test: (6,7] 6.75 3.50 0.2480 2.9943 4.0057

$binary.treatment
[1] 1

$bin.size
1 2 3
0.60 0.35 0.05

$treat.variation.byGroup
[1] 0.1218 0.1068 0.2222

$X.Lkurtosis
[1] 0.1021

$correctOrder
[1] FALSE

$p.twosided
p.1v2 p.2v3 p.1v3
0.0046 0.0000 0.0000
p values of pairwise t-tests of the binning estimates when there are 2 or 3 bins.

$p.wald
[1] 0.3853
p-value of a Wald test. The NULL hypothesis is that the linear interaction model
and the binning model are statistically equivalent.

$graph

Today I repeated it with R 3.6.1 and I got:

inter.binning(Y = "CAL_CH", D = "TREAT", X = "CAL_PRE", data = datapasq12, weights = NULL, Ylabel = "CAL Gain", Dlabel = "Treatment", Xlabel="CAL pre-test", main="Kernel estimator of Treatment effect")
Error in density.default(data[data[, D] == 0, X]) :
need at least 2 points to select a bandwidth automatically

Please, may you tell me what's the issue and if there is a solution?

Thanks for your attention.
PDN

@pdeninis
Copy link
Author

pdeninis commented Apr 7, 2020

Hi,
I tried recoding TREAT from 1,2 to 0,1.
Now it works, but the ouput is different.
`
datapasq12$TREAT<-TREAT-1
datapasq12 <-
structure(list(TREAT = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1), CAL_PRE = c(5, 6, 5.5, 5.5, 6, 5, 5, 5,
5, 6, 5, 6.5, 5, 4, 4, 4, 5, 5, 6, 5, 6, 6, 5, 6, 4, 5, 5, 5,
6, 7, 6, 6, 4, 5, 5, 6, 5, 6, 4, 5), CAL_CH = c(4, 5, 3.5, 3.5,
4, 3, 3, 3, 4, 5, 3, 4.5, 3, 2, 2, 2, 3, 3, 4, 3, 4, 4, 3, 4,
3, 4, 3, 4, 4, 5, 3, 3, 3, 3, 4, 4, 3, 4, 3, 4)), row.names = c(NA,
-40L), class = "data.frame")

inter.binning(Y = "CAL_CH", D = "TREAT", X = "CAL_PRE", data = datapasq12, weights = NULL, Ylabel = "CAL Gain", Dlabel = "Treatment", Xlabel="CAL pre-test", main="Kernel estimator of Treatment effect")
$type
[1] "binning"

$nbins
[1] 3

$est.lin
X.lvls marg lb ub
1 4.000 0.943263880 0.37488208 1.511645685
2 4.000 0.943263880 0.37488208 1.511645685
3 4.000 0.943263880 0.37488208 1.511645685
4 4.000 0.943263880 0.37488208 1.511645685
5 4.000 0.943263880 0.37488208 1.511645685
6 4.000 0.943263880 0.37488208 1.511645685
7 4.000 0.943263880 0.37488208 1.511645685
8 4.000 0.943263880 0.37488208 1.511645685
9 4.000 0.943263880 0.37488208 1.511645685
10 4.000 0.943263880 0.37488208 1.511645685
11 4.000 0.943263880 0.37488208 1.511645685
12 4.000 0.943263880 0.37488208 1.511645685
13 4.000 0.943263880 0.37488208 1.511645685
14 4.070 0.895655517 0.35080495 1.440506081
15 4.460 0.630408925 0.20783199 1.052985862
16 4.850 0.365162333 0.03778373 0.692540932
17 5.000 0.263144413 -0.04050754 0.566796368
18 5.000 0.263144413 -0.04050754 0.566796368
19 5.000 0.263144413 -0.04050754 0.566796368
20 5.000 0.263144413 -0.04050754 0.566796368
21 5.000 0.263144413 -0.04050754 0.566796368
22 5.000 0.263144413 -0.04050754 0.566796368
23 5.000 0.263144413 -0.04050754 0.566796368
24 5.000 0.263144413 -0.04050754 0.566796368
25 5.000 0.263144413 -0.04050754 0.566796368
26 5.000 0.263144413 -0.04050754 0.566796368
27 5.000 0.263144413 -0.04050754 0.566796368
28 5.000 0.263144413 -0.04050754 0.566796368
29 5.000 0.263144413 -0.04050754 0.566796368
30 5.000 0.263144413 -0.04050754 0.566796368
31 5.000 0.263144413 -0.04050754 0.566796368
32 5.000 0.263144413 -0.04050754 0.566796368
33 5.000 0.263144413 -0.04050754 0.566796368
34 5.000 0.263144413 -0.04050754 0.566796368
35 5.000 0.263144413 -0.04050754 0.566796368
36 5.000 0.263144413 -0.04050754 0.566796368
37 5.000 0.263144413 -0.04050754 0.566796368
38 5.000 0.263144413 -0.04050754 0.566796368
39 5.000 0.263144413 -0.04050754 0.566796368
40 5.000 0.263144413 -0.04050754 0.566796368
41 5.000 0.263144413 -0.04050754 0.566796368
42 5.000 0.263144413 -0.04050754 0.566796368
43 5.000 0.263144413 -0.04050754 0.566796368
44 5.000 0.263144413 -0.04050754 0.566796368
45 5.000 0.263144413 -0.04050754 0.566796368
46 5.000 0.263144413 -0.04050754 0.566796368
47 5.000 0.263144413 -0.04050754 0.566796368
48 5.000 0.263144413 -0.04050754 0.566796368
49 5.000 0.263144413 -0.04050754 0.566796368
50 5.000 0.263144413 -0.04050754 0.566796368
51 5.000 0.263144413 -0.04050754 0.566796368
52 5.000 0.263144413 -0.04050754 0.566796368
53 5.000 0.263144413 -0.04050754 0.566796368
54 5.000 0.263144413 -0.04050754 0.566796368
55 5.000 0.263144413 -0.04050754 0.566796368
56 5.000 0.263144413 -0.04050754 0.566796368
57 5.000 0.263144413 -0.04050754 0.566796368
58 5.000 0.263144413 -0.04050754 0.566796368
59 5.000 0.263144413 -0.04050754 0.566796368
60 5.005 0.259743815 -0.04328244 0.562770071
61 5.200 0.127120519 -0.16103383 0.415274869
62 5.395 -0.005502777 -0.29870072 0.287695166
63 5.500 -0.076915321 -0.38088899 0.227058346
64 5.500 -0.076915321 -0.38088899 0.227058346
65 5.500 -0.076915321 -0.38088899 0.227058346
66 5.675 -0.195936228 -0.52862061 0.136748156
67 5.870 -0.328559524 -0.70561347 0.048494422
68 6.000 -0.416975055 -0.82889363 -0.005056479
69 6.000 -0.416975055 -0.82889363 -0.005056479
70 6.000 -0.416975055 -0.82889363 -0.005056479
71 6.000 -0.416975055 -0.82889363 -0.005056479
72 6.000 -0.416975055 -0.82889363 -0.005056479
73 6.000 -0.416975055 -0.82889363 -0.005056479
74 6.000 -0.416975055 -0.82889363 -0.005056479
75 6.000 -0.416975055 -0.82889363 -0.005056479
76 6.000 -0.416975055 -0.82889363 -0.005056479
77 6.000 -0.416975055 -0.82889363 -0.005056479
78 6.000 -0.416975055 -0.82889363 -0.005056479
79 6.000 -0.416975055 -0.82889363 -0.005056479
80 6.000 -0.416975055 -0.82889363 -0.005056479
81 6.000 -0.416975055 -0.82889363 -0.005056479
82 6.000 -0.416975055 -0.82889363 -0.005056479
83 6.000 -0.416975055 -0.82889363 -0.005056479
84 6.000 -0.416975055 -0.82889363 -0.005056479
85 6.000 -0.416975055 -0.82889363 -0.005056479
86 6.000 -0.416975055 -0.82889363 -0.005056479
87 6.000 -0.416975055 -0.82889363 -0.005056479
88 6.000 -0.416975055 -0.82889363 -0.005056479
89 6.000 -0.416975055 -0.82889363 -0.005056479
90 6.000 -0.416975055 -0.82889363 -0.005056479
91 6.000 -0.416975055 -0.82889363 -0.005056479
92 6.000 -0.416975055 -0.82889363 -0.005056479
93 6.000 -0.416975055 -0.82889363 -0.005056479
94 6.000 -0.416975055 -0.82889363 -0.005056479
95 6.000 -0.416975055 -0.82889363 -0.005056479
96 6.025 -0.433978041 -0.85298180 -0.014974286
97 6.220 -0.566601337 -1.04419757 -0.089005107
98 6.415 -0.699224634 -1.23992442 -0.158524847
99 6.610 -0.831847930 -1.43875686 -0.224939001
100 6.805 -0.964471226 -1.63978206 -0.289160392
101 7.000 -1.097094522 -1.84239653 -0.351792513

$est.bin
x0 coef se CI_lower CI_upper
CAL pre-test: [4,5] 5.00 0.30 0.2104143 -0.1291429 0.7291429
CAL pre-test: (5,6] 6.00 -0.75 0.2716437 -1.3040211 -0.1959789
CAL pre-test: (6,7] 6.75 0.50 0.6273344 -0.7794568 1.7794568

$binaryD
[1] TRUE

$Xlabel
[1] "CAL pre-test"

$Dlabel
[1] "Treatment"

$Ylabel
[1] "CAL Gain"

$de
NULL

$de.co

Call:
density.default(x = data[data[, D] == 0, X])

Data: data[data[, D] == 0, X] (20 obs.); Bandwidth 'bw' = 0.2306

   x               y            

Min. :3.308 Min. :0.0009721
1st Qu.:4.279 1st Qu.:0.0871115
Median :5.250 Median :0.2180965
Mean :5.250 Mean :0.2571851
3rd Qu.:6.221 3rd Qu.:0.3420941
Max. :7.192 Max. :0.8822507

$de.tr

Call:
density.default(x = data[data[, D] == 1, X])

Data: data[data[, D] == 1, X] (20 obs.); Bandwidth 'bw' = 0.3689

   x               y            

Min. :2.893 Min. :0.0006081
1st Qu.:4.197 1st Qu.:0.0400754
Median :5.500 Median :0.1669296
Mean :5.500 Mean :0.1915717
3rd Qu.:6.803 3rd Qu.:0.3562400
Max. :8.107 Max. :0.4481561

$hist.out
$breaks
[1] 4.00 4.05 4.10 4.15 4.20 4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65 4.70 4.75 4.80 4.85 4.90 4.95 5.00 5.05 5.10 5.15 5.20 5.25 5.30 5.35 5.40 5.45 5.50 5.55 5.60 5.65 5.70 5.75 5.80 5.85
[39] 5.90 5.95 6.00 6.05 6.10 6.15 6.20 6.25 6.30 6.35 6.40 6.45 6.50 6.55 6.60 6.65 6.70 6.75 6.80 6.85 6.90 6.95 7.00

$counts
[1] 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 18 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1

$density
[1] 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 9.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[48] 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5

$mids
[1] 4.025 4.075 4.125 4.175 4.225 4.275 4.325 4.375 4.425 4.475 4.525 4.575 4.625 4.675 4.725 4.775 4.825 4.875 4.925 4.975 5.025 5.075 5.125 5.175 5.225 5.275 5.325 5.375 5.425 5.475 5.525
[32] 5.575 5.625 5.675 5.725 5.775 5.825 5.875 5.925 5.975 6.025 6.075 6.125 6.175 6.225 6.275 6.325 6.375 6.425 6.475 6.525 6.575 6.625 6.675 6.725 6.775 6.825 6.875 6.925 6.975

$xname
[1] "data[, X]"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

$count.tr
[1] 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

$tests
$tests$est.binning
x0 coef se CI_lower CI_upper
CAL pre-test: [4,5] 5.00 0.30 0.210 -0.129 0.729
CAL pre-test: (5,6] 6.00 -0.75 0.272 -1.304 -0.196
CAL pre-test: (6,7] 6.75 0.50 0.627 -0.779 1.779

$tests$bin.size
[1] "0.600" "0.350" "0.050"

$tests$treat.variation.byGroup
[1] "1.233" "0.808" "2.000"

$tests$X.Lkurtosis
[1] "0.102"

$tests$correctOrder
[1] FALSE

$tests$p.twosided
[1] "0.005" "0.077" "0.764"

$tests$p.wald
[1] "0.385"

$graph

attr(,"class")
[1] "interflex"
`
Is this a new coding of the function? The results are different, too. And the graph.
The main difference, at least at a first glance, stands in the estimate and CI's width of one of the bins, the one most far away from the line prediction. It was about 3, 2.5 to 3.5, while now is 0.5, -0.75 to 1.25 approximately. The new results would seem more correct.

  1. Is the new release less tolerant about the indicator variable coding? What are its requirement?
  2. Is the new result the correct one?
    Thank you.

PDN

@xuyiqing
Copy link
Owner

xuyiqing commented Apr 7, 2020 via email

@lzy318
Copy link
Collaborator

lzy318 commented Apr 8, 2020

Hi, PDN
I think you are still using the old version, now the "interflex" package on Github is updated to 1.1.1. Could you try the latest version? By the way, I use your data to plot two graphs of the binning estimator and kernel estimator. Here are the plots:
plot_binning
plot_kernel

And the codes are:
inter.binning(Y = "CAL_CH", D = "TREAT", X = "CAL_PRE", data = datapasq, nbins = 3,
weights = NULL, Ylabel = "CAL Gain", Dlabel = "Treatment", vartype = "homoscedastic",
Xlabel="CAL pre-test", main="Binning estimator of Treatment effect")

inter.kernel(Y = "CAL_CH", D = "TREAT", X = "CAL_PRE", data = datapasq,
weights = NULL, Ylabel = "CAL Gain", Dlabel = "Treatment",
Xlabel="CAL pre-test", main="Kernel estimator of Treatment effect")

@pdeninis
Copy link
Author

pdeninis commented Apr 8, 2020

Hi Yiqing Xu and Ziyi Liu
Thank you for quick answering!

I can not find any reference to the parameter bw.adaptive in the interflex documentation.
Please, may you send me a link?

I have just read the answer by Ziyi Liu.
He is right, the interflex version running in my R is 1.0.8. I will update quickly.

I have seen his graphs on my data, the greater difference being on the third bin which is not present in his.

image

However, it is really much different from the old analysis:

image

I did not intervene to reduce the bins to 2, which seems appropriate to judge by the result, as the last bin could be rather underrepresented.
Is it an automatic option or has to be forced in some way?

I will follow your news because I'm very interested in the possibility to apply interflex for diagnostics and alternative estimates of logistic regression !

Thanks again!

@pdeninis
Copy link
Author

pdeninis commented Apr 8, 2020

Oops,

Sorry, I had not yet seen the graphs above, I was referring to that recieved within the email:

image

@lzy318
Copy link
Collaborator

lzy318 commented Apr 8, 2020

Hi, PDN
Thank you for informing me of this, I think the results from the latest version is right, as you said, the histogram in the plot implies the estimates at the third bin is underrepresented. It's expected to see a wide confidence interval at this point. (If you use robust standard error, you will find the CI at this point can't be estimated in the latest version because of your small sample size). I guess the difference you mentioned may be caused by some coding problems in the old version.
If you want to use only 2 bins to draw the plot, you can set the option "nbins" to 2, such as in the following code, and I think that is a proper choice.
inter.binning(Y = "CAL_CH", D = "TREAT", X = "CAL_PRE", data = datapasq, nbins = 2,
weights = NULL, Ylabel = "CAL Gain", Dlabel = "Treatment",
Xlabel="CAL pre-test", main="Binning estimator of Treatment effect")
I am working on the logit/probit version now and may push it in 1~2 months.

@pdeninis
Copy link
Author

pdeninis commented Apr 9, 2020

Being the first attempt to use it, I thought it right that, as a diagnostic, it was manipulated as little as possible.

However, wanting use it routinely, I realize instead that this function should not be thought of as a black box: the adjustment parameters are there precisely to be adapted to the type of data being analyzed. A thorough experimentation of their possibilities is in order!

Thanks for showing it so effectively!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants