Title: | Assessing the Partial Association Between Ordinal Variables |
---|---|
Description: | An implementation of the unified framework for assessing partial association between ordinal variables after adjusting for a set of covariates (Dungang Liu, Shaobo Li, Yan Yu and Irini Moustaki (2020) <doi:10.1080/01621459.2020.1796394> Journal of the American Statistical Association). This package provides a set of tools to quantify, visualize, and test partial associations between multiple ordinal variables. It can produce a number of $phi$ measures, partial regression plots, 3-D plots, and p-values for testing H_0: phi=0 or H_0: phi <= delta. |
Authors: | Xiaorui (Jeremy) Zhu [aut, cre, cph], Shaobo Li [aut], Dungang Liu [ctb, aut], Yuejie Chen [ctb] |
Maintainer: | Xiaorui (Jeremy) Zhu <[email protected]> |
License: | GPL (>=2) |
Version: | 0.1.11 |
Built: | 2025-01-03 04:44:48 UTC |
Source: | https://github.com/xiaoruizhu/passo |
A subset of 2,188 participants of the 2016 American National Election Time Series Study, which was to track the enduring social trend and record the political moment of 2016 (DeBell, 2018). This study consisted of two surveys with same population. The pre-election interview was during the weeks before the 2016 general election, including 4,271 respondents in total. The post-election interview is the re-interview during the weeks after the election, including 3,649 respondents (662 respondents did not complete post-interviews).
data(ANES2016)
data(ANES2016)
A data frame with 2188 rows and 10 variables.
age
Respondent's age in years.
edu.year
Respondent's education year, which is mapped from education
level: MS
=8, HSdrop
=11, HS
=12, Coll
=14,
CCdeg
=15, BAdeg
=17, MAdeg
=19.
education
Respondent's education level.
income.num
Respondent's family income in thousands: an numerical variable.
It is median value of the range of each income
level
income
Respondent's family income level:
'(01) 01. Under $5,000' = 5,
'(02) 02. $5,000-$9,999',
'(03) 03. $10,000-$12,499',
'(04) 04. $12,500-$14,999',
'(05) 05. $15,000-$17,499',
'(06) 06. $17,500-$19,999',
'(07) 07. $20,000-$22,499',
'(08) 08. $22,500-$24,999',
'(09) 09. $25,000-$27,499',
'(10) 10. $27,500-$29,999',
'(11) 11. $30,000-$34,999',
'(12) 12. $35,000-$39,999',
'(13) 13. $40,000-$44,999',
'(14) 14. $45,000-$49,999',
'(15) 15. $50,000-$54,999',
'(16) 16. $55,000-$59,999',
'(17) 17. $60,000-$64,999',
'(18) 18. $65,000-$69,999',
'(19) 19. $70,000-$74,999',
'(20) 20. $75,000-$79,999',
'(21) 21. $80,000-$89,999',
'(22) 22. $90,000-$99,999',
'(23) 23. $100,000-$109,999',
'(24) 24. $110,000-$124,999',
'(25) 25. $125,000-$149,999',
'(26) 26. $150,000-$174,999',
'(27) 27. $175,000-$249,999',
'(28) 28. $250,000 or more' = 250.
PID
Party identification: a numeric variable with value from 1 to 7
representing strong Democrat, strDem
< weak Democrat, weakDem
<
independent Democrat, indDem
< independent independent indind
<
independent Republican, indRep
< weak Republican, weakRep
<
strong Republican, strRep
.
selfLR
The respondent' self-placement about own left-right in 7 ordinal levels
(from extremely liberal to extremely conservative). extLib
: extremely liberal, Lib
: liberal,
sliLib
: slightly liberal, Mod
: moderate, sliCon
:
slightly conservative, Con
: conservative, extCon
: extremely
conservative. extLib
< Lib
< sliLib
<
Mod
< sliCon
< Con
< extCon
.
TrumpLR
The respondent's opinion about Donald Trump's left-right
placement (same scale as selfLR).
ClinLR
The respondent's opinion about Hilary Clinton's left-right
placement (same scale as selfLR).
PreVote
The respondent's voting preference between
Donald Trump and Hilary Clinton two months preceeding the
November election (Pre-election interview). It is a factor with levels
HillaryClinton
and DonaldTrump
.
PreVote.num
Recode the PreVote to numeric values,
'HillaryClinton'=0, 'DonaldTrump'=1.
WeightforPreVote
Pre-election weight of a respondent.
The Pre-election preference is recorded as "PreVote" and the "PreVote.num" is the numeric of it. Observations with missing values, or "No thought" responses have been removed. Respondents expressing a voting preference other than Clinton or Trump have been removed.
DeBell, Matthew, Jon A. Krosnick, Katie Gera, David S. Yeager, and Michael P. McDonald. The turnout gap in surveys: Explanations and solutions. Sociological Methods & Research, 2018. doi:10.1177/0049124118769085
Enamorado, T., Fifield, B., & Imai, K. (2018). User’s guide and codebook for the ANES 2016 time series voter validation supplemental data. Technical Report, American National Election Studies.
head(ANES2016)
head(ANES2016)
A set of visualization tools for the diagnostic of the fitted model in
the partial association analysis. It can provides a plot matrix including Q-Q plots,
residual-vs-fitted plots, residual-vs-covariate plots of all the fitted models.
This function also support the direct diagnostic of the cumulative link regression model
in the class of clm
, glm
, lrm
,
orm
, polr
. Currently, vglm
is not supported.
diagnostic.plot(object, ...) ## Default S3 method: diagnostic.plot(object, ...) ## S3 method for class 'resid' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'PAsso' diagnostic.plot( object, output = c("qq", "fitted", "covariate"), model_id = NULL, x_name = NULL, ... ) ## S3 method for class 'glm' diagnostic.plot( object, output = c("qq", "fitted", "covariate"), x = NULL, fit = NULL, distribution = qnorm, ncol = NULL, alpha = 1, xlab = NULL, color = "#444444", shape = 19, size = 2, qqpoint.color = "#444444", qqpoint.shape = 19, qqpoint.size = 2, qqline.color = "#888888", qqline.linetype = "dashed", qqline.size = 1, smooth = TRUE, smooth.color = "red", smooth.linetype = 1, smooth.size = 1, fill = NULL, resp_name = NULL, ... ) ## S3 method for class 'clm' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'lrm' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'orm' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'polr' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...)
diagnostic.plot(object, ...) ## Default S3 method: diagnostic.plot(object, ...) ## S3 method for class 'resid' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'PAsso' diagnostic.plot( object, output = c("qq", "fitted", "covariate"), model_id = NULL, x_name = NULL, ... ) ## S3 method for class 'glm' diagnostic.plot( object, output = c("qq", "fitted", "covariate"), x = NULL, fit = NULL, distribution = qnorm, ncol = NULL, alpha = 1, xlab = NULL, color = "#444444", shape = 19, size = 2, qqpoint.color = "#444444", qqpoint.shape = 19, qqpoint.size = 2, qqline.color = "#888888", qqline.linetype = "dashed", qqline.size = 1, smooth = TRUE, smooth.color = "red", smooth.linetype = 1, smooth.size = 1, fill = NULL, resp_name = NULL, ... ) ## S3 method for class 'clm' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'lrm' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'orm' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...) ## S3 method for class 'polr' diagnostic.plot(object, output = c("qq", "fitted", "covariate"), ...)
object |
The object in the support classes (This function is mainly designed
for |
... |
Additional optional arguments can be passed onto |
output |
A character string specifying what type of output to plot. Default is
|
model_id |
A number refers to the index of the model that needs to be diagnosed. If NULL, all models will be diagnosed. |
x_name |
A string refers to the covariate name that needs to be diagnosed. If NULL, all adjustments will be diagnosed. |
x |
A vector giving the covariate values to use for residual-by-
covariate plots (i.e., when |
fit |
The fitted model from which the residuals were extracted. (Only
required if |
distribution |
Function that computes the quantiles for the reference
distribution to use in the quantile-quantile plot. Default is |
ncol |
Integer specifying the number of columns to use for the plot
layout (if requesting multiple plots). Default is |
alpha |
A single values in the interval [0, 1] controlling the opacity
alpha of the plotted points. Only used when |
xlab |
Character string giving the text to use for the x-axis label in
residual-by-covariate plots. Default is |
color |
Character string or integer specifying what color to use for the
points in the residual vs fitted value/covariate plot.
Default is |
shape |
Integer or single character specifying a symbol to be used for plotting the points in the residual vs fitted value/covariate plot. |
size |
Numeric value specifying the size to use for the points in the residual vs fitted value/covariate plot. |
qqpoint.color |
Character string or integer specifying what color to use for the points in the quantile-quantile plot. |
qqpoint.shape |
Integer or single character specifying a symbol to be used for plotting the points in the quantile-quantile plot. |
qqpoint.size |
Numeric value specifying the size to use for the points in the quantile-quantile plot. |
qqline.color |
Character string or integer specifying what color to use for the points in the quantile-quantile plot. |
qqline.linetype |
Integer or character string (e.g., |
qqline.size |
Numeric value specifying the thickness of the line in the quantile-quantile plot. |
smooth |
Logical indicating whether or not too add a nonparametric
smooth to certain plots. Default is |
smooth.color |
Character string or integer specifying what color to use for the nonparametric smooth. |
smooth.linetype |
Integer or character string (e.g., |
smooth.size |
Numeric value specifying the thickness of the line for the nonparametric smooth. |
fill |
Character string or integer specifying the color to use to fill
the boxplots for residual-by-covariate plots when |
resp_name |
Character string to specify the response name that will be displayed in the figure. |
A "ggplot"
object for supported models. For class "PAsso", it returns a plot in
"gtable"
object that combines diagnostic plots of all responses.
A "ggplot" object based on the input residuals.
A "ggplot" object based on the input residuals.
A plot in "gtable" object that combines diagnostic plots of all responses.
A "ggplot" object based on the residuals generated from glm object.
A "ggplot" object based on the residuals generated from clm object.
A "ggplot" object based on the residuals generated from lrm object.
A "ggplot" object based on the residuals generated from orm object.
A "ggplot" object based on the residuals generated from polr object.
# Import data for partial association analysis data("ANES2016") ANES2016$PreVote.num <- as.factor(ANES2016$PreVote.num) PAsso_3v <- PAsso(responses = c("PreVote.num", "PID", "selfLR"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016, uni.model = "probit", method = c("kendall"), resids.type = "surrogate", jitter = "latent") diag_p1 <- diagnostic.plot(object = PAsso_3v, output = "qq") diag_p2 <- diagnostic.plot(object = PAsso_3v, output = "fitted") diag_p3 <- diagnostic.plot(object = PAsso_3v, output = "covariate") # Simply diagnose a model # Fit cumulative link models fit1 <- ordinal::clm(PreVote.num ~ income.num + age + edu.year, data = ANES2016, link = "logit") # diagnostic.plot plot_qq_1 <- diagnostic.plot(object = fit1, output = "qq") plot_fit_1 <- diagnostic.plot(object = fit1, output = "fitted") plot_cov_1 <- diagnostic.plot(object = fit1, output = "covariate")
# Import data for partial association analysis data("ANES2016") ANES2016$PreVote.num <- as.factor(ANES2016$PreVote.num) PAsso_3v <- PAsso(responses = c("PreVote.num", "PID", "selfLR"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016, uni.model = "probit", method = c("kendall"), resids.type = "surrogate", jitter = "latent") diag_p1 <- diagnostic.plot(object = PAsso_3v, output = "qq") diag_p2 <- diagnostic.plot(object = PAsso_3v, output = "fitted") diag_p3 <- diagnostic.plot(object = PAsso_3v, output = "covariate") # Simply diagnose a model # Fit cumulative link models fit1 <- ordinal::clm(PreVote.num ~ income.num + age + edu.year, data = ANES2016, link = "logit") # diagnostic.plot plot_qq_1 <- diagnostic.plot(object = fit1, output = "qq") plot_fit_1 <- diagnostic.plot(object = fit1, output = "fitted") plot_cov_1 <- diagnostic.plot(object = fit1, output = "covariate")
This function is mainly designed for conducting the partial association analysis. It provides two ways of using:
1. A user-friendly way: only need "responses", "adjustments", and "data". All the rest of the argument will be setted as default (see Arguments for details of default).
2. An advanced way: user can input a list of fitted models by "fitted.models", then the
"responses" and "adjustments" are not necessary. Supported class of cumulative link models in
clm
, glm
, lrm
,
orm
, polr
, vglm
, .
It generates an object that has partial association matrix, marginal association, and some attributes: "arguments" saves c(association, method, resids.type). "responses" contains the names of response variables. The attribute "adjustments" contains the names of covariates. The "summary" function of "PAsso" class of object provides marginal association ' matrix for comparison purpose.
PAsso( responses, adjustments, data, uni.model = c("probit", "logit", "acat"), models = NULL, method = c("kendall", "pearson", "wolfsigma"), resids.type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), fitted.models = NULL, n_draws = 20, association = "partial", ... )
PAsso( responses, adjustments, data, uni.model = c("probit", "logit", "acat"), models = NULL, method = c("kendall", "pearson", "wolfsigma"), resids.type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), fitted.models = NULL, n_draws = 20, association = "partial", ... )
responses |
A string vector that specifies response variables. It requires to be equal or greater than two variables in the data frame. |
adjustments |
A string vector specifies covariates/confounders that need to be adjusted. |
data |
A data.frame including responses and adjustments. |
uni.model |
A character string specifying one single universal model setting for all
responses. Default |
models |
A string vector contains default link functions of fitting models with
respect to each response variable. If |
method |
A string argument to specify correlation coefficient method.
Three choices |
resids.type |
A character string specifying which type of residuals to generate
Current options include
Although |
jitter |
A character string specifying how to generate surrogate residuals.
Current options are
|
jitter.uniform.scale |
A character string specifying the scale on which to perform
the jittering whenever |
fitted.models |
A list contains all the models (S3 objects) you want to
assess for the partial association between ordinal responses after adjusting
for a set of covariates covariates. All of these models should be applied to the
same dataset, having same covariates, same sample size etc. The models in this
list can be an object of class |
n_draws |
A number to specify draws of surrogate residuls
such that the partial correlation coefficients are calculated repeatedly. The final
correlation coefficients are the average of all partial correlation coefficients.
It is the |
association |
An default argument to specify the partial association. Leave this further development of package such that other association analyses can be embedded. |
... |
Additional optional arguments. |
An object of class "PAsso"
is a list containing at least the following
components. It contains the partial correlation matrix and multiple repeats if
n_draws
> 1. This object has "arguments"
attribute saved as c(association, method, resids.type), "responses" attribute, and
"adjustments" attribute.
The list contains:
corr
The estimated correlation matrix(average of rep_MatCorr
)
of partial association after adjusting confounders;
rep_corr
The replications of estimated correlation matrix;
rep_SRs
The replications of surrogate residuals if partial association is applied;
fitted.models
The list stores all fitted.models;
data
The data.frame of original dataset;
mods_n
The sample size of each fitted model;
cor_func
The correlation function after assign different method;
Marg_corr
The marginal association matrix.
Liu, D., Li, S., Yu, Y., & Moustaki, I. (2020). Assessing partial association between ordinal variables: quantification, visualization, and hypothesis testing. Journal of the American Statistical Association, 1-14. doi:10.1080/01621459.2020.1796394
Liu, D., & Zhang, H. (2018). Residuals and diagnostics for ordinal regression models: A surrogate approach. Journal of the American Statistical Association, 113(522), 845-854. doi:10.1080/01621459.2017.1292915
Li, C., & Shepherd, B. E. (2010). Test of association between two ordinal variables while adjusting for covariates. Journal of the American Statistical Association, 105(490), 612-620. doi:10.1198/jasa.2010.tm09386
Li, C., & Shepherd, B. E. (2012). A new residual for ordinal outcomes. Biometrika, 99(2), 473-480. doi:10.1093/biomet/asr073
Franses, P. H., & Paap, R. (2001). Quantitative models in marketing research. Cambridge University Press. doi:10.1017/CBO9780511753794
########################################################### # User-friendly way of using ########################################################### library(MASS) # Import ANES2016 data in "PAsso" data(ANES2016) # User-friendly way of the partial association analysis PAsso_1 <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016, method = c("kendall")) print(PAsso_1, digits = 4) summary(PAsso_1, digits = 4) ########################################################### # Advanced way of the partial association analysis ########################################################### fit.vote<- glm(PreVote.num ~ income.num+ age + edu.year, data = ANES2016, family = binomial(link = "probit")) fit.PID<- polr(as.factor(PID) ~ income.num+age+edu.year, data = ANES2016, method="probit", Hess = TRUE) PAsso_adv1 <- PAsso(fitted.models=list(fit.vote, fit.PID), method = c("kendall"), resids.type = "surrogate") print(PAsso_adv1, digits = 4) summary(PAsso_adv1, digits = 4)
########################################################### # User-friendly way of using ########################################################### library(MASS) # Import ANES2016 data in "PAsso" data(ANES2016) # User-friendly way of the partial association analysis PAsso_1 <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016, method = c("kendall")) print(PAsso_1, digits = 4) summary(PAsso_1, digits = 4) ########################################################### # Advanced way of the partial association analysis ########################################################### fit.vote<- glm(PreVote.num ~ income.num+ age + edu.year, data = ANES2016, family = binomial(link = "probit")) fit.PID<- polr(as.factor(PID) ~ income.num+age+edu.year, data = ANES2016, method="probit", Hess = TRUE) PAsso_adv1 <- PAsso(fitted.models=list(fit.vote, fit.PID), method = c("kendall"), resids.type = "surrogate") print(PAsso_adv1, digits = 4) summary(PAsso_adv1, digits = 4)
A plot matrix to display the results of partial association analyses. Upper-triangle contains scatter-plot matrix between each pair of response variables. Lower-triangle contains the partial correlation coefficients adjusted by covariates.
## S3 method for class 'PAsso' plot(x, color = "#444444", shape = 19, size = 2, alpha = 0.5, ...)
## S3 method for class 'PAsso' plot(x, color = "#444444", shape = 19, size = 2, alpha = 0.5, ...)
x |
The object in "PAsso" class that is generated by "PAsso" or "test". |
color |
The color of points. |
shape |
The shapre of points. For more details see the help vignette:
|
size |
The size of points. For more details see the help vignette:
|
alpha |
The value to make the points transparent. For more details see the help vignette:
|
... |
Additional optional arguments to be passed onto. |
A pairwise plot matrix reveals the partial association between ordinal variables.
All the plots are based on surrogate residuals generated from "resides"
function.
Graphics are designed based on ggplot2
and "GGally"
.
A "GGally"
object.
data(ANES2016) summary(ANES2016) PAsso_2v <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016) plot(PAsso_2v)
data(ANES2016) summary(ANES2016) PAsso_2v <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016) plot(PAsso_2v)
A list of 3-D P-P plots (or false color level plots when type = "contour"
)
for the inspection of the partial association analysis. Each plot is either 3-D P-P plot or level plot
from an empirical copula trained from the surrogate residuals of a pair of responses.
plot3D(object, y1, y2, ...) ## Default S3 method: plot3D(object, y1, y2, ...) ## S3 method for class 'PAsso' plot3D(object, y1, y2, type = c("surface3D", "contour"), ...)
plot3D(object, y1, y2, ...) ## Default S3 method: plot3D(object, y1, y2, ...) ## S3 method for class 'PAsso' plot3D(object, y1, y2, type = c("surface3D", "contour"), ...)
object |
A PAsso class of object. |
y1 |
A string to specify the first response for the 3D plot. |
y2 |
A string to specify the second response for the 3D plot. If either one of the
y1 or y2 is missing. The |
... |
Additional optional arguments. |
type |
A character string specifying the trace type (e.g. "surface3D", "contour"). "contour" creates a 2D contour plot between u and v. |
All the plots are based on surrogate residuals generated from "residuals"
function in sure
. Graphics are designed based on
PAsso and "plotly"
.
If response y1 or y2 is not specified, a list of "plotly"
objects includes
all pairs of responses will be returned (with name "response 1 v.s. response 2" etc.). If
responses y1 and y2 are specified, returns a 3D plot as "plotly"
object.
# Did not run this to save time # data("ANES2016") # PAsso_3v <- PAsso(responses = c("PreVote.num", "PID", "selfLR"), # adjustments = c("income.num", "age", "edu.year"), # data = ANES2016) # plot3D(PAsso_3v, y1="PID", y2="selfLR") # plot3D(PAsso_3v, y1="PID", y2="selfLR", type = "contour")
# Did not run this to save time # data("ANES2016") # PAsso_3v <- PAsso(responses = c("PreVote.num", "PID", "selfLR"), # adjustments = c("income.num", "age", "edu.year"), # data = ANES2016) # plot3D(PAsso_3v, y1="PID", y2="selfLR") # plot3D(PAsso_3v, y1="PID", y2="selfLR", type = "contour")
Print partial association matrix
## S3 method for class 'PAsso' print(x, digits = max(2, getOption("digits") - 2), ...) ## S3 method for class 'PAsso.test' print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'PAsso' print(x, digits = max(2, getOption("digits") - 2), ...) ## S3 method for class 'PAsso.test' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
A PAsso object for printing out results. |
digits |
A default number to specify decimal digit values. |
... |
Additional optional arguments. |
Print partial association matrix of a PAsso object
# See PAsso for the example.
# See PAsso for the example.
A generic function to simulate surrogate residuals for cumulative link regression models using the latent method described in Liu and Zhang (2017).
It also support the sign-based residuals (Li and Shepherd, 2010), generalized residuals (Franses and Paap, 2001), and deviance residuals for cumulative link regression models.
## S3 method for class 'clm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'ord' residuals( object, type = c("surrogate", "sign", "general", "deviance", "pearson", "working", "response", "partial"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'lrm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'orm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'polr' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'glm' residuals( object, type = c("surrogate", "sign", "general", "deviance", "pearson", "working", "response", "partial"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S4 method for signature 'vglm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S4 method for signature 'vgam' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'PAsso' residuals(object, draw_id = 1, ...)
## S3 method for class 'clm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'ord' residuals( object, type = c("surrogate", "sign", "general", "deviance", "pearson", "working", "response", "partial"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'lrm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'orm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'polr' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'glm' residuals( object, type = c("surrogate", "sign", "general", "deviance", "pearson", "working", "response", "partial"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S4 method for signature 'vglm' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S4 method for signature 'vgam' residuals( object, type = c("surrogate", "sign", "general", "deviance"), jitter = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... ) ## S3 method for class 'PAsso' residuals(object, draw_id = 1, ...)
object |
An object of class |
type |
The type of residuals which should be returned. The alternatives are: "surrogate" (default), "sign", "general", and "deviance". Can be abbreviated.
|
jitter |
When the
|
jitter.uniform.scale |
When the |
nsim |
An integer specifying the number of replicates to use.
Default is |
... |
Additional optional arguments. |
draw_id |
A number refers to the i-th draw of residuals. |
A numeric vector of class c("numeric", "resids")
containing
the simulated surrogate residuals. Additionally, if nsim
> 1,
then the result will contain the attributes:
draws
A matrix with nsim
columns, one for each
is a replicate of the surrogate residuals. Note, they correspond
to the original ordering of the data;
draws_id
A matrix with nsim
columns. Each column
contains the observation number each surrogate residuals corresponds to in
draws
. (This is used for plotting purposes.)
A "resid" object with attributes. It contains a vector or a matrix (nsim>1) of residuals for the adjacent categories model.
A matrix of class c("matrix", "resids")
containing
the simulated surrogate residuals used for the partial association
analysis in PAsso
. Additionally, if rep_num
> 1 in PAsso
,
then the result will contain the attributes:
draws
An array contains all draws of residuals.
Surrogate response values require sampling from a continuous distribution;
consequently, the result will be different with every call to
surrogate
. The internal functions used for sampling from truncated
distributions are based on modified versions of
rtrunc
and qtrunc
.
For "glm"
objects, only the binomial()
family is supported.
Liu, D., Li, S., Yu, Y., & Moustaki, I. (2020). Assessing partial association between ordinal variables: quantification, visualization, and hypothesis testing. Journal of the American Statistical Association, 1-14. doi:10.1080/01621459.2020.1796394
Liu, D., & Zhang, H. (2018). Residuals and diagnostics for ordinal regression models: A surrogate approach. Journal of the American Statistical Association, 113(522), 845-854. doi:10.1080/01621459.2017.1292915
Li, C., & Shepherd, B. E. (2010). Test of association between two ordinal variables while adjusting for covariates. Journal of the American Statistical Association, 105(490), 612-620. doi:10.1198/jasa.2010.tm09386
Franses, P. H., & Paap, R. (2001). Quantitative models in marketing research. Cambridge University Press. doi:10.1017/CBO9780511753794
# Example 1 # Load data with binary response data(ANES2016) # Fit glm model with binomial logit model fit.prevote <- glm(PreVote.num ~ age + edu.year + income.num, data = ANES2016, family = "binomial") # Simulate surrogate residuals res1 <- residuals(fit.prevote, type = "surrogate", jitter="latent", jitter.uniform.scale="response") attr(res1,"arguments") # Example 2 # residuals() function can also work for PAsso object # Load data data("ANES2016") PAsso_1 <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016) # Extract surrogate residuals from the PAsso object res1 <- residuals(PAsso_1)
# Example 1 # Load data with binary response data(ANES2016) # Fit glm model with binomial logit model fit.prevote <- glm(PreVote.num ~ age + edu.year + income.num, data = ANES2016, family = "binomial") # Simulate surrogate residuals res1 <- residuals(fit.prevote, type = "surrogate", jitter="latent", jitter.uniform.scale="response") attr(res1,"arguments") # Example 2 # residuals() function can also work for PAsso object # Load data data("ANES2016") PAsso_1 <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016) # Extract surrogate residuals from the PAsso object res1 <- residuals(PAsso_1)
This function summarizes the partial association analysis by providing partial association matrix, marginal association matrix, and a matrix of the models' coefficients. The partial correlation coefficient matrix displays the partial association between each pair of responses after adjusting the covariates. While the marginal coefficient matrix displays association before the adjustment.
## S3 method for class 'PAsso' summary(object, digits = max(3L, getOption("digits") - 2L), ...)
## S3 method for class 'PAsso' summary(object, digits = max(3L, getOption("digits") - 2L), ...)
object |
A PAsso object to draw the summarized results, which includes partial association matrix and a matrix of the models' coefficients. |
digits |
A default number to specify decimal digit values. |
... |
Additional optional arguments. |
For a PAsso object, print its partial association matrix, marginal association matrix, and a matrix of the models' coefficients.
# See PAsso for the example.
# See PAsso for the example.
Simulate surrogate response values for cumulative link regression models using the latent method described in Liu and Zhang (2017).
surrogate( object, method = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... )
surrogate( object, method = c("latent", "uniform"), jitter.uniform.scale = c("probability", "response"), nsim = 1L, ... )
object |
|
method |
Character string specifying which method to use to generate the
surrogate response values. Current options are |
jitter.uniform.scale |
Character string specifying the scale on which to perform
the jittering whenever |
nsim |
Integer specifying the number of bootstrap replicates to use.
Default is |
... |
Additional optional arguments. (Currently ignored.) |
A numeric vector of class c("numeric", "surrogate")
containing
the simulated surrogate response values. Additionally, if nsim
> 1,
then the result will contain the attributes:
boot_reps
A matrix with nsim
columns, one for each
bootstrap replicate of the surrogate values. Note, these are random and do
not correspond to the original ordering of the data;
boot_id
A matrix with nsim
columns. Each column
contains the observation number each surrogate value corresponds to in
boot_reps
. (This is used for plotting purposes.)
Surrogate response values require sampling from a continuous distribution;
consequently, the result will be different with every call to
surrogate
. The internal functions used for sampling from truncated
distributions are based on modified versions of
truncdist:rtrunc
and truncdist:qtrunc
.
For "glm"
objects, only the binomial()
family is supported.
Liu, D., Li, S., Yu, Y., & Moustaki, I. (2020). Assessing partial association between ordinal variables: quantification, visualization, and hypothesis testing. Journal of the American Statistical Association, 1-14. doi:10.1080/01621459.2020.1796394
Liu, D., & Zhang, H. (2018). Residuals and diagnostics for ordinal regression models: A surrogate approach. Journal of the American Statistical Association, 113(522), 845-854. doi:10.1080/01621459.2017.1292915
Nadarajah, S., & Kotz, S. (2006). R Programs for Truncated Distributions. Journal of Statistical Software, 16(Code Snippet 2), 1 - 8. doi:10.18637/jss.v016.c02
# Generate data from a quadratic probit model set.seed(101) n <- 2000 x <- runif(n, min = -3, max = 6) z <- 10 + 3*x - 1*x^2 + rnorm(n) y <- ifelse(z <= 0, yes = 0, no = 1) # Scatterplot matrix pairs(~ x + y + z) # Setup for side-by-side plots par(mfrow = c(1, 2)) # Misspecified mean structure fm1 <- glm(y ~ x, family = binomial(link = "probit")) s1 <- surrogate(fm1) scatter.smooth(x, s1 - fm1$linear.predictors, main = "Misspecified model", ylab = "Surrogate residual", lpars = list(lwd = 3, col = "red2")) abline(h = 0, lty = 2, col = "blue2") # Correctly specified mean structure fm2 <- glm(y ~ x + I(x ^ 2), family = binomial(link = "probit")) s2 <- surrogate(fm2) scatter.smooth(x, s2 - fm2$linear.predictors, main = "Correctly specified model", ylab = "Surrogate residual", lpars = list(lwd = 3, col = "red2")) abline(h = 0, lty = 2, col = "blue2") dev.off() # reset to defaults once finish
# Generate data from a quadratic probit model set.seed(101) n <- 2000 x <- runif(n, min = -3, max = 6) z <- 10 + 3*x - 1*x^2 + rnorm(n) y <- ifelse(z <= 0, yes = 0, no = 1) # Scatterplot matrix pairs(~ x + y + z) # Setup for side-by-side plots par(mfrow = c(1, 2)) # Misspecified mean structure fm1 <- glm(y ~ x, family = binomial(link = "probit")) s1 <- surrogate(fm1) scatter.smooth(x, s1 - fm1$linear.predictors, main = "Misspecified model", ylab = "Surrogate residual", lpars = list(lwd = 3, col = "red2")) abline(h = 0, lty = 2, col = "blue2") # Correctly specified mean structure fm2 <- glm(y ~ x + I(x ^ 2), family = binomial(link = "probit")) s2 <- surrogate(fm2) scatter.smooth(x, s2 - fm2$linear.predictors, main = "Correctly specified model", ylab = "Surrogate residual", lpars = list(lwd = 3, col = "red2")) abline(h = 0, lty = 2, col = "blue2") dev.off() # reset to defaults once finish
This function use bootstrapping to conduct hypothesis testing for the partial association coefficients. It directly applies onto the "PAsso" class of object generated by "PAsso".
test(object, bootstrap_rep = 300, H0 = 0, parallel = FALSE)
test(object, bootstrap_rep = 300, H0 = 0, parallel = FALSE)
object |
An object of "PAsso" class, which is generated by "PAsso" function. |
bootstrap_rep |
The number of bootstrap replications. It may be slow. |
H0 |
null hypothesis of partial correlation coefficient. |
parallel |
logical argument whether conduct parallel for bootstrapping partial association. |
# Import ANES2016 data in "PAsso" data(ANES2016) # Parial association: PAsso_2v <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016) summary(PAsso_2v, digits=4) PAsso_2v_test <- test(object = PAsso_2v, bootstrap_rep=20, H0=0, parallel=FALSE) PAsso_2v_test
# Import ANES2016 data in "PAsso" data(ANES2016) # Parial association: PAsso_2v <- PAsso(responses = c("PreVote.num", "PID"), adjustments = c("income.num", "age", "edu.year"), data = ANES2016) summary(PAsso_2v, digits=4) PAsso_2v_test <- test(object = PAsso_2v, bootstrap_rep=20, H0=0, parallel=FALSE) PAsso_2v_test