This document shows examples for using the
sjp.glm function of the sjPlot package.
(back to table of content)
Please refer to this document.
library(sjPlot) library(sjmisc) data(efc) # set basic theme options set_theme("forest", axis.title.size = .85, axis.textsize = .85, legend.size = .8, geom.label.size = 3.5)
Please refer to this document
First, we fit a binomial logit-model and create a vector with labels for the predictors.
# create binary response y <- ifelse(efc$neg_c_7 < median(na.omit(efc$neg_c_7)), 0, 1) # create data frame for fitting model df <- data.frame(y = to_factor(y), sex = to_factor(efc$c161sex), dep = to_factor(efc$e42dep), barthel = efc$barthtot, education = to_factor(efc$c172code)) # set variable label for response set_label(df$y) <- "High Negative Impact" # fit model fit <- glm(y ~., data = df, family = binomial(link = "logit"))
sjp.glm function you can plot the odds ratios (or e.g. incidents ratios for poisson models) with confidence intervals as so called forest plots.
# set variable label for service usage set_label(efc$tot_sc_e) <- "Total number of services used by carer" # see distribution... looks like Poisson? sjp.frq(efc$tot_sc_e)
# fit poisson model fit2 <- glm(tot_sc_e ~ neg_c_7 + e42dep + c161sex, data = efc, family = poisson(link = "log")) # fit incident rate ratios, we need three decimal points to see # a difference to the negative binomial model... sjp.glm(fit2, digits = 3)
# fit negative binomial model as well library(MASS) fit3 <- glm.nb(tot_sc_e ~ neg_c_7 + e42dep + c161sex, data = efc) # fit incident rate ratios sjp.glm(fit3, digits = 3)
Due to the log-scaling of the x-axis - which should be done when plotting odds ratios (see here and here) - the x-axis values have an exponential growth. However, you can transform the ticks with
trns.ticks (defaults to
TRUE) to get proportional distances between the values. The x-axis-tick marks are set accordingly.
sjp.glm(fit, trns.ticks = FALSE)
By default, the odds ratios are sorted from highest to lowest value. You can also keep the order of predictors as they were introduced into the model if you set
sjp.glm(fit, sort.est = FALSE)
As you can see, the fitted model contains two continuous variables. The odds ratios for these predictors may a bit more difficult to interprete than categorical or factor variables, because of the missing reference category. Thus, you can also plot predicted probability or incidents of all predictors (covariates, coefficients) with
type = "slope", marginal effects with
type = "eff" and predictions for the response with
type = "pred".
The predicted values from this plot type are based on the intercept’s estimate and each specific term’s estimate. All other co-variates are set to zero (i.e. ignored), which corresponds to
family(fit)$linkinv(eta = b0 + bi * xi) (where
xi is the estimate).
sjp.glm(fit, type = "slope")
A probability curve of all predictors is plotted, which indicates the probability that the event (indicated by the response) occurs for each value of the predictor (not adjusted for remaining co-variates). In the above example, the first facet plot would be interpreted as: with increasing Barthel-Index (which means, better functional / physical status), the probability that caring for a dependent person is negatively perceived, decreases (in short: the less dependent a person I care for is, the less negative is the impact of care).
This kind of plot may be more informative then the odds ratio of
0.97 for the predictor Total score BARTHEL INDEX.
The same works for other model families or link functions. The following shows predicted incidents from the poisson model.
sjp.glm(fit2, type = "slope")
Confidence intervals are shown when
show.ci = TRUE, and data points are not plotted when
scatter.plot = FALSE .
sjp.glm(fit, type = "slope", show.ci = TRUE, scatter.plot = FALSE)
You can also plot single plots for each coefficient when
facet.grid = FALSE. To get selected plots for particular predictors only, pass the term names to the
vars argument. In the following example, only the relationship between barthel and negative impact is shown.
sjp.glm(fit, type = "slope", facet.grid = FALSE, show.ci = TRUE, vars = "barthel")
type = "eff", you can plot marginal effects (predicted marginal probabilities resp. predicted marginal incident rates), where all remaining co-variates are set to the mean. Unlike
type = "slope", this plot type adjusts for co-variates.
# the binary outcome sjp.glm(fit, type = "eff")
# the count outcome sjp.glm(fit2, type = "eff")
As you can see in the above examples, multiple plots for
type = "eff" are plotted as facet grid resp. as facet wrap. Since this does not allow to set a different x-scale for each plot, x-axis are not properly labelled. However,
facet.grid = FALSE produces a single plot for each predictor:
# plot marginal effects, but only for dependency this time sjp.glm(fit, type = "eff", facet.grid = FALSE, vars = "education")
To arrange all predictors of multiple in one plot, as grid, use the
plot_grid() function on multiple plot objects.
plot_grid() requires multiple plots, so you have to set
facet.grid = FALSE to get a
plot.list-value as return value from the function (see
?sjp.lm on Return Value for more details). This allows you arrange multiple plots as grid in one plot, but with different x-axis-scales.
# get list of all plots p <- sjp.glm(fit, type = "eff", facet.grid = FALSE, show.ci = TRUE, prnt.plot = FALSE)$plot.list # plot all marginal effects, as grid, proper x-axes # also, set margins for this example plot_grid(p, margin = c(0.3, 0.3, 0.3, 0.3))
type = "pred", you can plot predicted values for the response, related to specific model predictors. The predicted values of the response are computed, based on the
predict.glm method and corresponds to
predict(fit, type = "response"). This plot type requires the
vars argument to select specific terms that should be used for the x-axis and - optional - as first or second grouping factor. Hence,
vars must be a character vector with the names of one to three model predictors.
# the binary outcome sjp.glm(fit, type = "pred", vars = "barthel")
# the count outcome sjp.glm(fit3, type = "pred", vars = c("neg_c_7", "e42dep"), show.ci = TRUE)
# the count outcome, non faceted sjp.glm(fit2, type = "pred", vars = c("neg_c_7", "e42dep"), facet.grid = FALSE)
# the count outcome, grouped gender and education, w/o data points # and adjusted y-limits, to completely show CI bands sjp.glm(fit2, type = "pred", vars = c("neg_c_7", "c161sex","e42dep"), facet.grid = FALSE, show.ci = TRUE, scatter.plot = FALSE, axis.lim = c(0, 4))