# sjp.glm {sjPlot}

This document shows examples for using the sjp.glm function of the sjPlot package.

Ressources:

• Developer snapshot at GitHub
• Submission of bug reports and issues at GitHub

(back to table of content)

## Data initialization

library(sjPlot)
library(sjmisc)
data(efc)
# set basic theme options
set_theme("forest",
axis.title.size = .85,
axis.textsize = .85,
legend.size = .8,
geom.label.size = 3.5)

## Fitting a logistic model

First, we fit a binomial logit-model and create a vector with labels for the predictors.

# create binary response
y <- ifelse(efc$neg_c_7 < median(na.omit(efc$neg_c_7)), 0, 1)
# create data frame for fitting model
df <- data.frame(y = to_factor(y),
sex = to_factor(efc$c161sex), dep = to_factor(efc$e42dep),
barthel = efc$barthtot, education = to_factor(efc$c172code))
# set variable label for response
set_label(df$y) <- "High Negative Impact" # fit model fit <- glm(y ~., data = df, family = binomial(link = "logit")) ## Plotting estimates of generalized linear models With the sjp.glm function you can plot the odds ratios (or e.g. incidents ratios for poisson models) with confidence intervals as so called forest plots. sjp.glm(fit) ## Plotting incident rate ratios (Poisson) # set variable label for service usage set_label(efc$tot_sc_e) <- "Total number of services used by carer"
# see distribution... looks like Poisson?
sjp.frq(efc$tot_sc_e) # fit poisson model fit2 <- glm(tot_sc_e ~ neg_c_7 + e42dep + c161sex, data = efc, family = poisson(link = "log")) # fit incident rate ratios, we need three decimal points to see # a difference to the negative binomial model... sjp.glm(fit2, digits = 3) # fit negative binomial model as well library(MASS) fit3 <- glm.nb(tot_sc_e ~ neg_c_7 + e42dep + c161sex, data = efc) # fit incident rate ratios sjp.glm(fit3, digits = 3) ## Continuous values at the axis Due to the log-scaling of the x-axis - which should be done when plotting odds ratios (see here and here) - the x-axis values have an exponential growth. However, you can transform the ticks with trns.ticks (defaults to TRUE) to get proportional distances between the values. The x-axis-tick marks are set accordingly. sjp.glm(fit, trns.ticks = FALSE) ## Sorting estimates By default, the odds ratios are sorted from highest to lowest value. You can also keep the order of predictors as they were introduced into the model if you set sort.est to FALSE. sjp.glm(fit, sort.est = FALSE) ## Predictions of coefficients As you can see, the fitted model contains two continuous variables. The odds ratios for these predictors may a bit more difficult to interprete than categorical or factor variables, because of the missing reference category. Thus, you can also plot predicted probability or incidents of all predictors (covariates, coefficients) with type = "slope", marginal effects with type = "eff" and predictions for the response with type = "pred". ### Predicted probabilities or incidents The predicted values from this plot type are based on the intercept’s estimate and each specific term’s estimate. All other co-variates are set to zero (i.e. ignored), which corresponds to family(fit)$linkinv(eta = b0 + bi * xi) (where xi is the estimate).

sjp.glm(fit, type = "slope")

A probability curve of all predictors is plotted, which indicates the probability that the event (indicated by the response) occurs for each value of the predictor (not adjusted for remaining co-variates). In the above example, the first facet plot would be interpreted as: with increasing Barthel-Index (which means, better functional / physical status), the probability that caring for a dependent person is negatively perceived, decreases (in short: the less dependent a person I care for is, the less negative is the impact of care).

This kind of plot may be more informative then the odds ratio of 0.97 for the predictor Total score BARTHEL INDEX.

The same works for other model families or link functions. The following shows predicted incidents from the poisson model.

sjp.glm(fit2, type = "slope")

Confidence intervals are shown when show.ci = TRUE, and data points are not plotted when scatter.plot = FALSE .

sjp.glm(fit, type = "slope", show.ci = TRUE, scatter.plot = FALSE)

You can also plot single plots for each coefficient when facet.grid = FALSE. To get selected plots for particular predictors only, pass the term names to the vars argument. In the following example, only the relationship between barthel and negative impact is shown.

sjp.glm(fit, type = "slope", facet.grid = FALSE,
show.ci = TRUE, vars = "barthel")

### Marginal effects

With type = "eff", you can plot marginal effects (predicted marginal probabilities resp. predicted marginal incident rates), where all remaining co-variates are set to the mean. Unlike type = "slope", this plot type adjusts for co-variates.

# the binary outcome
sjp.glm(fit, type = "eff")

# the count outcome
sjp.glm(fit2, type = "eff")

As you can see in the above examples, multiple plots for type = "eff" are plotted as facet grid resp. as facet wrap. Since this does not allow to set a different x-scale for each plot, x-axis are not properly labelled. However, facet.grid = FALSE produces a single plot for each predictor:

# plot marginal effects, but only for dependency this time
sjp.glm(fit, type = "eff", facet.grid = FALSE, vars = "education")

To arrange all predictors of multiple in one plot, as grid, use the plot_grid() function on multiple plot objects. plot_grid() requires multiple plots, so you have to set facet.grid = FALSE to get a plot.list-value as return value from the function (see ?sjp.lm on Return Value for more details). This allows you arrange multiple plots as grid in one plot, but with different x-axis-scales.

# get list of all plots
p <- sjp.glm(fit, type = "eff", facet.grid = FALSE,
show.ci = TRUE, prnt.plot = FALSE)\$plot.list
# plot all marginal effects, as grid, proper x-axes
# also, set margins for this example
plot_grid(p, margin = c(0.3, 0.3, 0.3, 0.3))

### Predicting values

With type = "pred", you can plot predicted values for the response, related to specific model predictors. The predicted values of the response are computed, based on the predict.glm method and corresponds to predict(fit, type = "response"). This plot type requires the vars argument to select specific terms that should be used for the x-axis and - optional - as first or second grouping factor. Hence, vars must be a character vector with the names of one to three model predictors.

# the binary outcome
sjp.glm(fit, type = "pred", vars = "barthel")

# the count outcome
sjp.glm(fit3, type = "pred", vars = c("neg_c_7", "e42dep"),
show.ci = TRUE)

# the count outcome, non faceted
sjp.glm(fit2, type = "pred", vars = c("neg_c_7", "e42dep"),
facet.grid = FALSE)

# the count outcome, grouped gender and education, w/o data points
# and adjusted y-limits, to completely show CI bands
sjp.glm(fit2, type = "pred", vars = c("neg_c_7", "c161sex","e42dep"),
facet.grid = FALSE, show.ci = TRUE, scatter.plot = FALSE,
axis.lim = c(0, 4))