Understanding the “standard errors are probably not reliable” warning message when fitting mixed models via lme4 #548

annahuynhly · 2025-06-27T03:03:30Z

annahuynhly
Jun 27, 2025

Running these lines of code:

library(lme4)
library(modelbased)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
estimate_means(fm1, by= "Subject")

Will produce the following warning message:

Standard errors are probably not reliable. This can happen when random effects are
  involved. You may try `estimate_relation()` instead. You may also try package
  glmmTMB to produce valid standard errors.

That being said, we obtain the same error message if we use the suggested function:

estimate_relation(fm1, by = "Subject")

My main question is: why is it that when random effects are involved, the standard errors are deemed as unreliable?

Currently lme4::predict() has a reasonably reliable se.fit= argument for including the standard errors for prediction. This stackoverflow post discusses extracting the joint covariance matrix of random and fixed effects for merMod objects, and eventually obtaining the standard errors: https://stackoverflow.com/q/78923181/190277

(And my secondary less important question is: why was estimate_relation() suggested? This one could be ignorance on my part...)

strengejacke · 2025-06-30T10:04:58Z

strengejacke
Jun 30, 2025
Maintainer

The problem is that all standard errors are the same. In the SO posting, the explanation is because the design is balanced. But, identical standard errors are also calculated with unequal group numbers, where we might expect different standard errors. Working with glmmTMB and/or estimate_relation() extracts other standard errors (maybe predict() for glmmTMB works different, or we may have implemented it differently compared to lme4):

library(easystats)
library(lme4)
library(glmmTMB)

data(penguins)
fm1 <- lmer(body_mass ~  sex + (1 | species), data = penguins)

estimate_means(fm1, by = "species")
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead. You may
#>   also try package glmmTMB to produce valid standard errors.
#> Estimated Marginal Means
#> 
#> species   |    Mean |     SE |             95% CI | t(329)
#> ----------------------------------------------------------
#> Adelie    | 3706.68 | 454.93 | [2811.75, 4601.62] |   8.15
#> Chinstrap | 3734.14 | 454.93 | [2839.20, 4629.07] |   8.21
#> Gentoo    | 5082.79 | 454.93 | [4187.85, 5977.72] |  11.17
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors averaged: sex

dg <- get_datagrid(fm1, by = "species", include_random = TRUE)
predict(fm1, dg, se.fit = TRUE, re.form = NA)
#> $fit
#>        1        2        3 
#> 3840.747 3840.747 3840.747 
#> 
#> $se.fit
#>        1        2        3 
#> 455.2652 455.2652 455.2652

estimate_relation(fm1, by = "species")
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead. You may
#>   also try package glmmTMB to produce valid standard errors.
#> Model-based Predictions
#> 
#> species   | Predicted |     SE |             95% CI
#> ---------------------------------------------------
#> Adelie    |   3372.89 | 455.27 | [2477.30, 4268.49]
#> Chinstrap |   3400.35 | 455.27 | [2504.75, 4295.94]
#> Gentoo    |   4749.00 | 455.27 | [3853.40, 5644.60]
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)

fm2 <- glmmTMB(body_mass ~ sex + (1 | species), data = penguins)

estimate_relation(fm2, by = "species")
#> Model-based Predictions
#> 
#> species   | Predicted |    SE |             95% CI
#> --------------------------------------------------
#> Adelie    |   3373.15 | 31.37 | [3311.65, 3434.64]
#> Chinstrap |   3400.86 | 42.05 | [3318.45, 3483.27]
#> Gentoo    |   4748.38 | 34.01 | [4681.73, 4815.03]
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)

^{Created on 2025-06-30 with reprex v2.1.1}

0 replies

strengejacke · 2025-06-30T10:09:35Z

strengejacke
Jun 30, 2025
Maintainer

I may try the code from the SO posting you linked to. Maybe that helps obtaining accurate SEs. Must check the code with the penguins example.

0 replies

bwiernik · 2025-06-30T16:53:06Z

bwiernik
Jun 30, 2025
Maintainer

I believe the difference is that glmmTMB includes uncertainty from the random effects components, whereas lme4 does that. (we should check with Ben Bolker about that.) We could at least clarify the warning message.

0 replies

strengejacke · 2025-06-30T18:20:55Z

strengejacke
Jun 30, 2025
Maintainer

These are results from both packages:

library(lme4)
#> Loading required package: Matrix
library(glmmTMB)

data(penguins)
d <- data.frame(sex = "female", species = unique(penguins$species))

fm1 <- lmer(body_mass ~  sex + (1 | species), data = penguins)
# population level predictions
p1 <- predict(fm1, newdata = d, re.form = NA, se.fit = TRUE)
# individual level predictions
p2 <- predict(fm1, newdata = d, re.form = NULL, se.fit = TRUE)

fm2 <- glmmTMB(body_mass ~  sex + (1 | species), data = penguins)
# population level predictions
p3 <- predict(fm2, newdata = d, re.form = NA, se.fit = TRUE)
# individual level predictions
p4 <- predict(fm2, newdata = d, re.form = NULL, se.fit = TRUE)

cbind(
  do.call(cbind, p1),
  do.call(cbind, p2),
  do.call(cbind, p3),
  do.call(cbind, p4)
)
#>        fit   se.fit      fit   se.fit      fit   se.fit      fit   se.fit
#> 1 3840.747 455.2652 3372.894 31.41926 3840.796 371.8844 3373.145 31.37384
#> 2 3840.747 455.2652 4748.999 34.03007 3840.796 371.8844 4748.382 34.00714
#> 3 3840.747 455.2652 3400.347 42.10579 3840.796 371.8844 3400.860 42.04746

^{Created on 2025-06-30 with reprex v2.1.1}

And here the Bayesian results, for comparison

fm3 <- brms::brm(body_mass ~  sex + (1 | species), data = penguins)
# population level predictions
p4 <- predict(fm3, newdata = d, re_formula = NA)
# individual level predictions
p5 <- predict(fm3, newdata = d, re_formula = NULL)

p4
#>      Estimate Est.Error     Q2.5    Q97.5
#> [1,] 3790.211  555.4609 2680.449 4898.273
#> [2,] 3794.254  566.1043 2695.202 4893.326
#> [3,] 3799.027  556.9404 2667.236 4912.240

p5
#>      Estimate Est.Error     Q2.5    Q97.5
#> [1,] 3371.963  317.1194 2752.285 3994.157
#> [2,] 4738.856  322.9909 4101.411 5371.909
#> [3,] 3405.087  325.9277 2772.207 4045.131

0 replies

strengejacke · 2025-06-30T18:25:36Z

strengejacke
Jun 30, 2025
Maintainer

I think the population level predictions and SEs make sense, but in this code, random effects are explicitly requested.

library(lme4)
library(modelbased)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
estimate_means(fm1, by= "Subject")

We may change the message to something like: Looks like you want predictions for random effects, please add re.form = NULL to your call?

0 replies

strengejacke · 2025-06-30T18:55:50Z

strengejacke
Jun 30, 2025
Maintainer

Ok, this is a lot of output, but the bottom line is: re.form seems to have no effect on standard errors when we call marginaleffects::avg_predictions() (which we do in estimate_means()). But for estimate_relation(), where we rely on predict(), re.form has an effect on standard errors.

library(lme4)
#> Loading required package: Matrix
library(glmmTMB)
library(modelbased)

data(penguins)
d <- data.frame(sex = "female", species = unique(penguins$species))

fm1 <- lmer(body_mass ~  sex + (1 | species), data = penguins, REML = TRUE)
estimate_means(fm1, by = "species", re.form = NA, estimate = "specific")
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead. You may
#>   also try package glmmTMB to produce valid standard errors.
#> Model-based Predictions
#> 
#> species   |    Mean |     SE |             95% CI | t(329)
#> ----------------------------------------------------------
#> Adelie    | 3840.75 | 455.27 | [2945.15, 4736.34] |   8.44
#> Chinstrap | 3840.75 | 455.27 | [2945.15, 4736.34] |   8.44
#> Gentoo    | 3840.75 | 455.27 | [2945.15, 4736.34] |   8.44
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)
marginaleffects::avg_predictions(fm1, newdata = d, by = "species", re.form = NA)
#> 
#>    species Estimate Std. Error    z Pr(>|z|)    S 2.5 % 97.5 %
#>  Adelie        3841        455 8.44   <0.001 54.8  2948   4733
#>  Chinstrap     3841        455 8.44   <0.001 54.8  2948   4733
#>  Gentoo        3841        455 8.44   <0.001 54.8  2948   4733
#> 
#> Type: response
estimate_relation(fm1, by = "species", re.form = NA)
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead. You may
#>   also try package glmmTMB to produce valid standard errors.
#> Model-based Predictions
#> 
#> species   | Predicted |     SE |             95% CI
#> ---------------------------------------------------
#> Adelie    |   3372.89 | 455.27 | [2477.30, 4268.49]
#> Chinstrap |   3400.35 | 455.27 | [2504.75, 4295.94]
#> Gentoo    |   4749.00 | 455.27 | [3853.40, 5644.60]
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)

estimate_means(fm1, by = "species", re.form = NULL, estimate = "specific")
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead. You may
#>   also try package glmmTMB to produce valid standard errors.
#> Model-based Predictions
#> 
#> species   |    Mean |     SE |             95% CI | t(329)
#> ----------------------------------------------------------
#> Adelie    | 3372.89 | 455.27 | [2477.30, 4268.49] |   7.41
#> Chinstrap | 3400.35 | 455.27 | [2504.75, 4295.94] |   7.47
#> Gentoo    | 4749.00 | 455.27 | [3853.40, 5644.60] |  10.43
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)
marginaleffects::avg_predictions(fm1, newdata = d, by = "species", re.form = NULL)
#> Warning: For this model type, `marginaleffects` only takes into account the
#> uncertainty in fixed-effect parameters. This is often appropriate when
#> `re.form=NA`, but may be surprising to users who set `re.form=NULL` (default)
#> or to some other value. Call `options(marginaleffects_safe = FALSE)` to silence
#> this warning.
#> 
#>    species Estimate Std. Error     z Pr(>|z|)    S 2.5 % 97.5 %
#>  Adelie        3373        455  7.41   <0.001 42.8  2481   4265
#>  Chinstrap     3400        455  7.47   <0.001 43.5  2508   4293
#>  Gentoo        4749        455 10.43   <0.001 82.2  3857   5641
#> 
#> Type: response
estimate_relation(fm1, by = "species", re.form = NULL)
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead. You may
#>   also try package glmmTMB to produce valid standard errors.
#> Model-based Predictions
#> 
#> species   | Predicted |     SE |             95% CI
#> ---------------------------------------------------
#> Adelie    |   3372.89 | 455.27 | [2477.30, 4268.49]
#> Chinstrap |   3400.35 | 455.27 | [2504.75, 4295.94]
#> Gentoo    |   4749.00 | 455.27 | [3853.40, 5644.60]
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)


fm1 <- glmmTMB(body_mass ~  sex + (1 | species), data = penguins)
estimate_means(fm1, by = "species", re.form = NA, estimate = "specific")
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead.
#> Model-based Predictions
#> 
#> species   |    Mean |     SE |             95% CI |     z
#> ---------------------------------------------------------
#> Adelie    | 3840.80 | 371.88 | [3111.92, 4569.68] | 10.33
#> Chinstrap | 3840.80 | 371.88 | [3111.92, 4569.68] | 10.33
#> Gentoo    | 3840.80 | 371.88 | [3111.92, 4569.68] | 10.33
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)
marginaleffects::avg_predictions(fm1, newdata = d, by = "species", re.form = NA)
#> 
#>    species Estimate Std. Error    z Pr(>|z|)    S 2.5 % 97.5 %
#>  Adelie        3841        372 10.3   <0.001 80.7  3112   4570
#>  Chinstrap     3841        372 10.3   <0.001 80.7  3112   4570
#>  Gentoo        3841        372 10.3   <0.001 80.7  3112   4570
#> 
#> Type: response
estimate_relation(fm1, by = "species", re.form = NA)
#> Model-based Predictions
#> 
#> species   | Predicted |    SE |             95% CI
#> --------------------------------------------------
#> Adelie    |   3373.15 | 31.37 | [3311.65, 3434.64]
#> Chinstrap |   3400.86 | 42.05 | [3318.45, 3483.27]
#> Gentoo    |   4748.38 | 34.01 | [4681.73, 4815.03]
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)

estimate_means(fm1, by = "species", re.form = NULL, estimate = "specific")
#> Standard errors are probably not reliable. This can happen when random
#>   effects are involved. You may try `estimate_relation()` instead.
#> Model-based Predictions
#> 
#> species   |    Mean |     SE |             95% CI |     z
#> ---------------------------------------------------------
#> Adelie    | 3373.15 | 371.88 | [2644.27, 4102.03] |  9.07
#> Chinstrap | 3400.86 | 371.88 | [2671.98, 4129.74] |  9.14
#> Gentoo    | 4748.38 | 371.88 | [4019.50, 5477.26] | 12.77
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)
marginaleffects::avg_predictions(fm1, newdata = d, by = "species", re.form = NULL)
#> Warning: For this model type, `marginaleffects` only takes into account the
#> uncertainty in fixed-effect parameters. This is often appropriate when
#> `re.form=NA`, but may be surprising to users who set `re.form=NULL` (default)
#> or to some other value. Call `options(marginaleffects_safe = FALSE)` to silence
#> this warning.
#> 
#>    species Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
#>  Adelie        3373        372  9.07   <0.001  62.9  2644   4102
#>  Chinstrap     3401        372  9.14   <0.001  63.9  2672   4130
#>  Gentoo        4748        372 12.77   <0.001 121.6  4019   5477
#> 
#> Type: response
estimate_relation(fm1, by = "species", re.form = NULL)
#> Model-based Predictions
#> 
#> species   | Predicted |    SE |             95% CI
#> --------------------------------------------------
#> Adelie    |   3373.15 | 31.37 | [3311.65, 3434.64]
#> Chinstrap |   3400.86 | 42.05 | [3318.45, 3483.27]
#> Gentoo    |   4748.38 | 34.01 | [4681.73, 4815.03]
#> 
#> Variable predicted: body_mass
#> Predictors modulated: species
#> Predictors controlled: sex (female)

^{Created on 2025-06-30 with reprex v2.1.1}

0 replies

annahuynhly · 2025-07-02T14:35:44Z

annahuynhly
Jul 2, 2025
Author

Hello Dr. Lüdecke, thank you so much for investigating this issue. Regarding the last comment where you mention re.form has no effect on standard errors except for estimate_relation(), I’ve looked at the output and it seems like it still doesn’t have an effect on standard errors.

It seems to me that re.form has an effect on the means listed inestimate_means() but not on the standard errors. (Hope I am not going crazy…)

0 replies

strengejacke · 2025-07-02T14:43:09Z

strengejacke
Jul 2, 2025
Maintainer

Sorry, I meant that estimate_relations() gives the same standard errors as estimate_means() for lme4, but not the same for glmmTMB. You're right, it's not that re.form = NA differs from re.form = NULL in both cases for estimate_relation().

We have a vignette on mixed models, but not sure if it clarifies anything ;-)

0 replies

strengejacke · 2025-08-03T19:49:54Z

strengejacke
Aug 3, 2025
Maintainer

Ok, I will remove the message for now and turn this issue into a discussion.

0 replies

Uh oh!

Understanding the “standard errors are probably not reliable” warning message when fitting mixed models via lme4 #548

Uh oh!

annahuynhly Jun 27, 2025

Replies: 9 comments

Uh oh!

Uh oh!

strengejacke Jun 30, 2025 Maintainer

Uh oh!

strengejacke Jun 30, 2025 Maintainer

Uh oh!

bwiernik Jun 30, 2025 Maintainer

Uh oh!

strengejacke Jun 30, 2025 Maintainer

Uh oh!

strengejacke Jun 30, 2025 Maintainer

Uh oh!

strengejacke Jun 30, 2025 Maintainer

Uh oh!

annahuynhly Jul 2, 2025 Author

Uh oh!

strengejacke Jul 2, 2025 Maintainer

Uh oh!

strengejacke Aug 3, 2025 Maintainer

annahuynhly
Jun 27, 2025

strengejacke
Jun 30, 2025
Maintainer

strengejacke
Jun 30, 2025
Maintainer

bwiernik
Jun 30, 2025
Maintainer

strengejacke
Jun 30, 2025
Maintainer

strengejacke
Jun 30, 2025
Maintainer

strengejacke
Jun 30, 2025
Maintainer

annahuynhly
Jul 2, 2025
Author

strengejacke
Jul 2, 2025
Maintainer

strengejacke
Aug 3, 2025
Maintainer