Title: | Spatial Statistical Modeling and Prediction |
---|---|
Description: | Fit, summarize, and predict for a variety of spatial statistical models applied to point-referenced and areal (lattice) data. Parameters are estimated using various methods. Additional modeling features include anisotropy, non-spatial random effects, partition factors, big data approaches, and more. Model-fit statistics are used to summarize, visualize, and compare models. Predictions at unobserved locations are readily obtainable. For additional details, see Dumelle et al. (2023) <doi:10.1371/journal.pone.0282524>. |
Authors: | Michael Dumelle [aut, cre] , Matt Higham [aut] , Ryan A. Hill [ctb] , Michael Mahon [ctb] , Jay M. Ver Hoef [aut] |
Maintainer: | Michael Dumelle <[email protected]> |
License: | GPL-3 |
Version: | 0.9.0 |
Built: | 2024-11-07 03:24:14 UTC |
Source: | https://github.com/usepa/spmodel |
Compute AICc for one or several fitted model objects for which a log-likelihood value can be obtained.
AICc(object, ..., k = 2)
AICc(object, ..., k = 2)
object |
A fitted model object from |
... |
Optionally more fitted model objects. |
k |
The penalty parameter, taken to be 2. Currently not allowed to differ from 2 (needed for generic consistency). |
When comparing models fit by maximum or restricted maximum
likelihood, the smaller the AICc, the better the fit. The AICc contains
a correction to AIC for small sample sizes. The theory of
AICc requires that the log-likelihood has been maximized, and hence,
no AICc methods exist for models where estmethod
is not
"ml"
or "reml"
. Additionally, AICc comparisons between "ml"
and "reml"
models are meaningless – comparisons should only be made
within a set of models estimated using "ml"
or a set of models estimated
using "reml"
. AICc comparisons for "reml"
must
use the same fixed effects. To vary the covariance parameters and
fixed effects simultaneously, use "ml"
.
Hoeting et al. (2006) study AIC and AICc in a spatial context, using the AIC definition
and the AICc definition as
, where
is the sample size
and
is the number of estimated parameters. For
"ml"
, is
the number of estimated covariance parameters plus the number of estimated
fixed effects. For
"reml"
, is the number of estimated covariance
parameters.
If just one object is provided, a numeric value with the corresponding AICc.
If multiple objects are provided, a data.frame
with rows corresponding
to the objects and columns representing the number of parameters estimated
(df
) and the AICc.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) AICc(spmod) AIC(spmod) BIC(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) AICc(spmod) AIC(spmod) BIC(spmod)
Compute analysis of variance tables for a fitted model object or a likelihood ratio test for two fitted model objects.
## S3 method for class 'splm' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'spautor' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'spglm' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'spgautor' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'anova.splm' tidy(x, ...) ## S3 method for class 'anova.spautor' tidy(x, ...) ## S3 method for class 'anova.spglm' tidy(x, ...) ## S3 method for class 'anova.spgautor' tidy(x, ...)
## S3 method for class 'splm' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'spautor' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'spglm' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'spgautor' anova(object, ..., test = TRUE, Terms, L) ## S3 method for class 'anova.splm' tidy(x, ...) ## S3 method for class 'anova.spautor' tidy(x, ...) ## S3 method for class 'anova.spglm' tidy(x, ...) ## S3 method for class 'anova.spgautor' tidy(x, ...)
object |
A fitted model object from |
... |
An additional fitted model object. |
test |
A logical value indicating whether p-values from asymptotic Chi-squared
hypothesis tests should be returned. Defaults to |
Terms |
An optional character or integer vector that specifies terms in the model
used to jointly compute test statistics and p-values (if |
L |
An optional numeric matrix or list specifying linear combinations
of the coefficients in the model used to compute test statistics
and p-values (if |
x |
An object from |
When one fitted model object is present, anova()
performs a general linear hypothesis test corresponding to some hypothesis
specified by a matrix of constraints. If Terms
and L
are not specified,
each model term is tested against zero (which correspond to type III or marginal
hypothesis tests from classical ANOVA). If Terms
is specified and L
is not specified, all terms are tested jointly against zero. When L
is
specified, the linear combinations of terms specified by L
are jointly
tested against zero.
When two fitted model objects are present, one must be a "reduced"
model nested in a "full" model. Then anova()
performs a likelihood ratio test.
When one fitted model object is present, anova()
returns a data frame with degrees of
freedom (Df
), test statistics (Chi2
), and p-values
(Pr(>Chi2)
if test = TRUE
) corresponding
to asymptotic Chi-squared hypothesis tests for each model term.
When two fitted model objects are present, anova()
returns a data frame
with the difference in degrees of freedom between the full and reduced model (Df
), a test
statistic (Chi2
), and a p-value corresponding to the likelihood ratio test
(Pr(>Chi2)
if test = TRUE
).
Whether one or two fitted model objects are provided,
tidy()
can be used
to obtain tidy tibbles of the anova(object)
output.
# one-model anova spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) anova(spmod) tidy(anova(spmod)) # see terms tidy(anova(spmod))$effects tidy(anova(spmod, Terms = c("water", "tarp"))) # same as tidy(anova(spmod, Terms = c(2, 3))) # likelihood ratio test lmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "none" ) tidy(anova(spmod, lmod))
# one-model anova spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) anova(spmod) tidy(anova(spmod)) # see terms tidy(anova(spmod))$effects tidy(anova(spmod, Terms = c("water", "tarp"))) # same as tidy(anova(spmod, Terms = c(2, 3))) # likelihood ratio test lmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "none" ) tidy(anova(spmod, lmod))
Augment accepts a fitted model object and a data set and adds
information about each observation in the data set. New columns always
begin with a .
prefix to avoid overwriting columns in the original
data set.
Augment behaves differently depending on whether the original data or new data
requires augmenting. Typically, when augmenting the original data, only the fitted
model object is specified, and when augmenting new data, the fitted model object
and newdata
is specified. When augmenting the original data, diagnostic
statistics are augmented to each row in the data set. When augmenting new data,
predictions and optional intervals or standard errors are augmented to each
row in the new data set.
## S3 method for class 'splm' augment( x, drop = TRUE, newdata = NULL, se_fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, local, ... ) ## S3 method for class 'spautor' augment( x, drop = TRUE, newdata = NULL, se_fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, local, ... ) ## S3 method for class 'spglm' augment( x, drop = TRUE, newdata = NULL, type.predict = c("link", "response"), type.residuals = c("deviance", "pearson", "response"), se_fit = FALSE, interval = c("none", "confidence", "prediction"), newdata_size, level = 0.95, local = local, var_correct = TRUE, ... ) ## S3 method for class 'spgautor' augment( x, drop = TRUE, newdata = NULL, type.predict = c("link", "response"), type.residuals = c("deviance", "pearson", "response"), se_fit = FALSE, interval = c("none", "confidence", "prediction"), newdata_size, level = 0.95, local, var_correct = TRUE, ... )
## S3 method for class 'splm' augment( x, drop = TRUE, newdata = NULL, se_fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, local, ... ) ## S3 method for class 'spautor' augment( x, drop = TRUE, newdata = NULL, se_fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, local, ... ) ## S3 method for class 'spglm' augment( x, drop = TRUE, newdata = NULL, type.predict = c("link", "response"), type.residuals = c("deviance", "pearson", "response"), se_fit = FALSE, interval = c("none", "confidence", "prediction"), newdata_size, level = 0.95, local = local, var_correct = TRUE, ... ) ## S3 method for class 'spgautor' augment( x, drop = TRUE, newdata = NULL, type.predict = c("link", "response"), type.residuals = c("deviance", "pearson", "response"), se_fit = FALSE, interval = c("none", "confidence", "prediction"), newdata_size, level = 0.95, local, var_correct = TRUE, ... )
x |
|
drop |
A logical indicating whether to drop extra variables in the
fitted model object |
newdata |
A data frame or tibble containing observations requiring prediction.
All of the original explanatory variables used to create the fitted model object |
se_fit |
Logical indicating whether or not a |
interval |
Character indicating the type of confidence interval columns to
add to the augmented |
level |
Tolerance/confidence level. The default is |
local |
A list or logical. If a list, specific list elements described
in |
... |
Other arguments. Not used (needed for generic consistency). |
type.predict |
The scale ( |
type.residuals |
The residual type ( |
newdata_size |
The |
var_correct |
A logical indicating whether to return the corrected prediction
variances when predicting via models fit using |
augment()
returns a tibble with the same class as
data
. That is, if data
is
an sf
object, then the augmented object (obtained via augment(x)
)
will be an sf
object as well. When augmenting newdata
, the
augmented object has the same class as data
.
Missing response values from the original data can be augmented as if
they were a newdata
object by providing x$newdata
to the
newdata
argument (where x
is the name of the fitted model
object). This is the only way to compute predictions for
spautor()
and spgautor()
fitted model objects.
When augmenting the original data set, a tibble with additional columns
.fitted
Fitted value
.resid
Response residual (the difference between observed and fitted values)
.hat
Leverage (diagonal of the hat matrix)
.cooksd
Cook's distance
.std.resid
Standardized residuals
.se.fit
Standard error of the fitted value.
When augmenting a new data set, a tibble with additional columns
.fitted
Predicted (or fitted) value
.lower
Lower bound on interval
.upper
Upper bound on interval
.se.fit
Standard error of the predicted (or fitted) value
tidy.spmodel()
glance.spmodel()
predict.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) augment(spmod) spmod_sulf <- splm(sulfate ~ 1, data = sulfate, spcov_type = "exponential") augment(spmod_sulf) augment(spmod_sulf, newdata = sulfate_preds) # missingness in original data spmod_seal <- spautor(log_trend ~ 1, data = seal, spcov_type = "car") augment(spmod_seal) augment(spmod_seal, newdata = spmod_seal$newdata)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) augment(spmod) spmod_sulf <- splm(sulfate ~ 1, data = sulfate, spcov_type = "exponential") augment(spmod_sulf) augment(spmod_sulf, newdata = sulfate_preds) # missingness in original data spmod_seal <- spautor(log_trend ~ 1, data = seal, spcov_type = "car") augment(spmod_seal) augment(spmod_seal, newdata = spmod_seal$newdata)
Compare area under the receiver operating characteristic curve (AUROC) for binary (e.g., logistic) models. The area under the ROC curve provides a measure of the model's classification accuracy averaged over all possible threshold values.
AUROC(object, ...) ## S3 method for class 'spglm' AUROC(object, ...) ## S3 method for class 'spgautor' AUROC(object, ...)
AUROC(object, ...) ## S3 method for class 'spglm' AUROC(object, ...) ## S3 method for class 'spgautor' AUROC(object, ...)
object |
A fitted model object from |
... |
Additional arguments to |
The area under the receiver operating characteristic curve.
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J. C., & Müller, M. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC bioinformatics, 12, 1-8.
spgmod <- spglm(presence ~ elev, family = "binomial", data = moose, spcov_type = "exponential" ) AUROC(spgmod)
spgmod <- spglm(presence ~ elev, family = "binomial", data = moose, spcov_type = "exponential" ) AUROC(spgmod)
A caribou forage experiment.
caribou
caribou
A tibble
with 30 rows and 5 columns:
water: A factor representing whether water was added. Takes values
N
(no water added) and Y
(water added).
tarp: A factor representing tarp cover. Takes values clear
(a clear tarp), shade
(a shade tarp), and none
(no tarp).
z: The percentage of nitrogen.
x: The x-coordinate.
y: The y-coordinate.
These data were provided by Elizabeth Lenart of the Alaska Department of Fish and Game. The data were used in the publication listed in References.
Lenart, E.A., Bowyer, R.T., Ver Hoef, J.M. and Ruess, R.W. 2002. Climate Change and Caribou: Effects of Summer Weather on Forage. Canadian Journal of Zoology 80: 664-678.
coef extracts fitted model coefficients from
fitted model objects. coefficients
is an alias for it.
## S3 method for class 'splm' coef(object, type = "fixed", ...) ## S3 method for class 'splm' coefficients(object, type = "fixed", ...) ## S3 method for class 'spautor' coef(object, type = "fixed", ...) ## S3 method for class 'spautor' coefficients(object, type = "fixed", ...) ## S3 method for class 'spglm' coef(object, type = "fixed", ...) ## S3 method for class 'spglm' coefficients(object, type = "fixed", ...) ## S3 method for class 'spgautor' coef(object, type = "fixed", ...) ## S3 method for class 'spgautor' coefficients(object, type = "fixed", ...)
## S3 method for class 'splm' coef(object, type = "fixed", ...) ## S3 method for class 'splm' coefficients(object, type = "fixed", ...) ## S3 method for class 'spautor' coef(object, type = "fixed", ...) ## S3 method for class 'spautor' coefficients(object, type = "fixed", ...) ## S3 method for class 'spglm' coef(object, type = "fixed", ...) ## S3 method for class 'spglm' coefficients(object, type = "fixed", ...) ## S3 method for class 'spgautor' coef(object, type = "fixed", ...) ## S3 method for class 'spgautor' coefficients(object, type = "fixed", ...)
object |
A fitted model object from |
type |
|
... |
Other arguments. Not used (needed for generic consistency). |
A named vector of coefficients.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) coef(spmod) coefficients(spmod) coef(spmod, type = "spcov")
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) coef(spmod) coefficients(spmod) coef(spmod, type = "spcov")
Computes confidence intervals for one or more parameters in a fitted model object.
## S3 method for class 'splm' confint(object, parm, level = 0.95, ...) ## S3 method for class 'spautor' confint(object, parm, level = 0.95, ...) ## S3 method for class 'spglm' confint(object, parm, level = 0.95, ...) ## S3 method for class 'spgautor' confint(object, parm, level = 0.95, ...)
## S3 method for class 'splm' confint(object, parm, level = 0.95, ...) ## S3 method for class 'spautor' confint(object, parm, level = 0.95, ...) ## S3 method for class 'spglm' confint(object, parm, level = 0.95, ...) ## S3 method for class 'spgautor' confint(object, parm, level = 0.95, ...)
object |
A fitted model object from |
parm |
A specification of which parameters are to be given confidence intervals (a character vector of names). If missing, all parameters are considered. |
level |
The confidence level required. The default is |
... |
Other arguments. Not used (needed for generic consistency). |
Gaussian-based confidence intervals (two-sided and equal-tailed) for the
fixed effect coefficients based on the confidence level specified by level
.
For spglm()
or spgautor()
fitted model objects, confidence intervals are
on the link scale.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) confint(spmod) confint(spmod, parm = "waterY", level = 0.90)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) confint(spmod) confint(spmod, parm = "waterY", level = 0.90)
Compute the Cook's distance for each observation from a fitted model object.
## S3 method for class 'splm' cooks.distance(model, ...) ## S3 method for class 'spautor' cooks.distance(model, ...) ## S3 method for class 'spglm' cooks.distance(model, ...) ## S3 method for class 'spgautor' cooks.distance(model, ...)
## S3 method for class 'splm' cooks.distance(model, ...) ## S3 method for class 'spautor' cooks.distance(model, ...) ## S3 method for class 'spglm' cooks.distance(model, ...) ## S3 method for class 'spgautor' cooks.distance(model, ...)
model |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
Cook's distance measures the influence of an observation on a fitted model object. If an observation is influential, its omission from the data noticeably impacts parameter estimates. The larger the Cook's distance, the larger the influence.
A vector of Cook's distance values for each observation from the fitted model object.
augment.spmodel()
hatvalues.spmodel()
influence.spmodel()
residuals.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) cooks.distance(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) cooks.distance(spmod)
Create a covariance matrix from a fitted model object.
covmatrix(object, newdata, ...) ## S3 method for class 'splm' covmatrix(object, newdata, cov_type, ...) ## S3 method for class 'spautor' covmatrix(object, newdata, cov_type, ...) ## S3 method for class 'spglm' covmatrix(object, newdata, cov_type, ...) ## S3 method for class 'spgautor' covmatrix(object, newdata, cov_type, ...)
covmatrix(object, newdata, ...) ## S3 method for class 'splm' covmatrix(object, newdata, cov_type, ...) ## S3 method for class 'spautor' covmatrix(object, newdata, cov_type, ...) ## S3 method for class 'spglm' covmatrix(object, newdata, cov_type, ...) ## S3 method for class 'spgautor' covmatrix(object, newdata, cov_type, ...)
object |
A fitted model object (e.g., |
newdata |
If omitted, the covariance matrix of
the observed data is returned. If provided, |
... |
Other arguments. Not used (needed for generic consistency). |
cov_type |
The type of covariance matrix returned. If |
If newdata
is omitted, the covariance matrix of the observed
data, which has dimension n x n, where n is the sample size used to fit object
.
If newdata
is provided, the covariance matrix between the unobserved (new)
data and the observed data, which has dimension m x n, where m is the number of
new observations and n is the sample size used to fit object
.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) covmatrix(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) covmatrix(spmod)
Returns the deviance of a fitted model object.
## S3 method for class 'splm' deviance(object, ...) ## S3 method for class 'spautor' deviance(object, ...) ## S3 method for class 'spglm' deviance(object, ...) ## S3 method for class 'spgautor' deviance(object, ...)
## S3 method for class 'splm' deviance(object, ...) ## S3 method for class 'spautor' deviance(object, ...) ## S3 method for class 'spglm' deviance(object, ...) ## S3 method for class 'spgautor' deviance(object, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
For objects estimated using "ml"
or "reml"
,
the deviance is twice the difference in log-likelihoods between the
saturated (perfect-fit) model and the fitted model.
The deviance.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) deviance(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) deviance(spmod)
Create a dispersion parameter initial object that specifies
initial and/or known values to use while estimating the dispersion parameter
with spglm()
or spgautor()
.
dispersion_initial(family, dispersion, known)
dispersion_initial(family, dispersion, known)
family |
The generalized linear model family describing the distribution
of the response variable to be used. |
dispersion |
The value of the dispersion parameter. |
known |
A character vector indicating whether the dispersion parameter is to be
assumed known. The value |
The dispersion_initial
list is later passed to spglm()
or spgautor()
.
The variance function of an individual (given
)
for each generalized linear model family is given below:
family:
poisson:
nbinomial:
binomial:
beta:
Gamma:
inverse.gaussian:
The parameter is a dispersion parameter that influences
.
For the
poisson
and binomial
families, is always
one. Note that this inverse Gaussian parameterization is different than a
standard inverse Gaussian parameterization, which has variance
.
Setting
yields our parameterization, which is
preferred for computational stability. Also note that the dispersion parameter
is often defined in the literature as
, where
is the variance
function of the mean. We do not use this parameterization, which is important
to recognize while interpreting dispersion parameter estimates using
spglm()
or spgautor()
.
For more on generalized linear model constructions, see McCullagh and
Nelder (1989).
A list with two elements: initial
and is_known
.
initial
is a named numeric vector indicating the dispersion parameters
with a specified initial and/or known value. is_known
is a named
numeric vector indicating whether the dispersion parameters in
initial
is known or not. The class of the list
matches the value given to the family
argument.
McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.
# known dispersion value 1 dispersion_initial("nbinomial", dispersion = 1, known = "dispersion")
# known dispersion value 1 dispersion_initial("nbinomial", dispersion = 1, known = "dispersion")
Create a dispersion parameter object for use with other functions.
dispersion_params(family, dispersion)
dispersion_params(family, dispersion)
family |
The generalized linear model family describing the distribution
of the response variable to be used. |
dispersion |
The value of the dispersion parameter. |
The variance function of an individual (given
)
for each generalized linear model family is given below:
family:
poisson:
nbinomial:
binomial:
beta:
Gamma:
inverse.gaussian:
The parameter is a dispersion parameter that influences
.
For the
poisson
and binomial
families, is always
one. Note that this inverse Gaussian parameterization is different than a
standard inverse Gaussian parameterization, which has variance
.
Setting
yields our parameterization, which is
preferred for computational stability. Also note that the dispersion parameter
is often defined in the literature as
, where
is the variance
function of the mean. We do not use this parameterization, which is important
to recognize while interpreting dispersion parameter estimates using
spglm()
or spgautor()
.
For more on generalized linear model constructions, see McCullagh and
Nelder (1989).
A named numeric vector with class family
containing the dispersion.
McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.
dispersion_params("beta", dispersion = 1)
dispersion_params("beta", dispersion = 1)
Compute the empirical semivariogram for varying bin sizes and cutoff values.
esv( formula, data, xcoord, ycoord, cloud = FALSE, bins = 15, cutoff, dist_matrix, partition_factor ) ## S3 method for class 'esv' plot(x, ...)
esv( formula, data, xcoord, ycoord, cloud = FALSE, bins = 15, cutoff, dist_matrix, partition_factor ) ## S3 method for class 'esv' plot(x, ...)
formula |
A formula describing the fixed effect structure. |
data |
A data frame or |
xcoord |
Name of the variable in |
ycoord |
Name of the variable in |
cloud |
A logical indicating whether the empirical semivariogram should
be summarized by distance class or not. When |
bins |
The number of equally spaced bins. The default is 15. Ignored if
|
cutoff |
The maximum distance considered. The default is half the diagonal of the bounding box from the coordinates. |
dist_matrix |
A distance matrix to be used instead of providing coordinate names. |
partition_factor |
An optional formula specifying the partition factor. If specified, semivariances are only computed for observations sharing the same level of the partition factor. |
x |
An object from |
... |
Other arguments passed to other methods. |
The empirical semivariogram is a tool used to visualize and model
spatial dependence by estimating the semivariance of a process at varying distances.
For a constant-mean process, the
semivariance at distance is denoted
and defined as
. Under second-order stationarity,
, where
is the covariance function at distance
h
. Typically the residuals from an ordinary
least squares fit defined by formula
are second-order stationary with
mean zero. These residuals are used to compute the empirical semivariogram.
At a distance h
, the empirical semivariance is
, where
is the number of (unique)
pairs in the set of observations whose distance separation is
h
and
r1
and r2
are residuals corresponding to observations whose
distance separation is h
. In spmodel, these distance bins actually
contain observations whose distance separation is h +- c
,
where c
is a constant determined implicitly by bins
. Typically,
only observations whose distance separation is below some cutoff are used
to compute the empirical semivariogram (this cutoff is determined by cutoff
).
When using splm()
with estmethod
as "sv-wls"
, the empirical
semivariogram is calculated internally and used to estimate spatial
covariance parameters.
If cloud = FALSE
, a tibble (data.frame) with distance bins
(bins
), the average distance (dist
), the average semivariance (gamma
), and the
number of (unique) pairs (np
). If cloud = TRUE
, a tibble
(data.frame) with distance (dist
) and semivariance (gamma
)
for each unique pair.
esv(sulfate ~ 1, sulfate) plot(esv(sulfate ~ 1, sulfate))
esv(sulfate ~ 1, sulfate) plot(esv(sulfate ~ 1, sulfate))
Extract fitted values from fitted model objects. fitted.values
is an alias.
## S3 method for class 'splm' fitted(object, type = "response", ...) ## S3 method for class 'splm' fitted.values(object, type = "response", ...) ## S3 method for class 'spautor' fitted(object, type = "response", ...) ## S3 method for class 'spautor' fitted.values(object, type = "response", ...) ## S3 method for class 'spglm' fitted(object, type = "response", ...) ## S3 method for class 'spglm' fitted.values(object, type = "response", ...) ## S3 method for class 'spgautor' fitted(object, type = "response", ...) ## S3 method for class 'spgautor' fitted.values(object, type = "response", ...)
## S3 method for class 'splm' fitted(object, type = "response", ...) ## S3 method for class 'splm' fitted.values(object, type = "response", ...) ## S3 method for class 'spautor' fitted(object, type = "response", ...) ## S3 method for class 'spautor' fitted.values(object, type = "response", ...) ## S3 method for class 'spglm' fitted(object, type = "response", ...) ## S3 method for class 'spglm' fitted.values(object, type = "response", ...) ## S3 method for class 'spgautor' fitted(object, type = "response", ...) ## S3 method for class 'spgautor' fitted.values(object, type = "response", ...)
object |
A fitted model object from |
type |
|
... |
Other arguments. Not used (needed for generic consistency). |
When type
is "response"
, the fitted values
for each observation are the standard fitted values .
When
type
is "spcov"
the fitted values for each observation
are (generally) the best linear unbiased predictors of the spatial dependent and spatial
independent random error. When type
is "randcov"
, the fitted
values for each level of each random effect are (generally) the best linear unbiased
predictors of the corresponding random effect. The fitted values for type
"spcov"
and "randcov"
can generally be used to check assumptions
for each component of the fitted model object (e.g., check a Gaussian assumption).
The fitted values according to type
.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) fitted(spmod) fitted.values(spmod) fitted(spmod, type = "spcov")
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) fitted(spmod) fitted.values(spmod) fitted(spmod, type = "spcov")
Return formula used by a fitted model object.
## S3 method for class 'splm' formula(x, ...) ## S3 method for class 'spautor' formula(x, ...) ## S3 method for class 'spglm' formula(x, ...) ## S3 method for class 'spgautor' formula(x, ...)
## S3 method for class 'splm' formula(x, ...) ## S3 method for class 'spautor' formula(x, ...) ## S3 method for class 'spglm' formula(x, ...) ## S3 method for class 'spgautor' formula(x, ...)
x |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
The formula used by a fitted model object.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) formula(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) formula(spmod)
Returns a row of model
summaries from a fitted model object. Glance returns the same number of columns for all models
and estimation methods. If a particular summary is undefined for a model
or estimation method (e.g., likelihood statistics for estimation methods
"sv-wls"
or "sv-cl"
of splm()
objects), NA
is returned for that summary.
## S3 method for class 'splm' glance(x, ...) ## S3 method for class 'spautor' glance(x, ...) ## S3 method for class 'spglm' glance(x, ...) ## S3 method for class 'spgautor' glance(x, ...)
## S3 method for class 'splm' glance(x, ...) ## S3 method for class 'spautor' glance(x, ...) ## S3 method for class 'spglm' glance(x, ...) ## S3 method for class 'spgautor' glance(x, ...)
x |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
A single-row tibble with columns
n
The sample size.
p
The number of fixed effects.
npar
The number of estimated covariance parameters.
value
The optimized value of the fitting function.
AIC
The AIC.
AICc
The AICc.
BIC
The BIC.
logLik
The log-likelihood.
deviance
The deviance.
pseudo.r.squared
The pseudo r-squared.
stats::AIC()
AICc()
stats::BIC()
logLik.spmodel()
deviance.spmodel()
pseudoR2()
tidy.spmodel()
augment.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) glance(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) glance(spmod)
glances()
repeatedly calls glance()
on several
fitted model objects and binds the output together, sorted by a column of interest.
glances(object, ...) ## S3 method for class 'splm' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spautor' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'splm_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spautor_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spglm' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spgautor' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spglm_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spgautor_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE)
glances(object, ...) ## S3 method for class 'splm' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spautor' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'splm_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spautor_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spglm' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spgautor' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spglm_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE) ## S3 method for class 'spgautor_list' glances(object, ..., sort_by = "AICc", decreasing = FALSE, warning = TRUE)
object |
A fitted model object from |
... |
Additional fitted model objects. Ignored
if |
sort_by |
Sort by a |
decreasing |
Whether |
warning |
Whether a warning is displayed when model comparisons violate certain rules.
The default is |
A tibble where each row represents the output of glance()
for
each fitted model object.
lmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "none" ) spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) glances(lmod, spmod) glances(lmod, spmod, sort_by = "logLik", decreasing = TRUE)
lmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "none" ) spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) glances(lmod, spmod) glances(lmod, spmod, sort_by = "logLik", decreasing = TRUE)
Compute the leverage (hat) value for each observation from a fitted model object.
## S3 method for class 'splm' hatvalues(model, ...) ## S3 method for class 'spautor' hatvalues(model, ...) ## S3 method for class 'spglm' hatvalues(model, ...) ## S3 method for class 'spgautor' hatvalues(model, ...)
## S3 method for class 'splm' hatvalues(model, ...) ## S3 method for class 'spautor' hatvalues(model, ...) ## S3 method for class 'spglm' hatvalues(model, ...) ## S3 method for class 'spgautor' hatvalues(model, ...)
model |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
Leverage values measure how far an observation's explanatory variables are relative to the average of the explanatory variables. In other words, observations with high leverage are typically considered to have an extreme or unusual combination of explanatory variables. Leverage values are the diagonal of the hat (projection) matrix. The larger the hat value, the larger the leverage.
A vector of leverage (hat) values for each observation from the fitted model object.
augment.spmodel()
cooks.distance.spmodel()
influence.spmodel()
residuals.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) hatvalues(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) hatvalues(spmod)
Provides basic quantities which are used in forming a wide variety of diagnostics for checking the quality of fitted model objects.
## S3 method for class 'splm' influence(model, ...) ## S3 method for class 'spautor' influence(model, ...) ## S3 method for class 'spglm' influence(model, ...) ## S3 method for class 'spgautor' influence(model, ...)
## S3 method for class 'splm' influence(model, ...) ## S3 method for class 'spautor' influence(model, ...) ## S3 method for class 'spglm' influence(model, ...) ## S3 method for class 'spgautor' influence(model, ...)
model |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
This function calls residuals.spmodel()
, hatvalues.spmodel()
,
and cooks.distance.spmodel()
and puts the results into a tibble. It is
primarily used when calling augment.spmodel()
.
A tibble with residuals (.resid
), leverage values (.hat
),
cook's distance (.cooksd
), and standardized residuals (.std.resid
).
augment.spmodel()
cooks.distance.spmodel()
hatvalues.spmodel()
residuals.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) influence(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) influence(spmod)
Find a suitable set of labels from a fitted model object.
## S3 method for class 'splm' labels(object, ...) ## S3 method for class 'spautor' labels(object, ...) ## S3 method for class 'spglm' labels(object, ...) ## S3 method for class 'spgautor' labels(object, ...)
## S3 method for class 'splm' labels(object, ...) ## S3 method for class 'spautor' labels(object, ...) ## S3 method for class 'spglm' labels(object, ...) ## S3 method for class 'spgautor' labels(object, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
A character vector containing the terms used for the fixed effects from a fitted model object.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) labels(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) labels(spmod)
Lake data collected as part of the United States Environmental Protection Agency's 2012 and 2017 National Lakes Assessment and LakeCat.
lake
lake
An sf
object with 102 rows and 9 columns:
comid: A common identifier from NHDPlusV2.
log_cond: The natural logarithm of lake conductivity.
state: The US state: One of Arizona (AZ), Colorado (CO), Nevada (NV), Utah (UT).
temp: Lake catchment 30-year average temperature (in degrees Celsius).
precip: Lake watershed 30-year average precipitation (in centimeters).
elev: Lake elevation (in meters).
origin: Lake origin (human-made or natural).
year: A factor representing year (2012 or 2017).
geometry: POINT
geometry representing coordinates in a NAD83
projection (EPSG: 5070). Distances between points are in meters.
Lake prediction data collected as part of the United States Environmental Protection Agency's 2012 and 2017 National Lakes Assessment and LakeCat.
lake_preds
lake_preds
An sf
object with 10 rows and 8 columns:
comid: A common identifier from NHDPlusV2.
state: The US state: One of Arizona (AZ), Nevada (NV), Utah (UT).
temp: Lake catchment 30-year average temperature (in degrees Celsius).
precip: Lake watershed 30-year average precipitation (in centimeters).
elev: Lake elevation (in meters).
origin: Lake origin (human-made or natural).
year: A factor representing year (2012 or 2017).
geometry: POINT
geometry representing coordinates in a NAD83
projection (EPSG: 5070). Distances between points are in meters.
Find the log-likelihood of a fitted model when estmethod
is "ml"
or "reml"
.
## S3 method for class 'splm' logLik(object, ...) ## S3 method for class 'spautor' logLik(object, ...) ## S3 method for class 'spglm' logLik(object, ...) ## S3 method for class 'spgautor' logLik(object, ...)
## S3 method for class 'splm' logLik(object, ...) ## S3 method for class 'spautor' logLik(object, ...) ## S3 method for class 'spglm' logLik(object, ...) ## S3 method for class 'spgautor' logLik(object, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
The log-likelihood.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) logLik(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) logLik(spmod)
Perform leave-one-out cross validation with options for computationally efficient approximations for big data.
loocv(object, ...) ## S3 method for class 'splm' loocv(object, cv_predict = FALSE, se.fit = FALSE, local, ...) ## S3 method for class 'spautor' loocv(object, cv_predict = FALSE, se.fit = FALSE, local, ...) ## S3 method for class 'spglm' loocv( object, cv_predict = FALSE, type = c("link", "response"), se.fit = FALSE, local, ... ) ## S3 method for class 'spgautor' loocv( object, cv_predict = FALSE, type = c("link", "response"), se.fit = FALSE, local, ... )
loocv(object, ...) ## S3 method for class 'splm' loocv(object, cv_predict = FALSE, se.fit = FALSE, local, ...) ## S3 method for class 'spautor' loocv(object, cv_predict = FALSE, se.fit = FALSE, local, ...) ## S3 method for class 'spglm' loocv( object, cv_predict = FALSE, type = c("link", "response"), se.fit = FALSE, local, ... ) ## S3 method for class 'spgautor' loocv( object, cv_predict = FALSE, type = c("link", "response"), se.fit = FALSE, local, ... )
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
cv_predict |
A logical indicating whether the leave-one-out fitted values
should be returned. Defaults to |
se.fit |
A logical indicating whether the leave-one-out
prediction standard errors should be returned. Defaults to |
local |
A list or logical. If a list, specific list elements described
in |
type |
The scale ( |
Each observation is held-out from the data set and the remaining data
are used to make a prediction for the held-out observation. This is compared
to the true value of the observation and several fit statistics are computed:
bias, mean-squared-prediction error (MSPE), root-mean-squared-prediction
error (RMSPE), and the squared correlation (cor2) between the observed data
and leave-one-out predictions (regarded as a prediction version of r-squared
appropriate for comparing across spatial and nonspatial models). Generally,
bias should be near zero for well-fitting models. The lower the MSPE and RMSPE,
the better the model fit (according to the leave-out-out criterion).
The higher the cor2, the better the model fit (according to the leave-out-out
criterion). cor2 is not returned when object
was fit using
spglm()
or spgautor()
, as it is only applicable here for linear models.
If cv_predict = FALSE
and se.fit = FALSE
,
a fit statistics tibble (with bias, MSPE, RMSPE, and cor2; see Details).
If cv_predict = TRUE
or se.fit = TRUE
,
a list with elements: stats
, a fit statistics tibble
(with bias, MSPE, RMSPE, and cor2; see Details); cv_predict
, a numeric vector
with leave-one-out predictions for each observation (if cv_predict = TRUE
);
and se.fit
, a numeric vector with leave-one-out prediction standard
errors for each observation (if se.fit = TRUE
).
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) loocv(spmod) loocv(spmod, cv_predict = TRUE, se.fit = TRUE)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) loocv(spmod) loocv(spmod, cv_predict = TRUE, se.fit = TRUE)
Extract the model frame from a fitted model object.
## S3 method for class 'splm' model.frame(formula, ...) ## S3 method for class 'spautor' model.frame(formula, ...) ## S3 method for class 'spglm' model.frame(formula, ...) ## S3 method for class 'spgautor' model.frame(formula, ...)
## S3 method for class 'splm' model.frame(formula, ...) ## S3 method for class 'spautor' model.frame(formula, ...) ## S3 method for class 'spglm' model.frame(formula, ...) ## S3 method for class 'spgautor' model.frame(formula, ...)
formula |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
A model frame that contains the variables used by the formula for the fitted model object.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) model.frame(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) model.frame(spmod)
Extract the model matrix (X) from a fitted model object.
## S3 method for class 'splm' model.matrix(object, ...) ## S3 method for class 'spautor' model.matrix(object, ...) ## S3 method for class 'spglm' model.matrix(object, ...) ## S3 method for class 'spgautor' model.matrix(object, ...)
## S3 method for class 'splm' model.matrix(object, ...) ## S3 method for class 'spautor' model.matrix(object, ...) ## S3 method for class 'spglm' model.matrix(object, ...) ## S3 method for class 'spgautor' model.matrix(object, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
The model matrix (of the fixed effects), whose rows represent observations and whose columns represent explanatory variables corresponding to each fixed effect.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) model.matrix(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) model.matrix(spmod)
Moose counts and presence in Alaska, USA.
moose
moose
An sf
object with 218 rows and 5 columns.
elev: The elevation.
strat: A factor representing strata (used for sampling). Can take values L
and M
.
count: The count (number) of moose observed.
presence: A binary factor representing whether no moose were observed (value 0
) or at least one moose was observed
(value 1
).
geometry: POINT
geometry representing coordinates in an Alaska
Albers projection (EPSG: 3338). Distances between points are in meters.
Alaska Department of Fish and Game, Division of Wildlife Conservation has released this data set under the CC0 license.
Locations at which to predict moose counts and presence in Alaska, USA.
moose_preds
moose_preds
An sf
object with 100 rows and 3 columns.
elev: The elevation.
strat: A factor representing strata (used for sampling). Can take values L
and M
.
geometry: POINT
geometry representing coordinates in an Alaska
Albers projection (EPSG: 3338). Distances between points are in meters.
Alaska Department of Fish and Game, Division of Wildlife Conservation has released this data set under the CC0 license.
Heavy metals in mosses near a mining road in Alaska, USA.
moss
moss
An sf
object with 365 rows and 10 columns:
sample: A factor with a sample identifier. Some samples were replicated in the field or laboratory. As a result, there are 318 unique sample identifiers.
field_dup: A factor representing field duplicate. Takes values 1
and 2
.
lab_rep: A factor representing laboratory replicate. Takes values 1
and 2
.
year: A factor representing year. Takes values 2001
and 2006
.
sideroad: A factor representing direction relative to the haul road.
Takes values N
(north of the haul road) and S
(south
of the haul road).
log_dist2road: The log of distance (in meters) to the haul road.
log_Zn: The log of zinc concentration in moss tissue (mg/kg).
geometry: POINT
geometry representing coordinates in an Alaska
Albers projection (EPSG: 3338). Distances between points are in meters.
Data were obtained from Peter Neitlich and Linda Hasselbach of the National Park Service. Data were used in the publications listed in References.
Neitlich, P.N., Ver Hoef, J.M., Berryman, S. D., Mines, A., Geiser, L.H., Hasselbach, L.M., and Shiel, A. E. 2017. Trends in Spatial Patterns of Heavy Metal Deposition on National Park Service Lands Along the Red Dog Mine Haul Road, Alaska, 2001-2006. PLOS ONE 12(5):e0177936 DOI:10.1371/journal.pone.0177936
Hasselbach, L., Ver Hoef, J.M., Ford, J., Neitlich, P., Berryman, S., Wolk B. and Bohle, T. 2005. Spatial Patterns of Cadmium, Lead and Zinc Deposition on National Park Service Lands in the Vicinity of Red Dog Mine, Alaska. Science of the Total Environment 348: 211-230.
Plot fitted model diagnostics such as residuals vs fitted values, quantile-quantile, scale-location, Cook's distance, residuals vs leverage, Cook's distance vs leverage, a fitted spatial covariance function, and a fitted anisotropic level curve of equal correlation.
## S3 method for class 'splm' plot(x, which, ...) ## S3 method for class 'spautor' plot(x, which, ...) ## S3 method for class 'spglm' plot(x, which, ...) ## S3 method for class 'spgautor' plot(x, which, ...)
## S3 method for class 'splm' plot(x, which, ...) ## S3 method for class 'spautor' plot(x, which, ...) ## S3 method for class 'spglm' plot(x, which, ...) ## S3 method for class 'spgautor' plot(x, which, ...)
x |
A fitted model object from |
which |
An integer vector taking on values between 1 and 7, which indicates
the plots to return. Available plots are described in Details. If |
... |
Other arguments passed to other methods. |
For all fitted model objects,, the values of which
make the
corresponding plot:
1: Standardized residuals vs fitted values (of the response)
2: Normal quantile-quantile plot of standardized residuals
3: Scale-location plot of standardized residuals
4: Cook's distance
5: Standardized residuals vs leverage
6: Cook's distance vs leverage
For splm()
and spglm()
fitted model objects, there are two additional values of which
:
7: Fitted spatial covariance function vs distance
8: Fitted anisotropic level curve of equal correlation
No return value. Function called for plotting side effects.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) plot(spmod) plot(spmod, which = c(1, 2, 4, 6))
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) plot(spmod) plot(spmod, which = c(1, 2, 4, 6))
Predicted values and intervals based on a fitted model object.
## S3 method for class 'splm' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'spautor' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'splm_list' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'spautor_list' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'splmRF' predict(object, newdata, local, ...) ## S3 method for class 'spautorRF' predict(object, newdata, local, ...) ## S3 method for class 'splmRF_list' predict(object, newdata, local, ...) ## S3 method for class 'spautorRF_list' predict(object, newdata, local, ...) ## S3 method for class 'spglm' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... ) ## S3 method for class 'spgautor' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... ) ## S3 method for class 'spglm_list' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... ) ## S3 method for class 'spgautor_list' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... )
## S3 method for class 'splm' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'spautor' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'splm_list' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'spautor_list' predict( object, newdata, se.fit = FALSE, scale = NULL, df = Inf, interval = c("none", "confidence", "prediction"), level = 0.95, type = c("response", "terms"), local, terms = NULL, na.action = na.fail, ... ) ## S3 method for class 'splmRF' predict(object, newdata, local, ...) ## S3 method for class 'spautorRF' predict(object, newdata, local, ...) ## S3 method for class 'splmRF_list' predict(object, newdata, local, ...) ## S3 method for class 'spautorRF_list' predict(object, newdata, local, ...) ## S3 method for class 'spglm' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... ) ## S3 method for class 'spgautor' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... ) ## S3 method for class 'spglm_list' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... ) ## S3 method for class 'spgautor_list' predict( object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, dispersion = NULL, terms = NULL, local, var_correct = TRUE, newdata_size, na.action = na.fail, ... )
object |
A fitted model object. |
newdata |
A data frame or |
se.fit |
A logical indicating if standard errors are returned.
The default is |
scale |
A numeric constant by which to scale the regular standard errors and intervals.
Similar to but slightly different than |
df |
Degrees of freedom to use for confidence or prediction intervals
(ignored if |
interval |
Type of interval calculation. The default is |
level |
Tolerance/confidence level. The default is |
type |
The prediction type, either on the response scale, link scale (only for
|
local |
A optional logical or list controlling the big data approximation. If omitted,
When |
terms |
If |
na.action |
Missing ( |
... |
Other arguments. Only used for models fit using |
dispersion |
The dispersion of assumed when computing the prediction standard errors
for |
var_correct |
A logical indicating whether to return the corrected prediction
variances when predicting via models fit using |
newdata_size |
The |
For splm
and spautor
objects, the (empirical) best linear unbiased predictions (i.e., Kriging
predictions) at each site are returned when interval
is "none"
or "prediction"
alongside standard errors. Prediction intervals
are also returned if interval
is "prediction"
. When
interval
is "confidence"
, the estimated mean is returned
alongside standard errors and confidence intervals for the mean. For splm_list
and spautor_list
objects, predictions and associated intervals and standard errors are returned
for each list element.
For splmRF
or spautorRF
objects, random forest spatial residual
model predictions are computed by combining the random forest prediction with
the (empirical) best linear unbiased prediction for the residual. Fox et al. (2020)
call this approach random forest regression Kriging. For splmRF_list
or spautorRF
objects,
predictions are returned for each list element.
For splm
or spautor
objects, if se.fit
is FALSE
, predict()
returns
a vector of predictions or a matrix of predictions with column names
fit
, lwr
, and upr
if interval
is "confidence"
or "prediction"
. If se.fit
is TRUE
, a list with the following components is returned:
fit
: vector or matrix as above
se.fit
: standard error of each fit
For splm_list
or spautor_list
objects, a list that contains relevant quantities for each
list element.
For splmRF
or spautorRF
objects, a vector of predictions. For splmRF_list
or spautorRF_list
objects, a list that contains relevant quantities for each list element.
Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.
spmod <- splm(sulfate ~ 1, data = sulfate, spcov_type = "exponential", xcoord = x, ycoord = y ) predict(spmod, sulfate_preds) predict(spmod, sulfate_preds, interval = "prediction") augment(spmod, newdata = sulfate_preds, interval = "prediction") sulfate$var <- rnorm(NROW(sulfate)) # add noise variable sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential") predict(sprfmod, sulfate_preds) spgmod <- spglm(presence ~ elev * strat, family = "binomial", data = moose, spcov_type = "exponential" ) predict(spgmod, moose_preds) predict(spgmod, moose_preds, interval = "prediction") augment(spgmod, newdata = moose_preds, interval = "prediction")
spmod <- splm(sulfate ~ 1, data = sulfate, spcov_type = "exponential", xcoord = x, ycoord = y ) predict(spmod, sulfate_preds) predict(spmod, sulfate_preds, interval = "prediction") augment(spmod, newdata = sulfate_preds, interval = "prediction") sulfate$var <- rnorm(NROW(sulfate)) # add noise variable sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential") predict(sprfmod, sulfate_preds) spgmod <- spglm(presence ~ elev * strat, family = "binomial", data = moose, spcov_type = "exponential" ) predict(spgmod, moose_preds) predict(spgmod, moose_preds, interval = "prediction") augment(spgmod, newdata = moose_preds, interval = "prediction")
Print fitted model objects and summaries.
## S3 method for class 'splm' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'spautor' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summary.splm' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'summary.spautor' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.splm' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.spautor' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'spglm' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'spgautor' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summary.spglm' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'summary.spgautor' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.spglm' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.spgautor' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... )
## S3 method for class 'splm' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'spautor' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summary.splm' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'summary.spautor' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.splm' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.spautor' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'spglm' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'spgautor' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summary.spglm' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'summary.spgautor' print( x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.spglm' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... ) ## S3 method for class 'anova.spgautor' print( x, digits = max(getOption("digits") - 2L, 3L), signif.stars = getOption("show.signif.stars"), ... )
x |
A fitted model object from |
digits |
The number of significant digits to use when printing. |
... |
Other arguments passed to or from other methods. |
signif.stars |
Logical. If |
Printed fitted model objects and summaries with formatting.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) print(spmod) print(summary(spmod)) print(anova(spmod))
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) print(spmod) print(summary(spmod)) print(anova(spmod))
Compute a pseudo r-squared for a fitted model object.
pseudoR2(object, ...) ## S3 method for class 'splm' pseudoR2(object, adjust = FALSE, ...) ## S3 method for class 'spautor' pseudoR2(object, adjust = FALSE, ...) ## S3 method for class 'spglm' pseudoR2(object, adjust = FALSE, ...) ## S3 method for class 'spgautor' pseudoR2(object, adjust = FALSE, ...)
pseudoR2(object, ...) ## S3 method for class 'splm' pseudoR2(object, adjust = FALSE, ...) ## S3 method for class 'spautor' pseudoR2(object, adjust = FALSE, ...) ## S3 method for class 'spglm' pseudoR2(object, adjust = FALSE, ...) ## S3 method for class 'spgautor' pseudoR2(object, adjust = FALSE, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
adjust |
A logical indicating whether the pseudo r-squared
should be adjusted to account for the number of explanatory variables. The
default is |
Several pseudo r-squared statistics exist for in the literature. We define this pseudo r-squared as one minus the ratio of the deviance of a full model relative to the deviance of a null (intercept only) model. This pseudo r-squared can be viewed as a generalization of the classical r-squared definition seen as one minus the ratio of error sums of squares from the full model relative to the error sums of squares from the null model. If adjusted, the adjustment is analogous to the the classical r-squared adjustment.
The pseudo r-squared as a numeric vector.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) pseudoR2(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) pseudoR2(spmod)
Create a random effects (co)variance parameter initial object that specifies initial and/or known values to use while estimating random effect variances with modeling functions.
randcov_initial(..., known)
randcov_initial(..., known)
... |
Arguments to |
known |
A character vector indicating which random effect variances are to be
assumed known. The value |
A random effect is specified as , where
is the random
effects design matrix and
u
is the random effect. The covariance of
is
, where
is the random effect
variance, and
is the transpose of
.
A list with two elements: initial
and is_known
.
initial
is a named numeric vector indicating the random effect variances
with specified initial and/or known values. is_known
is a named
logical vector indicating whether the random effect variances in
initial
are known or not.
randcov_initial(group = 1) randcov_initial(group = 1, known = "group")
randcov_initial(group = 1) randcov_initial(group = 1, known = "group")
Create a random effects covariance parameter object for use with other functions.
randcov_params(..., nm)
randcov_params(..., nm)
... |
A named vector (or vectors) whose names represent the name of each random
effect and whose values represent the variance of each random effect. If
unnamed, |
nm |
A character vector of names to assign to |
Names of the random effects should match eligible names given to
random
in modeling functions. While with the random
argument to these functions, an
intercept is implicitly assumed, with randcov_params
, an intercept must be
explicitly specified. That is, while with random
, x | group
is shorthand for (1 | group) + (x | group)
, with randcov_params
,
x | group
implies just x | group
, which means that if 1 | group
is also desired, it must be explicitly specified.
A named numeric vector of random effect covariance parameters.
randcov_params(group = 1, subgroup = 2) randcov_params(1, 2, nm = c("group", "subgroup")) # same as randcov_params("1 | group" = 1, "1 | subgroup" = 2)
randcov_params(group = 1, subgroup = 2) randcov_params(1, 2, nm = c("group", "subgroup")) # same as randcov_params("1 | group" = 1, "1 | subgroup" = 2)
Extract residuals from a fitted model object.
resid
is an alias.
## S3 method for class 'splm' residuals(object, type = "response", ...) ## S3 method for class 'splm' resid(object, type = "response", ...) ## S3 method for class 'splm' rstandard(model, ...) ## S3 method for class 'spautor' residuals(object, type = "response", ...) ## S3 method for class 'spautor' resid(object, type = "response", ...) ## S3 method for class 'spautor' rstandard(model, ...) ## S3 method for class 'spglm' residuals(object, type = "deviance", ...) ## S3 method for class 'spglm' resid(object, type = "deviance", ...) ## S3 method for class 'spglm' rstandard(model, ...) ## S3 method for class 'spgautor' residuals(object, type = "deviance", ...) ## S3 method for class 'spgautor' resid(object, type = "deviance", ...) ## S3 method for class 'spgautor' rstandard(model, ...)
## S3 method for class 'splm' residuals(object, type = "response", ...) ## S3 method for class 'splm' resid(object, type = "response", ...) ## S3 method for class 'splm' rstandard(model, ...) ## S3 method for class 'spautor' residuals(object, type = "response", ...) ## S3 method for class 'spautor' resid(object, type = "response", ...) ## S3 method for class 'spautor' rstandard(model, ...) ## S3 method for class 'spglm' residuals(object, type = "deviance", ...) ## S3 method for class 'spglm' resid(object, type = "deviance", ...) ## S3 method for class 'spglm' rstandard(model, ...) ## S3 method for class 'spgautor' residuals(object, type = "deviance", ...) ## S3 method for class 'spgautor' resid(object, type = "deviance", ...) ## S3 method for class 'spgautor' rstandard(model, ...)
object |
A fitted model object from |
type |
|
... |
Other arguments. Not used (needed for generic consistency). |
model |
A fitted model object from |
The response residuals are taken as the response minus the fitted values
for the response: . The Pearson residuals are the
response residuals pre-multiplied by their inverse square root.
The standardized residuals are Pearson residuals divided by the square
root of one minus the leverage (hat) value. The standardized residuals are often used to
check model assumptions, as they have mean zero and variance approximately one.
rstandard()
is an alias for residuals(model, type = "standardized")
.
The residuals as a numeric vector.
augment.spmodel()
cooks.distance.spmodel()
hatvalues.spmodel()
influence.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) residuals(spmod) resid(spmod) residuals(spmod, type = "pearson") residuals(spmod, type = "standardized") rstandard(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) residuals(spmod) resid(spmod) residuals(spmod, type = "pearson") residuals(spmod, type = "standardized") rstandard(spmod)
Estimated harbor-seal trends from abundance data in southeast Alaska, USA.
seal
seal
A sf
object with 149 rows and 2 columns:
log_trend: The log of the estimated harbor-seal trends from abundance data.
stock: A seal stock factor with two levels: 8 and 10. The factor levels indicate the type of seal stock (i.e., type of seal). Stocks 8 and 10 are two distinct stocks (out of 13 total stocks) in southeast Alaska.
geometry: POLYGON
geometry representing polygons in an Alaska
Albers projection (EPSG: 3338).
These data were collected by the Polar Ecosystem Program of the Marine Mammal Laboratory of the Alaska Fisheries Science Center of NOAA Fisheries. The data were used in the publication listed in References.
Ver Hoef, J.M., Peterson, E. E., Hooten, M. B., Hanks, E. M., and Fortin, M.-J. 2018. Spatial Autoregressive Models for Statistical Inference from Ecological Data. Ecological Monographs, 88: 36-59. DOI: 10.1002/ecm.1283.
Fit spatial linear models for areal data (i.e., spatial autoregressive models) using a variety of estimation methods, allowing for random effects, partition factors, and row standardization.
spautor( formula, data, spcov_type, spcov_initial, estmethod = "reml", random, randcov_initial, partition_factor, W, row_st = TRUE, M, range_positive = TRUE, cutoff, ... )
spautor( formula, data, spcov_type, spcov_initial, estmethod = "reml", random, randcov_initial, partition_factor, W, row_st = TRUE, M, range_positive = TRUE, cutoff, ... )
formula |
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the |
data |
A data frame or |
spcov_type |
The spatial covariance type. Available options include
|
spcov_initial |
An object from |
estmethod |
The estimation method. Available options include
|
random |
A one-sided linear formula describing the random effect structure
of the model. Terms are specified to the right of the |
randcov_initial |
An optional object specifying initial and/or known values for the random effect variances. |
partition_factor |
A one-sided linear formula with a single term specifying the partition factor. The partition factor assumes observations from different levels of the partition factor are uncorrelated. |
W |
Weight matrix specifying the neighboring structure used.
Not required if |
row_st |
A logical indicating whether row standardization be performed on
|
M |
|
range_positive |
Whether the range should be constrained to be positive.
The default is |
cutoff |
The numeric, distance-based cutoff used to determine |
... |
Other arguments to |
The spatial linear model for areal data (i.e., spatial autoregressive model)
can be written as
, where
is the fixed effects design
matrix,
are the fixed effects,
is random error that is
spatially dependent, and
is random error that is spatially
independent. Together,
and
are modeled using
a spatial covariance function, expressed as
, where
is the dependent error variance,
is a matrix that controls the spatial dependence structure among observations,
is the independent error variance, and
is
an identity matrix. Note that
and
must be non-negative while
must be between the reciprocal of the maximum
eigenvalue of
W
and the reciprocal of the minimum eigenvalue of
W
.
spcov_type
Details: Parametric forms for are given below:
car: , weights matrix
,
symmetry condition matrix
sar: ,
weights matrix
,
indicates matrix transpose
If there are observations with no neighbors, they are given a unique variance
parameter called extra
, which must be non-negative.
estmethod
Details: The various estimation methods are
reml
: Maximize the restricted log-likelihood.
ml
: Maximize the log-likelihood.
By default, all spatial covariance parameters except ie
as well as all random effect variance parameters
are assumed unknown, requiring estimation. ie
is assumed zero and known by default
(in contrast to models fit using splm()
, where ie
is assumed
unknown by default). To change this default behavior, specify spcov_initial
(an NA
value for ie
in spcov_initial
to assume
ie
is unknown, requiring estimation).
random
Details: If random effects are used, the model
can be written as ,
where each Z is a random effects design matrix and each u is a random effect.
partition_factor
Details: The partition factor can be represented in matrix form as , where
elements of
equal one for observations in the same level of the partition
factor and zero otherwise. The covariance matrix involving only the
spatial and random effects components is then multiplied element-wise
(Hadmard product) by
, yielding the final covariance matrix.
Observations with NA
response values are removed for model
fitting, but their values can be predicted afterwards by running
predict(object)
. This is the only way to perform prediction for
spautor()
models (i.e., the prediction locations must be known prior
to estimation).
A list with many elements that store information about
the fitted model object. If spcov_type
or spcov_initial
are
length one, the list has class spautor
. Many generic functions that
summarize model fit are available for spautor
objects, including
AIC
, AICc
, anova
, augment
, BIC
, coef
,
cooks.distance
, covmatrix
, deviance
, fitted
, formula
,
glance
, glances
, hatvalues
, influence
,
labels
, logLik
, loocv
, model.frame
, model.matrix
,
plot
, predict
, print
, pseudoR2
, summary
,
terms
, tidy
, update
, varcomp
, and vcov
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class spautor_list
and each element in the list has class
spautor
. glances
can be used to summarize spautor_list
objects, and the aforementioned spautor
generics can be used on each
individual list element (model fit).
This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.
spmod <- spautor(log_trend ~ 1, data = seal, spcov_type = "car") summary(spmod)
spmod <- spautor(log_trend ~ 1, data = seal, spcov_type = "car") summary(spmod)
Fit random forest residual spatial linear models for areal data (i.e., spatial autoregressive models) using random forest to fit the mean and a spatial linear model to fit the residuals. The spatial linear model fit to the residuals can incorporate a variety of estimation methods, allowing for random effects, partition factors, and row standardization.
spautorRF(formula, data, ...)
spautorRF(formula, data, ...)
formula |
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the |
data |
A data frame or |
... |
Additional named arguments to |
The random forest residual spatial linear model is described by
Fox et al. (2020). A random forest model is fit to the mean portion of the
model specified by formula
using ranger::ranger()
. Residuals
are computed and used as the response variable in an intercept-only spatial
linear model fit using spautor()
. This model object is intended for use with
predict()
to perform prediction, also called random forest
regression Kriging.
A list with several elements to be used with predict()
. These
elements include the function call (named call
), the random forest object
fit to the mean (named ranger
),
the spatial linear model object fit to the residuals
(named spautor
or spautor_list
), and an object can contain data for
locations at which to predict (called newdata
). The newdata
object contains the set of
observations in data
whose response variable is NA
.
If spcov_type
or spcov_initial
(which are passed to spautor()
)
are length one, the list has class spautorRF
and the spatial linear
model object fit to the residuals is called spautor
, which has
class spautor
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class spautorRF_list
and the spatial linear model object
fit to the residuals is called spautor_list
, which has class spautor_list
.
and contains several objects, each with class spautor
.
Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.
seal$var <- rnorm(NROW(seal)) # add noise variable sprfmod <- spautorRF(log_trend ~ var, data = seal, spcov_type = "car") predict(sprfmod)
seal$var <- rnorm(NROW(seal)) # add noise variable sprfmod <- spautorRF(log_trend ~ var, data = seal, spcov_type = "car") predict(sprfmod)
Create a spatial covariance parameter initial object that specifies
initial and/or known values to use while estimating spatial covariance parameters
with splm()
, spglm()
, spautor()
, or spgautor()
.
spcov_initial(spcov_type, de, ie, range, extra, rotate, scale, known)
spcov_initial(spcov_type, de, ie, range, extra, rotate, scale, known)
spcov_type |
The spatial covariance function type. Available options include
|
de |
The spatially dependent (correlated) random error variance. Commonly referred to as a partial sill. |
ie |
The spatially independent (uncorrelated) random error variance. Commonly referred to as a nugget. |
range |
The correlation parameter. |
extra |
An extra covariance parameter used when |
rotate |
Anisotropy rotation parameter (from 0 to |
scale |
Anisotropy scale parameter (from 0 to 1).
Not used if |
known |
A character vector indicating which spatial covariance parameters are to be
assumed known. The value |
The spcov_initial
list is later passed to splm()
, spglm()
, spautor()
, or spgautor()
.
NA
values can be given for ie
, rotate
, and scale
, which lets
these functions find initial values for parameters that are sometimes
otherwise assumed known (e.g., rotate
and scale
with splm()
and spglm()
and ie
with spautor()
and spgautor()
).
The spatial covariance functions can be generally expressed as
, where
is
de
above,
is a matrix that controls the spatial dependence structure among observations,
,
is
ie
above, and is and identity matrix.
Note that
and
must be non-negative while
must be positive, except when
spcov_type
is car
or sar
,
in which case must be between the reciprocal of the maximum
eigenvalue of
W
and the reciprocal of the minimum eigenvalue of
W
. Parametric forms for are given below, where
:
exponential:
spherical:
gaussian:
triangular:
circular:
cubic:
pentaspherical:
cosine:
wave:
jbessel: , Bj is Bessel-J function
gravity:
rquad:
magnetic:
matern: ,
, Bk is Bessel-K function wit order
cauchy: ,
pexponential: ,
car: , weights matrix
,
symmetry condition matrix
, observations with no neighbors
are given a unique variance
parameter called
,
.
sar: ,
weights matrix
,
indicates matrix transpose,
observations with no neighbors are given a unique variance
parameter called
,
.
none:
All spatial covariance functions are valid in one spatial dimension. All
spatial covariance functions except triangular
and cosine
are
valid in two dimensions. An alias for none
is ie
.
When the spatial covariance function is car
or sar
, extra
represents the variance parameter for the observations in W
without
at least one neighbor (other than itself) – these are called unconnected
observations. extra
is only used if there is at least one unconnected
observation.
A list with two elements: initial
and is_known
.
initial
is a named numeric vector indicating the spatial covariance parameters
with specified initial and/or known values. is_known
is a named
numeric vector indicating whether the spatial covariance parameters in
initial
are known or not. The class of the list
matches the value given to the spcov_type
argument.
# known de value 1 and initial range value 0.4 spcov_initial("exponential", de = 1, range = 0.4, known = c("de")) # known ie value 0 and known range value 1 spcov_initial("gaussian", ie = 0, range = 1, known = c("given")) # ie given NA spcov_initial("car", ie = NA)
# known de value 1 and initial range value 0.4 spcov_initial("exponential", de = 1, range = 0.4, known = c("de")) # known ie value 0 and known range value 1 spcov_initial("gaussian", ie = 0, range = 1, known = c("given")) # ie given NA spcov_initial("car", ie = NA)
Create a spatial covariance parameter object for use with other functions.
spcov_params(spcov_type, de, ie, range, extra, rotate = 0, scale = 1)
spcov_params(spcov_type, de, ie, range, extra, rotate = 0, scale = 1)
spcov_type |
The spatial covariance function type. Available options include
|
de |
The spatially dependent (correlated) random error variance. Commonly referred to as a partial sill. |
ie |
The spatially independent (uncorrelated) random error variance. Commonly referred to as a nugget. |
range |
The correlation parameter. |
extra |
An extra covariance parameter used when |
rotate |
Anisotropy rotation parameter (from 0 to |
scale |
Anisotropy scale parameter (from 0 to 1).
A value of 1 (the default) implies no scaling.
Not used if |
Generally, all arguments to spcov_params
must be specified, though
default arguments are often chosen based on spcov_type
.
When spcov_type
is car
or
sar
, ie
is assumed to be 0 unless specified otherwise.
For full parameterizations of all spatial covariance
functions, see spcov_initial()
.
A named numeric vector of spatial covariance parameters with class spcov_type
.
spcov_params("exponential", de = 1, ie = 1, range = 1)
spcov_params("exponential", de = 1, ie = 1, range = 1)
Fit spatial generalized linear models for areal data (i.e., spatial generalized autoregressive models) using a variety of estimation methods, allowing for random effects, partition factors, and row standardization.
spgautor( formula, family, data, spcov_type, spcov_initial, dispersion_initial, estmethod = "reml", random, randcov_initial, partition_factor, W, row_st = TRUE, M, range_positive = TRUE, cutoff, ... )
spgautor( formula, family, data, spcov_type, spcov_initial, dispersion_initial, estmethod = "reml", random, randcov_initial, partition_factor, W, row_st = TRUE, M, range_positive = TRUE, cutoff, ... )
formula |
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the |
family |
The generalized linear model family describing the distribution
of the response variable to be used. Available options
|
data |
A data frame or |
spcov_type |
The spatial covariance type. Available options include
|
spcov_initial |
An object from |
dispersion_initial |
An object from |
estmethod |
The estimation method. Available options include
|
random |
A one-sided linear formula describing the random effect structure
of the model. Terms are specified to the right of the |
randcov_initial |
An optional object specifying initial and/or known values for the random effect variances. |
partition_factor |
A one-sided linear formula with a single term specifying the partition factor. The partition factor assumes observations from different levels of the partition factor are uncorrelated. |
W |
Weight matrix specifying the neighboring structure used.
Not required if |
row_st |
A logical indicating whether row standardization be performed on
|
M |
|
range_positive |
Whether the range should be constrained to be positive.
The default is |
cutoff |
The numeric, distance-based cutoff used to determine |
... |
Other arguments to |
The spatial generalized linear model for areal data
(i.e., spatial generalized autoregressive model) can be written as
, where
is the expectation
of the response (
) given the random errors,
is called
a link function which links together the
and
,
is the fixed effects design
matrix,
are the fixed effects,
is random error that is
spatially dependent, and
is random error that is spatially
independent.
There are six generalized linear model
families available: poisson
assumes is a Poisson random variable
nbinomial
assumes is a negative binomial random
variable,
binomial
assumes is a binomial random variable,
beta
assumes is a beta random variable,
Gamma
assumes is a gamma random
variable, and
inverse.gaussian
assumes is an inverse Gaussian
random variable.
The supports for for each family are given below:
family: support of
poisson: ;
an integer
nbinomial: ;
an integer
binomial: ;
an integer
beta:
Gamma:
inverse.gaussian:
The generalized linear model families and the parameterizations of their link functions are given below:
family: link function
poisson: (log link)
nbinomial: (log link)
binomial: (logit link)
beta: (logit link)
Gamma: (log link)
inverse.gaussian: (log link)
The variance function of an individual (given
)
for each generalized linear model family is given below:
family:
poisson:
nbinomial:
binomial:
beta:
Gamma:
inverse.gaussian:
The parameter is a dispersion parameter that influences
.
For the
poisson
and binomial
families, is always
one. Note that this inverse Gaussian parameterization is different than a
standard inverse Gaussian parameterization, which has variance
.
Setting
yields our parameterization, which is
preferred for computational stability. Also note that the dispersion parameter
is often defined in the literature as
, where
is the variance
function of the mean. We do not use this parameterization, which is important
to recognize while interpreting dispersion parameter estimates.
For more on generalized linear model constructions, see McCullagh and
Nelder (1989).
Together, and
are modeled using
a spatial covariance function, expressed as
, where
is the dependent error variance,
is a matrix that controls the spatial dependence structure among observations,
is the independent error variance, and
is
an identity matrix. Note that
and
must be non-negative while
must be between the reciprocal of the maximum
eigenvalue of
W
and the reciprocal of the minimum eigenvalue of
W
. Recall that and
are modeled on the link scale,
not the inverse link (response) scale. Random effects are also modeled on the link scale.
spcov_type
Details: Parametric forms for are given below:
car: , weights matrix
,
symmetry condition matrix
sar: ,
weights matrix
,
indicates matrix transpose
If there are observations with no neighbors, they are given a unique variance
parameter called extra
, which must be non-negative.
estmethod
Details: The various estimation methods are
reml
: Maximize the restricted log-likelihood.
ml
: Maximize the log-likelihood.
Note that the likelihood being optimized is obtained using the Laplace approximation.
By default, all spatial covariance parameters except ie
as well as all random effect variance parameters
are assumed unknown, requiring estimation. ie
is assumed zero and known by default
(in contrast to models fit using spglm()
, where ie
is assumed
unknown by default). To change this default behavior, specify spcov_initial
(an NA
value for ie
in spcov_initial
to assume
ie
is unknown, requiring estimation).
random
Details: If random effects are used, the model
can be written as ,
where each Z is a random effects design matrix and each u is a random effect.
partition_factor
Details: The partition factor can be represented in matrix form as , where
elements of
equal one for observations in the same level of the partition
factor and zero otherwise. The covariance matrix involving only the
spatial and random effects components is then multiplied element-wise
(Hadmard product) by
, yielding the final covariance matrix.
Observations with NA
response values are removed for model
fitting, but their values can be predicted afterwards by running
predict(object)
. This is the only way to perform prediction for
spgautor()
models (i.e., the prediction locations must be known prior
to estimation).
A list with many elements that store information about
the fitted model object. If spcov_type
or spcov_initial
are
length one, the list has class spgautor
. Many generic functions that
summarize model fit are available for spgautor
objects, including
AIC
, AICc
, anova
, augment
, AUROC
, BIC
, coef
,
cooks.distance
, covmatrix
, deviance
, fitted
, formula
,
glance
, glances
, hatvalues
, influence
,
labels
, logLik
, loocv
, model.frame
, model.matrix
,
plot
, predict
, print
, pseudoR2
, summary
,
terms
, tidy
, update
, varcomp
, and vcov
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class spgautor_list
and each element in the list has class
spgautor
. glances
can be used to summarize spgautor_list
objects, and the aforementioned spgautor
generics can be used on each
individual list element (model fit).
This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.
McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.
spgmod <- spgautor(I(log_trend^2) ~ 1, family = "Gamma", data = seal, spcov_type = "car") summary(spgmod)
spgmod <- spgautor(I(log_trend^2) ~ 1, family = "Gamma", data = seal, spcov_type = "car") summary(spgmod)
Fit spatial generalized linear models for point-referenced data (i.e., generalized geostatistical models) using a variety of estimation methods, allowing for random effects, anisotropy, partition factors, and big data methods.
spglm( formula, family, data, spcov_type, xcoord, ycoord, spcov_initial, dispersion_initial, estmethod = "reml", anisotropy = FALSE, random, randcov_initial, partition_factor, local, range_constrain, ... )
spglm( formula, family, data, spcov_type, xcoord, ycoord, spcov_initial, dispersion_initial, estmethod = "reml", anisotropy = FALSE, random, randcov_initial, partition_factor, local, range_constrain, ... )
formula |
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the |
family |
The generalized linear model family describing the distribution
of the response variable to be used. Available options
|
data |
A data frame or |
spcov_type |
The spatial covariance type. Available options include
|
xcoord |
The name of the column in |
ycoord |
The name of the column in |
spcov_initial |
An object from |
dispersion_initial |
An object from |
estmethod |
The estimation method. Available options include
|
anisotropy |
A logical indicating whether (geometric) anisotropy should
be modeled. Not required if |
random |
A one-sided linear formula describing the random effect structure
of the model. Terms are specified to the right of the |
randcov_initial |
An optional object specifying initial and/or known values for the random effect variances. |
partition_factor |
A one-sided linear formula with a single term specifying the partition factor. The partition factor assumes observations from different levels of the partition factor are uncorrelated. |
local |
An optional logical or list controlling the big data approximation.
If omitted,
When |
range_constrain |
An optional logical that indicates whether the range
should be constrained to enhance numerical stability. If |
... |
Other arguments to |
The spatial generalized linear model for point-referenced data
(i.e., generalized geostatistical model) can be written as
, where
is the expectation
of the response (
) given the random errors,
is called
a link function which links together the
and
,
is the fixed effects design
matrix,
are the fixed effects,
is random error that is
spatially dependent, and
is random error that is spatially
independent.
There are six generalized linear model
families available: poisson
assumes is a Poisson random variable
nbinomial
assumes is a negative binomial random
variable,
binomial
assumes is a binomial random variable,
beta
assumes is a beta random variable,
Gamma
assumes is a gamma random
variable, and
inverse.gaussian
assumes is an inverse Gaussian
random variable.
The supports for for each family are given below:
family: support of
poisson: ;
an integer
nbinomial: ;
an integer
binomial: ;
an integer
beta:
Gamma:
inverse.gaussian:
The generalized linear model families and the parameterizations of their link functions are given below:
family: link function
poisson: (log link)
nbinomial: (log link)
binomial: (logit link)
beta: (logit link)
Gamma: (log link)
inverse.gaussian: (log link)
The variance function of an individual (given
)
for each generalized linear model family is given below:
family:
poisson:
nbinomial:
binomial:
beta:
Gamma:
inverse.gaussian:
The parameter is a dispersion parameter that influences
.
For the
poisson
and binomial
families, is always
one. Note that this inverse Gaussian parameterization is different than a
standard inverse Gaussian parameterization, which has variance
.
Setting
yields our parameterization, which is
preferred for computational stability. Also note that the dispersion parameter
is often defined in the literature as
, where
is the variance
function of the mean. We do not use this parameterization, which is important
to recognize while interpreting dispersion estimates.
For more on generalized linear model constructions, see McCullagh and
Nelder (1989).
Together, and
are modeled using
a spatial covariance function, expressed as
, where
is the dependent error variance,
is a correlation matrix that controls the spatial dependence structure among observations,
is the independent error variance, and
is
an identity matrix. Recall that
and
are modeled on the link scale,
not the inverse link (response) scale. Random effects are also modeled on the link scale.
spcov_type
Details: Parametric forms for are given below, where
for
distance between observations:
exponential:
spherical:
gaussian:
triangular:
circular:
cubic:
pentaspherical:
cosine:
wave:
jbessel: , Bj is Bessel-J function
gravity:
rquad:
magnetic:
matern: ,
, Bk is Bessel-K function with order
cauchy: ,
pexponential: ,
none:
ie: (see below for differences compared to none)
There is a slight difference between none
and ie
(the spcov_type
) from above. none
fixes the "ie"
covariance parameter at zero, which should yield a model that has very similar
parameter estimates as one from stats::glm. Note that spglm()
uses
a different likelihood, so direct likelihood-based comparisons (e.g., AIC
)
should not be made to compare an spglm()
model to a glm()
one
(though deviance can be used). ie
allows the "ie"
covariance
parameter to vary. If the "ie"
covariance parameter is estimated near
zero, then none
and ie
yield similar models.
All spatial covariance functions are valid in one spatial dimension. All
spatial covariance functions except triangular
and cosine
are
valid in two dimensions.
estmethod
Details: The various estimation methods are
reml
: Maximize the restricted log-likelihood.
ml
: Maximize the log-likelihood.
Note that the likelihood being optimized is obtained using the Laplace approximation.
anisotropy
Details: By default, all spatial covariance parameters except rotate
and scale
as well as all random effect variance parameters
are assumed unknown, requiring estimation. If either rotate
or scale
are given initial values other than 0 and 1 (respectively) or are assumed unknown
in spcov_initial()
, anisotropy
is implicitly set to TRUE
.
(Geometric) Anisotropy is modeled by transforming a covariance function that
decays differently in different directions to one that decays equally in all
directions via rotation and scaling of the original coordinates. The rotation is
controlled by the rotate
parameter in radians. The scaling
is controlled by the
scale
parameter in . The anisotropy
correction involves first a rotation of the coordinates clockwise by
rotate
and then a
scaling of the coordinates' minor axis by the reciprocal of scale
. The spatial
covariance is then computed using these transformed coordinates.
random
Details: If random effects are used, the model
can be written as ,
where each Z is a random effects design matrix and each u is a random effect.
partition_factor
Details: The partition factor can be represented in matrix form as , where
elements of
equal one for observations in the same level of the partition
factor and zero otherwise. The covariance matrix involving only the
spatial and random effects components is then multiplied element-wise
(Hadmard product) by
, yielding the final covariance matrix.
local
Details: The big data approximation works by sorting observations into different levels
of an index variable. Observations in different levels of the index variable
are assumed to be uncorrelated for the purposes of model fitting. Sparse matrix methods are then implemented
for significant computational gains. Parallelization generally further speeds up
computations when data sizes are larger than a few thousand. Both the "random"
and "kmeans"
values of method
in local
have random components. That means you may get slightly different
results when using the big data approximation and rerunning spglm()
with the same code. For consistent results,
either set a seed via base::set.seed()
or specify index
to local
.
Observations with NA
response values are removed for model
fitting, but their values can be predicted afterwards by running
predict(object)
.
A list with many elements that store information about
the fitted model object. If spcov_type
or spcov_initial
are
length one, the list has class spglm
. Many generic functions that
summarize model fit are available for spglm
objects, including
AIC
, AICc
, anova
, augment
, AUROC
, BIC
, coef
,
cooks.distance
, covmatrix
, deviance
, fitted
, formula
,
glance
, glances
, hatvalues
, influence
,
labels
, logLik
, loocv
, model.frame
, model.matrix
,
plot
, predict
, print
, pseudoR2
, summary
,
terms
, tidy
, update
, varcomp
, and vcov
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class spglm_list
and each element in the list has class
spglm
. glances
can be used to summarize spglm_list
objects, and the aforementioned spglm
generics can be used on each
individual list element (model fit).
This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.
McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.
spgmod <- spglm(presence ~ elev, family = "binomial", data = moose, spcov_type = "exponential" ) summary(spgmod)
spgmod <- spglm(presence ~ elev, family = "binomial", data = moose, spcov_type = "exponential" ) summary(spgmod)
Fit spatial linear models for point-referenced data (i.e., geostatistical models) using a variety of estimation methods, allowing for random effects, anisotropy, partition factors, and big data methods.
splm( formula, data, spcov_type, xcoord, ycoord, spcov_initial, estmethod = "reml", weights = "cressie", anisotropy = FALSE, random, randcov_initial, partition_factor, local, range_constrain, ... )
splm( formula, data, spcov_type, xcoord, ycoord, spcov_initial, estmethod = "reml", weights = "cressie", anisotropy = FALSE, random, randcov_initial, partition_factor, local, range_constrain, ... )
formula |
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the |
data |
A data frame or |
spcov_type |
The spatial covariance type. Available options include
|
xcoord |
The name of the column in |
ycoord |
The name of the column in |
spcov_initial |
An object from |
estmethod |
The estimation method. Available options include
|
weights |
Weights to use when |
anisotropy |
A logical indicating whether (geometric) anisotropy should
be modeled. Not required if |
random |
A one-sided linear formula describing the random effect structure
of the model. Terms are specified to the right of the |
randcov_initial |
An optional object specifying initial and/or known values for the random effect variances. |
partition_factor |
A one-sided linear formula with a single term specifying the partition factor. The partition factor assumes observations from different levels of the partition factor are uncorrelated. |
local |
An optional logical or list controlling the big data approximation.
If omitted,
When |
range_constrain |
An optional logical that indicates whether the range
should be constrained to enhance numerical stability. If |
... |
Other arguments to |
The spatial linear model for point-referenced data
(i.e., geostatistical model) can be written as
, where
is the fixed effects design
matrix,
are the fixed effects,
is random error that is
spatially dependent, and
is random error that is spatially
independent. Together,
and
are modeled using
a spatial covariance function, expressed as
, where
is the dependent error variance,
is a correlation matrix that controls the spatial dependence structure among observations,
is the independent error variance, and
is
an identity matrix.
spcov_type
Details: Parametric forms for are given below, where
for
distance between observations:
exponential:
spherical:
gaussian:
triangular:
circular:
cubic:
pentaspherical:
cosine:
wave:
jbessel: , Bj is Bessel-J function
gravity:
rquad:
magnetic:
matern: ,
, Bk is Bessel-K function with order
cauchy: ,
pexponential: ,
none:
All spatial covariance functions are valid in one spatial dimension. All
spatial covariance functions except triangular
and cosine
are
valid in two dimensions. An alias for none
is ie
.
estmethod
Details: The various estimation methods are
reml
: Maximize the restricted log-likelihood.
ml
: Maximize the log-likelihood.
sv-wls
: Minimize the semivariogram weighted least squares loss.
sv-cl
: Minimize the semivariogram composite likelihood loss.
anisotropy
Details: By default, all spatial covariance parameters except rotate
and scale
as well as all random effect variance parameters
are assumed unknown, requiring estimation. If either rotate
or scale
are given initial values other than 0 and 1 (respectively) or are assumed unknown
in spcov_initial()
, anisotropy
is implicitly set to TRUE
.
(Geometric) Anisotropy is modeled by transforming a covariance function that
decays differently in different directions to one that decays equally in all
directions via rotation and scaling of the original coordinates. The rotation is
controlled by the rotate
parameter in radians. The scaling
is controlled by the
scale
parameter in . The anisotropy
correction involves first a rotation of the coordinates clockwise by
rotate
and then a
scaling of the coordinates' minor axis by the reciprocal of scale
. The spatial
covariance is then computed using these transformed coordinates.
random
Details: If random effects are used (the estimation method must be "reml"
or
"ml"
), the model
can be written as ,
where each Z is a random effects design matrix and each u is a random effect.
partition_factor
Details: The partition factor can be represented in matrix form as , where
elements of
equal one for observations in the same level of the partition
factor and zero otherwise. The covariance matrix involving only the
spatial and random effects components is then multiplied element-wise
(Hadmard product) by
, yielding the final covariance matrix.
local
Details: The big data approximation works by sorting observations into different levels
of an index variable. Observations in different levels of the index variable
are assumed to be uncorrelated for the purposes of model fitting. Sparse matrix methods are then implemented
for significant computational gains. Parallelization generally further speeds up
computations when data sizes are larger than a few thousand. Both the "random"
and "kmeans"
values of method
in local
have random components. That means you may get slightly different
results when using the big data approximation and rerunning splm()
with the same code. For consistent results,
either set a seed via base::set.seed()
or specify index
to local
.
Observations with NA
response values are removed for model
fitting, but their values can be predicted afterwards by running
predict(object)
.
A list with many elements that store information about
the fitted model object. If spcov_type
or spcov_initial
are
length one, the list has class splm
. Many generic functions that
summarize model fit are available for splm
objects, including
AIC
, AICc
, anova
, augment
, BIC
, coef
,
cooks.distance
, covmatrix
, deviance
, fitted
, formula
,
glance
, glances
, hatvalues
, influence
,
labels
, logLik
, loocv
, model.frame
, model.matrix
,
plot
, predict
, print
, pseudoR2
, summary
,
terms
, tidy
, update
, varcomp
, and vcov
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class splm_list
and each element in the list has class
splm
. glances
can be used to summarize splm_list
objects, and the aforementioned splm
generics can be used on each
individual list element (model fit).
This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) summary(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) summary(spmod)
Fit random forest spatial residual models for point-referenced data (i.e., geostatistical models) using random forest to fit the mean and a spatial linear model to fit the residuals. The spatial linear model fit to the residuals can incorporate variety of estimation methods, allowing for random effects, anisotropy, partition factors, and big data methods.
splmRF(formula, data, ...)
splmRF(formula, data, ...)
formula |
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the |
data |
A data frame or |
... |
Additional named arguments to |
The random forest residual spatial linear model is described by
Fox et al. (2020). A random forest model is fit to the mean portion of the
model specified by formula
using ranger::ranger()
. Residuals
are computed and used as the response variable in an intercept-only spatial
linear model fit using splm()
. This model object is intended for use with
predict()
to perform prediction, also called random forest
regression Kriging.
A list with several elements to be used with predict()
. These
elements include the function call (named call
), the random forest object
fit to the mean (named ranger
),
the spatial linear model object fit to the residuals
(named splm
or splm_list
), and an object can contain data for
locations at which to predict (called newdata
). The newdata
object contains the set of
observations in data
whose response variable is NA
.
If spcov_type
or spcov_initial
(which are passed to splm()
)
are length one, the list has class splmRF
and the spatial linear
model object fit to the residuals is called splm
, which has
class splm
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class splmRF_list
and the spatial linear model object
fit to the residuals is called splm_list
, which has class splm_list
.
and contains several objects, each with class splm
.
An splmRF
object to be used with predict()
. There are
three elements: ranger
, the output from fitting the mean model with
ranger::ranger()
; splm
, the output from fitting the spatial
linear model to the ranger residuals; and newdata
, the newdata
object, if relevant.
This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.
Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.
sulfate$var <- rnorm(NROW(sulfate)) # add noise variable sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential") predict(sprfmod, sulfate_preds)
sulfate$var <- rnorm(NROW(sulfate)) # add noise variable sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential") predict(sprfmod, sulfate_preds)
Simulate a spatial beta random variable with a specific mean and covariance structure.
sprbeta( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
sprbeta( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
spcov_params |
An |
dispersion |
The dispersion value. |
mean |
A numeric vector representing the mean. |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Additional arguments passed to |
The values of spcov_params
, mean
, and randcov_params
are assumed to be on the link scale. They are used to simulate a latent normal (Gaussian)
response variable using sprnorm()
. This latent variable is the
conditional mean used with dispersion
to simulate a beta random variable.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprbeta(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprbeta(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprbeta(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprbeta(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
Simulate a spatial binomial random variable with a specific mean and covariance structure.
sprbinom( spcov_params, mean = 0, size = 1, samples = 1, data, randcov_params, partition_factor, ... )
sprbinom( spcov_params, mean = 0, size = 1, samples = 1, data, randcov_params, partition_factor, ... )
spcov_params |
An |
mean |
A numeric vector representing the mean. |
size |
A numeric vector representing the sample size for each binomial trial.
The default is |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Additional arguments passed to |
The values of spcov_params
, mean
, and randcov_params
are assumed to be on the link scale. They are used to simulate a latent normal (Gaussian)
response variable using sprnorm()
. This latent variable is the
conditional mean used with dispersion
to simulate a binomial random variable.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprbinom(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprbinom(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprbinom(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprbinom(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
Simulate a spatial gamma random variable with a specific mean and covariance structure.
sprgamma( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
sprgamma( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
spcov_params |
An |
dispersion |
The dispersion value. |
mean |
A numeric vector representing the mean. |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Additional arguments passed to |
The values of spcov_params
, mean
, and randcov_params
are assumed to be on the link scale. They are used to simulate a latent normal (Gaussian)
response variable using sprnorm()
. This latent variable is the
conditional mean used with dispersion
to simulate a gamma random variable.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprgamma(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprgamma(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprgamma(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprgamma(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
Simulate a spatial inverse gaussian random variable with a specific mean and covariance structure.
sprinvgauss( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
sprinvgauss( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
spcov_params |
An |
dispersion |
The dispersion value. |
mean |
A numeric vector representing the mean. |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Additional arguments passed to |
The values of spcov_params
, mean
, and randcov_params
are assumed to be on the link scale. They are used to simulate a latent normal (Gaussian)
response variable using sprnorm()
. This latent variable is the
conditional mean used with dispersion
to simulate a inverse gaussian random variable.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprinvgauss(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprinvgauss(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprinvgauss(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprinvgauss(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
Simulate a spatial negative binomial random variable with a specific mean and covariance structure.
sprnbinom( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
sprnbinom( spcov_params, dispersion = 1, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
spcov_params |
An |
dispersion |
The dispersion value. |
mean |
A numeric vector representing the mean. |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Additional arguments passed to |
The values of spcov_params
, mean
, and randcov_params
are assumed to be on the link scale. They are used to simulate a latent normal (Gaussian)
response variable using sprnorm()
. This latent variable is the
conditional mean used with dispersion
to simulate a negative binomial random variable.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprnbinom(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprnbinom(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprnbinom(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprnbinom(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
Simulate a spatial normal (Gaussian) random variable with a specific mean and covariance structure.
sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... ) ## S3 method for class 'exponential' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, xcoord, ycoord, ... ) ## S3 method for class 'none' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... ) ## S3 method for class 'ie' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... ) ## S3 method for class 'car' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, W, row_st = TRUE, M, ... )
sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... ) ## S3 method for class 'exponential' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, xcoord, ycoord, ... ) ## S3 method for class 'none' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... ) ## S3 method for class 'ie' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... ) ## S3 method for class 'car' sprnorm( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, W, row_st = TRUE, M, ... )
spcov_params |
An |
mean |
A numeric vector representing the mean. |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Other arguments. Not used (needed for generic consistency). |
xcoord |
Name of the column in |
ycoord |
Name of the column in |
W |
Weight matrix specifying the neighboring structure used for car and
sar models. Not required if |
row_st |
A logical indicating whether row standardization be performed on
|
M |
M matrix satisfying the car symmetry condition. The car
symmetry condition states that |
Random variables are simulated via the product of the covariance matrix's
square (Cholesky) root and independent standard normal random variables
with mean 0 and variance 1. Computing the square root is a significant
computational burden and likely unfeasible for sample sizes much past 10,000.
Because this square root only needs to be computed once, however, it is
nearly the sample computational cost to call sprnorm()
for any value
of samples
.
Only methods for the exponential
, none
, and car
covariance functions are documented here,
but methods exist for all other spatial covariance functions defined in
spcov_initial()
. Syntax for the exponential
method is the same
as syntax for spherical
, gaussian
, triangular
,
circular
, cubic
, pentaspherical
, cosine
, wave
,
jbessel
, gravity
, rquad
, magnetic
, matern
,
cauchy
, and pexponential
methods. Syntax for
the car
method is the same as syntax for the sar
method. The
extra
parameter for car and sar models is ignored when all observations have
neighbors.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 1, ie = 1, range = 1) sprnorm(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprnorm(spcov_params_val, mean = 1:30, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 1, ie = 1, range = 1) sprnorm(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprnorm(spcov_params_val, mean = 1:30, samples = 5, data = caribou, xcoord = x, ycoord = y)
Simulate a spatial Poisson random variable with a specific mean and covariance structure.
sprpois( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
sprpois( spcov_params, mean = 0, samples = 1, data, randcov_params, partition_factor, ... )
spcov_params |
An |
mean |
A numeric vector representing the mean. |
samples |
The number of independent samples to generate. The default
is |
data |
A data frame or |
randcov_params |
A |
partition_factor |
A formula indicating the partition factor. |
... |
Additional arguments passed to |
The values of spcov_params
, mean
, and randcov_params
are assumed to be on the link scale. They are used to simulate a latent normal (Gaussian)
response variable using sprnorm()
. This latent variable is the
conditional mean used with dispersion
to simulate a Poisson random variable.
If samples
is 1, a vector of random variables for each row of data
is returned. If samples
is greater than one, a matrix of random variables
is returned, where the rows correspond to each row of data
and the columns
correspond to independent samples.
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprpois(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprpois(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
spcov_params_val <- spcov_params("exponential", de = 0.2, ie = 0.1, range = 1) sprpois(spcov_params_val, data = caribou, xcoord = x, ycoord = y) sprpois(spcov_params_val, samples = 5, data = caribou, xcoord = x, ycoord = y)
Sulfate atmospheric deposition in the conterminous USA.
sulfate
sulfate
An sf
object with 197 rows and 2 columns.
sulfate: Total wet deposition sulfate in kilograms per hectare.
geometry: POINT
geometry representing coordinates in a
Conus Albers projection (EPSG: 5070). Distances between points are in meters.
These data were used in the publication listed in References. Data were downloaded from the National Atmospheric Deposition Program National Trends Network.
Zimmerman, D.L. (1994). Statistical analysis of spatial data. Pages 375-402 in Statistical Methods for Physical Science, J. Stanford and S. Vardeman (eds.), Academic Press: New York.
Locations at which to predict sulfate atmospheric deposition in the conterminous USA.
sulfate_preds
sulfate_preds
An sf
object with 197 rows and 1 column.
geometry: POINT
geometry representing coordinates in a
Conus Albers projection (EPSG: 5070).
These data were used in the publication listed in References. Data were downloaded from the National Atmospheric Deposition Program National Trends Network.
Zimmerman, D.L. (1994). Statistical analysis of spatial data. Pages 375-402 in Statistical Methods for Physical Science, J. Stanford and S. Vardeman (eds.), Academic Press: New York.
Summarize a fitted model object.
## S3 method for class 'splm' summary(object, ...) ## S3 method for class 'spautor' summary(object, ...) ## S3 method for class 'spglm' summary(object, ...) ## S3 method for class 'spgautor' summary(object, ...)
## S3 method for class 'splm' summary(object, ...) ## S3 method for class 'spautor' summary(object, ...) ## S3 method for class 'spglm' summary(object, ...) ## S3 method for class 'spgautor' summary(object, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
summary()
creates a summary of a fitted model object
intended to be printed using print()
. This summary contains
useful information like the original function call, residuals,
a coefficients table, a pseudo r-squared, and estimated covariance
parameters.
A list with several fitted model quantities used to create informative summaries when printing.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) summary(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) summary(spmod)
Texas voter turnout data collected during the United States 1980 Presidential election.
texas
texas
An sf
object with 254 rows and 4 columns:
FIPS: Federal Information Processing System (FIPS) county codes.
turnout: Proportion of eligible voters who voted.
log_income: The natural logarithm of average per capita (in dollars) income.
geometry: POINT
geometry representing coordinates in a NAD83
projection (EPSG: 5070). Distances between points are in meters.
The data source is the elect80
data set in the spData
R package.
Tidy a fitted model object into a summarized tibble.
## S3 method for class 'splm' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...) ## S3 method for class 'spautor' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...) ## S3 method for class 'spglm' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...) ## S3 method for class 'spgautor' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...)
## S3 method for class 'splm' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...) ## S3 method for class 'spautor' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...) ## S3 method for class 'spglm' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...) ## S3 method for class 'spgautor' tidy(x, conf.int = FALSE, conf.level = 0.95, effects = "fixed", ...)
x |
A fitted model object from |
conf.int |
Logical indicating whether or not to include a confidence interval
in the tidied output. The default is |
conf.level |
The confidence level to use for the confidence interval if
|
effects |
The type of effects to tidy. Available options are |
... |
Other arguments. Not used (needed for generic consistency). |
A tidy tibble of summary information effects
.
glance.spmodel()
augment.spmodel()
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) tidy(spmod) tidy(spmod, effects = "spcov")
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) tidy(spmod) tidy(spmod, effects = "spcov")
Compare the proportion of total variability explained by the fixed effects and each variance parameter.
varcomp(object, ...) ## S3 method for class 'splm' varcomp(object, ...) ## S3 method for class 'spautor' varcomp(object, ...) ## S3 method for class 'spglm' varcomp(object, ...) ## S3 method for class 'spgautor' varcomp(object, ...)
varcomp(object, ...) ## S3 method for class 'splm' varcomp(object, ...) ## S3 method for class 'spautor' varcomp(object, ...) ## S3 method for class 'spglm' varcomp(object, ...) ## S3 method for class 'spgautor' varcomp(object, ...)
object |
A fitted model object (e.g., from |
... |
Other arguments. Not used (needed for generic consistency). |
A tibble that partitions the the total variability by the fixed effects
and each variance parameter. The proportion of variability explained by the
fixed effects is the pseudo R-squared obtained by psuedoR2()
. The
remaining proportion is spread accordingly among each variance parameter:
"de"
, "ie"
, and if random effects are used, each named random effect.
If spautor()
objects have unconnected sites, a list is returned with three elements:
"connected"
for a variability comparison among the connected sites;
"unconnected"
for a variability comparison among the unconnected
sites; and "ratio"
for the ratio of the variance of the connected
sites relative to the variance of the unconnected sites.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) varcomp(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) varcomp(spmod)
Calculate variance-covariance matrix for a fitted model object.
## S3 method for class 'splm' vcov(object, ...) ## S3 method for class 'spautor' vcov(object, ...) ## S3 method for class 'spglm' vcov(object, var_correct = TRUE, ...) ## S3 method for class 'spgautor' vcov(object, var_correct = TRUE, ...)
## S3 method for class 'splm' vcov(object, ...) ## S3 method for class 'spautor' vcov(object, ...) ## S3 method for class 'spglm' vcov(object, var_correct = TRUE, ...) ## S3 method for class 'spgautor' vcov(object, var_correct = TRUE, ...)
object |
A fitted model object from |
... |
Other arguments. Not used (needed for generic consistency). |
var_correct |
A logical indicating whether to return the corrected variance-covariance
matrix for models fit using |
The variance-covariance matrix of coefficients obtained via coef()
.
Currently, only the variance-covariance matrix of the fixed effects is supported.
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) vcov(spmod)
spmod <- splm(z ~ water + tarp, data = caribou, spcov_type = "exponential", xcoord = x, ycoord = y ) vcov(spmod)