CRAN Task View: Computational Econometrics

CRAN Task View: Computational Econometrics

Maintainer: Achim Zeileis
Contact: Achim.Zeileis at R-project.org
Version: 2014-02-28

Base R ships with a lot of functionality useful for computational econometrics, in particular in the stats package. This functionality is complemented by many packages on CRAN, a brief overview is given below. There is also a considerable overlap between the tools for econometrics in this view and for finance in the Finance view. Furthermore, the Finance SIG is a suitable mailing list for obtaining help and discussing questions about both computational finance and econometrics. Finally, there is also some overlap with the SocialSciences that also covers a broad variety of tools for social sciences, e.g., including political science.

The packages in this view can be roughly structured into the following topics. If you think that some package is missing from the list, please let me know.

Linear regression models

  • Linear models can be fitted (via OLS) with lm() (from stats) and standard tests for model comparisons are available in various methods such as summary() and anova().
  • Analogous functions that also support asymptotic tests ( instead of tests, and Chi-squared instead of tests) and plug-in of other covariance matrices are coeftest() and waldtest() in lmtest.
  • Tests of more general linear hypotheses are implemented in linear.hypothesis() in car.
  • HC and HAC covariance matrices that can be plugged into these functions are available in sandwich.
  • Diagnost checking: The packages car and lmtest provide a large collection of regression diagonstics and diagnostic tests.
  • Instrumental variables regression (two-stage least squares) is provided by ivreg() in AER, another implementation is tsls() in package sem.

Microeconometrics

  • Many standard microeconometric models belong to the family of generalized linear models (GLM) and can be fitted by glm() from package stats. This includes in particular logit and probit models for modeling choice data and poisson models for count data. Effects for typical values of regressors in these models can be obtained and visualized using effects. Marginal effects tables for certain GLMs can be obtained using the mfx package.
  • Negative binomial GLMs are available via glm.nb() in package MASS. Another implementation of negative binomial models is provided by aod, which also contains other models for overdispersed data.
  • Zero-inflated and hurdle count models are provided in package pscl.
  • Multinomial responses: Multinomial models with individual-specific covariates only are available in multinom() from package nnet. Implementations with both individual- and choice-specific variables are mlogit and mnlogit. Generalized additive models (GAMs) for multinomial responses can be fitted with the VGAM package. A Bayesian approach to multinomial probit models is provided by MNP. Various Bayesian multinomial models (including logit and probit) are available in bayesm. Furthermore, the package RSGHB fits various hierarchical Bayesian specifications based on direct specification of the likelihood function.
  • Ordered responses: Proportional-odds regression for ordered responses is implemented in polr() from package MASS. The package ordinal provides cumulative link models for ordered data which encompasses proportional odds models but also includes more general specifications. Bayesian ordered probit models are provided by bayesm.
  • Censored responses: Basic censored regression models (e.g., tobit models) can be fitted by survreg() in survival, a convenience interface tobit() is in package AER. Further censored regression models, including models for panel data, are provided in censReg. Interval regression models are in intReg. Censored regression models with conditional heteroskedasticity are in crch. Furthermore, hurdle models for left-censored data at zero can be estimated with mhurdle. Models for sample selection are available in sampleSelection and semiparametric extensions of these are provided bySemiParSampleSel.
  • Instrumental variables for binary responses: The LARF package estimates local average response functions for binary treatments and binary instruments.
  • Multivariate probit models: Estimation and marginal effect computations can be carried out with mvProbit.
  • Miscellaneous: Further more refined tools for microecnometrics are provided in the micEcon family of packages: Analysis with Cobb-Douglas, translog, and quadratic functions is in micEcon; the constant elasticity of scale (CES) function is in micEconCES; the symmetric normalized quadratic profit (SNQP) function is in micEconSNQP. The almost ideal demand system (AIDS) is inmicEconAids. Stochastic frontier analysis is in frontier. The package bayesm implements a Bayesian approach to microeconometrics and marketing. Inference for relative distributions is contained in package reldist.

Further regression models

  • Nonlinear least squares modeling is availble in nls() in package stats.
  • Quantile regression: quantreg (including linear, nonlinear, censored, locally polynomial and additive quantile regressions).
  • Linear models for panel data: plm, providing a wide range of within, between, and random-effect methods (among others) along with corrected standard errors, tests, etc. For panel-corrected standard errors in OLS and GEE models, see geepack and pcse. Estimation of linear models with multiple group fixed effects is contained in lfe.
  • Generalized method of moments (GMM) and generalized empirical likelihood (GEL): gmm.
  • Spatial econometric models: The Spatial view gives details about handling spatial data, along with information about (regression) modeling. In particular, spatial regression models can be fitted usingspdep and sphet (the latter using a GMM approach). splm is a package for spatial panel models. Spatial probit models are available in spatialprobit.
  • Linear structural equation models: sem (including two-stage least squares).
  • Simultaneous equation estimation: systemfit.
  • Nonparametric kernel methods: np.
  • Beta regression: betareg and gamlss.
  • Truncated (Gaussian) regression: truncreg.
  • Nonlinear mixed-effect models: nlme and lme4.
  • Generalized additive models (GAMs): mgcvgamgamlss and VGAM.
  • Mixed data sampling regression: midasr.
  • Miscellaneous: The packages VGAMrms and Hmisc provide several tools for extended handling of (generalized) linear regression models. Zelig is a unified easy-to-use interface to a wide range of regression models.

Basic time series infrastructure

  • The TimeSeries task view provides much more detailed information. Here, only the most important aspects are briefly mentioned.
  • The class "ts" in package stats is R’s standard class for regularly spaced time series (especially annual, quarterly, and monthly data).
  • Time series in "ts" format can be coerced back and forth without loss of information to "zooreg" from package zoozoo provides infrastructure for both regularly and irregularly spaced time series (the latter via the class "zoo") where the time information can be of arbitrary class. This includes daily series (typically with "Date" time index) or intra-day series (e.g., with "POSIXct" time index).
  • Several other implementations of irregular time series building on the "POSIXct" time-date class are available in itstseries and timeSeries (previously: fSeries) which are all aimed particularly at finance applications. See the Finance task view for more information.

Time series modeling

  • The TimeSeries task view contains detailed information about time series analysis in R. Time series models for financial econometrics (e.g., GARCH, stochastic volatility models, or stochastic differential equations, etc.) are described in the Finance. Here, only a brief overview of the most important methods for econometrics is given.
  • Classical time series modeling tools are contained in the stats package and include arima() for ARIMA modeling and Box-Jenkins-type analysis.
  • Fitting linear regression models with AR error terms via OLS is possible using gls() from nlme.
  • Structural time series models are provided by StructTS() in stats.
  • Filtering and decomposition for time series is available in decompose() and HoltWinters() in stats.
  • Extensions to these methods, in particular for forecasting and model selection, are provided in the forecast package.
  • Miscellaneous time series filters are available in mFilter.
  • For estimating VAR models, several methods are available: simple models can be fitted by ar() in stats, more elaborate models are provided in package varsestVARXls() in dse and a Bayesian approach is available in MSBVAR. A convenient interface for fitting dynamic regression models via OLS is available in dynlm; a different approach that also works with other regression functions is implemented in dyn.
  • More advanced dynamic system equations can be fitted using dse.
  • Periodic autoregressive models are provided by partsm.
  • Gaussian linear state space models can be fitted using dlm (via maximum likelihood, Kalman filtering/smoothing and Bayesian methods).
  • Unit root and cointegration techniques are available in urcatseriesCADFtest.
  • Time series factor analysis is available in tsfa.
  • Asymmetric price transmission modeling is available in apt.

Data sets

  • Packages AER and Ecdat contain a comprehensive collections of data sets from various standard econometric textbooks as well as several data sets from the Journal of Applied Econometrics and the Journal of Business & Economic Statistics data archives.
  • AER additionally provides an extensive set of examples reproducing analyses from the textbooks/papers, illustrating various econometric methods.
  • FinTS is the R companion to Tsay’s ‘Analysis of Financial Time Series’ (2nd ed., 2005, Wiley) containing data sets, functions and script files required to work some of the examples.
  • CDNmoney provides Canadian monetary aggregates.
  • pwt provides the Penn World Table from versions 5.6, 6.x, 7.x. The version 8.x data are available in pwt8.
  • The packages expsmoothfma, and Mcomp are data packages with time series data from the books ‘Forecasting with Exponential Smoothing: The State Space Approach’ (Hyndman, Koehler, Ord, Snyder, 2008, Springer) and ‘Forecasting: Methods and Applications’ (Makridakis, Wheelwright, Hyndman, 3rd ed., 1998, Wiley) and the M-competitions, respectively.
  • Package erer contains functions and datasets for the book of ‘Empirical Research in Economics: Growing up with R’ (Sun, forthcoming).
  • The package psidR available from GitHub can build panel data sets from the Panel Study of Income Dynamics (PSID).

Miscellaneous

  • Matrix manipulations : As a vector- and matrix-based language, base R ships with many powerful tools for doing matrix manipulations, which are complemented by the packages Matrix andSparseM.
  • Optimization and mathematical programming : R and many of its contributed packages provide many specialized functions for solving particular optimization problems, e.g., in regression as discussed above. Further functionality for solving more general optimization problems, e.g., likelihood maximization, is discussed in the the Optimization task view.
  • Bootstrap : In addition to the recommended boot package, there are some other general bootstrapping techniques available in bootstrap or simpleboot as well some bootstrap techniques designed for time-series data, such as the maximum entropy bootstrap in meboot or the tsbootstrap() from tseries.
  • Inequality : For measuring inequality, concentration and poverty the package ineq provides some basic tools such as Lorenz curves, Pen’s parade, the Gini coefficient and many more.
  • Structural change : R is particularly strong when dealing with structural changes and changepoints in parametric models, see strucchange and segmented.
  • Exchange rate regimes : Methods for inference about exchange rate regimes, in particular in a structural change setting, are provided by fxregime.