Capybara now offers different variance-covariance estimators that do
not require to call summary() (e.g., this differens from
Alpaca). I added a vignette replicating Cameron and Miller (2014) to
show how to use 1-way, 2-way, and dyadic clustering.
Improved numerical robustness of convergence criteria across all
fitting functions (fepoisson_asymmetric(),
feglm_fit(), fenegbin_fit()) to handle FMA
(Fused Multiply-Add) compiler optimizations on macOS and ARM.
Convergence checks now use hybrid absolute + relative tolerance
thresholds that scale appropriately, and bias correction now uses
stronger ridge regularization for numerical stability. This eliminates
intermittent convergence failures across different platforms and
compiler configurations.
Besides vcov estimation, capybara now allows to update formulas,
which is explained in the vcov vignette. tldr; you can use
fml <- mpg ~ wt | am followed by
felm(update(fml, . ~ . | cyl), data = mtcars and variations
of (same for feglm() and cluster update)
I rewrote the the Rank-Revealing Cholesky general method to
contribute back to Armadillo.
I found an error when using
summary(lm/glm, type = "clustered") that largely
underestimated the standard errors. This is now fixed and I merged
“clustered” and ” sandwich” types into a single “sandwich” type for
clarity and consistency as both use a bread-meat-bread approach.
The InferenceGLM struct now adds the VCOV matrix and
the standard errors to reuse computation and
strealine the summary creation. Same for the InferenceLM
struct.
All the inference (std. error, R-squared, etc) was moved to C++ side
and print() and summary() functions just
format the output.
Allows formulas without fixed effects, such as
y ~ x1 + x2 and y ~ x1 + x2 | 0 | cluster and
formulas without slopes such as y ~ 0 | fe1 + fe2.
Allows offsets in GLMs.
All the computation is done on C++ side, and now model formulas
explicitly fail if there are functions inside them.
The default is now
predict(glm_object, type = "response"), unlike base R
behavior.
Most of the R and C++ code was refactored to use memory
efficiently.
Follows fixest-based normalization for fixed effects to match Stata
results.
Provides the option to use
control = list(centering = "berge") or
list(centering = "stammann"). Both methods are equivalent
but use different internal logics. Berge’s fixed point problem approach
is usually faster.
Supports Probit and Logit regression.
Adds parallelization over columns for an efficient centering
regardless of the method used.
A new function fepoisson_asymmetric() to compare
coefficients across expectiles (e.g., 10%, 50%, 90%) to weight
positive/negative residuals. This is based on “The Tails of Gravity”
(10.1016/j.jinteco.2025.104145). The argument
expectile_glm_iter_max in fit_control()
controls the number of inner GLM iterations per APPML step. Setting it
to 1L updates asymmetric weights at every Newton step
instead of only after the inner GLM converges, which typically reduces
total iterations needed.
The summary_table() function now accepts positioning
arguments for LaTeX.
Improved numerical robustness of convergence criteria across all
fitting functions (fepoisson_asymmetric(),
feglm_fit(), fenegbin_fit()) to handle FMA
(Fused Multiply-Add) compiler optimizations on macOS and ARM.
Convergence checks now use hybrid absolute + relative tolerance
thresholds that scale appropriately, eliminating intermittent
convergence failures across different platforms and compiler
configurations.
capybara 1.8.1
Link to published article and citation info.
capybara 1.8.0
Drops congujate gradient acceleration and uses Irons-Tuck
acceleration instead. It is slightly faster.
The benchmarks show a small overhead compared to fixest, which is
much smaller memory footprint.
capybara 1.7.0
All the computation is done on C++ side. R does just do the data
cleaning/wrangling.
Implements a rank-revealing Cholesky factorisation like fixest.
Returns estimated fixed effects by default (with an option not
to).
capybara 1.6.0
Handles collinearities in the model matrix by using a QR
decomposition. when Cholesky fails.
It can return NA coefficients when there is collinearity to match
base R outputs.
capybara 1.4.0
Adds an extended battery of optional tests for the Poisson
model.
Modular code for easier maintenance.
capybara 1.3.0
Explicitly avoids Intel MKL and fallbacks to OpenBLAS to avoid
issues with non reproducible results.
Uses OpenMP to parallelize the demeaning functions, which can lead
to significant speedups in large datasets.
Uses Irons-Tuck acceleration for fast convergence in the demeaning
functions.
capybara 1.2.0
Changes to fit and summary functions to report perfectly classified
observations.
Dropped linear dependence checks, leaving it to the Cholesky
decomposition to handle it.
capybara 1.1.0
The workhorse demeaning functions were rewritten towards a more
efficient implementation. This is based on ppmlhdfe and fixest
code.
Loops were avoided and replace with efficient matrix
operations.
capybara 1.0.3
Implements some ideas from reghdfe/ppmlhdfe to improve the
centering/demeaning functions.
capybara 1.0.2
Small refactors for speed.
capybara 1.0.1
The examples now use smaller datasets to avoid CRAN timeouts with
Clang-ASAN.
capybara 1.0.0
Implements a new approach to obtain the rank with a QR decomposition
without loss of stability.
Adds different refactors to:
Streamline the code
Pass all large objects by reference
Use BLAS/LAPACK instead of iteration for some operations
Uses a new configure file that works nicely with Intel MKL (i.e. the
user does not need to export environment variables for the package to
detect MKL).
capybara 0.9.6
Calculates the rank of matrix X based on singular value
decomposition instead of QR decomposition. This is more efficient and
numerically stable.
capybara 0.9.5
Fixes and expands the ‘weights’ argument in the fe*()
functions to allow for different types of weights. The default is still
NULL (i.e., all weights equal to 1). The argument now
admits weights passed as weights = ~cyl,
weights = mtcars$cyl, or
w <- mtcars$cyl; weights = w.
capybara 0.9.4
Allows to estimate models without fixed effects.
capybara 0.9.3
Fixes the tidy() method for linear models
(felm class). Now it does not require to load the
tibble package to work.
Adds a wrapper to present multiple models into a single table with
the option to export to LaTeX.
capybara 0.9.2
Implements Irons and Tuck acceleration for fast convergence.
capybara 0.9.1
Fixes a minor uninitialized variable in the C++ code used for a
conditional check.
capybara 0.9
First CRAN version
Refactored functions to avoid data copies:
center variables
crossprod
GLM and LM fit
get alpha
group sums
mu eta
variance
iter_center_max and iter_inner_max now
can be modified in feglm_control().
capybara 0.8.0
Dedicated functions for linear models to avoid the overhead of
running the GLM function with a Gaussian link.
capybara 0.7.0
The predict method now allows to pass new data to predict the
outcome.
Fully documented code and tests according to rOpenSci
standards.
capybara 0.6.0
Moves all the heavy computation to C++ using Armadillo and it
exports the results to R. Previously, there were multiple data copies
between R and C++ that added overhead to the computations.
The previous versions returned MX by default, now it has to be
specified.
Adds code to extract the fixed effects with felm
objects.
capybara 0.5.2
Uses an O(n log(n)) algorithm to compute the Kendall correlation for
the pseudo-R2 in the Poisson model.
capybara 0.5.1
Using arma::field consistently instead of
std::vector<std::vector<>> for indices.
Linear algebra changes, such as using arma::inv instead
of solving arma::qr for the inverse.
Replaces multiple for loops with dedicated Armadillo functions.
capybara 0.5.0
Avoids for loops in the C++ code, and instead uses Armadillo’s
functions.
O(n) computations in C++ access data directly by using
pointers.
capybara 0.4.6
Fixes notes from tidyselect regarding the use of
all_of().
The C++ code follows a more consistent style.
The GH-Actions do not test gcc 4.8 anymore.
capybara 0.4.5
Ungroups the data to avoid issues with the model matrix
capybara 0.4
Uses R’s C API efficiently to add a bit more of memory
optimizations
capybara 0.3.5
Uses Mat consistently for all matrix operations (avoids
vectors)
capybara 0.3
Reduces memory footprint ~45% by moving some computation to
Armadillo’s side
capybara 0.2
Includes pseudo R2 (same as Stata) for Poisson models