VPSPulse Mirrors

High-Performance Open-Source Archive

README

factor256

The goal of factor256 is to minimize the memory footprint of data analysis that uses categorical variables with fewer than 256 unique values.

Installation

You can install the development version of factor256 from GitHub with:

# install.packages("devtools")
devtools::install_github("HughParsonage/factor256")

Example

This is a basic example which shows you how to solve a common problem:

library(factor256)
x <- factor256(LETTERS)
typeof(x)
#> [1] "raw"
identical(recompose256(x), LETTERS)
#> [1] TRUE
library(data.table)
DT <-
  CJ(Year = 2000:2020,
     State = rep_len(c("WA", "SA", "NSW", "NT", "TAS", "VIC", "QLD"), 1000),
     Age = rep_len(0:100, 10000))
# pryr::object_size(DT)
# 3.36GB
for (j in seq_along(DT)) {
  set(DT, j = j, value = factor256(.subset2(DT, j)))
}
# pryr::object_size(DT)
# 630 MB

Need mirroring services?
Contact our team at info@vpspulse.com.

Mirror powered by VPSpulse

Infrastructure sponsored by VPSPulse & Secure Payments by ArionPay.