VPSPulse Mirrors
High-Performance Open-Source Archive
NEWS
DataExplorer 0.9.0
New Features
#141 :
Added plotly argument to all plot functions. When
plotly = TRUE, plots are converted to interactive plotly
objects via plotly::ggplotly() (requires the plotly
package).
Enhancements
#181 :
Added by argument to plot_histogram and
plot_density to break down distributions by a discrete or
continuous feature.
Bug Fixes
#169 :
Charts no longer run off the edge of PDF pages. Report template now uses
smaller figure dimensions (6” x 6”) for PDF output and keeps larger
dimensions (14” x 10”) for HTML.
#172 :
Fixed plot_qq(..., by = ...) error “Faceting variables must
have at least one value”.
#185 :
Fixed warnings from deprecated aes_string.
DataExplorer 0.8.4
Bug Fixes
Fixed Rd cross-reference issues for CRAN submission by adding proper
package anchors to external function links.
Updated all .r file extensions to .R for
consistency with R naming conventions.
Updated GitHub Actions workflows to use latest action versions
(actions/checkout@v4, actions/upload-artifact@v4).
DataExplorer 0.8.3
Enhancements
#154
PR : Added YAML option to allow HTML elements when choosing PDF
report.
#165 :
Added geom_jitter option to plot_boxplot and
plot_scatterplot.
#176
PR : Improved legend ordering in plot_missing.
#177
PR : Added group color customization in
plot_missing.
DataExplorer 0.8.2
Enhancements
#139 :
Added by argument to plot_bar.
Bug Fixes
#148 :
Address CRAN removal due to vignette build failure.
DataExplorer 0.8.1
Enhancements
#111 :
Continuous distributions can now be plotted with different scales, i.e.,
histogram, density, boxplot, scatterplot.
#126 :
Cleaned up labels in legend guide.
#127
(PR) : Added option to plot columns with missing values only in
plot_missing.
Cleaned up code for create_report.
Bug Fixes
#109 :
Fixed a bug causing unordered bar charts.
#114 :
Removed redundant message in dummify.
#116 :
Fixed pandoc document conversion error 99.
#120 :
Fixed type logical being parsed as symbol in
configure_report.
#121 :
Fixed missing value bug when
split_columns(..., binary_as_factor = TRUE).
#130
(PR) : plot_prcomp now drops columns with zero
variance.
DataExplorer 0.8.0
New Features
#92 :
Added update_columns to transform any selected
columns.
Enhancements
#87 :
Added configure_report function to customize report
content.
#89 :
Added option to customize geom_text and
geom_label arguments.
#91 :
create_report now displays full report directory after
completion.
#95 :
Added better exception handling for plot_bar.
#98 :
Added band customization to plot_missing.
#100 :
Switched geom_text to geom_label.
#103 :
Report title can now be customized in create_report.
#108 :
Added option to treat binary features as discrete in
plot_bar, plot_histogram,
plot_density and plot_boxplot.
Updated d3.min.js to v5.9.2.
Bug Fixes
#88 :
Added plot_intro to report config.
#90 :
Added first plot in plot_prcomp to output and
page_0.
#94 :
Fixed typo for PCA.
DataExplorer 0.7.1
Enhancements
#86 :
Replaced gridExtra::grid.arrange with facets.
Added seeds to vignette and README for re-producible examples.
Hid all internal functions.
DataExplorer 0.7.0
New Features
#72 :
Added plot_qq for QQ plot.
#76 :
Added plot_intro to visualize results of
introduce.
Enhancements
#42 :
Applied S3 methods for plotting functions.
#77 :
dummify now works on selected columns.
#78 : All
ggplot objects from plot_* are now invisibly returned. As a
result, extracted profile_missing from
plot_missing for missing value profiles.
#83 :
Removed all deprecated functions.
#85 :
Users can now specify number of rows/columns for plot page layout.
plot_prcomp now passed scale. = TRUE to
prcomp by default.
Added sampled_rows argument to
plot_scatterplot.
Added option to parallelize plot object construction.
Updated default config for create_report.
Bug Fixes
#74 :
Fixed a bug causing create_report failure due to zero
complete rows.
#75 :
Fixed a bug in plot_str when plotting data.frame with more
than 100 columns.
#82 :
Removed hard-coded scales from all plot functions.
Fixed a bug causing wrong column indices in
split_columns.
Fixed a bug using standard deviation instead of variance in
plot_prcomp.
DataExplorer 0.6.1
Enhancements
Updated vignette for better clarity.
#71 :
Added better error handler for plot_prcomp.
Bug Fixes
#69 :
Fixed bug causing create_report failure (specifically from
plot_prcomp) when y is specified.
Added more unit tests for create_report and
plot_prcomp.
DataExplorer 0.6.0
New Features
#15 :
Added plot_prcomp to visualize principal component
analysis.
#54 :
Extracted dummify from plot_correlation as a
new function.
#59 :
Added introduce for basic metadata.
Enhancements
#41 :
create_report can now be customized.
#53 :
Added page number for plots that span multiple pages.
#56 :
Added support for theme and customization for individual
components.
#62 :
plot_bar now supports optional measures (in addition to
categorical frequency) using argument with.
#66 :
Feature engineering functions works on other classes in addition to just
data.table .
plot_missing:
Percentage text labels from output plot now has 2 decimals to
prevent small percentages from being truncated to 0%.
Added example to quickly drop columns with too many missing
values.
Added .ignoreCat and .getAllMissing to
helper.
Bug Fixes
#55 :
Fixed bugs and updated vignette with latest functions.
#57 :
Fixed plot_str bug for not supporting S4 objects.
#63 :
Fixed plot_histogram and plot_density not
working with column names containing spaces.
DataExplorer 0.5.0
New Features
#48 :
Added plot_scatterplot to visualize relationship of one
feature against all other.
#50 :
Added plot_boxplot to visualize continuous distributions
broken down by another feature.
Enhancements
#44 :
Added option to exclude categories in group_category.
#45 :
Added title option for all plots.
#46 :
Added option to exclude columns in set_missing.
#49
[Breaking Change] : Switched package to tidyverse style . All old
functions are in .Deprecated mode. List of name changes in
alphabetical order:
BarDiscrete -> plot_bar
CollapseCategory -> group_category
CorrelationContinuous->
plot_correlation(..., type = "continuous")
CorrelationDiscrete->
plot_correlation(..., type = "discrete")
DensityContinuous -> plot_density
DropVar -> drop_columns
GenerateReport -> create_report
HistogramContinuous ->
plot_histogram
PlotMissing -> plot_missing
PlotStr -> plot_str
SetNaTo -> set_missing
SplitColType -> split_columns
#52 :
Combined CorrelationContinuous and
CorrelationDiscrete into one function, and added option to
view correlation of all features at once.
Optimized layout for multiple plots.
Bug Fixes
#47 :
Fixed color scale for correlation heatmap.
DataExplorer 0.4.0
New Features
#33 :
Added PlotStr to visualize data structure.
#40 :
Added network graph to GenerateReport.
Bug Fixes
#32 :
Fixed pandoc requirement error in unit test on cran.
#34 :
Fixed error message when quiet is not supplied. In
addition, report directory are printed through message()
instead of cat().
#35 :
Fixed rprojroot not found error.
Enhancements
#12 :
Added vignette: dataexplorer-intro .
#36 :
Fixed warnings from data.table in DropVar.
#37 :
Changed all cat() to message().
#38 :
Added option to order bars in BarDiscrete.
#39 :
Extended SetNaTo to discrete features.
Added more examples to README.md .
DataExplorer 0.3.0
New Features
#25 :
Added SetNaTo to quickly reset missing numerical
values.
#29 :
Added DropVar to quickly drop variables by either name or
column position.
Bug Fixes
#24 :
CorrelationDiscrete now displays all factor levels instead
of full rank matrix from model.matrix.
Enhancements
#11 :
Functions with return values will now match the input class and set it
back.
#22 :
Added documentation for num_all_missing in
SplitColType.
#23 :
Added additional measures (in addition to frequency) to
CollapseCategory.
#26 :
Removed density estimation section from report template.
#31 :
Added flexibility to name the new category in
CollapseCategory.
Other notes
#30 : In
CollapseCategory, update = TRUE will only work
with input data as data.table. However, it is still
possible to view the frequency distribution with any input data class,
as long as update = FALSE.
DataExplorer 0.2.6
Bug Fixes
#20 :
Fixed permission denied bug due to intermediates_dir argument in
knitr::render.
Enhancements
#16 :
Improved handling of missing values.
DataExplorer 0.2.5
Bug Fixes
#18 :
GenerateReport now handles data without discrete or
continuous features.
Enhancements
#14 :
Updated rmarkdown template for GenerateReport.
#1 :
Features with all NA values will be ignored in
BarDiscrete.
DataExplorer 0.2.4
Bug Fixes
Fixed a major bug in GenerateReport function due to
package renaming.
Enhancements
GenerateReport will now print the directory of the
report to console.
DataExplorer 0.2.3
New Features
Added function CollapseCategory to collapse sparse
categories for discrete features.
Added correlation heatmap for both continuous and discrete
features.
Added density plot for continuous features.
Bug Fixes
Fixed a bug in BarDiscrete and
CorrelationDiscrete for not plotting non-factor class.
Minor changes for CRAN re-submission.
Enhancements
Changed grid layout for BarDiscrete and
HistogramContinuous.
Features with all missing values will be ignored.
Switched position between continuous and discrete features in report
template.
Renamed package name to DataExplorer .
Added NEWS.md .
Removed BoxplotContinuous.