tidyclust (development version)

Added butcher support for cluster_fit objects. axe_data() removes the training data stored in the fit, and axe_env() clears the environment reference from the preprocessing terms. (#126)
extract_cluster_assignment(), extract_centroids(), and predict() now accept a labels argument, a character vector of cluster labels that overrides the auto-generated prefix-based labels. (#148)
hier_clust() gains a dist_fun argument for specifying a custom distance function. (#70)
The dist_fun argument accepted by cluster metrics is now documented, including how to use {philentropy} to supply custom distance methods. See vignette("tuning_and_metrics", package = "tidyclust") for examples. (#185)
Added a "Getting started with tidyclust" vignette (vignette("tidyclust")). (#232)
contr_one_hot is now exported, fixing the indicators = "one_hot" code path in .convert_form_to_x_fit() and .convert_form_to_x_new(). (#218)
finalize_model_tidyclust() and finalize_workflow_tidyclust() are deprecated. Use tune::finalize_model() and tune::finalize_workflow() instead, which now support cluster_spec objects natively. (#223)
tune_cluster() now warns when passed an apparent() resample. Metrics from apparent resamples are excluded by collect_metrics(summarize = TRUE) (the default) since tune 1.2.0, which caused unexpected NA values. Use collect_metrics(summarize = FALSE) to see per-resample metrics. (#193)
hier_clust() documentation now clarifies that predict() may not match extract_cluster_assignment() on training data. This is expected behavior: predict() uses a distance-based heuristic while extract_cluster_assignment() uses cutree() based on the dendrogram structure. (#208)

New Clustering Specifications

The db_clust() clustering specification has been added. This specification allows for the use of the DBSCAN algorithm using the dbscan engine. (#209)
The gm_clust() clustering specification has been added. This specification allows for the fitting of Gaussian mixture models using the mclust engine. (#209)
The mean_shift() clustering specification has been added. This specification fits clusters by iteratively shifting observations toward regions of high density, with the number of clusters determined automatically. The LPCM engine is used. (#240)
mean_shift() gains a new engine with meanShiftR. (#244)
The .config column produced by tune_cluster() has changed from the Preprocessor{num}_Model{num} pattern to pre{num}_mod{num}_post{num} to align with updates in the tune package. (#220)
The foreach package is no longer supported for parallel processing in tune_cluster(). Use the future or mirai packages instead. See ?tune::parallelism for details. (#220)
tune_cluster() now supports parallel processing via the mirai package in addition to future. (#220)
The .notes column returned by tune_cluster() now includes a trace column containing backtraces for errors and warnings, making it easier to debug failures. (#220)
Fixed bug when trying to tune the linkage_method argument. (#206, @lgaborini)
sse_within_total() now correctly applies a custom dist_fun when new_data is NULL by using training data stored in the model. (#184)
silhouette_avg() now has direction = "maximize" instead of direction = "zero", so that show_best() and select_best() correctly return models with the highest silhouette values. (#212, @dnldelarosa)

tidyclust 0.2.4

The philentropy package is now used to calculate distances rather than Rfast. (#199)

tidyclust 0.2.3

Update to fix revdep issue for clustMixType. (#190)

tidyclust 0.2.2

Update to fix revdep issue for ClusterR. (#186)

tidyclust 0.2.1

Small change to let tune package have easy CRAN release. (#178)

tidyclust 0.2.0

New Engines

The clustMixType engine as been added to k_means(). This engine allows fitting of k-prototype models. (#63)
The klaR engine as been added to k_means(). This engine allows fitting of k-modes models. (#63)

Improvements

Engine specific documentation has been added for all models and engines. (#159)

Bug Fixes

Fixed bug where engine specific arguments were passed along for k_means() when the engine ClusterR. (#142)
Fixed bug where prefix argument wouldn't be correctly passed through extract_cluster_assignment(), extract_centroids(), and predict() (#145)
Metric functions now error informatively if used with unfit cluster specifications. (#146)
Fixed bug that caused cluster ordering in extract_fit_summary(). (#136)
Using extract_cluster_assignment(), extract_centroids() and predict() on a fitted hier_clust() model without specifying num_clust or cut_height now gives more informative error message. (#147)
k_means() now errors informatively if fit() without num_clust specified. (#134)
Fixed bug where levels didn't match number of clusters if prediction on fewer number of observations. (#158)
Fixed bug where tune_cluster() would error if used with an recipe that contained non-predictor variables such as id variables. (#124)

Breaking Changes

Exported internal functions ClusterR_kmeans_fit(), stats_kmeans_fit(), and hclust_fit() have been renamed to .k_means_fit_ClusterR(), .k_means_fit_stats(), and .hier_clust_fit_stats() to reduce visibility for users.
Cluster reordering is now done at the fitting time, not the extraction and prediction time. (#154)

tidyclust 0.1.2

The cluster specification methods for generics::tune_args() and generics::tunable() are now registered unconditionally (#115).

tidyclust 0.1.1

Fixed bug where extract_cluster_assignment() and predict() sometimes didn't have agreement of clusters. (#94)
silhouette() and silhouette_avg() now return NAs instead of erroring when applied to a clustering object with 1 cluster. (#104)
Fixed bug where extract_cluster_assignment() doesn't work for hier_clust() models in workflows where num_clusters is specified in extract_cluster_assignment().

tidyclust 0.1.0

Added a NEWS.md file to track changes to the package.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tidyclust (development version)

New Clustering Specifications

tidyclust 0.2.4

tidyclust 0.2.3

tidyclust 0.2.2

tidyclust 0.2.1

tidyclust 0.2.0

New Engines

Improvements

Bug Fixes

Breaking Changes

tidyclust 0.1.2

tidyclust 0.1.1

tidyclust 0.1.0

FilesExpand file tree

NEWS.md

Latest commit

History

NEWS.md

File metadata and controls

tidyclust (development version)

New Clustering Specifications

tidyclust 0.2.4

tidyclust 0.2.3

tidyclust 0.2.2

tidyclust 0.2.1

tidyclust 0.2.0

New Engines

Improvements

Bug Fixes

Breaking Changes

tidyclust 0.1.2

tidyclust 0.1.1

tidyclust 0.1.0