-
Added
butchersupport forcluster_fitobjects.axe_data()removes the training data stored in the fit, andaxe_env()clears the environment reference from the preprocessing terms. (#126) -
extract_cluster_assignment(),extract_centroids(), andpredict()now accept alabelsargument, a character vector of cluster labels that overrides the auto-generatedprefix-based labels. (#148) -
hier_clust()gains adist_funargument for specifying a custom distance function. (#70) -
The
dist_funargument accepted by cluster metrics is now documented, including how to use{philentropy}to supply custom distance methods. Seevignette("tuning_and_metrics", package = "tidyclust")for examples. (#185) -
Added a "Getting started with tidyclust" vignette (
vignette("tidyclust")). (#232) -
contr_one_hotis now exported, fixing theindicators = "one_hot"code path in.convert_form_to_x_fit()and.convert_form_to_x_new(). (#218) -
finalize_model_tidyclust()andfinalize_workflow_tidyclust()are deprecated. Usetune::finalize_model()andtune::finalize_workflow()instead, which now supportcluster_specobjects natively. (#223) -
tune_cluster()now warns when passed anapparent()resample. Metrics from apparent resamples are excluded bycollect_metrics(summarize = TRUE)(the default) since tune 1.2.0, which caused unexpectedNAvalues. Usecollect_metrics(summarize = FALSE)to see per-resample metrics. (#193) -
hier_clust()documentation now clarifies thatpredict()may not matchextract_cluster_assignment()on training data. This is expected behavior:predict()uses a distance-based heuristic whileextract_cluster_assignment()usescutree()based on the dendrogram structure. (#208)
-
The
db_clust()clustering specification has been added. This specification allows for the use of the DBSCAN algorithm using the dbscan engine. (#209) -
The
gm_clust()clustering specification has been added. This specification allows for the fitting of Gaussian mixture models using the mclust engine. (#209) -
The
mean_shift()clustering specification has been added. This specification fits clusters by iteratively shifting observations toward regions of high density, with the number of clusters determined automatically. The LPCM engine is used. (#240) -
mean_shift()gains a new engine withmeanShiftR. (#244) -
The
.configcolumn produced bytune_cluster()has changed from thePreprocessor{num}_Model{num}pattern topre{num}_mod{num}_post{num}to align with updates in the tune package. (#220) -
The
foreachpackage is no longer supported for parallel processing intune_cluster(). Use thefutureormiraipackages instead. See?tune::parallelismfor details. (#220) -
tune_cluster()now supports parallel processing via themiraipackage in addition tofuture. (#220) -
The
.notescolumn returned bytune_cluster()now includes atracecolumn containing backtraces for errors and warnings, making it easier to debug failures. (#220) -
Fixed bug when trying to tune the
linkage_methodargument. (#206, @lgaborini) -
sse_within_total()now correctly applies a customdist_funwhennew_dataisNULLby using training data stored in the model. (#184) -
silhouette_avg()now hasdirection = "maximize"instead ofdirection = "zero", so thatshow_best()andselect_best()correctly return models with the highest silhouette values. (#212, @dnldelarosa)
- The philentropy package is now used to calculate distances rather than Rfast. (#199)
- Update to fix revdep issue for clustMixType. (#190)
- Update to fix revdep issue for ClusterR. (#186)
- Small change to let tune package have easy CRAN release. (#178)
-
The clustMixType engine as been added to
k_means(). This engine allows fitting of k-prototype models. (#63) -
The klaR engine as been added to
k_means(). This engine allows fitting of k-modes models. (#63)
- Engine specific documentation has been added for all models and engines. (#159)
-
Fixed bug where engine specific arguments were passed along for
k_means()when the engine ClusterR. (#142) -
Fixed bug where
prefixargument wouldn't be correctly passed throughextract_cluster_assignment(),extract_centroids(), andpredict()(#145) -
Metric functions now error informatively if used with unfit cluster specifications. (#146)
-
Fixed bug that caused cluster ordering in extract_fit_summary(). (#136)
-
Using
extract_cluster_assignment(),extract_centroids()andpredict()on a fittedhier_clust()model without specifyingnum_clustorcut_heightnow gives more informative error message. (#147) -
k_means()now errors informatively iffit()withoutnum_clustspecified. (#134) -
Fixed bug where levels didn't match number of clusters if prediction on fewer number of observations. (#158)
-
Fixed bug where
tune_cluster()would error if used with an recipe that contained non-predictor variables such as id variables. (#124)
-
Exported internal functions
ClusterR_kmeans_fit(),stats_kmeans_fit(), andhclust_fit()have been renamed to.k_means_fit_ClusterR(),.k_means_fit_stats(), and.hier_clust_fit_stats()to reduce visibility for users. -
Cluster reordering is now done at the fitting time, not the extraction and prediction time. (#154)
- The cluster specification methods for
generics::tune_args()andgenerics::tunable()are now registered unconditionally (#115).
-
Fixed bug where
extract_cluster_assignment()andpredict()sometimes didn't have agreement of clusters. (#94) -
silhouette()andsilhouette_avg()now return NAs instead of erroring when applied to a clustering object with 1 cluster. (#104) -
Fixed bug where
extract_cluster_assignment()doesn't work forhier_clust()models in workflows wherenum_clustersis specified inextract_cluster_assignment().
- Added a
NEWS.mdfile to track changes to the package.