Optimal, Fast, and Reproducible Univariate Clustering

Fast, optimal, and reproducible weighted univariate clustering by dynamic programming. Four types of problem including univariate k-means, k-median, k-segments, and multi-channel weighted k-means are solved with guaranteed optimality and reproducibility. The core algorithm minimizes the sum of (weighted) within-cluster distances using respective metrics. Its advantage over heuristic clustering in efficiency and accuracy is pronounced at a large number of clusters k. Weighted k-means can also process time series to perform peak calling. Multi-channel weighted k-means groups multiple univariate signals into k clusters. An auxiliary function generates histograms that are adaptive to patterns in data. This package provides a powerful set of tools for univariate data analysis with guaranteed optimality, efficiency, and reproducibility.

Tests Vignettes

Available Snapshots

This version of Ckmeans.1d.dp can be found in the following snapshots:


Imports/Depends/LinkingTo/Enhances (3)
  • Rcpp
  • Rdpack >= 0.6-1
  • Rcpp
  • Suggests (4)
  • testthat
  • knitr
  • rmarkdown
  • RColorBrewer
  • Version History