Skip to contents

A wrapper function that encapsulates various algorithms for detecting changepoint sets in univariate time series.

Usage

segment(x, method = "null", ...)

# S3 method for class 'tbl_ts'
segment(x, method = "null", ...)

# S3 method for class 'xts'
segment(x, method = "null", ...)

# S3 method for class 'numeric'
segment(x, method = "null", ...)

# S3 method for class 'ts'
segment(x, method = "null", ...)

Arguments

x

a numeric vector coercible into a stats::ts object

method

a character string indicating the algorithm to use. See Details.

...

arguments passed to methods

Value

An object of class tidycpt.

Details

Currently, segment() can use the following algorithms, depending on the value of the method argument:

  • pelt: Uses the PELT algorithm as implemented in segment_pelt(), which wraps either changepoint::cpt.mean() or changepoint::cpt.meanvar(). The segmenter is of class cpt.

  • binseg: Uses the Binary Segmentation algorithm as implemented by changepoint::cpt.meanvar(). The segmenter is of class cpt.

  • segneigh: Uses the Segmented Neighborhood algorithm as implemented by changepoint::cpt.meanvar(). The segmenter is of class cpt.

  • single-best: Uses the AMOC criteria as implemented by changepoint::cpt.meanvar(). The segmenter is of class cpt.

  • wbs: Uses the Wild Binary Segmentation algorithm as implemented by wbs::wbs(). The segmenter is of class wbs.

  • ga: Uses the Genetic algorithm implemented by segment_ga(), which wraps GA::ga(). The segmenter is of class tidyga.

  • ga-shi: Uses the genetic algorithm implemented by segment_ga_shi(), which wraps segment_ga(). The segmenter is of class tidyga.

  • ga-coen: Uses Coen's heuristic as implemented by segment_ga_coen(). The segmenter is of class tidyga. This implementation supersedes the following one.

  • coen: Uses Coen's heuristic as implemented by segment_coen(). The segmenter is of class seg_basket(). Note that this function is deprecated.

  • random: Uses a random basket of changepoints as implemented by segment_ga_random(). The segmenter is of class tidyga.

  • manual: Uses the vector of changepoints in the tau argument. The segmenter is of class seg_cpt`.

  • null: The default. Uses no changepoints. The segmenter is of class seg_cpt.

Examples

# Segment a time series using PELT
segment(DataCPSim, method = "pelt")
#> A tidycpt object
#> Class 'cpt' : Changepoint Object
#>        ~~   : S4 class containing 12 slots with names
#>               cpttype date version data.set method test.stat pen.type pen.value minseglen cpts ncpts.max param.est 
#> 
#> Created on  : Mon Jan 20 19:10:13 2025 
#> 
#> summary(.)  :
#> ----------
#> Created Using changepoint version 2.3 
#> Changepoint type      : Change in mean and variance 
#> Method of analysis    : PELT 
#> Test Statistic  : Normal 
#> Type of penalty       : MBIC with value, 27.99769 
#> Minimum Segment Length : 2 
#> Maximum no. of cpts   : Inf 
#> Changepoint Locations : 547 822 972 
#> List of 6
#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ tau          : int [1:3] 547 822 972
#>  $ region_params: tibble [4 × 3] (S3: tbl_df/tbl/data.frame)
#>   ..$ region           : chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"
#>   ..$ param_mu         : num [1:4] 35.3 58.1 96.7 155.9
#>   ..$ param_sigma_hatsq: Named num [1:4] 127 372 924 2442
#>   .. ..- attr(*, "names")= chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"
#>  $ model_params : NULL
#>  $ fitted_values: num [1:1096] 35.3 35.3 35.3 35.3 35.3 ...
#>  $ model_name   : chr "meanvar"
#>  - attr(*, "class")= chr "mod_cpt"

# Segment a time series using PELT and the BIC penalty
segment(DataCPSim, method = "pelt", penalty = "BIC")
#> A tidycpt object
#> Class 'cpt' : Changepoint Object
#>        ~~   : S4 class containing 12 slots with names
#>               cpttype date version data.set method test.stat pen.type pen.value minseglen cpts ncpts.max param.est 
#> 
#> Created on  : Mon Jan 20 19:10:13 2025 
#> 
#> summary(.)  :
#> ----------
#> Created Using changepoint version 2.3 
#> Changepoint type      : Change in mean and variance 
#> Method of analysis    : PELT 
#> Test Statistic  : Normal 
#> Type of penalty       : BIC with value, 20.99827 
#> Minimum Segment Length : 2 
#> Maximum no. of cpts   : Inf 
#> Changepoint Locations : 547 822 972 
#> List of 6
#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ tau          : int [1:3] 547 822 972
#>  $ region_params: tibble [4 × 3] (S3: tbl_df/tbl/data.frame)
#>   ..$ region           : chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"
#>   ..$ param_mu         : num [1:4] 35.3 58.1 96.7 155.9
#>   ..$ param_sigma_hatsq: Named num [1:4] 127 372 924 2442
#>   .. ..- attr(*, "names")= chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"
#>  $ model_params : NULL
#>  $ fitted_values: num [1:1096] 35.3 35.3 35.3 35.3 35.3 ...
#>  $ model_name   : chr "meanvar"
#>  - attr(*, "class")= chr "mod_cpt"

# Segment a time series using Binary Segmentation
segment(DataCPSim, method = "binseg", penalty = "BIC")
#> A tidycpt object
#> Class 'cpt' : Changepoint Object
#>        ~~   : S4 class containing 14 slots with names
#>               cpts.full pen.value.full data.set cpttype method test.stat pen.type pen.value minseglen cpts ncpts.max param.est date version 
#> 
#> Created on  : Mon Jan 20 19:10:13 2025 
#> 
#> summary(.)  :
#> ----------
#> Created Using changepoint version 2.3 
#> Changepoint type      : Change in mean and variance 
#> Method of analysis    : BinSeg 
#> Test Statistic  : Normal 
#> Type of penalty       : BIC with value, 20.99827 
#> Minimum Segment Length : 2 
#> Maximum no. of cpts   : 5 
#> Changepoint Locations : 547 809 972 
#> Range of segmentations:
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]  809   NA   NA   NA   NA
#> [2,]  809  547   NA   NA   NA
#> [3,]  809  547  972   NA   NA
#> [4,]  809  547  972  822   NA
#> [5,]  809  547  972  822  813
#> 
#>  For penalty values: 1485.679 462.0479 160.3649 15.04514 15.04514 
#> List of 6
#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ tau          : int [1:3] 547 809 972
#>  $ region_params: tibble [4 × 3] (S3: tbl_df/tbl/data.frame)
#>   ..$ region           : chr [1:4] "[1,547)" "[547,809)" "[809,972)" "[972,1.1e+03)"
#>   ..$ param_mu         : num [1:4] 35.3 57.9 94 155.9
#>   ..$ param_sigma_hatsq: Named num [1:4] 127 341 1015 2442
#>   .. ..- attr(*, "names")= chr [1:4] "[1,547)" "[547,809)" "[809,972)" "[972,1.1e+03)"
#>  $ model_params : NULL
#>  $ fitted_values: num [1:1096] 35.3 35.3 35.3 35.3 35.3 ...
#>  $ model_name   : chr "meanvar"
#>  - attr(*, "class")= chr "mod_cpt"

# Segment a time series using a random changepoint set
segment(DataCPSim, method = "random")
#> Seeding initial population with probability: 0.0063863343681642
#> A tidycpt object
#> An object of class "ga"
#> 
#> Call:
#> GA::ga(type = "binary", fitness = obj_fun, nBits = n, population = ..1,     maxiter = 1)
#> 
#> Available slots:
#>  [1] "data"          "model_fn_args" "call"          "type"         
#>  [5] "lower"         "upper"         "nBits"         "names"        
#>  [9] "popSize"       "iter"          "run"           "maxiter"      
#> [13] "suggestions"   "population"    "elitism"       "pcrossover"   
#> [17] "pmutation"     "optim"         "fitness"       "summary"      
#> [21] "bestSol"       "fitnessValue"  "solution"     
#> List of 6
#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ tau          : int [1:7] 31 104 366 550 837 872 930
#>  $ region_params: tibble [8 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ region  : chr [1:8] "[1,31)" "[31,104)" "[104,366)" "[366,550)" ...
#>   ..$ param_mu: num [1:8] 35.9 37.5 35.2 34.7 59.9 ...
#>  $ model_params : Named num 644
#>   ..- attr(*, "names")= chr "sigma_hatsq"
#>  $ fitted_values: num [1:1096] 35.9 35.9 35.9 35.9 35.9 ...
#>  $ model_name   : chr "meanshift_norm"
#>  - attr(*, "class")= chr "mod_cpt"

# Segment a time series using a manually-specified changepoint set
segment(DataCPSim, method = "manual", tau = c(826))
#> A tidycpt object
#> List of 8
#>  $ data        : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ pkg         : chr "tidychangepoint"
#>  $ algorithm   : chr "manual"
#>  $ changepoints: num 826
#>  $ fitness     : Named num 10571
#>   ..- attr(*, "names")= chr "BIC"
#>  $ seg_params  : list()
#>  $ model_name  : chr "meanshift_norm"
#>  $ penalty     : chr "BIC"
#>  - attr(*, "class")= chr "seg_cpt"
#> List of 6
#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ tau          : int 826
#>  $ region_params: tibble [2 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ region  : chr [1:2] "[1,826)" "[826,1.1e+03)"
#>   ..$ param_mu: num [1:2] 43.2 123.8
#>  $ model_params : Named num 882
#>   ..- attr(*, "names")= chr "sigma_hatsq"
#>  $ fitted_values: num [1:1096] 43.2 43.2 43.2 43.2 43.2 ...
#>  $ model_name   : chr "meanshift_norm"
#>  - attr(*, "class")= chr "mod_cpt"

# Segment a time series using a null changepoint set
segment(DataCPSim)
#> A tidycpt object
#> List of 8
#>  $ data        : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ pkg         : chr "tidychangepoint"
#>  $ algorithm   : chr "manual"
#>  $ changepoints: NULL
#>  $ fitness     : Named num 11503
#>   ..- attr(*, "names")= chr "BIC"
#>  $ seg_params  : list()
#>  $ model_name  : chr "meanshift_norm"
#>  $ penalty     : chr "BIC"
#>  - attr(*, "class")= chr "seg_cpt"
#> List of 6
#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...
#>  $ tau          : int(0) 
#>  $ region_params: tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>   ..$ region  : chr "[1,1.1e+03)"
#>   ..$ param_mu: num 63.2
#>  $ model_params : Named num 2089
#>   ..- attr(*, "names")= chr "sigma_hatsq"
#>  $ fitted_values: num [1:1096] 63.2 63.2 63.2 63.2 63.2 ...
#>  $ model_name   : chr "meanshift_norm"
#>  - attr(*, "class")= chr "mod_cpt"