A tabu-search algorithm to solve the max-p-region problem
Source:R/spatial_cluster.R
st_maxp_tabu.Rd
A wrapper function for rgeoda::maxp_tabu()
.The max-p-region problem is a special case
of constrained clustering where a finite number of geographical areas are aggregated into
the maximum number of regions (max-p-regions), such that each region is geographically
connected and the clusters could maximize internal homogeneity.
Arguments
- sfj
An sf (simple feature) object.
- varcol
The variable selected to calculate spatial lag, which is a character.
- wt
(optional) The spatial weights object,which can use
st_weights()
to construct,default is constructed byst_weights(sfj,'contiguity')
.- boundvar
A numeric vector of selected bounding variable.
- min_bound
A minimum value that the sum value of bounding variable int each cluster should be greater than.
- tabu_length
(optional) The length of a tabu search heuristic of tabu algorithm. Defaults to 10.
- conv_tabu
(optional): The number of non-improving moves. Defaults to 10.
- iterations
(optional) The number of iterations of greedy algorithm. Defaults to 99.
- initial_regions
(optional) The initial regions that the local search starts with. Default is empty. means the local search starts with a random process to "grow" clusters.
- scale_method
(optional) One of the scaling methods 'raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust' to apply on input data. Default is 'standardize' (Z-score normalization).
- distance_method
(optional) The distance method used to compute the distance betwen observation i and j. Defaults to "euclidean". Options are "euclidean" and "manhattan"
- seed
(optional) The seed for random number generator. Defaults to 123456789.
- cpu_threads
(optional) The number of cpu threads used for parallel computation.Default is 6.
- rdist
(optional) The distance matrix (lower triangular matrix, column wise storage).
Value
A names list with names "Clusters", "Total sum of squares", "Within-cluster sum of squares", "Total within-cluster sum of squares", and "The ratio of between to total sum of squares".
Author
Wenbo Lv lyu.geosocial@gmail.com
Examples
library(sf)
guerry = read_sf(system.file("extdata", "Guerry.shp", package = "rgeoda"))
guerry_clusters = st_maxp_tabu(guerry,c('Crm_prs','Crm_prp','Litercy','Donatns',
'Infants','Suicids'),boundvar = 'Pop1831',min_bound = 3236.67)
guerry_clusters
#> $Clusters
#> [1] 2 5 1 3 3 1 5 1 4 1 1 3 6 1 2 7 2 7 5 8 2 7 4 3 6 6 8 3 1 1 7 3 8 2 7 3 5 7
#> [39] 2 2 1 8 6 1 7 3 2 2 4 5 8 4 4 8 4 2 5 5 2 6 1 1 1 1 4 4 3 4 2 2 6 6 5 6 2 6
#> [77] 1 1 3 3 8 7 7 4 5
#>
#> $`Total sum of squares`
#> [1] 504
#>
#> $`Within-cluster sum of squares`
#> [1] 37.71292 53.00122 24.16448 61.51004 53.23544 33.45476
#>
#> $`Total within-cluster sum of squares`
#> [1] 240.9211
#>
#> $`The ratio of between to total sum of squares`
#> [1] 0.4780181
#>