This function identifies transitive clusters (i.e. connected components) as well as the number of members in each cluster, and adds this information to the linelist data.

get_clusters(
  x,
  output = c("epicontacts", "data.frame"),
  member_col = "cluster_member",
  size_col = "cluster_size",
  override = FALSE
)

Arguments

x

An epicontacts object.

output

A character string indicating the type of output: either an epicontacts object (default) or a data.frame containing cluster memberships to which members of epicontacts linelist belong to.

member_col

Name of column to which cluster membership is assigned to in the linelist. Default name is 'cluster_member'.

size_col

Name of column to which cluster sizes are assigned to in the linelist. Default name is 'cluster_size'.

override

Logical value indicating whether cluster member and size columns should be overwritten if they already exist in the linelist. Default is 'FALSE'.

Value

An epicontacts object whose 'linelist' dataframe contains new columns corresponding to cluster membership and size, or a data.frame containing member ids, cluster memberships as factors, and associated cluster sizes. All ids that were originally in the 'contacts' dataframe but not in the linelist will also be added to the linelist.

Author

Nistara Randhawa (nrandhawa@ucdavis.edu)

Examples

if (require(outbreaks)) { ## build data x <- make_epicontacts(ebola_sim$linelist, ebola_sim$contacts, id = "case_id", to = "case_id", from = "infector", directed = TRUE) ## add cluster membership and sizes to epicontacts 'linelist' y <- get_clusters(x, output = "epicontacts") y ## return a data.frame with linelist member ids and cluster memberships as ## factors z <- get_clusters(x, output = "data.frame") head(z) }
#> cluster_member id cluster_size #> 1 1 d1fafd 2 #> 2 1 53371b 2 #> 3 2 f5c3d8 6 #> 4 2 900021 6 #> 5 2 0f58c4 6 #> 6 2 d58402 6