--- title: "Benchmarking" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Benchmarking} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} readRDS("benchmark_res.rds") -> res knitr::opts_chunk$set( collapse = TRUE, eval = FALSE, comment = "#>" ) ``` Based on community detection to automatically classify the keywords, \CRANpkg{akc} can utilize different algorithms for clustering. In this vignette, a benchmark is provided to show the difference for various algorithms on multiple sizes of networks. First, we'll load the needed packages. ```{r setup} library(akc) library(dplyr) ``` Then, we prepare the needed data. The built-in data table `biblio_data_table` would be used here. ```{r} bibli_data_table %>% keyword_clean() %>% keyword_merge() -> clean_data ``` Next, a combination of network size and community detection algorithms are designed to be tested: ```{r} 100:300 -> topn_sample ls("package:akc") %>% str_extract("^group.+") %>% na.omit() %>% setdiff(c("group_biconnected_component", "group_components", "group_optimal")) -> com_detect_fun_list ``` Finally, we'll implement the computation and record the results. ```{r,eval=FALSE} all = tibble() for(i in com_detect_fun_list){ for(j in topn_sample){ system.time({ clean_data %>% keyword_group(top = j,com_detect_fun = get(i)) %>% as_tibble -> grouped_network_table }) %>% na.omit-> time_info grouped_network_table %>% nrow -> node_no grouped_network_table %>% distinct(group) %>% nrow -> group_no grouped_network_table %>% count(group) %>% summarise(mean(n)) %>% .[[1]] -> group_avg_node_no grouped_network_table %>% count(group) %>% summarise(sd(n)) %>% .[[1]] -> group_sd_node_no c(com_detect_fun = i, topn = j, node_no = node_no,group_no = group_no, avg = group_avg_node_no, sd = group_sd_node_no,time_info[1:3]) %>% bind_rows(all,.) -> all } } res = all %>% mutate_at(2:9,function(x) as.numeric(x) %>% round(2)) %>% distinct(com_detect_fun,node_no,.keep_all = T) %>% select(-topn,-contains("self")) %>% setNames(c("com_detect_fun","No. of total nodes","No. of total groups", "Average node number in each group","Standard deviation of node number", "Computer running time for keyword_group function")) ``` The results are displayed in the following table. ```{r,eval=TRUE} knitr::kable(res) ``` The session information is displayed as below: ```{r,eval=TRUE} sessionInfo() ```