Introduction to `poem`
Siyuan Luo
Institute for Molecular Life Sciences, University of Zurich, Zurich, SwitzerlandDepartment of Health Sciences and Technology, ETH Zurich, Zurich, Switzerlandroseluosy@gmail.com
Pierre-Luc Germain
Institute for Molecular Life Sciences, University of Zurich, Zurich, SwitzerlandDepartment of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland2024-12-02
Source:vignettes/poem.Rmd
poem.Rmd
Installation & loading
if (!requireNamespace("devtools", quietly = TRUE))
install.packages("devtools")
devtools::install_github("RoseYuan/poem")
Introduction
What is this package for?
This package provides multiple approaches for comparing two partitions1 of the same dataset, and evaluating the alignment between a dataset’s embedding/graph representations and its partition.
Besides, this package further offers methods for comparing two fuzzy partitions2 as well as for comparing a hard partition with a fuzzy partition. This allows the evaluation of fuzzy partition results by assessing its agreement to a fuzzy or a hard ground-truth partition.
Finally, the package implements visualization and evaluation metrics tailored for domain detection in spatially-resolved -omics data. These include especially external evaluation metrics (i.e. based on a comparison to ground truth labels), but also internal metrics.
Main functions
The package poem includes many metrics to perform different kinds of evaluations, and these metrics can be retrieved via 6 main wrapper functions. Unless specified, “partition” means “hard” partition. They are:
-
getEmbeddingMetrics()
: Metrics to compare an embedding of data points to a partition of these data points. -
getGraphMetrics()
: Metrics to compare a graph (e.g. kNN/sNN) to a partition, where nodes in the graph are data points in the partition. -
getPartitionMetrics()
: Metrics to compare two partitions of the same dataset. -
getfuzzyPartitionMetrics()
: Metrics to compare two fuzzy partitions, or to compare between a fuzzy and a hard partition of the same dataset. -
getSpatialExternalMetrics()
: External metrics for evaluating spatial clustering results in a spatial-aware fashion. For non-spatial-aware evaluation, one can directly usegetPartitionMetrics()
. -
getSpatialInternalMetrics()
: Internal metrics for evaluating spatial clustering results in a spatial-aware fashion.
There are 3 different levels where one can perform the above-mentioned evaluation: element-level, class-level, and dataset-level. Element-level evaluation reports metric values for each data point; Class-level evaluation reports metrics for each classes3 or clusters4; and dataset-level evaluation returns a single metric value for the whole dataset.
The following table illustrates available metrics at different evaluation levels, and the main functions used to retrieve them.
Getting started
Example data
To showcase the main functions, we will use some simulated datasets as examples in this vignette.
The two datasets, g1
and g2
, both contain
80 data points with x
and y
coordinates and of
4 different classes.
data(toyExamples)
g1 <- toyExamples[toyExamples$graph=="graph1",]
g2 <- toyExamples[toyExamples$graph=="graph2",]
head(g1)
## graph x y class
## 641 graph1 -0.6290416 -0.487293 class1
## 642 graph1 -2.5646982 -1.742079 class1
## 643 graph1 -1.6368716 -1.911560 class1
## 644 graph1 -1.3671374 -2.120897 class1
## 645 graph1 -1.5957317 -3.194329 class1
## 646 graph1 -2.1061245 -1.388003 class1
If we plot them out:
ggplot(rbind(g1,g2), aes(x,y,color=class, shape=class)) +
geom_point() +
facet_wrap(~graph) +
theme_bw()
Embedding evaluation
Let’s assume g1
and g2
contain two
different embeddings of the same set of objects. A “good” embedding
should put objects of the same class together, and objects of different
class apart. Since we know the ground-truth class of each object, one
can evaluation such “goodness” of an embedding by calculating embedding
evaluation metrics. One can calculate such metrics element-wise, for
each class/cluster, or for the whole dataset.
Element-level evaluation
For example, at the element level, one can calculate the Silhouette
Width by specifying level="element"
and
metrics=c("SW")
:
sw <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("SW"),
level="element")
head(sw)
## class SW
## 641 class1 0.2986628
## 642 class1 0.5818507
## 643 class1 0.6299871
## 644 class1 0.5867285
## 645 class1 0.5191290
## 646 class1 0.5679847
The output will be a data.frame
containing the metric
values for the specified level.
g1$sw <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class,
metrics=c("SW"), level="element")$SW
g2$sw <- getEmbeddingMetrics(x=g2[,c("x","y")], labels=g2$class,
metrics=c("SW"), level="element")$SW
ggplot(rbind(g1,g2), aes(x, y, color=sw, shape=class)) +
geom_point() +
facet_wrap(~graph) +
theme_bw()
Class-level evaluation
One can also evaluate at each class level, by specifying
level="class"
. Check ?getEmbeddingMetrics
to
see what are the allowed metrics at the class level. For example:
cl <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class,
metrics=c("dbcv", "meanSW"), level="class")
head(cl)
## class meanSW dbcv
## 1 class1 0.4240817 -0.37367780
## 2 class2 0.4897828 -0.34617982
## 3 class3 0.5021555 0.07752233
## 4 class4 0.5957709 0.26757842
res1 <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class,
metrics=c("dbcv", "meanSW"), level="class")
res2 <- getEmbeddingMetrics(x=g2[,c("x","y")], labels=g2$class,
metrics=c("dbcv", "meanSW"), level="class")
bind_rows(list(graph1=res1, graph2=res2), .id="graph") %>%
pivot_longer(cols=c("meanSW","dbcv"), names_to="metric",values_to="value") %>%
ggplot(aes(class, value, fill=graph, group=graph)) +
geom_bar(position = "dodge", stat = "identity") +
facet_wrap(~metric) +
theme_bw()
Dataset-level evaluation
Similarly, one can evaluate at the dataset level by specifying
level="dataset"
. For example:
getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, level="dataset",
metrics=c("meanSW", "meanClassSW", "pnSW", "minClassSW",
"cdbw", "cohesion", "compactness", "sep", "dbcv"))
## meanSW meanClassSW pnSW minClassSW cdbw cohesion compactness
## 1 0.5029477 0.5029477 0.0375 0.4240817 0.0553208 0.2732925 0.2800803
## sep dbcv
## 1 0.7227335 -0.09368922
Graph evaluation
Instead of directly using the distances or densities in the embedding
space for evaluation, one may want to evaluate from a connectivity stand
point by looking at the graph structure constructed from the above
datasets. getGraphMetrics()
can perform k nearest neighbor
(KNN) graph or shared nearest neighbor graph (SNN) construction from an
embedding and then apply graph-based evaluation metrics.
# Some functions for plotting
plotGraphs <- function(d, k=7){
gn <- dplyr::bind_rows(lapply(split(d[,-1],d$graph), FUN=function(d1){
nn <- emb2knn(as.matrix(d1[,c("x","y")]), k=k)
g <- poem:::.nn2graph(nn, labels=d1$class)
ggnetwork(g, layout=as.matrix(d1[,1:2]), scale=FALSE)
}), .id="graph")
ggplot(gn, aes(x = x, y = y, xend = xend, yend = yend)) + theme_blank() +
theme(legend.position = "right") + geom_edges(alpha=0.5, colour="grey") +
geom_nodes(aes(colour=class, shape=class), size=2) + facet_wrap(~graph, nrow=1)
}
For our examples g1
and g2
, the constructed
graphs will look like:
Use ?getGraphMetrics()
to check optional arguments for
KNN/SNN graph construction.
Similarly, level
can be "element"
,
"class"
or "dataset"
.
getGraphMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("PWC","ISI"),
level="class", directed=FALSE, k=7, shared=FALSE)
## class PWC ISI
## class1 class1 0.05 1.186272
## class2 class2 0.10 1.224188
## class3 class3 0.05 1.149098
## class4 class4 0.05 1.251146
res1 <- getGraphMetrics(x=g1[,c("x","y")], labels=g1$class,metrics=c("PWC","ISI"), level="class", directed=FALSE, k=7, shared=FALSE)
res2 <- getGraphMetrics(x=g2[,c("x","y")], labels=g2$class, metrics=c("PWC","ISI"), level="class", directed=FALSE, k=7, shared=FALSE)
bind_rows(list(graph1=res1, graph2=res2), .id="graph") %>%
pivot_longer(cols=c("PWC","ISI"), names_to="metric",values_to="value") %>%
ggplot(aes(class, value, fill=graph, group=graph)) +
geom_bar(position = "dodge", stat = "identity") +
facet_wrap(~metric) +
theme_bw()
Alternatively, getGraphMetrics()
can take an
igraph object as x
, which enables the application
of the evaluation metrics to a general graph, or a list of nearest
neighbors as x
, to accelerate the computation for large
datasets.
Partition evaluation
We construct SNN graph from g1 and g2 embeddings, and then apply Louvain algorithm to get partitions out of them.
k <- 7
r <- 0.5
snn1 <- emb2snn(as.matrix(g1[,c("x","y")]), k=k)
snn2 <- emb2snn(as.matrix(g2[,c("x","y")]), k=k)
g1$cluster <- factor(igraph::cluster_louvain(snn1, resolution = r)$membership)
g2$cluster <- factor(igraph::cluster_louvain(snn2, resolution = r)$membership)
ggplot(rbind(g1,g2), aes(x,y,color=cluster, shape=class)) +
geom_point() +
facet_wrap(~graph) +
theme_bw()
We then compare the predictions with the known labels using the partition metrics:
# for g1
getPartitionMetrics(true=g1$class, pred=g1$cluster, level="dataset",
metrics = c("RI", "WC", "WH", "ARI", "AWC", "AWH",
"FM", "AMI"))
## RI WC WH ARI AWC AWH FM AMI
## 1 0.9636076 0.925 0.9237845 0.9004285 0.9012088 0.8996496 0.9624922 0.8872892
# for g2
getPartitionMetrics(true=g2$class, pred=g2$cluster, level="dataset",
metrics = c("RI", "WC", "WH", "ARI", "AWC", "AWH",
"FM", "AMI"))
## RI WC WH ARI AWC AWH FM AMI
## 1 0.721519 0.95 0.4616368 0.4400954 0.9010025 0.2911552 0.6501669 0.4193846
Note that for class-level metrics, some are reported per class, while some (specifically, “WH”, “AWH) are reported per cluster.
getPartitionMetrics(true=g1$class, pred=g2$cluster, level="class")
## WC AWC FM class WH AWH cluster
## 1 0.9 0.802005 0.6551724 class1 NA NA <NA>
## 2 0.9 0.802005 0.6551724 class2 NA NA <NA>
## 3 1.0 1.000000 0.6451613 class3 NA NA <NA>
## 4 1.0 1.000000 0.6451613 class4 NA NA <NA>
## 5 NA NA NA <NA> 0.4864865 0.3238739 1
## 6 NA NA NA <NA> 0.4413473 0.2644406 2
Fuzzy partition evaluation
For comparing two fuzzy partitions or comparing a fuzzy partition to
a hard patition, one can use
getFuzzyPartitionMetrics()
.
The fuzzy reprensentation of a partion should look like the following, where each row is a data point, and the value is the class memberships to each class. Each row sums up to 1.
fuzzyTrue <- matrix(c(
0.95, 0.025, 0.025,
0.98, 0.01, 0.01,
0.96, 0.02, 0.02,
0.95, 0.04, 0.01,
0.95, 0.01, 0.04,
0.99, 0.005, 0.005,
0.025, 0.95, 0.025,
0.97, 0.02, 0.01,
0.025, 0.025, 0.95),
ncol = 3, byrow=TRUE)
# a hard truth:
hardTrue <- apply(fuzzyTrue,1,FUN=which.max)
# some predicted labels:
hardPred <- c(1,1,1,1,1,1,2,2,2)
getFuzzyPartitionMetrics(hardPred=hardPred, hardTrue=hardTrue,
fuzzyTrue=fuzzyTrue, nperms=3, level="class")
## fuzzyWC fuzzyAWC class fuzzyWH fuzzyAWH cluster
## 1 0.7195238 0.3542847 1 NA NA NA
## 2 1.0000000 NaN 2 NA NA NA
## 3 1.0000000 NaN 3 NA NA NA
## 4 NA NA NA 1.00000000 1.0000000 1
## 5 NA NA NA 0.06166667 -0.8064171 2
By using the input hardPred
, hardTrue
,
fuzzyPred
, fuzzyTrue
, one can control whether
the fuzzy or hard version of the two partitions is used in comparison.
For example, when fuzzyTrue
and fuzzyPred
are
not NULL
, metrics for comparing two fuzzy partitions will
be used.
Spatial clustering evaluation
Example data
We use another toy example dataset in the package,
sp_toys
, to illustrate spatial clustering evaluation.
data(sp_toys)
s <- 3
st <- 1
p1 <- ggplot(sp_toys, aes(x, y,
color=label)) +
geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() +
geom_point(shape = 1, size = s, stroke = st, aes(color=p1)) +
labs(x="",y="", title="P1")
p0 <- ggplot(sp_toys, aes(x, y,
color=label)) +
geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() +
geom_point(shape = 1, size = s, stroke = st, aes(color=label)) +
labs(x="",y="", title="C")
p2 <- ggplot(sp_toys, aes(x, y,
color=label)) +
geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() +
geom_point(shape = 1, size = s, stroke = st, aes(color=p2)) +
labs(x="",y="", title="P2")
plot_grid(p0 + theme(legend.position = "none",
plot.title = element_text(hjust = 0.5)),
p1 + theme(legend.position = "none",
plot.title = element_text(hjust = 0.5)),
p2 + theme(legend.position = "none",
plot.title = element_text(hjust = 0.5)), ncol = 3)
Here in C, the spots are colored by the ground-truth class. In P1 and P2, the color inside each spot is according to the ground-truth class, while the color of the border is according to clustering predictions. P1 and P2 misclassified the same amount of red spots into the blue cluster.
External metrics
Let’s quantify this by calculating external spatial metrics:
getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p1,
location=sp_toys[,c("x","y")], level="dataset",
metrics=c("SpatialARI","SpatialAccuracy"),
fuzzy_true = TRUE, fuzzy_pred = FALSE)
## SpatialARI SpatialAccuracy
## 1 0.7871135 0.956746
By specifying fuzzy_true
and fuzzy_pred
,
one can control whether the fuzzy or hard version of true
and pred
is used in comparison. If fuzzy_true
or fuzzy_pred
is TRUE
, the spatial
neighborhood information will be used to construct the fuzzy
representation of the class/cluster memberships.
getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p1,
location=sp_toys[,c("x","y")], level="class")
## SpatialWH SpatialAWH SpatialWC SpatialAWC class cluster
## 1 NA NA 0.8078698 0.5929504 1 NA
## 2 NA NA 1.0000000 1.0000000 2 NA
## 3 1.0000000 1.0000000 NA NA NA 1
## 4 0.8323893 0.6493279 NA NA NA 2
res1.1 <- getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p1,
location=sp_toys[,c("x","y")], level="dataset",
metrics=c("SpatialARI","SpatialAccuracy"),
fuzzy_true = TRUE, fuzzy_pred = FALSE)
res2.1 <- getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p2,
location=sp_toys[,c("x","y")], level="dataset",
metrics=c("SpatialARI","SpatialAccuracy"),
fuzzy_true = TRUE, fuzzy_pred = FALSE)
res1.2 <- getPartitionMetrics(true=sp_toys$label, pred=sp_toys$p1,
level="dataset", metrics=c("ARI"))
res2.2 <- getPartitionMetrics(true=sp_toys$label, pred=sp_toys$p2,
level="dataset", metrics=c("ARI"))
cbind(bind_rows(list(res1.1, res2.1), .id="P"),
bind_rows(list(res1.2, res2.2), .id="P")) %>%
pivot_longer(cols=c("SpatialARI", "SpatialAccuracy", "ARI"),
names_to="metric", values_to="value") %>%
ggplot(aes(x=P, y=value, group=metric)) +
geom_point(size=3, aes(color=P)) +
facet_wrap(~metric) +
theme_bw() + labs(x="Prediction")
When the evaluation is non-spatial-aware, P1 and P2 get the same ARI score. However, with spatial-aware metrics like SpatialARI and SpatialAccuracy, P2 gets a higher scores than P1.
Internal metrics
Last but not least, there are internal metrics for spatial clustering evaluation:
sp_toys$c_elsa <- getSpatialInternalMetrics(label=sp_toys$label,
location=sp_toys[,c("x","y")], level="element",
metrics=c("ELSA"))$ELSA
## the specified variable is considered as categorical...
sp_toys$p1_elsa <- getSpatialInternalMetrics(label=sp_toys$p1,
location=sp_toys[,c("x","y")], level="element",
metrics=c("ELSA"))$ELSA
## the specified variable is considered as categorical...
sp_toys$p2_elsa <- getSpatialInternalMetrics(label=sp_toys$p2,
location=sp_toys[,c("x","y")], level="element",
metrics=c("ELSA"))$ELSA
## the specified variable is considered as categorical...
s <- 3
st <- 1
p1 <- ggplot(sp_toys, aes(x, y,
color=p1_elsa)) +
geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() +
labs(x="",y="", title="P1", color="ELSA") +
scico::scale_color_scico(palette = "roma", limits = c(0, 1), direction=-1)
p0 <- ggplot(sp_toys, aes(x, y,
color=c_elsa)) +
geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() +
labs(x="",y="", title="C", color="ELSA") +
scico::scale_color_scico(palette = "roma", limits = c(0, 1), direction=-1)
p2 <- ggplot(sp_toys, aes(x, y,
color=p2_elsa)) +
geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() +
labs(x="",y="", title="P2", color="ELSA") +
scico::scale_color_scico(palette = "roma", limits = c(0, 1), direction=-1)
plot_grid(p0 + theme(plot.title = element_text(hjust = 0.5)),
p1 + theme(plot.title = element_text(hjust = 0.5)),
p2 + theme(plot.title = element_text(hjust = 0.5)),
nrow=1, rel_width=c(1,1,1))
Session info
sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Zurich
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] cowplot_1.1.3 igraph_2.1.1 ggnetwork_0.5.13 tidyr_1.3.1
## [5] dplyr_1.1.4 ggplot2_3.5.1 poem_0.99.2 BiocStyle_2.32.1
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 farver_2.1.2 fastmap_1.2.0
## [4] bluster_1.14.0 digest_0.6.36 lifecycle_1.0.4
## [7] sf_1.0-6 cluster_2.1.6 terra_1.5-21
## [10] dbscan_1.2-0 magrittr_2.0.3 compiler_4.4.2
## [13] rlang_1.1.4 sass_0.4.9 tools_4.4.2
## [16] mclustcomp_0.3.3 utf8_1.2.4 yaml_2.3.10
## [19] knitr_1.48 labeling_0.4.3 htmlwidgets_1.6.4
## [22] sp_2.1-4 classInt_0.4-10 scico_1.5.0
## [25] BiocParallel_1.38.0 KernSmooth_2.23-24 fclust_2.1.1.1
## [28] elsa_1.1-28 withr_3.0.1 purrr_1.0.2
## [31] BiocGenerics_0.50.0 desc_1.4.3 grid_4.4.2
## [34] stats4_4.4.2 fansi_1.0.6 e1071_1.7-9
## [37] colorspace_2.1-1 aricode_1.0.3 scales_1.3.0
## [40] MASS_7.3-61 cli_3.6.3 rmarkdown_2.27
## [43] ragg_1.3.2 generics_0.1.3 rstudioapi_0.16.0
## [46] spdep_1.3-6 DBI_1.2.3 cachem_1.1.0
## [49] proxy_0.4-27 parallel_4.4.2 BiocManager_1.30.23
## [52] s2_1.1.7 vctrs_0.6.5 boot_1.3-30
## [55] Matrix_1.7-0 jsonlite_1.8.8 spData_2.3.3
## [58] bookdown_0.40 clevr_0.1.2 S4Vectors_0.42.1
## [61] BiocNeighbors_1.22.0 clue_0.3-65 crosstalk_1.2.1
## [64] systemfonts_1.1.0 jquerylib_0.1.4 units_0.8-0
## [67] glue_1.8.0 pkgdown_2.1.1 codetools_0.2-20
## [70] DT_0.33 gtable_0.3.5 deldir_2.0-4
## [73] raster_3.5-15 munsell_0.5.1 tibble_3.2.1
## [76] pillar_1.9.0 htmltools_0.5.8.1 R6_2.5.1
## [79] wk_0.9.4 textshaping_0.3.6 Rdpack_2.6.1
## [82] evaluate_0.24.0 lattice_0.22-6 highr_0.11
## [85] rbibutils_2.3 bslib_0.8.0 class_7.3-22
## [88] Rcpp_1.0.13 xfun_0.46 fs_1.6.4
## [91] pkgconfig_2.0.3