Skip to contents

Computes a selection of external fuzzy clustering evaluation metrics.

Usage

getFuzzyPartitionMetrics(
  hardTrue = NULL,
  fuzzyTrue = NULL,
  hardPred = NULL,
  fuzzyPred = NULL,
  metrics = c("fuzzyWH", "fuzzyAWH", "fuzzyWC", "fuzzyAWC"),
  level = "class",
  nperms = NULL,
  verbose = TRUE,
  returnElementPairAccuracy = FALSE,
  BPPARAM = BiocParallel::SerialParam(),
  useNegatives = TRUE,
  usePairs = NULL,
  ...
)

Arguments

hardTrue

An atomic vector coercible to a factor or integer vector containing the true hard labels.

fuzzyTrue

A object coercible to a numeric matrix with membership probability of elements (rows) in clusters (columns).

hardPred

An atomic vector coercible to a factor or integer vector containing the predicted hard labels.

fuzzyPred

A object coercible to a numeric matrix with membership probability of elements (rows) in clusters (columns).

metrics

The metrics to compute. See details.

level

The level to calculate the metrics. Options include "element", "class" and "dataset".

nperms

The number of permutations (for correction for chance). If NULL (default), a first set of 10 permutations will be run to estimate whether the variation across permutations is above 0.0025, in which case more (max 1000) permutations will be run.

verbose

Logical; whether to print info and warnings, including the standard error of the mean across permutations (giving an idea of the precision of the adjusted metrics).

returnElementPairAccuracy

Logical. If TRUE, returns the per-element pair accuracy instead of the various parition-level and dataset-level metrics. Default FALSE.

BPPARAM

BiocParallel params for multithreading (default none)

useNegatives

Logical; whether to include negative pairs in the concordance score (tends to result in a larger overall concordance and lower dynamic range of the score). Default TRUE.

usePairs

Logical; whether to compute over pairs instead of elements Recommended and TRUE by default.

...

Optional arguments for poem::FuzzyPartitionMetrics(): tnorm. Only useful when fuzzy_true=TRUE and fuzzy_pred=TRUE.

Value

A dataframe of metric results.

Details

The allowed values for metrics depend on the value of level:

  • If level = "element", the allowed metrics are: "fuzzySPC".

  • If level = "class", the allowed metrics are: "fuzzyWH", "fuzzyAWH", "fuzzyWC", "fuzzyAWC".

  • If level = "dataset", the allowed metrics are: "fuzzyRI", "fuzzyARI", "fuzzyWH", "fuzzyAWH", "fuzzyWC", "fuzzyAWC".

Examples

# generate fuzzy partitions:
m1 <- matrix(c(0.95, 0.025, 0.025, 
               0.98, 0.01, 0.01, 
               0.96, 0.02, 0.02, 
               0.95, 0.04, 0.01, 
               0.95, 0.01, 0.04, 
               0.99, 0.005, 0.005, 
               0.025, 0.95, 0.025, 
               0.97, 0.02, 0.01, 
               0.025, 0.025, 0.95), 
               ncol = 3, byrow=TRUE)
m2 <- matrix(c(0.95, 0.025, 0.025,  
               0.98, 0.01, 0.01, 
               0.96, 0.02, 0.02, 
               0.025, 0.95, 0.025, 
               0.02, 0.96, 0.02, 
               0.01, 0.98, 0.01, 
               0.05, 0.05, 0.95, 
               0.02, 0.02, 0.96, 
               0.01, 0.01, 0.98), 
               ncol = 3, byrow=TRUE)
colnames(m1) <- colnames(m2) <- LETTERS[1:3]
getFuzzyPartitionMetrics(fuzzyTrue=m1,fuzzyPred=m2, level="class")
#> Comparing between a fuzzy truth and a fuzzy prediction...
#> Running 100 extra permutations.
#> Standard error of the mean NDC across permutations:0.00233
#>     fuzzyWC    fuzzyAWC class   fuzzyWH   fuzzyAWH cluster
#> 1 0.3445840  0.04734015     1        NA         NA      NA
#> 2 0.7242508 -0.08356488     2        NA         NA      NA
#> 3 0.7520319  0.03308862     3        NA         NA      NA
#> 4        NA          NA    NA 0.9359492  0.8172083       1
#> 5        NA          NA    NA 0.9214151  0.8093812       2
#> 6        NA          NA    NA 0.1588990 -0.8914500       3

# generate a fuzzy truth:
fuzzyTrue <- matrix(c(
  0.95, 0.025, 0.025, 
  0.98, 0.01, 0.01, 
  0.96, 0.02, 0.02, 
  0.95, 0.04, 0.01, 
  0.95, 0.01, 0.04, 
  0.99, 0.005, 0.005, 
  0.025, 0.95, 0.025, 
  0.97, 0.02, 0.01, 
  0.025, 0.025, 0.95), 
  ncol = 3, byrow=TRUE)
# a hard truth:
hardTrue <- apply(fuzzyTrue,1,FUN=which.max)
# some predicted labels:
hardPred <- c(1,1,1,1,1,1,2,2,2)
getFuzzyPartitionMetrics(hardPred=hardPred, hardTrue=hardTrue, fuzzyTrue=fuzzyTrue, nperms=3, level="class")
#> Comparing between a fuzzy truth and a hard prediction...
#> Standard error of the mean NDC across permutations:0.106
#> You might want to increase the number of permutations to increase the robustness of the adjusted metrics.
#>     fuzzyWC  fuzzyAWC class    fuzzyWH  fuzzyAWH cluster
#> 1 0.7195238 0.3972369     1         NA        NA      NA
#> 2 1.0000000       NaN     2         NA        NA      NA
#> 3 1.0000000       NaN     3         NA        NA      NA
#> 4        NA        NA    NA 1.00000000  1.000000       1
#> 5        NA        NA    NA 0.06166667 -1.978836       2