Skip to contents

`from_signatures_to_upset_df()` converts a data frame like that produced by binding many data.frames read in by `read_signature()` into a named list of vectors. Each vector in the list is named by the name column in the input signatures data frame and contains the minhashes (mins) for that sample. The `name` column is used for sample names. If the column does not exist, sample names are derived using the base name of filename. If and individual name is blank ("") or NA, that individual name is derived using the base name of filename. The input data frame must contain only one value for each of `ksize`, `scaled`/`num`, `hash_function`, `molecule`, and `seed` as signatures calculated with different values for these parameters are not comparable. The returned data frame will have the minhashes as row names and the shortened filenames as column names and can be plotted with `UpSetR::upset()` or `ComplexUpset::upset()`. If plotting with Complex Upset, additional metadata can be added to the data frame (and therefore the plot) by joining on the values of the minhash rownames.

Usage

from_signatures_to_upset_df(signatures_df)

Arguments

signatures_df

A data frame of multiple sourmash signatures created by combining many signatures read in by `read_signature()`.

Value

A an upset plot compliant data frame

Examples

if (FALSE) {
from_signatures_to_upset_df()
}