fct_lump {forcats} | R Documentation |
Lump together least/most common factor levels into "other"
fct_lump(f, n, prop, w = NULL, other_level = "Other", ties.method = c("min", "average", "first", "last", "random", "max")) fct_lump_min(f, min, w = NULL, other_level = "Other")
f |
A factor (or character vector). |
n, prop |
If both Positive Positive |
w |
An optional numeric vector giving weights for frequency of each value (not level) in f. |
other_level |
Value of level used for "other" values. Always placed at end of levels. |
ties.method |
A character string specifying how ties are
treated. See |
min |
Preserves values that appear at least |
fct_other()
to convert specified levels to other.
x <- factor(rep(LETTERS[1:9], times = c(40, 10, 5, 27, 1, 1, 1, 1, 1))) x %>% table() x %>% fct_lump() %>% table() x %>% fct_lump() %>% fct_inorder() %>% table() x <- factor(letters[rpois(100, 5)]) x table(x) table(fct_lump(x)) # Use positive values to collapse the rarest fct_lump(x, n = 3) fct_lump(x, prop = 0.1) # Use negative values to collapse the most common fct_lump(x, n = -3) fct_lump(x, prop = -0.1) # Use weighted frequencies w <- c(rep(2, 50), rep(1, 50)) fct_lump(x, n = 5, w = w) # Use ties.method to control how tied factors are collapsed fct_lump(x, n = 6) fct_lump(x, n = 6, ties.method = "max") x <- factor(letters[rpois(100, 5)]) fct_lump_min(x, min = 10)