Question
In R, mean()
and median()
are standard functions which do what you'd
expect. mode()
tells you the internal storage mode of the object, not the
value that occurs the most in its argument. But is there is a standard library
function that implements the statistical mode for a vector (or list)?
Answer
One more solution, which works for both numeric & character/factor data:
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
On my dinky little machine, that can generate & find the mode of a 10M-integer vector in about half a second.
If your data set might have multiple modes, the above solution takes the same
approach as which.max
, and returns the first-appearing value of the set of
modes. To return all modes, use this variant (from @digEmAll in the
comments):
Modes <- function(x) {
ux <- unique(x)
tab <- tabulate(match(x, ux))
ux[tab == max(tab)]
}