Mike's Technology and Finance Blog: The Sample Mode

Monday, June 1, 2015

The Sample Mode

Part of Mike's Big Data, Data Mining, and Analytics Tutorial

The sample mode is a statistic that reflects which value occurs most frequently in the sample. It is a suitable measure of center for nominal data. It can also be used on higher level data (ordinal and continuous).
Given the following 20 values generated between 1 and 3:

#Get 20 random integer values uniformly distributed between 1 and 3
x<-round(runif(20,1,3))
#sort and display the values 
x<-x[order(x)]
x

##  [1] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3

These values can be summarized as frequencies of individual values (frequency referring to the number [count] of times each individual value appears in the set):

x_freq<-table(x)
x_freq

## x
##  1  2  3 
##  4 13  3

The mean can be determined by finding which table values are the highest:

names(x_freq)[which(x_freq == max(x_freq))]

## [1] "2"

Assessed graphically, the mode is the tallest bar:

R does not have a built in function to find the mode; however it is easy using the combination of table and which

names(x_freq)[which(x_freq == max(x_freq))]

## [1] "2"

Mike's Technology and Finance Blog

Monday, June 1, 2015

The Sample Mode

The Sample Mode

Part of Mike's Big Data, Data Mining, and Analytics Tutorial

Back to Mike's Big Data, Data Mining, and Analytics Tutorial

No comments:

Post a Comment