8. How to draw a histogram


Often, the simplest way to visualize your dataset or sample will be through a histogram, also called frequency histogram. The histogram will eventually represent the distribution of the data and the shape of the chart will most certainly give you information on whether this distribution is symetrical, bimodal, skewed…

This type of chart requires that you bin the range of values of the dataset to create a frequency table. This means that you “create” a decent number of intervals (let’s say 10 or 20 to begin with) and check how many values fall into each of these ranges. In our example, the data range from 0,029652413 (the minimum, see table) to 0,995863502 (maximum, see table). Let’s round it up and say that it ranges from 0 to 1. We know “create” 10 intervals (0-0.1; 0.1-0.2; 0.2-0.3; …; 0.9-1). When done, we draw a histogram (chart with vertical bars) where each bar represents an interval. The height of the bar is defined by the number of values contained in the range.

 

parameterTo do this in MS Excel, we’ll use again the Data Analysis icon of the Analysis section installed by the Analysis ToolPak (see here). Prior to clicking on the icon, create the list of intervals on your worksheet: enter “0.1” in B1, then “0.2” in B2, … , and finally 1 in B10. Note that the intervals must be placed in ascending order. Now, click on the Data Analysis icon. This time, choose Histogram and click OK. When the Histogram dialog box shows up, fill in the necessary fields.

In Input Range, indicate the range of cells containing the dataset (A1:A100). In Bin Range, indicate the range of cells describing the intervals (B1:B10). In Output Range, choose where you want the table and chart to be located (see here). The last 3 boxes define what will be created by the tool.

 

 

paretoIf you choose pareto, you will get a 4-column table indicating the bins (intervals) and the number of data falling into those intervals in the first two columns, while the last two columns contain the same information, but presented in decreasing order of frequency.

 

 

 

 

 

 

cumulativeIf you choose Cumulative Percentage, you’ll get a table with the same intervals and frequencies, as well as a third column indicating the cumulative percentage.

 

 

 

 

 

 

 
 

 

final histogramIf you choose Chart Output, you finally get the histogram that we were trying to build since a few minutes. Note that the corresponding table of frequencies still shows up next to the chart.

Note that if you choose to tick Chart Output AND one of the two options above, your histogram will be replaced by the chart corresponding to the table that you chose (Pareto or Cumulative Percentage).

  Fant du det du lette etter? Did you find this helpful?
[Average: 0]