I. Overview
a) Box Plot is a statistical chart showing the distribution of a continuous dataset. It is used for data analysis when you seek to understand the data distribution characteristics or view outliers in the data.
b) Box Plot boasts an institutive display of the distribution of a continuous set of data and a quick demonstration of outliers. However, it cannot display changes of data.
c) Box Plot provides two data forms, namely, Result and Detailed.
Result: The data is results. In other words, the statistical data stored in the dataset is used as the statistics of the box plot.
Detailed: The data is detailed data. In other words, the statistics of the box plot is calculated by FR according to the dataset.
d) Suggested Reading: Chart Data, Chart Style, Chart Special Effects
II. Basic concepts
A box plot involves the following statistics:
Maximum: It refers to the maximum value to identify outliers, rather than the maximum value of the data. Maximum=Q3+1.5*IQR
Third quartile (Q3 / 75th percentile)): It refers to the values in the sample which fall within the 75% percentile in an ascending order.
Median (Q2 / 50th percentile)): It refers to the values in the sample which fall within the 50% percentile in an ascending order, i.e., the median value.
First quartile (Q1 / 25th percentile)): It refers to the values in the sample which fall within the 25% percentile in an ascending order.
Minimum: It refers to the minimum value to identify outliers, rather than the minimum value of the data. Minimum=Q1 - 1.5*IQR
Interquartile range (IQR) ): It refers to the difference between Q3 and Q1.
Outlier: It refers to data points which fall beyond the maximum and minimum.
There is a line in the middle of the box, which represents the median value (Q2).
The upper quartile and the lower quartile of the box represent the Third quartile (Q3) and the First quartile (Q1) respectively. It means the box contains 50% of the data. Therefore, the box height reflects the degree of data concentration. The degree of flatness indicates the concentration of data. In addition, the shorter the end line is, the more concentrated the data is.
The upper and lower edges represent the maximum and minimum values to define outliers. Data points falling beyond the maximum and minimum are seen as outliers.
III. Steps
1. Data preparation
1. When Detailed is used
Create a file with the name of ProductInfo and select the following Excel file:
Data preview: It is shown that detailed information is used.
2. When Result is used
Create a file with the name of ProductInfo2 and select the following Excel file:
Data preview: It is shown that Result data is used.
2. Insert a chart
Take the floating chart as an example. Reference Insert Floating Chart:
Click the [Box Plot] on the left of the popped-up [Chart Type] panel.
Click [Box Plot].
Click [OK], and a box chart is inserted successfully.
3. Data Binding
1. Details
The following settings are for binding the data of detailed information:
Double-click the chart
Click [Cell Element] on the right panel. Click [Data].
Select [Dataset data] for [Data source] and [ProductInfo] for [Dataset].
Set [Detailed] for [Data from].
Select [Feature] for [Category].
Use field names as series names, and add [Product1], [Product2] and [Product3].
2. Result
The following settings are for binding the data of result information:
Double-click the chart
Click [Cell Element] on the right panel. Click [Data].
Select [Dataset data] for [Data source] and [ProductInfo2] for [Dataset].
Select [Result] for [Data from].
Select [Feature] for [Category], and [Product] for [Series].
Select the corresponding fields from the dataset for the remaining quantiles.
4. Style Setting
1. General style
Select the chart and click on [Style]. There are various types of charts, but most of the style settings are common, which can be set by referring to Chapter Chart Style. Here we make the following settings:
Change the chart title:
Click [Title]
Change [Text] of the title to Box Chart.
2. Special style
Some settings in the [Style] > [Series] of the box chart are different from the basic style, and are introduced as below:
Border
Line width: Here you can set the line width of the border of the box chart.
Color: Here you can choose to use the color of each corresponding series, or use the same color.
Normal value: For detailed information with specific data points, you may set the style for normal value points.
Rule or Custom: You can select normal value data points and use the preset style or a customized image.
Preset style: If a preset style is used, you can set a shape for normal value points. Select [None] to display no normal value points in preview.
Color: If a preset style is used, you can set series colors as corresponding normal value points, or use the same color.
Radius: If a preset style is used, you can set the radius of normal value data points.
Outliers: For detailed information with specific data points, you may set the style for outliers.
Rule or Custom: You can select outliers and use a preset style or customized image.
Preset style: If a preset style is used, you can set a shape for outliers. Select [None] to display no outliers in preview.
Color: If a preset style is used, you can set series colors as corresponding outliers, or use the same color.
Radius: If a preset style is used, you can set the radius of outliers.
Note: No Normal Value or Outliers settings available for result information, as it contains no data points.
5. Set special effects
See Setting of General Special Effects in Chapter Chart Special Effects for detailed introduction of special effects settings.
6. Preview
1. Detailed: You can see the number of data points and all the calculated statistics in each series, as shown by the prompt. You can also see data point information on outliers.
2. Result: You can see all the recorded statistics in the dataset, as shown by the series prompt. You cannot also see data point information.