Tuesday, January 20, 2015

Boxplots grouped by categories

Simulating self-organizing systems often requires a compact representation of numerical data. One way to achieve this is via Boxplots,which indicate statisical distributions of data series through their quartiles. Usually, a box shows the median, the lower and the upper quartile values of a data series. The whiskers depict the lowest datum still withing 1.5 IQR (interquartile range) of the lower quartile and the highest datum still within 1.5 IQR of the upper quartile. Boxplots depict a good deal of information for statistical interpretation of data. Most of the tools for statistical computing and graphics can easily build boxplots, e.g., the boxplot function in R, the boxplot function in MATLAB, and the boxplot function in Python. As you can see, there are many affordable tools to display boxplots, but things get tricky if there is a need to group in categories. To achieve this, Sergii Zhevzhyk wrote a Python program using the matplotlib library which supports customization and adaptation of graphs. Data are loaded from the given csv files. One boxplot sample is shown below. The source code of our implementation can be found at GitHub.         
The image above shows the results of two measurements for different type of the candies. The comparison of two measurements can be done without problem, because they placed close to each other and have different colors. Two files (first file, second file) contain the data for this boxplot.

Links:

No comments:

Post a Comment