The improvement of the information visualization techniques in business intelligence - design part

For my MSC thesis, I redesigned information visualization techniques (graphs) that are used in tool “SAOP Poročevalec” (SAOP Reporter), which works in the SAOP’s BI solution, called iCenter. I choosed four graphs, that communicate quantitative information the best: bar graph, horizontal bar graph, line graph and scatter plot. I redesigned them according to the synthesis of the information visualization’s theory and good practice. Criteria that has been considered in designing phase were data structure, scale & thick marks, legends, title & notifications, background, grid lines, and aspect ratio. Improvement recommnedations were evaluated by interviewing SAOP’s clients. Design improvement recommendations are now under SAOP’s management consideration.
  • Job: Graph Design
  • Mentor: dr. Jurij Jaklič
  • Keywords: Information Visualization, Business Intelligence, Interviews
  • MSc in PDF (Slovenian)

BAR GRAPH

Bar graph was generated by default settings and with random data (from test database), as were all other following three graphs. Only labels of category groups on horizontal (categorical) axis were set manually (months). The result is seen on the picture bellow. All sizes of graphs represented on this web page are identical to sizes of graphs generated by the tool. For redesigning, I used only SAOP Poročevalec’s tools and options. No Photoshop or any other software was used. Bars represent data structures of graph.

In design phase, bars were changed - “dark blue areas” with labels at the top of bars and black borderlines of bars were removed. Width of bars and distances between them were balanced, according to the recommended ratio. Both axes were thinned out and lightened, scale marks were reduced (on quantitative scale) and removed (on categorical scale), labels were lightened. Since there were no other categories, legend wasn't necessary and was removed; title was added. White area with the borderline at the bottom of the graph was removed. Grey background was changed to white to highlight data, grid lines were redesigned into continuous, light lines; also, aspect ration was considered (around 1 : 1,5). After design phase, interviewees had to pick up one of three versions of bar graph for which they think communicates the best: default graph (as seen above), new graph with aspect ratio or new graph without aspect ratio. The last one was chosen as the best - see image below.

HORIZONTAL BAR GRAPH

When generating horizontal bar graph, I manually changed labels of category groups on vertical (categoric) axis, in order to give graph more sense. Names are used instead of default numbers. The result can be seen on the picture bellow. Bars represent data structures of graph.

In design phase, “dark areas” with labels at the end of bars were removed. These dark areas appeared in graphs because “3D” option was enabled - when I disabled it, dark areas disappeared. Width of bars and distances between bars were set with consideration of recommended ratio. Both axes were thinned out and lightened, scale marks (thick marks) were reduced (on quantitative scale) and removed (on categoric scale). Labels of scale marks were lightened. Additional axes (at the top and on the right) were removed, as were in all considered graphs. Legend was removed and the title was added. White area with the borderline at the bottom of the graph was removed, background of the graph was whitened. Grid lines were completely removed since point of horizontal bar graph is in ordering values from high to low, or low to high, for the purpose of the comparative analysis. Once again, two examples were made and clients had to choose between default graph, graph with aspect ratio and graph without aspect ratio. The second one was chosen (new, with an aspect ratio) and it is seen below.

LINE GRAPH

Line graph is the best option for visualizing quantitative data through time (days, months, years etc.). The line represents data structure. There were no additional manual manipulations of data. You can see graph below. Let’s say, that numbers on horizontal axis represent days, and that numbers on a vertical axis represent profit in a small business.

The first problem with the default version of the line graph is that there is no “zero value” on the vertical (quantitative) axis. As such, quantitative scale has no basis (the actual basis here is around 820 units) and the shape of the line is not proportional to actual values. Since course of the line is important for comparative analysis, such graph can lead to misleading interpretations. Line was thicken and moved away from both vertical axes on account of scales’ correction. Scale marks were reduced on a quantitative scale and removed from the categoric scale. White area at the bottom was removed, as were the upper and the right axis of the graph. I excluded legend from the graph, added title, and set white background to expose the line. Dotted grid lines were removed, horizontal (discrete) grid lines were used instead. Two versions of redesign graph came out - one with unchanged aspect ratio and one with recommended aspect ration. All interviewee chose the latter (on the image below).

SCATTER PLOT

Scatter plot was probably the most difficult graph for redesigning, mainly because of two reasons: (1) design options for scatter plot were limited in SAOP Poročevalec (e.g. no “contour only” option for dots); (2) random data could not be generated in a manner that one dot would cover another one (similar values). Solution to the first problem was using partial transparency of the whole dot, meanwhile second problem was solved by manipulation of data in data tables. I wanted to see how data structures with similar values would behave. It happens in “real-life” scatter plots all the time. This was the only graph of all four graphs where manipulation of data tables was necessary. Scatter plot uses two quantitative scales, as it is obvious from the picture below.

Rectangular dots in graph area were too inaccurate to read values, in some cases (values 857,5 and 822,5) rectangles completely failed to communicate true values. Among all of four graphs that I processed in my MSc, scatter plot is probably the only one where particular values are almost as important as the patterns. For this reason, dots in data area should be as accurate as possible with the smallest amount of ink. I chose black contours instead of vertical rectangles. If we take a closer look at the scales, we can see that neither vertical neither horizontal scale doesn't start with the value 0; there is no basis. In case of scatter plot, where particular values must be clearly visible within pattern, we can make an exception and use the part of the scale that determines our values at its best. As Stephen Few suggested, in such case there should be a notice somewhere close to the graph, that warns user scale does not start with zero value. Since data structures (dots) should never touch axes, as it was in the case of line, I starting value 0 only on horizontal quantitative scale. For vertical scale, I decided to use scale which would allow values to be seen in the middle of my data area. As in previous cases, I decreased the number of scale marks (including smaller one) on both scales. Labels of scale marks were lightened, legend and additional axes were removed, title was inserted into the graph. To improve visibility of dots, background was changed to white. Grid lines were redesigned into continuous discrete lines. Since we are (also) interested in particular values, I kept both, vertical and horizontal grid lines. In the first example of newly designed graph, I used black contours that are not transparent. Also, graph is not resized according to the recommended aspect ratio. Despite that, this graph was chosen by majority of users. You can see it below.

Then I decided to make a version with transparent circles, where there would be only black contours. In such case, all underlying values be seen also. But as mentioned before, transparency could be set only for a dot as a whole. My dots “turned grey” when transparency was added (image below).

In other versions, I used bigger circles in order to compensate “lightening of dots” and to see values more clearly (but less accurately), but those versions were refused by all interviewees. One of these versions, also considering recommended aspect ratio, is seen below.

CONCLUSION

I consider my recommendations more as a design basis for further software development. At the end of interviews, users pointed out that these improvement suggestions should be built in, since designing graphs demands time, special knowledge and skills. In case of SAOP Poročevalec, I leave that decision to SAOP’s management. But as a researcher and designer I believe that user should always have an option - to design his/her own graph with appropriate tools and functions or just to select properly prepared template for visualizing quantitative information. In any case, communication must be effective and easy for the user.