This article is written by Michael Phillips, the author of the book TIBCO Spotfire: A Comprehensive Primer, discusses that human beings are fundamentally visual in the way they process information. The invention of writing was as much about visually representing our thoughts to others as it was about record keeping and accountancy. In the modern world, we are bombarded with formalized visual representations of information, from the ubiquitous opinion poll pie chart to clever and sophisticated infographics. The website http://data-art.net/resources/history_of_vis.php provides an informative and entertaining quick history of data visualization. If you want truly breathtaking demonstration of the power of data visualization, seek out Hans Rosling’s The best stats you’ve ever seen at http://ted.com.
(For more resources related to this topic, see here.)
We will spend time getting to know some of Spotfire’s data capabilities. It’s important that you continue to think about data; how it’s structured, how it’s related, and where it comes from. Building good visualizations requires visual imagination, but it also requires data literacy.
This article is all about getting you to think about the visualization of information and empowering you to use Spotfire to do so. Apart from learning the basic features and properties of the various Spotfire visualization types, there is much more to learn about the seamless interactivity that Spotfire allows you to build in to your analyses.
We will be taking a close look at 7 of the 16 visualization types provided by Spotfire, but these 7 visualization types are the most commonly used.
We will cover the following topics:
Now let’s have some fun!
While working through the data examples, we used the Spotfire Table visualization, but now we’re going to take a closer look. People will nearly always want to see the “underlying data”, the details behind any visualization you create. The Table visualization meets this need.
It’s very important not to confuse a table in the general data sense with the Spotfire Table visualization; the underlying data table remains immutable and complete in the background. The Table visualization is a highly manipulatable view of the underlying data table and should be treated as a visualization, not a data table.
The data used here is BaseballPlayerData.xls
There is always more than one way to do the same thing in Spotfire, and this is particularly true for the manipulation of visualizations. Let’s start with some very quick manipulations:
These and other properties of the Table visualization are also accessed via visualization properties. As you work through the various Spotfire visualizations, you’ll notice that some types have more options than others, but there are common trends and an overall consistency in conventions.
Visualization properties can be opened in a number of ways:
It’s beyond the scope of this book to explore every property and option. The context-sensitive help provided by Spotfire is excellent and explains all the options in glorious detail.
I’d like to highlight four important properties of the Table visualization:
Color is a strong feature in Spotfire and an important visualization tool, often underestimated by report creators. It can be seen as merely a nice-to-have customization, but paying attention to color can be the difference between creating a stimulating and intuitive data visualization rather than an uninspiring and even confusing corporate report. Take some pride and care in the visual aesthetics of your analytics creations!
Let’s take a look at the color properties of the Table visualization.
We saw how the Table visualization is perfect for showing and ordering detailed information. It’s quite similar to a spreadsheet. The Bar Chart visualization is very good for visualizing categorical information, that is, where you have categories with supporting hard numbers—sales by region, for example. The region is the category, whereas the sales is the hard number or fact.
Bar charts are typically used to show a distribution. Depending on your data or your analytic requirement, the bars can be ordered by value, placed side by side, stacked on top of each other, or arranged vertically or horizontally.
There is a special case of the category and value combination and that is where you want to plot the frequencies of a set of numerical values. This type of bar chart is referred to as a histogram, and although it is number against number, it is still, in essence, a distribution plot. It is very common in fact to transform the continuous number range in such cases into a set of discrete bins or categories for the plot. For example, you could take some demographic data and plot age as the category and the number of people at that age as the value (the frequency) on a bar chart. The result, for a general population, would approach a bell-shaped curve.
Let’s create a bar chart using the baseball data. The data we will use is BaseballPlayerData.xls, which you can download from http://www.insidespotfire.com.
The vertical, or value, axis must be an aggregation because there is more than one home run value for each category. You must decide if you want a sum, an average, a minimum, and so on.
To copy an existing visualization, simply right-click on it and select Duplicate Visualization.
We can now compare the distribution of home run average and salary average across all the baseball teams, but there’s a better way to do this in a single visualization using color.
You will still see the distribution of home runs across the baseball teams, but now you will have a superimposed salary heat map. Texas and Cleveland appear to be getting much more bang for their buck than the NY Yankees.
Trellising, whereby you divide a series of visualizations into individual panels, is a useful technique when you want to subdivide your analysis. In the example we’ve been working with, we might, for instance, want to split the visualization by league.
Spotfire allows you to build layers of information with even basic visualizations such as the bar chart. In one chart, we see the home run distribution by team, salary distribution by team, and breakdown by league.
It’s time to introduce one of the most important Spotfire concepts, called marking, which is central to the interactivity that makes Spotfire such a powerful analysis tool. Marking refers to the action of selecting data in a visualization. Every element you see is selectable, or markable, that is, a single row or multiple rows in a table, a single bar or multiple bars in a bar chart.
You need to understand two aspects to marking. First, there is the visual effect, or color(s) you see, when you mark (select) visualization elements. Second, there is the behavior that follows marking: what happens to data and the display of data when you mark something.
From Spotfire v5.5 onward, you can choose, on a visualization-by-visualization basis, two distinct visual effects for marking:
The second option is not available in versions older than v5.5 but is the default option in Versions 5.5 onward.
The setting is made in the visualization’s Appearance property by checking or unchecking the option Use separate color for marked items. The default color when using a separate color for marked items is dark green, but this can be changed by going to Edit|Document Properties|Markings|Edit. The new option has the advantage of retaining any underlying coloring you defined, but you might not like how the rest of the chart is washed out. Which approach you choose depends on what information you think is critical for your particular situation.
When you create a new analysis, a default marking is created and applied to every visualization you create by default. You can change the color of the marking in Document Properties, which is found in the Edit menu. Just open Document Properties, click on the Markings tab, select the marking, click on the Edit button, and change the color.
You can also create as many markings as you need, giving them convenient names for reference purposes, but we’ll just focus on using one for now.
Marking behavior depends fundamentally on data relationships. The data within a single data table is intrinsically related; the data in separate data tables must be explicitly related before you configure marking behavior for visualizations based on separate datasets.
When you mark something in a visualization, five things can happen depending on the data involved and how you configured your visualizations:
Conditions | Behavior |
Two visualizations with the same underlying data table (they can be on different pages in the analysis file) and the same marking scheme applied. | Marking data on one visualization will automatically mark the same data on the other. |
Two visualizations with related underlying data tables and the same marking scheme applied. | The same as the previous condition’s behavior, but subject to differences in data granularity. For example, marking a baseball team in one visualization will mark all the team’s players in another visualization that is based on a more detailed table related by team. |
Two visualizations with the same or related data tables where one has been configured with data dependency on the marking in the other. | Nothing will display in the marking-dependent visualization other than what is marked in the reference visualization. |
Visualizations with unrelated underlying data tables. | No marking interaction will occur, and the visualizations will mark completely independently of one another. |
Two visualizations with the same underlying data table or related data tables and with different marking schemes applied. | Marking data on one visualization will not show on the other because the marking schemes are different. |
Here’s how we set these behaviors:
It’s not good to have the same marking for Marking and Limit data using markings. If you are using the limit data setting, select no marking, or create a second marking and select it under Marking.
You’re possibly a bit confused by now. Fortunately, marking is much harder to describe than to use! Let’s build a tangible example.
Save your analysis file at this point and at regular intervals. It’s good behavior to save regularly as you build an analysis. It will save you a lot of grief if your PC fails in any way. There is no autosave function in Spotfire.
Property | Value |
General | Title | Home Runs |
Data | Marking | Marking |
Data | Limit data using markings | Nothing checked |
Appearance | Orientation | Vertical bars |
Appearance | Sort bars by value | Check |
Category Axis | Columns | Team |
Value Axis | Columns | Avg(Home Runs) |
Colors | Columns | Avg(Salary) |
Colors | Color mode | Gradient Add Point for median Max = strong red; Median = pale yellow; Min = strong blue |
Labels | Show labels for | Marked Rows |
Labels | Types of labels | Complete bar | Check |
Property | Value |
General | Title | Roster |
Data | Marking | Marking |
Data | Limit data using markings | Nothing checked |
Appearance | Orientation | Horizontal bars |
Appearance | Sort bars by value | Check |
Category Axis | Columns | Team |
Value Axis | Columns | Count(Player Name) |
Colors | Columns | Position |
Colors | Color mode | Categorical |
Property | Value |
General | Title | Details |
Data | Marking | (None) |
Data | Limit data using markings | Check Marking |
Columns | Team, Player Name, Games Played, Home Runs, Salary, Position |
Now start selecting visualization elements with your mouse. You can click on elements such as bars or segments of bars, or you can click and drag a rectangular block around multiple elements.
When you select a bar on the Home Runs bar chart, the corresponding team bar automatically selects the Roster bar chart, and details for all the players in that team display in the Details table. When you select a bar segment on the Roster bar chart, the corresponding team bar automatically selects on the Home Runs bar chart and only players in the selected position for the team selected appear in the details.
There are some very useful additional functions associated with marking, and you can access these by right-clicking on a marked item. They are Unmark, Invert, Delete, Filter To, and Filer Out. You can also unmark by left-clicking on any blank space in the visualization.
Play with this analysis file until you are comfortable with the marking concept and functionality.
This article is a small taste of the book TIBCO Spotfire: A comprehensive primer. You’ve seen how the Table visualization is an easy and traditional way to display detailed information in tabular form and how the Bar Chart visualization is excellent for visualizing categorical information, such as distributions.
You’ve learned how to enrich visualizations with color categorization and how to divide a visualization across a trellis grid. You’ve also been introduced to the key Spotfire concept of marking.
Apart from gaining a functional understanding of these Spotfire concepts and techniques, you should have gained some insight into the science and art of data visualization.
Further resources on this subject:
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…