Archive for the ‘BI’ Category

Baltimore Can Stack ‘Em Up - Prequel

Monday, April 21st, 2008

A special thank you goes to Lou Spirito, Graphics Director at the Baltimore Sun for providing the graphs below and some valuable insight.

Recently, I wrote a blog post about the homicide rate in Baltimore and included the second graph below, which can be found here.

Lou was gracious enough to share with me that the original graph that was designed for the article was the second one below, which is a bar chart overlaid with two series. Then, the chart was redesigned to be the line chart shown below. Prior to the release, it was then reverted back to the originally designed bar chart. I have said before that I prefer line charts to stacked bar charts, but this may be an exception. Here is a quote from Lou that explains part of the design choice.

“I agree with your observation that in this particular case the bars are superior to the line chart for the simple reason that the relationship of the “annual” and “first quarter” data remains unified — this relationship gets lost to some degree in the line chart.” [Lou Spirito]

Because the two series have a significant scale variance, it is helpful to use the bar chart where you can add the values to the smaller series that has less variance between the years than the second series. Adding values to the line chart would make it unreadable.

Granted, both would work well and serve the purpose, but I really like the bar chart with the two overlaid series. Also, the use of color is clean, neat and effective without being gaudy. I may have changed the labels on the x-axis to be every five or ten years, but otherwise it is well designed.

Content is king and sometimes that means going unconventional. See Lou’s comment below.

“We take great care to design graphics with content driving the way. Sometimes it means breaking standard convention at the risk of fielding criticism…” [Lou Spirito]

Intermediate design before reverting back to the original:

Original and Final Design:

Which one do you think is more effective and why?

Win an Apple iPhone or S. Few Book - Excel Dashboard Competition

Thursday, April 17th, 2008

During the months of April and May 2008, BonaVista Systems is running an Excel Dashboard competition. I have said before that their MicroCharts product is exceptional for dashboard design and improving presentations or Excel analytics. The new version (3) has a lot of great new features, which can be found here.

MicroCharts is the first software, to my knowledge, that has a solution for creating Sparklines in Excel without having to finagle using the old camera feature. MicroCharts is easy to use and truly a great product. For under $200, you can get this add-on for Excel. You can download MicroCharts and try it for free for 30 days.

For the last few days, I have seen many of the blogs that I read promote this competition. I have been hesitant to follow suit because of redundancy, but also believe that it would be a shame for my readers to miss out on this event.

Winners of the competition will receive:

Some other exceptional blog/web sites that you should check out are:

Baltimore Can Stack ‘Em Up!

Tuesday, April 8th, 2008

The bar chart below made the cover of the Baltimore Sun newspaper on April 7, 2008. It definitely caught my eye because of its design and my passion for data/info visualization. In the past, I have been a bit unkind to stacked bar charts, as a quick glance through the previous blog posts could quickly confirm. My first impression was that this was going to be another example of what not to do using stacked bar charts.

After taking a closer look, this turns out to be a good example of visualizing data. Sure, it’s no line graph, but this chart is very effective for one fundamental reason. Both series, Annual Homicide Total and First Quarter numbers, have the same baseline - zero. By having both series start at zero, the chart is turned into a simple bar chart with two overlaid series versus a stacked bar chart.

The values for the first quarter numbers are labeled probably because they would be hard to determine due to the scale being high to accommodate for both series. My guess would be that this chart was embellished with a graphic program like, Adobe Illustrator. However, the same results could be done in Excel with a little finagling. The callout boxes are strategically placed to point out the significant figures like lowest, highest, starting and prior year.

Crime 23 yr Low

Source: Baltimore Sun & Baltimore Police Dept.

The Google map below accompanied the bar chart from above on the Baltimore Sun’s website. Having a vested interest in the Baltimore area, I quickly looked to see where the most homicides were and if there seemed to be a pattern. I love the interactivity of Google maps driven by a set of data!

Map

Source: Data compiled by Sun reporter Gus Sentementes using information from the City of Baltimore. Baltimoresun.com designer Stephen Mekosh produced the Google Map mashup.

For my fellow The Wire (HBO Series) enthusiasts, you can see if the West side has more homicides than the East side thanks to the Marlow and Barksdale crews. I’m sure that Pryzbylewski (a.k.a. “Prez”) and Detective Freamon would love to see these stats pinned to their board.

Capital One - The Analytics Superpower, Really?

Monday, March 24th, 2008

Prior to Support Analytics’ move to Maryland last summer, we were located in Richmond, Virginia. Everyone in Richmond and probably the country knows the credit card company Capital One (a.k.a. Cap One), but may not know that their headquarters is located in McLean, Virginia. They also have a large facility in Richmond. For the purpose of this post, I am only focusing on their credit card business.

imagesIf you have read Tom Davenport’s book, Competing on Analytics: The New Science of Winning and Ian Ayres’ book Super Crunchers: Why Thinking-by-Numbers Is the New Way to Be Smart, you would get the sense that Capital One is an analytics superpower. Or, to use Ian’s term, he would say they are a “super cruncher”. Now, all things considered, Capital One has a enormous amount of data that they need to understand, analyze and create algorithms from to efficiently and effectively run their business. No easy task.

Here’s my question. If they are such an analytics superpower, why do I get business credit card applications when I already own one of their business credit cards? This hasn’t happened once by accident. I get one about every two weeks. If they can’t differentiate between who is a current customer versus who is a prospective customer, then I can’t continue to consider them in the elite sector of analytics. Sorry.

41432786 capitalone203

Think of the dollars wasted through this process. Not only is there the postage fee, but there is also the working cost to: store data, print the applications and ship them out. I won’t even get into the whole notion of saving trees or the annoyance factor.

Shift Happens!

Friday, March 21st, 2008

This video is a little dated (been out about a year), but is well worth the 6 minutes it takes to watch.

The presentation is crisp and very well designed with a great shock factor. People using PowerPoint could do well to mimic the design of this presentation and put away the fifteen bullets per slide.

My suggestion: embrace change, evolve while taking responsibility for your own development and remove the word “can’t” from your vocabulary.

Some quotes:

  • “The top 10 jobs in 2010 didn’t exist in 2004 and we are preparing students for jobs that don’t even exist in order to solve problems we don’t even know are problems yet.”
  • “If MySpace subscribers were a country, it would be the 11th largest in the world.”
  • “Who answered all of our questions before Google (B.G.)?”
  • “By 2049 a $1,000 computer will exceed the computational capabilities of the human race.”
  • “About 1.5 exabytes of unique information will be generated this year, which is more than in the previous 5,000 years.”

Here is a table that defines byte sizes:

Bytes

(Source: Wikipedia)

Microsoft: Office 14 and Windows 7

Monday, March 17th, 2008

Microsoft Office

It’s been just over a year since the release of Microsoft Office™ 2007 (code named Office 12). The next version, code named Office 14, is rumored to be released in the first half of 2009. With only two years between the releases, how much can really change? Office 2007 had quite a few changes, especially in the look and feel of the user interface. The beta release is expected soon (first half of 2008).

According to Paul Thurrott’s site, Microsoft is estimating a 20% increase in dollars spent on research and development compared to Office 2007. I doubt they will increase the price from $400 to $480, which would be a 20% increase. In fact, I think you will find a less than stellar revenue stream from the next release. We all know the corporate world drives the sales for Office. In my opinion, without the vast corporate entrenchment, other free products would prevail.

What I found interesting is that they are going to try to bring online access to some of the Office products, like Excel in the Office 14 version. This enhancement would seem logical when compared to a competitor like Google Spreadsheets, who already provides this feature. They need this enhancement just to stay competitive. It will be interesting to see if they enhance any of the charting tools that Excel has consistently lacked.

Some free alternatives to Microsoft are: Google Docs & Spreadsheets, Zoho and Open Office.

My personal opinion is for Microsoft to spend a little more time on research and development and deliver a superior product instead of pushing out a new version every few years. Bugs like, in-cell charting should never have made it to market. Here at Support Analytics, we use the Office 2003 suite every day (mostly Excel, Word, Project, Access, PowerPoint and Outlook).

google logogvz zohologonew

Windows™

Windows™ 7, the next version of Microsoft’s operating system (codename - Vienna ) has been rumored to be out sometime between 2009 and 2011. At first, it was leaked that Windows™ 7 could be out as early as next year. The next rumor was, don’t hold your breath until sometime in the 2011 vicinity. It’s no secret that Windows™ Vista was the second coming of the disastrous Windows™ ME, which I owned a while back. We all know that Windows™ XP will be around until the next operating system is released. I am no OS expert, but I don’t see companies throwing XP by the wayside and loading up Vista. images

What’s pretty entertaining is that there is a downgrade (a.k.a. revert to an older version) to go from Vista to XP.

I have never owned a MAC computer in the many years I have been a user dating back to the ’80s when my parents bought a Commodore 64. Ah, 64kbs of memory and the old dot-matrix printer, those were the days.

I think we are starting to see Apple really capitalize on Vista’s low adoption, which brings me to my dilemma. Is now the time to go with a MAC OS? If it wasn’t for all of the software I own being PC only, I just might. Maybe a new MAC Air laptop would suppress my appetite…

Commodore 64

320px-Commodore64

Horizontal Stack!

Monday, March 3rd, 2008

For some reason, I can’t seem to pass up the opportunity show why stacked bar charts are usually a poor option when comparing more than one series over time.

Below is another example, taken from the last BusinessWeek edition of 2007.

If you are going to show similar data and are heart-set on a horizontal stacked bar chart, here is one tip I would recommend using the example below:

Don’t fill in the space from the amount invested series (aqua) until the percent return value (on right) with black fill. By filling the space in black, It gives the illusion that there is a third series and the total of all three amounts are exactly the same from ‘03 to ‘07.

vcreturn

I challenge you, my devoted readers, to submit an example where you think the best chart design is a stacked bar chart.

As a bonus, here is another great example where sparklines can be used to show multiple data series over time without taking up much “real-estate”. The amount of space taken up could have been a lot smaller, but you get the point; start adding sparklines to your toolkit!

vcinvest

Some things just do not stack well

Monday, February 11th, 2008

The old stack ‘em up trick…

losing ground

This graph is renders beautifully. From the x-axis labels being shortened to the lack of chartjunk in the form of gridlines and excessive tick marks. The colors definitely make the chart stand out. I don’t think many would argue that point.

If you take a look at an earlier post, I featured another visualization taken from the same page of BusinessWeek’s January 28th issue. As you will see, the color scheme for the whole page was green and yellow.

This graph shows two points very well!

  1. The first point is the trend of “The Detroit Three” shown in green below. I can quickly and accurately see the initial jump, plateau and then gradual decline.
  2. The second point that is definitive is the total for both the Foreign series and The Detroit Three series. The total sharply rises until 2000 and then starts a gradual decline over the next seven years.

gt500 m6

What is almost impossible to determine is the change in Foreign sales from year to year and over the entire sixteen years. Intuitively, I can see that after about year 2000 the yellow portion is larger because the total is about the same while the green section gets smaller. Simply put, the reason a stacked bar chart is a poor choice when comparing more than one series over time is due to the baseline not being the same for the second (yellow) series.

For this example, the point isn’t what the total sales were over sixteen years. The point is how one compare to the other. I know this from the title of the page being, Detroit is still behind, despite hard-won gains.

Point: When showing more than one series over time (time being the key here) the most logical choice should be a line graph. I’ve seen some horrid stacked bar charts with many more segments. On a rare occasion, I may use a stacked bar chart only when time or trending is not a factor.

ESPN Data Visualization

Tuesday, February 5th, 2008

Below you will find some of the best data visualizations on the web via ESPN’s website. On the bottom right of ESPN’s homepage, there is a poll of the day usually involving breaking news or a hot sports topics.

Below is one example of the daily poll.

Poll

Once you click the vote button, the results are posted in the form of a horizontal bar chart with values list at the end (shown below). Also, they show the total number of votes cast so you can determine how statistically significant the results are. Finally, you can click on the View Map button to see the results displayed on a map.

Results

Below is one example where there were only two options. I imagine the voting system uses the location of the IP address to determine where you are voting from. The states in blue voted ‘Yes’ and the ones in red voted ‘No’. The states shown in gray split the votes and didn’t fall one way or the other.

Perfect Season

The image below has an example of another poll where Tiger Woods ran away with the votes. I point out this example because if you look at Montana, you will see that the state is blue. However, when you look closer (it’s hard to see here) you can see that there was only one vote cast. Not significantly significant if you ask me. Also, notice that Alaska is in black due to no votes being cast.

1-29-08 Poll

Below you will see yet another example. The international results are located in the globe on the right to see how the votes tally outside of the United States.

ESPN Map 1

Finally, the last example shows that most of the U.S. thought that the Dallas Cowboys were going to the Super Bowl…

ESPN Map

I like these data visualizations because they are interactive, intuitive and valuable. Also, ESPN uses bar charts instead of pie charts, which always helps with the graphs. I like to see how my home state votes on different topics and often find it interesting how biased states are toward local sports figures or teams. Shown above, you can see that 81% of voters in Texas thought the Cowboys were going to the Super Bowl compared to 60% of all voters.

Spending InfoVis!

Tuesday, January 15th, 2008

In a recent issue of BusinessWeek , I found the picture shown below. A simple pie chart would show this data, but this adds a certain flare that caught my eye. Maybe it only caught my eye because I live and breathe data analysis and visualization… Regardless, it’s similar to a horizontal stacked bar chart, but with more pizzazz.

I have stated before that I don’t like stacked bar charts, and I don’t. Whenever you compare more than one component over time, it becomes very ineffective. Once you get beyond the first series of data, the baseline is not the same, making a comparison difficult. However, when you only compare a few pieces of data without time on an axis, a horizontal stacked bar chart can be effective. You can visit an earlier post to get more information on stacked bar charts.

In the graphic below, there are some things I might have done differently. For example, the last 3% is a little hard to distinguish because of how small it is compared to the rest. The first and last colors are very hard to differentiate in the print version. In this picture, the colors are more defined, making the comparison easier.

I share this with you to promote more abstract thinking when it comes to presenting data without losing effectiveness. I can tell you this: if I was an Executive and someone brought me this visualization instead of a pie chart, I would be impressed!

Visualize this: you’re watching a presentation and the slides are gliding by with every imaginable abuse of PowerPoint: fifteen bullets per slide, clip-art images, data-packed charts that aren’t even visible, goofy transitions, then along comes this slide. The only thing on the slide is this picture below. All of a sudden ears perk up and slouching turns to posture only a second grade teacher could be proud of. Dare to be different, yet effective!

03mac7

 

 

 

 

 

 

 

 

 

 

Without regard to any minute details or scale, I replicated this visualization using Excel, which is shown below. I literally spent about seven minutes creating this in Excel. Granted it looks a little better when the image isn’t modified to fit this blog, but you get the idea. In a future post, I may show how this is done in a screencast, which I can guarantee won’t take longer than a minute.

DSA Dollar

 

 

 

 

 

 

 

Stay tuned!