Archive for the ‘Predictive Analytics’ Category

Data Analysis – Do You Really Mean Average?

Thursday, December 17th, 2009

In the corporate world I see this issue quite frequently.  Specifically, I will hear a request where the verbiage doesn’t align to what the requestor is ultimately looking for.  To illustrate, I have included an example below that shows ten different customers within a territory.  For each customer the total revenue year-to-date is listed.  To make the illustration relevant for this example, I listed Customer 5 with revenue that is exponentially higher than the rest. 

Now here’s the question I typically hear:

"What is the average customer size (revenue) for Territory A?"

Here is what that really means most of the time:

"What is a typical customer size (revenue) for Territory A?"

You may think it’s semantics, but it’s really not.  I don’t want to turn this into a statistics lesson, but average (mean) doesn’t always translate into typical.  Because Customer 5 is such an outlier, the average (sum of all customer revenue divided by count of customers) will be higher than if that customer fell into the typical range like the rest.

I have included the median revenue amount for the ten customers, which I think is probably a better predictor (in general) than the mean or average.  The median is simply defined as the number in the middle.  In reality, Customer 5’s revenue could be 875 zillion dollars and the median amount wouldn’t change.  When there are thousands of records and you need to know what the typical amount is, it’s often safer to choose median unless you want to take the time to calculate min, max, median, std deviation and mean to compare.

"In probability theory and statistics, a median is described as the numeric value separating the higher half of a sample, a population, or a probability distribution, from the lower half." [source]

Now the real question that would need to be answered is, can a typical territory have one very large customer or is this a unique situation and should not be considered normal?  Answering the preceding question will make all the difference in what calculation to use.  Most often I will include both.

Median vs. Average Example

It’s my belief that most people are simply familiar with the term average because it’s so commonly used.  The underlying reason that average is more prevalent in analysis is probably due to the fact that it’s very easy to calculate.  Before spreadsheet software was available that automated the median calculation, it was much more difficult to get a median amount even with a calculator.

As a data analyst, it’s prudent to know the difference between mean and median and when each is applicable.  Telling the CEO/CFO that the typical customer is roughly $131,000 when one customer is atypical and the true amount is more like $57,000 can be a career changer.

U.S. Infrastructure Dashboard

Wednesday, February 25th, 2009

Smart bridge technology – it’s unfortunate this wasn’t in place before the 2007 bridge collapse in Minneapolis.  When you think of data analysis, information and data visualization, and dashboards, it doesn’t have to be limited to business and corporations.  With nearly $500 Billion potentially allocated to infrastructure in President Obama’s stimulus bill, the Department of Transportation and technology companies could be teaming up to bring us "smart" bridges that will alert us to potential hazardous conditions.

Smart Bridge

[source]

Some of the data provided by the sensors (per the article) in the bridge that could be presented in a dashboard type display is:

  • Concrete temperature
  • Vibrations
  • Corrosion
  • Ice buildup
  • Traffic monitoring

Actionable analytics is what helps set corporations ahead of the pack.  There’s no reason we can’t bridge (pun intended) the gap to bring better technology and analytics to infrastructure and other non business sectors.

One of the enhancements to current bridges is a sensor that activates sprinkler heads in the pavement that would spread an anti-icing (potassium acetate) solution to prevent the bridge surface from forming ice.  This technology could make the signs reading "bridges freeze first" obsolete.

Fun fact: $500 Billion is more than the amount spent to build the entire Interstate Highway System in today’s dollars.

Predictive Analytics Example

Wednesday, October 22nd, 2008

As a follow up to some recent posts, this one will hopefully give you a better idea of predictive analytics in action. 

Recurring Orders

Just recently, I received an order from a dietary supplement supplier for multi-vitamins.  The ones that I buy made by AST Sports Science, are much better than your typical One-A-Day vitamins and are geared towards athletes.  I’ve been buying them for about ten years now through a few of different suppliers.

MultiPro_32X

Enough background, here is where the predictive analytics comes into play.  I take one vitamin a day and typically buy two bottles at a time.  There are exactly 100 vitamins in each bottle.  Now, I am disciplined and take one every day.  It doesn’t take a statistician to tell you that in roughly 200 days, I am going to need more.  Let’s assume that I miss 10% (20 days) due to forgetfulness.  Let’s also assume that I want my next shipment before I actually run out so there are no missed days (10 days). 

Days supply lasts +200

Missed days +20

Days for early notification -10

Reminder in 210 days

Predictive Analytics at Work

The supplier should be able to predict, with confidence, that I will need another shipment of two bottles in roughly 210 days.  My supplier, who I’ve been using for many years now, should now setup an alert on my account.  That alert could simply send me a reminder e-mail in 210 days about a reorder.  Or, the supplier’s system could setup the same alert and then gear offers or discounts toward what I’ve bought historically.  Again, not rocket science, the key or value added is to make it as easy as possible for the customer (me) to make repeat purchases without overdoing it or annoying the customer.  By doing these simple things, the company can probably reduce customer turnover and increase sales.  Heck, the company can pool customer data and determine what the most bought product is where vitamins are in the order.  It really is scary how powerful predictive analytics can be when utilized effectively.

Want to see some basic and advanced predictive analytics in action, just check out Amazon:

  • What do customer buy after viewing this item
  • Latest from authors you have previously purchased
  • Personalized Recommendations
  • Books frequently bought together
  • One-click ordering

Start today and think about how you can add value and improve your customer’s experience.  If you start thinking and talking about predictive analytics, I guarantee you will impress some people and most importantly, your boss.

Have any good examples or stories to share, post a comment or send them to me.

Related Posts:

Predictive Analytics – Hiring

Analytics for Everyone

Analytics for Everyone

Tuesday, October 14th, 2008

data 2

There was a great article in DM Review in September titled Analytics for Everyone by Swayne Hill.  What I took from the article is the idea that end users need more information and less data, especially historical data.  Being a true analyst, I want to have all the data to look for patterns, trends, seasonality and stories.  However, it’s not data that the end users need.  They need results and enabling information.  IT skills shouldn’t be needed to make sense of data like it typically is today.  An excellent example of this was found in this article and shown below.

“This point became clear to me recently when I visited my new wireless carrier’s local retail outlet. I was about to take a trip abroad and needed to upgrade to a new handset that would work overseas as well as at home. As the salesperson handling the changeover entered the data for my new account, I noticed two small rows of dots at the bottom of the screen labeled “churn” and “revenue,” and I immediately realized that this was a perfect example of the value of analytics embedded into the middle of a business workflow”. (DM Review)

Instead of the retail salesperson having to look at all of the customer’s data to understand how valuable the customer is to the company on the spot, it is presented immediately.  Granted it probably should have been less visible to the customer, but that’s irrelevant.  It’s not that difficult to program logic to determine the lifetime value of a customer.  It’s a little bit more difficult to define how likely the customer is to stay with the company.  Regardless, this type of predictive analytics is exactly what businesses need today.

In an environment where costs have been cut, cut more and cut to shreds, there needs to be more emphasis put on future looking and predicting outcomes and value.  Knowing what a customer did yesterday, last month or last year will only tell you just that.  If that historic data is used with specific logic, then it becomes very valuable as in the example Swayne gave above.

Predictive analytics

Stayed tuned for more posts on predictive analytics including examples.