home portfolioblog academic work about contact

Category Archives: Data Visualization

Data Visualization Ideas and Unclear Graphs

In my last post on data visualization, I had a couple tools recommended to me to try out. One had a limited trial period that I didn’t take advantage of in time, and the other, a tool by VisualizeFree, was too buggy to work. I uploaded my data easily enough (still had to clean it up first like some other ones), then I can’t view the actual visualization. Lame.  I could email support, but I’m too lazy to do that again.

I think I got more out of reading this article on what is (or should be) the point of data visualization.  Its based on a talk given by Manuel Lima of VisualComplexity.com, who curates that collection of data visualization examples and resources.  The main point from Lima was this:

“We need to make a transition from tools of curiosity to tools of functionality.”

Which is true, there are many tools that provide interactivity but not much substance. On the other hand tools for fun or aesthetics only can also drive innovation. I’m torn. It seemed like a lot of people were disagreeing over separating “information art” from data visualization, but this article states it more eloquently than I can. Manuel Lima also listed some key principles for data visualization that I did not know and should probably keep in mind:

  • form follows function
  • start with a question
  • interactivity is key
  • cite your source
  • the power of narrative

I’ve become much more judgmental toward charts and graphs thanks to blogs like Junk Charts, Flowing Data, and Simple Complexity.  So when I tried out one of the many Twitter measurement tools, Graph Edge, and received my first report, I was confused by the charts.

Followers and "Legitimate" Followers

Followers and "Legitimate" Followers

This graph shows my Twitter followers and “legitimate” followers (they have a definition for it). Because the lower limit of my y-axis is the number of legitimate followers, it gives a false impression of having a very low number of legitimate followers. Why not start with 0?

Net Twitter follower change

Net Twitter follower change

This line graph shows the follows, unfollows, and net follower change over time for my Twitter account. But I thought it was strange to include negative numbers, because it looked like I had negative 1 unfollows. Anyway, there’s room for improvement here.

Misc.

Today I created a Twitter list for #measure and web analytics people (like @ABTests did) on TLists. I think the value is that its curated so you can see recommended people who actually tweet about web analytics on a regular basis, and you can follow many people at once/discover new people. It may not be worthwhile once Twitter implements lists though. And I successfully integrated my GWO data with GA thanks to this post from the GWO Tricks blog!

The Latest Data Visualization Toys

Lately I’ve been seeing a ton of new data visualization tools/experiments that I thought were worth sharing. One is not really applicable to web analytics or reporting, but others are.

Trendly

I saw a post about Trendly on the Google Analytics Blog, its an application that leverages GA data to display it in a way that is supposed to help you understand the data better. From what I can tell, they use statistically significant numbers (between the upper and lower control) to show a trend-line that is less jagged and crazy. I think that is helpful for noting the big picture of change over a longer time span, but less so if you want to do a deep dive. Which might be besides the point. And then their other graph shows time vertically (I’ve always had issues with that metaphor), and the percentage of traffic from that source correlates to thickness. Its easier to show than tell so here it is:

Trendly time graph

Trendly time graph

Darwin’s Origin of Species

This is the English major in me geeking out, but I think this changing textual visualization by Ben Fry (go to the site to see it in action) is a great way to show how different editions of an important book like this can change. I’d like to see it for Lyrical Ballads. This is from the information aesthetics blog again, but I think its cool.

Darwin visualization

Darwin visualization

California Stimulus Map

I should probably find some more data visualization blogs, but I like the information aesthetics finds so much. This interactive map by Stamen Design shows where stimulus money is being spent in California, and what kinds of projects it goes to. The subject matter is worthwhile, but mainly I liked the vertical graph on the right side, I find that visualization easier to duplicate (in Excel!) and translate to different data sets than the one in Trendly. I will try to sneak it into a report at some point, and I’m sure I will get a WTF reaction, but it would be more interesting than the standard Excel graphs for me and the reader.

CA stimulus map

CA stimulus map

Otherwise I am having fun with my new G1 phone, new running shoes, and I’m going to Arizona this weekend–good times! Hoping to get some hiking in if it is not over 100° outside.

Results of the Poll: 2-way tie and vote tampering

Thanks to those who participated in the poll! I took a 2-week hiatus from writing in the blog because I kind of just got lazy. So there is a tie between writing a post about optimization and data visualization. When my sister found out that her puppy was on the ballot, the puppy faction mysteriously got a lot of votes within 1 hour. She pretty much openly admitted that she voted a few times for Bentley, but on the off chance that someone else voted for Bentley, I’ll dedicate a picture or 2 to him. First, is a comparison of 2 data visualization tools: Tableau and Swivel.

Data Visualization Tools: Swivel and Tableau

I decided to try Swivel mainly because I could access it without having to download anything, it’s free, and seemed like it had an easy-to-use UI.  Those are also the main benefits to using this tool, overall I didn’t think it was a huge step above using Excel. I guess I was expecting more utilization of Flash or interesting visualizations instead of standard charts and graphs because it was in a web environment. Basically you just upload the data, and then you have to “clean it up” so that Swivel can work with the data, and then choose colors and font/pixel sizes. I tried to take out some of the words in my chart so that I could use a readable font size and not have overlap, but now they are overlapping again. They did have a lot of options for embedding and sharing, so that was a plus. Overall it was a “meh” experience. Here’s the final chart that’s supposed to be of Browsers and OS:

Tableau is a software application you can download (I’m using a 2-week free trial), that does cost money. For personal use its a bit pricey, but maybe if all you did for fun was manipulate data into different graphs and charts it would be worth it. I thought they had some interesting visualization options, like heatmaps and text graphs with corresponding color and size variations. For me there was a steep learning curve, and a lot of options for customization that I wasn’t sure would be useful to me. Its a fairly simple drag and drop action to set up the axes, but I found myself performing actions by clicking on things and not really being sure of what I had just done to change the graph. It would have been nice to have a history of actions for people like me who just click on stuff. Still, I wasn’t blown away by the graphical representations in Tableau either. I guess its hard to be impressed by these visualizations when I read about innovations like this.

Tableau Heatmap

Tableau Heatmap

Torn Between Topics

Forgive me for the continued alliterative titles.  So I have a bunch of things I’d like to write about at the moment but not much overlap.  I’ve had Tableau recommended as a tool to use for data visualization, Avinash posted on trying out other web analytics tools on your own blog, and I just want to try out optimization for fun. My sister’s puppy is adorable, but not totally related to most topics on this blog.

For those of you kind enough to read and/or participate, what would be more interesting:

Which Metrics Matter Most?

Yay for alliterative titles. As I watch the new season of the Real Housewives of Atlanta, I also have to think about a web analytics reporting challenge.

Imagine you have a large, complex website with data pulls for monthly scorecards/dashboards, but the stakeholders viewing the scorecard have diverse, possibly mutually exclusive interests. For instance, one stakeholder might want to look at visitors to one section of the site, and another only wants to see the number of downloads for a different section. And you have to create one scorecard (for some reason I was just reminded of LOTR “one ring to rule them all”) to make everyone moderately happy.

Do I argue for creating more specialized scorecards or try to make a one-size-fits-all report? If I do choose the 2nd option, do I try to include basic traffic reports along with a few more specialized metrics? I could also make the executive decision that a metric like average time on site is a metric that will not lead to actionable insights, and replace it with one of the stakeholder’s ideal metrics. Probably no one will be 100% happy but we’ll see how it goes. I’m open to suggestions. :)

Here’s the latest cool data visualization found on information aesthetics that shows a tool leveraging media aggregation and overlaying it on a map of New Orleans. Pretty cool, but kind of confusing. I’m also looking for data visualization tools (free and online accessible preferably) to try out with my own data sources–any recommendations?

Citymurmur New Orleans

Citymurmur New Orleans

Seattle finally cooled down, which is nice because I have a gross bruise on my leg from soccer that no one should be able to see.  Also I’m excited for Web Analytics Wednesday this week!