Monday, 31 October 2011

Herman Cain's 9-9-9 Plan

The name of Herman Cain came up a few times in my reading materials recently. A former business executive, he is in the news as the leading Republican presidential candidate in the primaries, and for his 9-9-9 Tax Plan.

Kaiser, of Junk Charts has made a couple of attempts to visualize the 9-9-9 Tax Plan, and its possible impact on people belonging to different cash income quintiles. One can read and learn about his first attempt here. In all fairness, the chart appears somewhat mangled and hence lacking clarity and the utility for the purpose.

At first glance, the lines looked far too thick and the labels were wrongly positioned. But the most critical omission is the missing out on 6.2% of lowest quintile population who remains undisturbed in this tax shake-up.

The data table is available here. I decided to keep the vertical and horizontal axes similar. The horizontal axis labels are brought down below and major gridlines are formatted to create quintile panels every 20 percentage points. The chart lines, which were plotted overlapping each other in the original chart were plotted within their respective quintile panel to render legibility. Finally, labels were placed to mention the increase/decrease of tax for the percentage of people in the quintile.

The new chart clearly shows the answer to "How Tax hike will affect different income groups"? - The higher the income, the greater is the tax relief for a larger number of people.


Friday, 28 October 2011

Using Box Plots Differently


For those of us into performance consulting, Box plot (Know more about them here), is a wonderfully simple and efficient tool to help us get about our job. As every coin has two faces, every data-set has two aspects. One, is the "feel-good" part of it, the part which tells us about the successes and the top performers. Second, is the "ouch" bit of it, that gets us notice the outliers.


Usually, the Box-and-Whisker diagram is used to plot and separate data as the smallest and largest values, the median, and the lower and the upper quartiles. But, the shape and the style of it can be applied to depict other sets of data as well. Here, I've used the B&W template to depict the monthly temperatures of my home city - Calcutta.


The necessary data was gathered from here and there. It included the data for record monthly highs and lows, which were depicted in the chart by the whiskers, and the average monthly range of normal temperature, which is represented by the box, and the average monthly temperature by the line. The maximum and the minimum temperature, of the months of a specific year, taken here as 2010, is represented as a range by the violet-column to show its position vis-a-vis the Box and whisker graph. 


The final look:

Friday, 21 October 2011

US Open Tennis

Charting a sporting event is always great fun. More so, if that happens to be the vibrant US Open Tennis Championships which I came across in this piece. But to me, it appeared somewhat wayward and lacking purpose. Most of its charts could have been more efficiently represented through a table, and the only one which made some sense, or added some value was the one which depicted US women’s singles champion by country over time (1900 - 2011).

I decided to approach this from a different perspective. Firstly, I separated the amateur years from the open era which began in 1968. From 1881-1911, the US Open used a challenge system whereby the defending champion automatically qualified for the next year's final, creating in the bargain, some unbelievable feats like Richard Sears remaining undefeated in the tournament and winning the inaugural seven editions of the Championships. Furthermore, the difficulty in travelling to and from the USA in the earlier years can also be attributed to US ladies winning 26 of the first 28 editions, with only Mabel Cahill from Britain managing to break the stranglehold, and the men topping it by winning all but one between 1881-1925, when in 1903, Lawrence Doherty triumphed.

It also threw up some quirky facts:
  • Players from 5 nations have won the US Open Men's & Ladies Championship before the Open Era. 
  • Players from 11 nations have won the US Open Men's & Ladies Championship in the Open Era. 
  • In 44 years of US Open since 1968, a total of 44 players has won either the Men or the Ladies Championship. 
My Chart looks as follows.

Pick a Country:

Pick a Year:

and finally, Pick a Champion:

Wednesday, 12 October 2011

Views of Parties' Ideologies

US Politics has always intrigued me. Firstly because it is the oldest democracy in the world, and is one of the strongest. Secondly due to its bipartisanship. As someone in India, where new political parties are formed on a daily basis, without any rhyme or reason; it never ceased to amaze me, how the system, and more importantly, the two parties have been able to adapt, transform and evolve over time to appeal to the populace and cater to their needs and aspirations.

The dataset is taken from the Pew Research Center for the People & the Press article captioned, "More Now See GOP as Very Conservative". I chose to present the snapshot based on the results of the survey conducted between Aug 17-21, 2011. 

I chose to keep it simple, 3 charts for 3 tables, and space for Information synopsis for those seeking to read more into the numbers.



Once finished, I had a wicked idea in mind. What if I were to bind my three separate charts into one? It looks good, but doesn't improve on anything, other than saving some space.

Friday, 7 October 2011

Graphing Obesity Trends

The five days of Durga Puja was stupendous as usual. Lots of great times with friends, all my favorite dishes for meals, and with no restrictions, it was a fantastic experience yet again. The countdown to October 20th, 2012 for the next years festival looks a long wait indeed.

This week's visualization was posted as a challenge in flowingdata.com on Apr 29th, 2010. It received several novel replies, none more appealing than the heatmaps. One of the problems of approaching a long expired challenge is that all regular and normal forms of visualization is already considered and posted.

When I looked at the data, it was apparent that the distribution could be either viewed with respect to the Age-Group or, the Year of Birth. I therefore sought to plot years of birth along the vertical axis, and the obesity percentage on the horizontal axis, for every panel separated by the age group.

My visualization for the Challenge, is as follows:


A second version of the visualization: