The large majority of information online and within organizations is unstructured text that is not cleansed and neatly organized in databases. This poses a unique challenge to analysts as well as database architects. Add into the mix that there is an abundance of data feeds online, and now that challenge looks like a large opportunity.

As a result, there are many open source projects and software that both gather this data and process it into useful summaries. One such tool is the “word cloud”. It provides a visual representation of the words/phrases that are most used in a given site/data feed. On my free analytics software page, I provide more info about available software. Below is an example of a word cloud that I created on 2/25/2012 using Jason Davies’ word cloud tool that searched twitter for comments about Ron Paul. (Be sure to visit Jason’s site for this tool (and others) ¬†as well as his critical assessment of using word clouds – there are shortcomings.)

Word Cloud: Ron Paul

Word Cloud: Ron Paul, created @

So, how can we use these word clouds for marketing strategy? Here are two ideas for competitor research. Suppose that competing websites are currently beating you out on several keywords. Word clouds can offer you a quick glimpse into content of these websites and what stakeholders are currently saying about them.

Website content of competitors: Using google spreadsheets’ importXML function, I scraped a Pittsburgh-based website designers’ site that has high organic ranks for several keywords related to internet marketing strategy. I only scraped text content, but I could have just as easily searched for links, meta information, etc… After scraping the pages, I then copied/pasted the text into Wordle’s Cloud creator and created the following:

Content Cloud of Competitor

Content Cloud of Competitor: made @

This image could tip me off to certain keywords that I should consider emphasizing or building content around.

What stakeholders are saying: For another web competitor, I simply typed in the name of their twitter account into Jason Davies’ tool. I then selected the log scale of respective word counts so that a few words didn’t dominate the entire image. Here’s what I got:

Twitter Competitor Cloud

Twitter Competitor Cloud @

The above keywords are from posts containing the competitor’s name – these may have been made by customers, the company itself, or anyone else with a twitter account. This may tip me off to new offers, content, and ideas that my competitors are trying; as well as how they are being perceived by outsiders.

Long story short: word clouds are easy to use and pleasing to the eye. They are a very blunt tool to begin analyzing text. Some inSights may be gleaned, but more technical Analytics should be pursued to identify more certain and clear business actions.