I’ve been getting to grips with Processing lately. Apart from creating a couple of small games and a handful of generative images I’ve also been experimenting with data feeds. One of the ideas I had was to create a visual card for WordPress blogs based on the content in them. I wanted the cards to give a snapshot of the blog at that specific time, and also reflect the content of the data feed visually.
It also displays the title of the blog and the url, which it takes from whatever WordPress RSS you put in there.
Automated keyword tagging has been an interest of mine for some time (as some of my Yahoo pipe experiments have shown) and in this program I pull out all of the words in the description field of the feed and then rank the most mentioned words. I also filtered out unwanted words with a stopword list. It’s interesting to see the words that crop up the most, although as a second stage I’m considering stemming words, because for example both “library” and “libraries” appears in the top 10 words in my blog and I would reduce this problem of closely related words appearing.
When you run the program each top keyword is displayed separately for a few seconds (starting with the most popular) and then it moves onto the next one.
So, here are a couple of examples from this blog and also the Voices for the Library site.
I’d like to develop the idea further – include more detail and possibly have quotes and images from the blogs, as well as using more data from the feed to generate the background. I’m also thinking that if I focused on just library and information service based blogs it might be a good idea to create a dictionary of terms to compare against, as well as having the top 10 words.
Anyway, I like the way they’ve turned out so far.