I’ve been wondering about how I can pull out popular links about libraries from Twitter, for current awareness purposes. I’m talking about the sort of links that people find so interesting they get retweeted. I suppose I could just create a twitter search and look at which links have been retweeted the most, but it’s a pain in the bum to perform the same search all the time and trawl through a load of search results. Plus the fact, I thought it would be interesting to try and do something a bit more techy.
I decided to make use of packrati.us, which is a bookmarking service used to automatically save links you tweet to your delicious.com account. It can also be used for saving links in Historious, Instapaper, Pinboard.in, and Diigo accounts too, but I just use it for Twitter. Loads of other people use it too, so I thought I could make use of links that everyone has saved via this method.
By default if a link is saved in delicious.com using packrati.us it saves it with the tag “via:packrati.us“. This gave me a starting point to create relevant RSS feeds to pull into Yahoo Pipes. I then built on it to pull in tags such as “library”, “libraries” and “librarians”.
Delicious is a bit of a nuisance, because it does rank bookmarks, but it doesn’t do it by the number of times a link has been bookmarked. It provides links to popular bookmarks (using some kind of relevance ranking), not necessarily links that have been saved the most. Strangely enough, even though delicious.com users have been asking for ranking by the number of times an item has been saved for a while, this feature hasn’t appeared.
- Put the RSS feeds into Yahoo pipes
- Combined the feeds into one feed
- Filtered them (so that each link only appeared once in the list)
- Sorted them by number of times the link appeared in the original RSS feed & date (to get most recent at the top of the list when it’s refreshed)
- Pulled out keywords from the original Tweet and delicious bookmarks (I just wanted it to give me an idea of the focus of the link. eg literacy; reader development, etc.)
- Deleted any irrelevant words (‘quot’, which appears in the text if ” is used)
- Mapped the keywords to the description field.
This is the resulting pipe.
It does what I want it to do, but it would be better if:
- All packrati links could be pulled out. At the moment I’m relying on people tagging anything they save via packrati with a tag reference to libraries too, so I may be missing out on library links that are popular, but haven’t had an extra tag added. There’s no other way of getting an RSS feed for a search on any keywords. RSS feeds in delicious.com are limited to tag searches.
- My regex skills aren’t great, so some odd keywords like “RT” and “amp” appear in the description field of the results. I couldn’t get rid of them.
- The term ‘library’ or ‘libraries’ can also refer to programming code collections, so I might end up with the odd false hit in the results.
As far as I’m concerned they’re not massive issues, but I’d like to get them ironed out if I can.
Anyway, now I don’t have to perform loads of searches every day to find the most popular library links.