Mashups At CPD25

Along with Chris Keene and Paul Stainthorp, I was recently asked by Craigie-Lee Paterson to present to around 20 people at a CPD25 training event at Goldsmith’s (University of London) on the theme of mashups and Web2.0 tools. The audience was mainly made up of academic and health library staff, so as a public librarian it was an opportunity for me to see things from another angle.

The morning was based around presentations from all three of us. Chris opened with a discussion about mashups, giving examples of what people have created with mashup tools and how mashups have developed. I followed up with a look at the tools/ resources you need to create mashups, such as RSS, Yahoo Pipes, library catalogues. Paul then went into detail about a catalogue project called Jerome at the University of Lincoln, as an example of what can be done when you get to the stage where you are able to programme and tinker with data. We finished the morning with questions and answers and a few more mashup examples.

"Simplest)l(" by Chrstphre

In the afternoon Paul and I ran a practical session to create a mashup with Yahoo Pipes. All in attendance sat at their own computer and followed an example, which took an original RSS feed from the Guardian newspaper, filtered out unwanted news articles and tweaked the information so that it was presented in a particular way. After this Yahoo Pipes tutorial we gave people the opportunity to build their own.

I really enjoyed the event. Being involved in a training session outside of my public libraries role provides me with another perspective on how mashups can be used and also what is going on in libraries in the broader arena. It’s always great for me to be involved in these events, as I also get the chance to learn from other presenters – I think Paul and Chris would agree that we all have our own specialist areas and (as with Mashed Libraries events) sessions like this help me fill in gaps in my knowledge.

I think the day worked well and the feedback from Craigie-Lee and those in attendance was positive. Hopefully it will have inspired some of those people in attendance to make use of these tools and get mashing.

Leeds Libraries use Mapped in Yahoo Pipes

Recently, Ian Clark blogged about proposed closures in Leeds Libraries on the Voices For The Library site. Following the Freedom of Information request this blog post was based on, we thought it might be useful to map some of the data, as a simple way of comparing libraries in Leeds. (NB: These figures were only a starting point for the findings.)

The FoI request included details of issues, visits and PC bookings. After tracking down unemployment figures for electoral districts, I mapped them to postcodes so they related to the correct libraries. The data was then combined in a Google spreadsheet and the spreadsheet was mapped in Yahoo Pipes.

Each library appears as a marker on the map and they contain information such as “Middleton Library. Change in issues: 20185 . Change in visits: 28409 . Change in PC Bookings: 945 / Unemployment 2010: 13.6%“.

As I say, it was just a simple way of comparing usage figures of libraries situated close to each other alongside unemployment figures. It beats scanning a list of figures on a spreadsheet. I also just wondered if I could put Yahoo Pipes to practical use. My pipe tinkerings have previously been aimed at seeing what I could do with pipes, whereas this was more to do with putting it to good use and maybe building on it in some way.

Finding Packrati Popular Library Links

I’ve been wondering about how I can pull out popular links about libraries from Twitter, for current awareness purposes. I’m talking about the sort of links that people find so interesting they get retweeted. I suppose I could just create a twitter search and look at which links have been retweeted the most, but it’s a pain in the bum to perform the same search all the time and trawl through a load of search results. Plus the fact, I thought it would be interesting to try and do something a bit more techy.

I decided to make use of packrati.us, which is a bookmarking service used to automatically save links you tweet to your delicious.com account. It can also be used for saving links in HistoriousInstapaperPinboard.in, and Diigo accounts too, but I just use it for Twitter. Loads of other people use it too, so I thought I could make use of links that everyone has saved via this method.

(c) National Media Museum/Flickr

By default if a link is saved in delicious.com using packrati.us it saves it with the tag “via:packrati.us“. This gave me a starting point to create relevant RSS feeds to pull into Yahoo Pipes. I then built on it to pull in tags such as “library”, “libraries” and “librarians”.

Delicious is a bit of a nuisance, because it does rank bookmarks, but it doesn’t do it by the number of times a link has been bookmarked. It provides links to popular bookmarks (using some kind of relevance ranking), not necessarily links that have been saved the most. Strangely enough, even though delicious.com users have been asking for ranking by the number of times an item has been saved for a while, this feature hasn’t appeared.

I then:

  • Put the RSS feeds into Yahoo pipes
  • Combined the feeds into one feed
  • Filtered them (so that each link only appeared once in the list)
  • Sorted them by number of times the link appeared in the original RSS feed & date (to get most recent at the top of the list when it’s refreshed)
  • Pulled out keywords from the original Tweet and delicious bookmarks (I just wanted it to give me an idea of the focus of the link. eg literacy; reader development, etc.)
  • Deleted any irrelevant words (‘quot’, which appears in the text if ” is used)
  • Mapped the keywords to the description field.

This is the resulting pipe.

It does what I want it to do, but it would be better if:

  • All packrati links could be pulled out. At the moment I’m relying on people tagging anything they save via packrati with a tag reference to libraries too, so I may be missing out on library links that are popular, but haven’t had an extra tag added. There’s no other way of getting an RSS feed for a search on any keywords. RSS feeds in delicious.com are limited to tag searches.
  • My regex skills aren’t great, so some odd keywords like “RT” and “amp” appear in the description field of the results. I couldn’t get rid of them.
  • The term ‘library’ or ‘libraries’ can also refer to programming code collections, so I might end up with the odd false hit in the results.

As far as I’m concerned they’re not massive issues, but I’d like to get them ironed out if I can.

Anyway, now I don’t have to perform loads of searches every day to find the most popular library links.

Generating Blog Keyword Tags 2

I had another go at automating the tagging process for my blog using Yahoo Pipes, as I wanted to improve on my original idea, which was a bit scrappy.

So, I’ve reworked the pipe to pull out all of the keywords from all of the posts (using Category RSS feeds). The original pipe listed the blog post title and keywords associated with that blog post. The new pipe lists the most frequently used keywords in all of the blog posts. When the keyword is clicked on (in the RSS feed) it runs a search on that keyword and returns any blog post mentioning the keyword.

In the pipe I’ve manually filtered out certain irrelevant words eg ‘blog’, ‘amp’ and ‘doc’. As time goes on I’ll have to manually add more words.

The only problem at the moment is that, even though the pipe returns an unlimited number of keywords, WordPress.com is limited to showing the first twenty items. I decided to compromise and call the feed ‘Top automatic Tags’. Unsurprisingly the most common phrase is ‘Yahoo pipes’.

You can see it on the right hand side of this blog (if Yahoo pipes is working, of course ;-) )

PS. I’ve not abandoned the original Tagxedo idea, but I need a bit more time to tinker with it.

Twitter follower/friend map

I seem to be getting into the swing of things with Yahoo Pipes at the moment and I seem to be creating lots of maps. Every time I use it, something else clicks in my head and puts a smile on my face. Yesterday, Aaron Tay asked me if I knew how to create a Twitter followers or friends map. I didn’t, but I thought it would be a good way to see if I could get to grips with some of Twitter’s APIs and also if they’d play more nicely with Yahoo pipes than previously. It was also nice to be asked by someone else to do something like this – my own projects seem to be a bit self-centred, so being able to do something useful for someone else made a nice change.

The Twitter API lets you pull out details of a users friends/followers. It does this via their Twitter id number, but by creating a URL with their id added to it, you can pull out full details. You can use a programming language to do this too, but if it goes into Yahoo pipes I’d rather do it there. Once you’ve got this, you can narrow the info down to the various bits you need. In my case I wanted biography details, location, photo and a link to Twitter profile.

In summary, I had to:

(1) Create user input boxes for ‘username’ and to identify if the map was for ‘followers’ or ‘friends’. This meant anyone can enter their user details, rather than just myself.

(2) I then had to build a url to point to the Twitter API and include the detail in (1).

(3) This url then fetched the details of the users followers or friends. ie their id numbers only.

(4) I then built another url using the id’s, to fetch full details of every follower or friend of the user.

(5) Each users profile contains a location field and if you put this into the ‘location builder’ module it extracts very detailed geographic location. Pretty impressive, considering some users only give the vaguest of details. It’s not perfect though, as, for example @therealwikiman is mapped to the USA, even though his location info is detailed. As he’s really based in England, I imagine the commute in the morning is a bit of a nightmare. ;-)

(6) From various fields in each profile I then built a description that contained Twitter image, biography and location in text.

(7) I also added a link to each of their Twitter home pages.

(8) Finally I mapped all of the data to standard RSS/map data fields (title, link, description, y:location). When Yahoo pipes works with data it changes field names to reflect what it’s done to the data, so you need to change them to a format that is recognised.

(9) I connected it to the pipe output.

Twitter follower and friends map

When it ran, because it saw the field ‘item.y:location’ in there, it automatically displayed the information as a map, which you can see here. You can also add your own user info into the search box and create your own map. (NB: Sometimes Yahoo pipes & Twitter don’t play nicely together. If you have a problem with this pipe and have a Yahoo account, try copying the pipe and adding your own information into the search boxes.)

One thing I would like to get to grips with in Yahoo pipes is to be able to embed the output of a pipe into a web page and also allow users to add their own input on the same page, but I’ve not cracked that yet. So, if anyone else can help me with that side of things it would be appreciated. Thanks.

A Travellers Map in Yahoo Pipes


Before putting together the Surrey Fiction Book Map for work I was considering the possibility of creating a map of the world that would link from markers to Surrey Libraries’ catalogue. I didn’t fancy creating it manually and I was sure it could be done via Yahoo Pipes. However, at the time I hadn’t used Yahoo Pipes in this way before, so I didn’t follow through with the idea.

Now I’ve had a bit more time to think about it, I’ve managed to put something together using a spreadsheet version of our Subject index and Yahoo Pipes.

Firstly, the spreadsheet contains all of the information I need – text description of the location, plus the sub topic (eg Travel; history; etc). It also contains the Dewey number and our Reader Interest Categories (RIC). In Surrey the RIC is used to shelve our stock by subject area – helping to bring together related stock that would otherwise be separated.

Section from Subject Index spreadsheet

I created a Yahoo pipe that pulled in the spreadsheet information.

It then filtered the subject headings based on the ‘NewRIC’ column, removing any subject headings that weren’t location-based. In the above example you can see some subject headings in the original source file that it excluded eg Aramaic Language; Arboretums; Archery.

The pipe combined the Heading/Subheading fields (so they appeared in the title) and the RIC and Dewey number (so they appeared in the description). It’s a librarian thing I do to scare off the public ;-)

I also fed the title field into the ‘Location builder’ module and it did a pretty good job of identifying the map locations mentioned. It did have some problems, as you can see from the fact that “War of the Roses” has been mapped to just off the Australian coast! This was due to the fact that some of the text wasn’t precise. I’m correcting these issues gradually, as there are over 800 items to check.

War of the Roses, just off the coast of Australia!

Finally I created a link from each marker pin back to the library catalogue. As the subject index contained Dewey numbers I could add this information to each link via the String builder module. The link basically acts as a catalogue search.

If you’re interested you can take a look at it here.

As a next stage I need to tidy up the subject index, so it maps more accurately and removes subject headings that I can’t map correctly.

It would also be useful to be able to present the map so it is less tightly packed and maybe add a location search too. Maybe with some location images, as well.

Also, if you do want to know what each part of the pipe does in detail, feel free to ask.

Generating Blog Keyword Tags and Tagxedo Clouds with Yahoo Pipes

As I’ve been adding blog posts here I’ve noticed that the keyword tags are getting into a mess. So I’ve been thinking about what I could do to sort them out, either by getting the computer to do the tagging for me, or provide another way of presenting relevant keywords about specific blog posts to anyone who visits my blog.

As a first attempt (1), I decided to use Yahoo pipes and simply feed in the RSS feed of my blog, pull out keywords from each blog post and then create another RSS feed to be used anywhere. Visitors can view keywords for the last 10 blog posts, as I couldn’t get Yahoo pipes to go beyond 10. They can also click on a link to the blog post. The words/phrases pulled out aren’t perfect (as with any automated word extraction), but I think you get a good feel for the blog posts from them.

(NB: Click on ‘List’ to see the keywords and use the link back to the blog.)

At the same time, I was thinking about whether I could use something like a tag cloud generator to do what I wanted a bit more creatively. Having a look at Tagxedo, I realised that if you use the URL function on the home page to create a cloud you could actually build the url yourself. So, I created a second pipe (2) that presented the same keywords, but also provides a clickable link that feeds through to Tagxedo and creates individual tag clouds for each entry.

(NB: Click on ‘List’ to see the keywords and use the link to generate the cloud.)

Ultimately, I did want to combine the two pipes, but I couldn’t get Yahoo pipes to create valid links to 2 places in the same RSS item.

I would have also liked the Tagxedo cloud to display in the RSS feed, but at the moment the link just creates a cloud from the RSS.

Hopefully there is a way to achieve both of these things, but as a first attempt I think they both work quite well, even if the RSS feeds/presentation do need a bit of tidying up.

The results of the embedded pipes can be found on my test site here. Links to the source code of the pipe can be found in (1) and (2) above. I’ve also added the RSS to the blog on the right hand side to see how people get on with it. The feed is labelled ‘Term Tags’.

This Made Me 2

In a previous blog post I talked about a biographical project I wanted to attempt – I called it “This Made Me”. I wanted to put together a visual representation of things I consider influenced me throughout my life and made me the person I am today, just as an experiment to see what I could come up with.

I’ve actually managed to turn that idea into something concrete using Yahoo Pipes to pull through information I added to a Google spreadsheet. Yahoo Pipes then automatically created the map with markers and details of influences in those markers. Here it is. The markers contain images pulled through from various websites and also link to relevant web sites too. The map only contains about 13 influences so far, but I’ll add others as I go along.

This Mage Me Yahoo Pipes map

I’m pleased with the fact that I’ve managed to create this without having to manually add the information to the map, as I have done with other maps I’ve created previously. It’s also helped me understand how aspects of Google docs and Yahoo pipes work and is definitely something I can build on. Both @psychemedia and @ostephens gave me plenty of tips on how to achieve this. So, big thanks to them.

Putting this information on a map is only one way of doing things and I’d like it to be more visual (without a map), so I’ll see where I can go with it next. The data is there, it just needs to be fiddled with.

Edit: Part of the challenge of doing this, is seeing if I could provide something that could be used by others too- if they wanted to. That’s why I didn’t create the map manually (for one reason anyway). Maybe a biography of a famous person could be created in the same way, detailing their life based around locations around the world. How about a great explorer like Christopher Columbus or James Cook? All you’d need to do is copy the Yahoo pipe and pull in the data from a different spreadsheet.

Literary Twist Project and Run Basic

After tinkering with my Literary Twist Yahoo pipe (put in a book synopsis and it turns it into a synopsis for a horror novel) I’ve decided that it doesn’t work. Well, it sort of works when it finds any relevant words. The problem is, it relies on the common words appearing in the text fields that are entered into the synopsis text boxes and after testing it for a bit I’ve decided that this method isn’t good enough. Even though I methodically chose the words that would occur frequently enough, it seems that synopsis writers don’t like to write using common words ;-)

I’m going to try a different method now – Yes, I know that this project has no practical use in the world, apart from amusing myself, but it’s a challenge to see if I can get it working in the way I wanted.

 

(c) Tomas Rotger (Flickr)

I’ve now worked out that what it needs to do is, basically identify the most common words in any text that is entered into the text box (rather than common words in general) and twist or replace them in a way that makes sense, but also gives the horror aspect to the new words.

I realise I can’t do this with Yahoo pipes – it’s just too complicated to do it that way for my brain. I find Yahoo pipes is fine as long as I don’t make it really complicated and sometimes Yahoo pipes just stalls and sputters into lifelessness if I make anything too complicated.

So, I’m currently using Run Basic to try and achieve this. As the name suggests it’s a language based around Basic – no sniggering! Basic is embedded in my brain and I will champion this favourite language of 80s school boys until you mock me so much that I curl up into a ball and cry. The good thing is that it’s server based, so you can create dynamic web pages from it. I’ve used php to create dynamic web pages in the past, but if I don’t use it for a while I forget the syntax/methods, etc. Php also tended to go wonky on me when I upgraded browsers as my programming was less than standard. Whereas, as I spent years programming in Basic, Run Basic was so easy to pick up. Run Basic also allows you to parse XML, manipulate files, and use HTTPGET, HTTPPUT functions, as well as other useful things.

So the first thing I’m doing on this new plan is to put together a term extractor and word count. It’s not quite there, but I’ve more confidence cracking it with Run Basic than anything else. I won’t let it beat me, no matter how useless the result is!

Literary Twist Update

I mentioned in my Tinkering Day post that I’d made some progress on the Literary Twist project. I thought it might be interesting for others to see what I’d done/how I’d done it.

Well, I’ve sort of done what I wanted on tweaking the words, but at the same time it’s obvious that what I wanted to do wasn’t enough to make it as entertaining as I wanted. :-(

I basically got a list of commonly used words – I looked for a few sites that covered this to get an aggregated group of words… just to make sure I was replacing the best set of words. Then, using Google docs I pulled data from tables in websites into a spreadsheet, rather than retyping the info (Tony Hirst wrote a blog post about doing this). Sometimes, because the words weren’t in a table, I had to copy/paste the data into the spreadsheet. The data was a bit scrappy, as it came through to the spreadsheet in a variety of formats. Google spreadsheets doesn’t have a regex function and I didn’t want to do hundreds of manual find/replace, so I fed into Yahoo pipe to clean it up, using regex.

I output the clean file as csv and imported it into Excel, so I could get a count on the number of times specific words appeared. This helped me decide which words I’d do the find/replace on later on. I also needed to look at a few dictionary sites to make sure I replaced words that could only be used as one class of word, rather than more than one (ie adjective, noun, verb) – more than one messes up the syntax/form of the sentence.

Then I created a new Yahoo pipe, which had 2 text input boxes for title & synopsis. I added find/replace modules & manually entered words that needed to be replaced, along with text that replaced it.

Werewolf by Schnaars (Flickr)

Werewolf by Schnaars (Flickr)

Still, at this stage, some words didn’t work. Some of the replacement words didn’t work either. This is partly because I hadn’t thought too much about the type of text that would work with the replacement. For example, synopsis seem to talk more about ‘he’, ‘she’, ‘it’, rather than ‘I’, ‘me’ and this affects the way that you need to deal with the whole word replacement style.

I also worked out that, even though it’s a good idea to replace common words, because you’ve got a better chance of hitting words that can be replaced out of the 171,476 words in common use in the English language (according to the Oxford English Dictionary), more synopsis actually try to avoid the cliche/common words.

It still needs tweaking and it’s presentation still needs prettifying (or horrifying ;-) ), but here’s the pipe for people to have a look at. All you need to do is enter a book title and synopsis into the boxes. I’d be interested in the output from anything you paste into the pipe, as I’d like to see how the pipe works on a wide variety of synopsis. Maybe anyone who uses it could cut/paste the output of the pipe as a comment to this post. I know it needs work on.