MeaningCloud – Extracting Meaning from Content

Standard

I received an email the other day to inform me that Textalytics had changed its name to MeaningCloud. This was really handy, as it reminded me that it existed – I’d signed up for it during the summer, but hadn’t had much of a chance to use it, as I was working on the Library A to Z.

Anyway, it’s something I want to explore more. It’s an online service that analyses the content of text, documents or web pages you supply it with and it highlights key subject, people, places, things and other entities and concepts for you. As a librarian (specifically with my classification head on) I’ve been interested in the idea of automated classification for some time and have tried various experiments including using Yahoo pipes and my recent WordPress snapshot cards to extract meaning from text.

I’ve tested a few other online services like MeaningCloud, but this was the one that seemed the most straightforward and easy to use. The documentation is clear enough for me to understand all I need to and, as I have only really got my head around working with XML output, having this as one of the 2 output options is important to me. It also helps that it’s free up to a certain amount of use.

The way it works is you submit a url containing all the key parameters to the online service:

  • The text, document file, or url of the web page you want to analyse.
  • The type of results you want returned to you (eg sentiment – positive/negative/neutral; text classification – very broad categories such as “libraries and museums”; topic extraction – more detailed subjects and concepts).
  • The output format (eg json or XML).

You can specify more than this and you can define topic dictionaries that are used.

It then returns the information you requested to the service you sent the request from. So, in my case, it would most likely be sent via a program written in Processing. You can then do whatever you want with that response. So, in theory I can develop my WordPress snapshot cards idea to include the subjects, concepts, people, places etc that it returns.

Even though I recognise that analysis tools don’t always pick up on the finer points of text and lack human understanding that is sometimes needed to make complete sense of a piece of text, I like what they can do, and I hope I can do something useful with MeaningCloud.

If you want to try it out, take a look at the demos and enter your own text into the box. The image below shows some of the results it gave me when I entered the text from this blog post. It threw up a couple of odd things “begging” (request) and “boss” (head), but as I say, if you are using it properly you can take the time to set up a dictionary to overcome these sort of issues.

Screenshot of MeaningCloud analysis of blog post

Making Games for Libraries Workshop

Standard

A few weeks ago I attended a workshop focused on making games for libraries run by Andy Walsh. The idea behind the workshop is to produce games used for information literacy instruction. As I am running a session at PI and Mash in August to introduce people to the Pocket Code programming environment and give them ideas for producing a library game, I thought it would be a good opportunity to get a bit more focused.

The workshop gave us a brief introduction to different types of games that we could make; the tools you can use and game design, concepts, mechanics, goals/aims and rules; and how to progress through the design stages logically. Andy provided a range of materials eg blank cards; dice; spinners; pens; blank boards, etc and we were split into small groups to actually prototype a game. I thought this was a tall order for a 4 hours session, but all of the groups managed it. It seemed as if you kept the end focus in sight it was a lot easier than I expected. As an aside, this is something I’ll be keeping in mind when I’m creating my little computer games, as they tend to go off in random directions.

My group created a prototype for a classification based card game. The end goal being something that would improve people’s understanding of classification. It was called “Dewey or Die” and was based around the idea of collecting a set or run of similar Dewey classification playing cards.

This video explains the rules.

Here are the prototype set of lovingly created hand-drawn cards (possibly a collectors edition in future).

Prototype cards for Dewey or Die classification game

Prototype cards for Dewey or Die classification game

There were 5 games prototyped during the workshop and it was interesting to see the areas the other teams focused on and how they put their games together. They can all be found on Andy’s Making Games in Libraries blog.

I found the workshop enjoyable and fun and the ideas behind it are something I’ll be using in future.

Dewey Invaders Project

Standard

A while ago I thought it would be a ridiculous idea to create a game called Dewey Invaders. In this game the player would be presented with a subject heading and also a series of Dewey numbers, one of which is related to the subject heading. The player would then shoot down the Dewey number that the subject heading referred to. If they shot down the wrong Dewey number they would be sternly corrected by the father Dewey ship. It is a “What’s the point?” idea, but I also think it would be a great training tool for game minded classifiers. In fact, with cataloguing and classification being dropped from so many library courses it might be the most cost effective way to train classifiers. I could sell it as an app (even though I hate the word app). What’s the going rate for an app these days? 59p! Oh, that’s the rate for a good app! You can have this for 7.5p then.
It’s not as dull as it sounds, you know! I’d put the numbers into the shape of aliens & have explosions in 3 vibrant colours. 😉 What do you mean- it still isn’t enticing?
It might sound like a daft idea, but I actually think it would keep me up-to-date with my Dewey. I’m partly a cataloguer/classifier, but most of the time I don’t need to add Dewey to records. I just need to know that a number that’s been added to a record is okay and because I enjoy the challenge and style of retro arcade games this would be a way of learning something useful while playing.
I don’t mind what I zap- It’s just pixels on a screen, so it might as well be pixels in the shape of Dewey numbers!

Catalogue and Classify Those Tweets

Standard

The recent announcement about The Library of Congress acquiring the back archive of tweets got me thinking again about meaning in Twitter hashtags. I jokingly suggested that L.O.C. might like to classify them all. Thinking about it properly, classifying them might be useful. I’m not thinking about all tweets, but hashtags mentioned in the tweets.

If you are a library holding references to resources on your catalogue, a catalogue record for and linking to a hashtag url for an event eg #jisc10 or a subject eg #rda might be useful. Twitter provides useful & concise information, links to resources, discussions, etc, so why not make use of that? I know you have to read some waffle too on Twitter, but you often have to read through more waffle in a book.

You could index the hashtag in the same way as your normal stock. ie with subject headings and classification codes. I know tweets are lost into the ether after a few days, but for more permanent links you could use a url pointing to a twitter archiving service like Twapperkeeper, if tweets about the hashtag are being stored. These services are more likely to hold onto tweets for much longer.

I wonder if this is the intention of the Library of Congress?

Knobbling the Winter Olympic Catalogue Results

Standard

In my role of ‘Keeper of the Keys to the Catalogue (once removed)’ for a public library service and ‘Man with Access to Official Twitter Account’, I thought it would be a good idea to promote some of our books around the Winter Olympics. This included trying to get a few more loans out of the curling books we bought after Team GB did so well some time ago. 🙂

I wanted to point our Twitter followers to a few handpicked books on our library catalogue, rather than a huge wodge of titles and I wanted to do it as simply and quickly as possible. However, as I tried to pull out a few relevant skiing books I knew it wasn’t going to work using any of the search methods available, despite working out different combinations of words.

In the end I realised I was trying to make the search methods work for me, when the catalogue records should be doing the work instead. As a cataloguer/classifier I’d always been taught that cataloguing/classification should be consistent. The sacred laws of UKMARC should be obeyed. I can’t complain with this as a general principle, but in some cases if you want to achieve something different, you need to do something different to make it work. As long as it doesn’t affect the end user, as far as I’m concerned it’s fine to do it. In fact, in this case, it was for the benefit of the end user that I decided to take a different angle with this.

I decided to hashtag the catalogue entries I thought would be of interest. I know cataloguers and classifiers commonly tag records anyway, but the difference in this case was that I only tagged a handful of records, rather than tagging the entire stock with these new hashtags. Using the hashtag format would indicate that these tags had a unique purpose. It’s the same idea as giving a Twitter message a hashtag only if it’s related to a particular event ( eg ‘#van2010‘ for the Winter Olympics). You don’t need to tag all of your Twitter messages and, in the same way, you don’t always need to tag all of the records on your catalogue.

Winter Olympic Catalogue Search Results

I suppose it’s like partial/filtered indexing, where you limit the results to a subset of items, based on rules you define, rather than retrieving the full set of records. If I’d just searched for ‘skiing’ for example it would have given me 208 records. I didn’t want our users to have to trawl through all of these records. Using my method I limited the results to a single page of 7 items. Anyone searching the catalogue could still retrieve the 208 skiing records if they wanted to, but my tags pointed our Twitter followers to this limited set, as a sort of mini promotion. In fact, as I only tagged about 35 titles out of the thousands of titles on our catalogue you could say it was almost micro-indexing.

I basically pre-weighted the catalogue records so that they give me exactly what I wanted. If it was an Olympic event it might call for a stewards enquiry for knobbling the competitors!

The tags didn’t need to make any sense to anyone, as they’d just be used to query the online catalogue. They just needed to be unique, so the more obscure the tag the better – I didn’t want any unrelated items in the search results. In the end I created tags such as ‘#woski10‘ (skiing), ‘#woiho10‘ (ice-hockey), ‘#wotd10‘ (Torvill and Dean). There were about seven hashtags in the end.

After running each hashtag search, they were saved as bit.ly links (bit.ly shortens long url’s). The links were added to appropriate Twitter messages, which were scheduled to run at various times over the Winter Olympic period.

Twitter Olympic Tweets

I’ll be checking the items a few weeks after the Olympics are over to see if this has increased their use.

I’m also wondering if I could have made extra use of these hashtags via a Yahoo pipe mashup, but I’ve no firm ideas at the moment about what would be useful. Maybe a link between books and related Team GB/ Winter Olympic web pages, Flickr photos, Youtube videos would have been a good idea.