Extrapolating classification from work tags

SnakBuild the Open Shelves Classification

Bliv bruger af LibraryThing, hvis du vil skrive et indlæg

Extrapolating classification from work tags

Dette emne er markeret som "i hvile"—det seneste indlæg er mere end 90 dage gammel. Du kan vække emnet til live ved at poste et indlæg.

1defaults
Redigeret: jan 25, 2009, 8:52 am

I think it could be feasible to produce tentative work classifications algorithmically from the tags people have described them with. It would avoid a whole lot of redundant effort, notwithstanding the occasional "false positive". Eg. if a work has been given tags by 100 people and 90 of them have tagged it fiction and no other top-level classifier matches its popularity, then it could be assumed to be fiction.

2jjwilson61
jan 25, 2009, 10:31 am

But this whole feature of categorizing your books isn't really about categorizing books. It's about testing the tentative OSC classifications and whether they make sense to humans, so having the computer doing the categorizing doesn't make sense.

3tcarter
jan 25, 2009, 10:48 am

It does make sense if all the computer is aggregating the tags which tells us how a whole load of humans do actually categorise them.

4jjwilson61
jan 25, 2009, 10:50 am

I believe that has been done as part of coming up with the current categories.

5tcarter
jan 25, 2009, 1:57 pm

well, there is this:

http://www.librarything.com/wiki/index.php/Map_of_OSC_Top_Levels_%26_Librarythin...

but I'm fairly sure that this was done manually, not algorithmically. I think that at this stage, using the info already in LT to reduce the work required would speed up the testing of proposed top level categories.

6klarusu
jan 25, 2009, 2:04 pm

I think there are a few issues with that, first is that I think it is the process that is important so the effort is far from redundant. How we place these books in the pre-defined categories (and the minefield of issues it brings up) is as important as where they go. The process of cataloguing is highlighting real issues with the system that just using tag assignations would not. Also, tags are personal and unreliable as a testing measure. Certainly my tags have nothing to do with how I would expect a book to be categorised in a real-life shelf order and were never applied with that in mind. Plus this is for shelf order cataloguing and a book can only physically exist in one place, tags enable various different attributes to be highlighted for each book. Even Fiction vs Non-Fiction is not that simple. Take, for example, Norton Critical Editions, which I catalogue as both Fiction (or Poetry) and Non-Fiction (for the critcal element). That's just my perspective on it.

7laena
feb 2, 2009, 1:50 pm

Greetings! David and I have been busy compiling and analyzing all your comments, and a post with new top levels is forthcoming!

In the interim, take a look on Thingology (http://www.librarything.com/thingology) at the summary of the OSC meeting we had in Denver last weekend.