Ten Years On...

SnakToo Obscure

Bliv bruger af LibraryThing, hvis du vil skrive et indlæg

Ten Years On...

Dette emne er markeret som "i hvile"—det seneste indlæg er mere end 90 dage gammel. Du kan vække emnet til live ved at poste et indlæg.

1johnandlisa
mar 29, 2016, 10:32 am

Long preliminary throat clearing:

Things have been dormant on this group for quite a while. I hope you'll be indulgent if I "spam" your talk page for the next day or two.

Tomorrow is the tenth anniversary of our joining Library Thing. I'm looking forward to getting the shiny "X" badge. Almost immediately upon joining, I noticed the information on "obscurity" that has now been incorporated into "stats/memes" and paid close attention to how it was affected by my entries. Just three days after joining, I started making observations about where we stood in terms of those measures of obscurity (along with how the size of our library compared with others in Library Thing and which other libraries had the most books in common with us). And I kept on reporting this information as our entries passed various milestones. So when this group came into existence a few months later, I knew I would be in the company of kindred spirits.

After the first year of entries, I had reached a plateau of entering aspects of our collection and decided that I would revisit how the numbers had changed near the anniversary of our joining Library Thing. Periodically, I would add corners of our collection that I hadn't got to in the first year in bunches to reach new plateaus. As the tenth anniversary of our joining approached I made a concerted effort to add one last stubborn collection so I could do a comprehensive overview. (Technically, we've got at least one more stubborn collection that could be entered -- piano sheet music from my parents and grandparents -- which would likely drive down my median and mean obscurity a bit more. I'm not sure if I'll get to them or not.)

In retrospect, I baked a couple of naive assumptions into my original observations which it would have been nice to separate out. I was mostly interested in how my numbers would change as different subjects in my collection were entered (e.g. would my numbers go up or down when I got around to entering my baseball books?). But the growth of Library Thing itself clearly swamped any changes that came from different characteristics of our collection. I also treated my own collection strictly as "books we own now." If we get rid of a book (not a common occurrence, but it does happen), I remove it from our collection on LT. But I realize that lots of people use LT as a wish list or list of books read but not owned and there's no way of knowing if a "shared" book is of the former or latter variety. Also, the development of Legacy Libraries turned some of my very obscure books into shared books. Still, I think I made some smart choices too, that give a good indication of how both our library and LT have been changing, such as, for instance, noting how many copies of a "popular" early entry there were.

So, spam... I'm going to post sequentially in this topic the observations I made at our own page about our obscurity numbers and other things over the years. I hope that some of you find the evolution interesting. Gasp at how much the addition of Harry Potter to our collection affected our mean!

Tomorrow, I plan to do a comprehensive tenth year overview.

2johnandlisa
mar 29, 2016, 10:33 am

On April 2, 2006, I stared my observations on obscurity with the following comment:

So we've just broken the 100 mark in cataloging.

I've started with my working collection. It allows us to have some fun with the various widgets of Library Thing. My strategy of cataloging is to make entries directly from my bookshelves, but with an eye to keeping my library obscurity number as low as possible for as long as possible (it's at 1 for the first 100 books). So far, that hasn't required any special finagling because I know that my research specialty is an obscure one. My most widely held book so far is held by six other users. And no one holds more than one book in common among my first 100. From a quick scan of other people's lists, I think I can keep my number down to one or two until I start cataloging my British stuff. And by then I should have built up a pretty big cushion.

I'll save my Harry Potters for last.

John

3johnandlisa
Redigeret: mar 29, 2016, 10:42 am

On April 15, I got a welcome message from Tim Spalding which prompted the following thoughts.

My thanks to Tim for stopping by and leaving a comment. Perhaps he gets a ping when a library reaches the 500 mark. That's what's happened with us. Time to take stock of where we are so far...

The library obscurity widget changed a little while I've been adding things. Now it distinguishes between mean and median obscurity. My median is holding steady at 1 and should stay there for a while. My mean shot up to 2 when I added Bainton's Here I Stand and Diarmaid MacCulloch's Reformation in rapid succession. I don't imagine I'll ever get it back to 1 again. It seems to be holding fairly steady at 2 for now, even though I've added a few books that are about as popular as MacCulloch's. It's been interesting watching Here I Stand grow since I first entered it. At first it was shared by 55 libraries; it's now in 60.

I have no idea how many books we have total in our collection. I now feel pretty confident we'll make it onto the largest libaries list if we put everything in, even though that's a receding horizon. I have yet to finish all of my books in early modern Europe, much less my modern history and non-European history.

I've enjoyed playing with the various lists of shared books. At the moment, the person who shares the most books in common is bibliophiles (with 28 right now). The person with the most shared books has changed every so often (for a while it was pobanion, who describes himself as an early modern historian, so the overlap was no surprise), but bibliophiles has a pretty solid lead now. I think the top spot is only likely to change when I start moving in a different direction in the collection -- and perhaps not even then.

I thought that the large number of foreign language books in my collection would give me an especially large cushion of unique titles. But I have discovered that Library Thing has an international user base. Several of my German titles are shared by users in Europe.

I've been using what I call the "Crooked Timber Gang" as my touchstone for comparisons on things like library obscurity. The gang is users I can identify as academics from blogs like CT, many at one or two degrees of separation. I notice a number of them amongst the shared books list already. I also think I've found at least one user I know in real life.

So it's been fun so far playing with the widgets. I'll update periodically when new fun things happen.

Next on my list to add are modern German history, general early modern European history, intellectual history, and military history.

4johnandlisa
mar 29, 2016, 10:41 am

May 30, 2006: (Note how uncertain I was about how many books we actually owned! And how far off I was!)

So I have now passed the 1000 mark. My library obscurity numbers have not moved much. Goethe's Faust pushed the mean up to 3. The median remains at 1. Bibliophiles retains the title of the most titles in common, now with 84. I still have a little ways to go with our working library in the study. The remaining titles are in U.S. and British history, so the median may make it up to 2 by the time I'm done here. The mean may also move up a notch, but most of my standard primary texts are downstairs, so I won't have large leaps from works like The Prince or Candide.

I think I've got more than 3000 books in all, but the horizon is receding fast enough that I'm not sure I'll make it to the top 50 largest libraries list.

5johnandlisa
mar 29, 2016, 10:45 am

June 6, 2006:

We're at the 1400 mark and it's time to take stock again. We've now finished with our "working collection" in the study. We could go many directions from here. We have not yet cataloged the working collection in Lisa's office at work. Nor have we tackled the lion's share of the classic texts that we often assign in intro classes (e.g. Hobbes, Locke, Machiavelli, Rousseau). I'm sure when we tackle those our mean obscurity number will shoot up. Another direction we could go is to start on our art history collection, which is strongly oriented towards our research specialties.

Our obscurity ratings still are at median 1, mean 4. The most popular book we've cataloged so far is Erik Larson's Devil in the White City. At the top of our shared books list is bibliophiles, who has just crossed the three digit mark at 101. Second is Ellenandjim at 79 and Meburste at 59. Academics with an inclination for history clearly dominate the list.

I'll get back to entering books in a month or so. For now, real world work is interfering.

6johnandlisa
mar 29, 2016, 10:46 am

July 29, 2006:

We've just past the 2000 mark. Here are some observations:

We've now got a median obscurity of 2 and a mean of 19. The mean has grown substantially in the last couple of days as I have started adding classic fiction we have in hardbound. I expect the mean to keep going up for a while. I could probably drive the median back down to one for a while, because we still have several pockets of obscure books, but I can't see that I'll be able to keep it there. On the other hand, I'll be surprised if our median obscurity goes up to 3.

Roland Bainton's Here I Stand has grown to 100 users (from 60 in April).

Bibliophiles remains the user with the most books in common at 124. ellenandjim and meburste remain second and third. The stability of their positions is pretty remarkable. There are now fifteen users with at least 50 books in common with us. Of those, only 5 (bibliophiles, pomonomo2003, lycanthropist, jgarrig, and pobanion) have us as one of the 100 users with the most books in common with them. I find that also pretty remarkable and wonder if that kind of pattern is common. Library size cannot be the only explanation for the imbalance, since two of the libraries closest in size do not include us amongst their top 100, while one that does is quite a bit bigger than us.

I've identified a group of about twenty libraries owned by academics in the humanities and social sciences which I am going to start following in more depth going forward.

7johnandlisa
mar 29, 2016, 10:48 am

September 7, 2006:

So, we are now past 3000 books. When we first started with LT, that would have been enough to get us on the list of top 50 libraries. Now it is enough to get us into the top 100, but we would need to break 4000 to be in the top 50. I still think there is a chance we can make it. I'm sure we have at least 1000 more books to go, especially if we add the books in Lisa's office.

To pick up some earlier themes. ellenandjim now has the most volumes in common at 231. Bibliophiles is second at 193. Our median and mean obscurity are now 2 and 38 respectively. The mean puts us near the norm for general academic libraries. The median is still comparatively low. I am reasonably confident we'll keep it at 2 until all of our books are entered. The 100 most similar libraries now go to 54 books in common.

The books we have yet to enter fall into a few big categories. In addition to Lisa's office and the books in our children's rooms we have an extensive collection of travel guides, poetry and drama, and paperback fiction.

When I first started observing similarity scores I thought that academic libraries would dominate the most similar list by now. But in fact, the apparent academicness of the most similar libraries has declined since the first 500 or so entries. I have identified twelve libraries from academics at one or two degrees of remove from the Crooked Timber blog. Only three, meburste, chrisbrooke, and TimothyBurke are now in my top 100 list. There are twelve other libraries I can identify with certainty as academics. Seven of them remain in my top 100 list, including the two grad students whose libraries came out as most similar in the earliest stages of entry: pobanion and AlextheHunn. Nine of the 24 libraries belong to historians. I note that the strongest clustering effects seem to be around intellectual history/political theory/philosophy. Someone like jcherniss has an unusually large concentration of the 24 libraries.

8johnandlisa
mar 29, 2016, 10:50 am

October 21, 2--6:

Another milestone reached. We're now at 4002 books. At this point my uncertainty about reaching the top fifty libraries seems kind of silly, because we must have at least another 1500 books to go. I'm in the midst of entering our childrens books currently in the family room and basement. There are still the books in our kids rooms plus the books in Lisa's office, an abundance of travel guides, cookbooks, and a few other random books. And then I can decide whether to include the sheet music, which we also have in quantity.

I've deliberately held off on entering Harry Potters until now. I'm going to enter them next to check the impact on my library obscurity.

In the meantime, here's an update on stats I've been following:

Current median and mean: 3/65. I underestimated the effects of new users on the median obscurity. It's not just that a higher proportion of my recent entries have had multiple users, but some of the old entries also got cancelled out by newer entries of others.

The averages seem fairly consistent for similar types of libraries. Ellenandjim, for instance are 5/78, meburste is 4/74.

I'd say about 35% of our books are not shared by anyone.

The top three shared libraries are now ellenandjim (333) eromsted (275) and debweiss (275). Former leaders bibliophiles are now fifth at 230 and meburste is now ninth at 208. Our top 100 matches conveniently go down to 100 books shared. Several standbys from earlier to 100s have dropped out. No more pobanion, jgarrig, TimothyBurke, or chrisbrooke. Lycanthropist and pomonomo2003 have drifted down to middle of the pack.

Of course, now the profile page lists the weighted similar libraries as default rather than the raw number. The top three weighted are bibliophiles, alex19, and ellenandjim. I don't quite understand how the weighted scores are determined. Some things about it are curious. For example, alex19 is our second closest match, but we are not one of his top 100 matches. We're in the middle of the pack of matches for bibliphiles and ellenandjim.

Less striking, but also notable is how far down the list we are on raw score for our top matches. We are about 50th on ellenandjim's list, about 70th on eromsted's and about 85th on debweiss's.

So virtually all of the peer group I identified before has fallen out of my top list of shared raw numbers and many are not in my list of weighted. The weighted list nevertheless more closely resembles the original list of similar libraries.

Oh, and Bainton's Here I Stand now has 144 entries in Library Thing.

Another number in fun statistics that has started to interest me is the average date of publication of the books. Ours is 1971, which seems to be in a common band for academicish libraries with a lot of older books. Meburste is 1975, lycanthropist 1970, ellenandjim 1971, bibliophiles 1977. The oldest library I've noted is languagehat, 1963, though I think that is because the books are entered by original date of publication rather than the given edition's date of publication.

I'll be back shortly after I've added Harry Potter, Lord of the Rings, and a couple of other very popular books to note their impact.

So, after entering Harry Potters and Lord of the Rings, our library's mean obscurity went from 65 to 79.

9johnandlisa
mar 29, 2016, 10:55 am

Happy New Year (2007)!

We've come to another moment to take stock. We're now past the 5000 mark -- 5228 to be exact -- which finally gets us into the top fifty libraries. We're also in the top fifty taggers. We're not done yet, but the areas we have left may take a while to enter.

In the meantime, here are where we stand with our stats.

Our current median and mean are 4 and 105. The average date of publication is now 1975.

The top three users with our book in weighted average are bibliophiles, RonKaplanNJ, and amboles. Our top three in raw total are ellenandjim, debweiss, and eromsted with 377, 336, and 329 books in common, respectively. The 100th library in raw total now has 136 books in common.

The weighted average libraries are an interesting mix. Bibliophiles has been one of our closest matches from the very beginning. RonKaplanNJ and amboles make it to the list because of very heavy concentrations on one aspect of our collection. RonKaplanNJ has an abundance of baseball books which connect with our collection. Amboles has a bunch of Tintins and Asterixes. The next few entries also range widely. They include ellenandjim, but also aprille, which rises so high on the basis of lots of childrens books. It's an interesting assortment.

I'll get back to relationships between user libraries in a subsequent post.

And to return to another running theme, Bainton's Here I Stand has grown to 209 copies in library thing. Here's the development:

April: 55
May: 60
July: 100
October: 144
January: 209

That's it for now.

10johnandlisa
mar 29, 2016, 10:56 am

Jan 25, 2007: (Around the time I joined Too Obscure, I think)

A landmark has been passed for Library Thing as a whole. The top fifty libraries now all have more than 5000 books. We're still in, but I don't know how much longer it will last. But even if we bow out temporarily, we've got enough left to enter that we'll get back in eventually

11johnandlisa
mar 29, 2016, 10:58 am

And then the shift to anniversary observations:

Tomorrow marks the first anniversary of our joining Library Thing. It's been lots of fun.

Alas, we've been bumped out of the top 50 libraries for now. We've got 5255 books, which puts us in 58th place according to Zeitgeist.

Our obscurity scores are now median of 5 and mean of 144. I blame the rapid growth of the mean on our Harry Potters. Roland Bainton's Here I Stand now has 293 Library Thing Owners.

Since we're now at our anniversary, I thought I would be a bit more explicit about the other libraries I've been following in depth. I joined the Too Obscure group and should probably post this there -- but since I've started all this ruminating here, I'll continue for a while.

I first learned about Library Thing from an entry at the blog Crooked Timber. I noticed other users who I could identify as associated with blogs and began to note which ones tended to cluster as most similar libraries with others. I began to note the total number of books in the library, the median and mean obscurity score, the average date of publication, and the raw totals of the first and last entry of the users with their books. In addition, I kept a roster of which of the selected users turned up in the top 100 of weighted books in common. Unfortunately, these things change frequently and take a fair amount of mindless effort to keep up. But I've put some of the information on a spread sheet as a snapshot at specific moments.

My research group has three subgroups: 12 academic bloggers I recognize from the blogosphere: chrisbertram, chrisbrooke, kieranhealey, jtlevy, TimothyBurke, mcmoran, joshcherniss, sdarwall, cshalizi, williamdorr, meburst, and ranaverde. 12 self-identified academics who I do not recognize from blogging (though one or two turn out to have blogs): ellenandjim, markell, sylphette, jgarrig, michaelbancroft, geoffmiles, pobanion, alexthehunn, abvr, jaybernstein, fledgist, and alex19. And 5 big libraries that have lots of books in common with us and seem very academic, but do not identify themselves as academics: bibliophiles, pomonomo2003, lycanthropist, debweiss, and jfclark.

Ten of these libraries were on my list of the weighted users with my books: bibliophiles, ellenandjim, pomonomo, lycanthropist, debweiss, jfclark, michaelbancroft, alex19, jgarrig, and mcmoran. jgarrig and mcmoran have fewer than 1000 books, but the rest are all pretty large.

Of the twenty, the one that stands out most for obscurity is lycanthropist. In January, his stats were:
Total Books 7152, Median obscurity 2, Mean obscurity 24, average publication date 1970, Top user in common 390, Bottom of top 100 users in common 78.

By contrast, the one that is least obscure of the twenty is sylphette. Her stats were:
1847, 64, 347, 1997, 505, and 188.

In this peer group our total number is fairly high, our median low, our mean near the middle, or average date of publication low, and the top and bottom of our top 100 users near the middle.

Among the interesting tidbits I noticed in researching this is that ellenandjim and meburste now have more than 1000 books in common. Pretty impressive!

With that, I'll relaxedly celebrate our anniversary here at LT. And eventually we'll get back to the 1000 or so books we still need to enter to get us back to the 50 largest libraries.

12johnandlisa
mar 29, 2016, 11:00 am

We've just passed our second anniversary on Library Thing.

The past year was a bit slower than the previous. Here's where the stats I've been tracking are now in comparison to last year:

Last Year This Year
# of books 5255 5447
Median 5 12
Mean 144 310
Top User in Common 390 413
100th User in Common 78 197 (157 is last user's raw score listed)
Publication Date 1975 1975

% Unshared 21.9 19
Book 10% from bottom 1 1
Book 10% from top 333 494
Title Trout Fishing in America Cultural Literacy

# Users with Here I Stand 293 565

Most of the changes in rates comes from the addition of new users to Library Thing. The 200 or so books we've added in the past year probably did little to change the overall rates on their own.

13johnandlisa
mar 29, 2016, 11:01 am

Today is our third anniversary at Library Thing.

As we did at the first two anniversaries, I will review our stats and how they have evolved. I rather naively thought that I would be able to see patterns in how our library's stats changed as we shifted from one aspect of our collection to another, but it's now clear that the effects of the growth of Library Thing as a whole swamp the changes caused by our adding new books, especially once we entered our Harry Potters.

Here's where we stand today (And I'll redo these stats again in comparison to our first two anniversaries below):

Library Size: 6714 (Zeitgeist rank 103)
Median: 21
Mean: 419
Top User in Common: 563 (bibliophiles)
Last User in Common Listed raw: 318/255 (antimuzak/bfrank) (the second number is the last listed when you click "more")
Average Date of Publication: 1976
Percent Unique Titles: 14%
LT Copies of 10th Percentile Book: 1
LT Copies of 90th Percentile Book: 697 Bully for Brontosaurus
Copies of Here I Stand: 824

On our first anniversary I started tracking 25 libraries of interest to see where they turned up on our "members with our books." At the time, ten of those libraries were on the weighted list. Seven of those ten are still there (bibliophiles, ellenandjim, jgarrig, pomonomo, debweiss, alex19, and meburste). One that I was tracking back then has now emerged on the list (pobanion). Two others dropped below the fold (they appear if you click "more": jfclark and mcmoran). Two that I was tracking have turned up below the fold (Markell and AlextheHunn). And two have disappeared from my list (lycanthropist and michaelbancroft, the latter of whom went private).

Here's comparative stats for our first three anniversaries at LT:

Library Size: 5255 5447 6714
Zeitgeist Rank: 58 unknown 103
Median: 5 12 21
Mean: 144 310 419
Top Match: 390 413 563
Last Match: 78 197 318
Last Match ("more"): na 157 255
Pub Date: 1975 1975 1976
% Unique: 22 19 14
10th Percentile: 1 1 1
90th Percentile: 333 494 697
Here I Stand: 293 565 824

As you can see, our library didn't grow very much between our first and second anniversaries, but it has grown substantially in the past year. But a couple of indicators suggest that most of the change in the comparative stats are caused by the growth of LT rather than our adding more common books in the past year. The first indicator is the growth of a representative book, Here I Stand, in comparison to the other figures. Also, the positioning of our 90th percentile book has only changed a little. This year it is Bully for Brontosaurus, ranked 671 with 697 copies. Last year's 90th percentile book Cultural Literacy is now ranked 681 with 678 copies, while our first year's 90th percentile book Trout Fishing in America is now ranked 700 with 648 copies.

That's it for this year. I'll be interested in what the next year brings.

14johnandlisa
mar 29, 2016, 11:05 am

Today is our fourth anniversary at Library Thing! Time flies when you're having fun.

I will follow the same pattern as last year for presenting the changes in my key stats. First, I'll give the information just for this year; later on I'll give year by year stats for all four years. I'll put my general comments on trends in the space between the first and second lists.

The key stats for right now:

Library Size 6785
Zeitgeist Rank 176
Median 28
Mean 543
Top User in Common (raw) 672 aquaticus
Last User in Common (raw) 346 pomonomo2003
Last User in Common (more raw) 280 rundlettmiddle
Publication Date 1976
Percentage Unique 12
10th Percentile Book 1 (many different books)
90th Percentile Book 881 (Henry Chadwick, The Early Church - ranked 678)
Here I Stand Copies 1051 (ranked 611)

As you can see, our library grew only a little bit this year. We gave away about 100 books and removed them from the list, so the actual gain in new books is greater than the gain in library size.

For those of you who don't want to scroll down to find out why I list Bainton's, Here I Stand, it's because it was the first book I entered that had "many" users (55 when I started), so I can use it as a marker of the general growth of Library Thing in comparison to the early days.

Every year, I've tracked 25 "libraries of interest" to see if their status on our Users in Common list change from year to year. All things considered, the list has remained very stable. Although aquaticus is a new entry as the top user in common in terms of raw number, bibliophiles remains top user in common by weighted number, as they have been off and on since our first month at Library Thing. Bibliophiles is now third on the raw total list, with ten more in common than they had last year when they topped both the weighted and raw total lists. All 8 of the libraries of interest that turned up on the front page of our weighted users in common (bibliophiles, ellenandjim, jgarrig, alex19, pobanion, debweiss, meburste, and pomonomo2003) are still on the list this year. The two that emerged on the list if you click "more" are there too (markell and AlextheHunn). Two that were sinking last year (jfclark and mcmoran) and one that dropped off the year before (lycanthropist) appear to have dropped off permanently.

Our library's mean is very influenced by our having most of the most popular books according to Zeitgeist. We have 9 of the top 10 books, missing only The DaVinci Code. And we have two copies of the most popular book, Harry Potter and the Sorcerer's Stone. But we also seem to be acquiring more books near the top of the Zeitgeist list. This year's 90th percentile book (ranked 678), The Early Church, has 881 copies in Library Thing. Last year's, Bully for Brontosaurus, has 848 copies and is ranked 696 in our library this year; the year before's, Cultural Literacy, has 806 copies and is ranked 716; the first year's, Trout Fishing in America, has 772 copies and is ranked 737. The trend line seems obvious.

This year, I also decided to create a couple of other manufactured stats to see if they are worth following. First of all, I thought some quick numbers from the overall Zeitgeist page would be useful for comparison. When I looked, there were 1,086,837 members of LT; 50,012,841 books had been entered; 5,240,699 different works had been entered. By my calculations, that means that the average library size is 46 books (though lots of members have no books at all) and the average number of copies of an individual work in LT is 9.5.

Also, though I'm not sure what use it is going to be, I decided to find out how many books in our library met certain thresholds within the LT collection. We have 73 books that have at least 10,000 copies per book in LT; we have 633 books that have at least 1,000 copies; we have 2132 books that have at least 100 copies; and we have 4421 books that have at least 10 copies.

Here's a table that shows the change from year to year in the key categories

Library Size 5255 5447 6714 6785
Zeitgeist Rank 58 unknown 103 176
Median 5 12 21 28
Mean 144 310 419 543
Top User in Common (raw) 390 413 563 672
Bottom User in Common (raw) 78 197 318 346
Bottom User in Common (more raw) na 157 255 280
Publication Date 1975 1975 1976 1976
Percent Unique 22 19 14 12
90th percentile 1 1 1 1
10th percentile 333 494 697 881
Copies of Here I Stand 293 565 824 1051

That's it for now. Thanks for reading. See you again next year.

15johnandlisa
mar 29, 2016, 11:06 am

Happy 5th Anniversary of our library on LibraryThing!

As has been my tradition on our anniversaries here I've once again checked the various stats to see how much has changed over the year.

Here they are for this year:

Library Size: 6882
Zeitgeist Rank: 228
Median: 34
Mean: 643
Top User in Common (raw): 752 aquaticus
Bottom User in Common (raw): 374 aguenin
Bottom User in Common (more raw): 303 markell
Publication date: 1976
Percentage Unique: 11.2
10th percentile book: 1 (many)
90th percentile book: 1051 (J.S. Mill, Utilitarianism -- ranked 688)
Here I Stand Copies: 1283 (ranked 610)

Overall, we added about 100 books during the year, most of them in the past couple of days. So almost all changes in our statistics are caused by changes in LibraryThing's total population, not shifts in our own library. And basically, things have stayed pretty much the same as last year. The positions of the 25 "libraries of interest" in weighted Users with Our Books haven't changed, except that meburste is now the first to pop up if you press "more" instead of being one of the last that appeared in the front group. The most remarkable stability remains bibliophiles, who came in first (with 28 books in common) when we first bothered to note it back in April, 2006 and remains in first in weighted average today.

The other semi-processed stats from last time didn't really add anything to the discussion. But to help comparisons with last year I will note that the total number of users on LibraryThing is now 1,310,954, the number of books is 61,097,862, and the number of titles is 5,990,111.

Here's a table of the stats for all five years:

Library Size 5255 5447 6714 6785 6882
Zeitgeist Rank 53 ukn 103 176 228
Median 5 12 21 28 34
Mean 144 310 419 543 643
Top User In Common 390 413 563 672 752
Bottom User in Common 78 197 318 346 374
Bottom User in Common More ukn 157 255 280 303
Publication Date 1975 1975 1976 1976 1976
Percentage Unique 22 19 14 12 11.2
10th Percentile 1 1 1 1 1
90th Percentile 333 494 697 881 1051
Here I Stand Copies 293 565 864 1051 1283

Here's the list of the current number of copies and rank in our library for all five year's 90th percentile books:

Trout Fishing in America 902 750
Cultural Literacy 926 737
Bully for Brontosaurus 946 729
The Early Church 1036 694
Utilitarianism 1051 688

As you can see, there is a slight drift downward, presumably because we have added a handful of very popular books in and among the mass of ordinary books each year. According to Zeitgeist, the 1000th most popular book in all of LibraryThing is Goethe's Faust. Faust is the 285th most popular book in our library.

That's it for our fifth anniversary. I'll try to be back again next year.

16johnandlisa
mar 29, 2016, 11:08 am

I posted this on April 21, 2011, but never followed up on these stats as promised:

Oh Good. LibraryThing has just rolled out a whole new set of stats for libraries: Physical dimensions http://www.librarything.com/profile/johnandlisa/stats/physical

It looks as if the metadata allow them to identify page length, weight, size, etc. for about 2/3 of the books in our collection. Since I just learned about them, I'll put down the current data for the most conspicuous categories. Next year when I do our stats, I'll also check on how much these have changed.

First, how many of our books are actually included in the dataset?

The most inclusive category is page length. 4635 (of 6882) are listed at the "book level," 26 at the ISBN level, and 2221 have no data.
The least inclusive category is book thickness. 3737 are listed at the book level, 157 at the ISBN level, and 2988 have no data.

Overall, our books would make a stack 570.5 feet high, between the height of the Washington Monument and the Gateway Arch.

The books with usable data have a total of 1,435,334 pages. The mean number of pages per book is 307.95. The median number of pages per book is 279. Extrapolating to all 6882 books in the collection produces an estimate of 2,119,281 pages total.

I don't imagine the core numbers will change much. I'm curious how they compare to our list of comparable libraries.

17johnandlisa
Redigeret: mar 29, 2016, 11:12 am

Family events interfered with my posting my annual state of the stats on the precise sixth anniversary of our joining LibraryThing. I don't think six days late will have too great an effect, though.

Here are the most recent stats. To compare with previous years, scroll down within the post. To figure out the rationale for some comparisons scroll through posts from earlier years.

Library Size: 6976
Zeitgeist Rank: 290
Median: 40
Mean: 718
Top User in Common (raw): 823 Michaelg16
Bottom User in Common (raw): 393 EmScape
Bottom User in Common (more raw): 335 Hobus
Publication Date: 1976
% Unique: 10.2
10th Percentile Book: 1 (many)
90th Percentile Book: 1173 (Katz und Maus ranked 698)
Here I Stand Copies: 1456 (ranked 617)

Library Thing Users: 1,513,521
Books Catalogued: 71,426,286
Different Works: 6,765,349

Observations: We added a net of 94 books this year. The growth of obscure books in Library Thing means that by next year, less than 10% of our books will be unique. Our weighted users in common that we have been following since the first year have not changed much in the past year, except that AlextheHunn no longer appears if you press "more." Six appear on the main weighted list, three others when you press more.

The current number of copies and rank of the 90th percentile books from all six years:

Trout Fishing in America 985 778
Cultural Literacy 1020 755
Bully for Brontosaurus 1042 746
The Early Church 1174 703
Utilitarianism 1181 696
Katz und Maus 1173 698

Comparisons for all six years:

Size: 5255 5447 6714 6785 6882 6976
Zeitgeist Rank: 53 ukn 103 176 228 290
Median: 5 12 21 28 34 40
Mean: 144 310 419 543 643 718
Top User in Common: 390 413 563 672 752 823
Bottom User in Common: 78 197 318 346 374 393
Bottom User in Common (more): ukn 157 255 280 303 335
% Unique: 22 19 14 12 11.2 10.2
90th percentile: 333 494 697 881 1051 1173
Here I Stand: 293 565 864 1051 1283 1456

See you next year!

A curious additional factoid.

Last year, the 1000th most popular book in Library Thing according to Zeitgeist was Goethe's Faust, which was the 285th most popular book in our collection. With the addition of popular new books, Faust has drifted below the top 1000. The new 1000th most popular book is an interesting one: The King James Bible. And it is currently the 286th most popular book in our collection. It's interesting 1) that we have had the 1000th most popular book in all of Library Thing each of the past two years, and 2) that it has ranked in nearly the same spot in our own collection each time.

18johnandlisa
mar 29, 2016, 11:14 am

It's our seventh anniversary on Library Thing. Time for the traditional reflections on our stats.

First, the numbers. Observations, thoughts, elaborations will follow in a separate post.

This year:

Size: 7081
Zeitgeist Rank: 346
Median: 45
Mean: 794
Top User in Common: 846 (Michaelg16)
Bottom User in Common: 414 (Clio12)
Bottom User in Common (more): 356 (nhemme)
Publication Date: 1976
% Unique: 9.8
10th percentile: 2 copies (many) rank 6373
90th percentile: 1312 copies (How to Lie with Statistics) rank 708
Here I Stand Copies: 1652 rank 612

Library Thing Users: 1,656,349
Books Catalogued: 79, 895, 042
Different Works: 7,457,254

Past 7 years' 90th percentile books copies and rank this year:
Trout Fishing in America 1049 802
Cultural Literacy 1086 790
Bully for Brontosaurus 1085 791
The Early Church 1299 712
Utilitarianism 1295 714
Katz und Maus 1254 732
How to Lie with Statistics 1312 708

All 7 Years in One Table

Size 5255 5447 6714 6785 6882 6976 7081
Zeitgeist 53 ukn 103 176 228 290 346
Median 5 12 21 28 34 40 45
Mean 144 310 419 543 643 718 794
TUIC 390 413 563 672 752 823 846
BUIC 78 197 318 346 374 393 414
BUIC more ukn 157 255 280 303 335 356
% Unique 22 19 14 12 11.2 10.2 9.8
90th % 333 494 697 881 1051 1173 1312
HIStand 293 565 864 1051 1283 1456 1652

Seventh anniversary observations and elaborations:

In previous years, I've added comments (along with some additional numbers) in the course of presenting the regular numbers that get summarized in a table. I'll try to pick up on some of the themes of those comments in this post to go along with the previous one.

We added a net of 115 books last year.

Just two weeks after joining Library Thing I started paying attention to two things: Who was at the top of the list of "Members with your books" and a group of mostly academic interesting libraries to track. The first time I noted who was at the top of the list, it was bibliophiles, with 28 books in common. I continued to be amazed that over all this time, bibliophiles is still at the top of our weighted list of members with our books. They're also eighth on our raw list of members with our books.

At one point I created a grid to chart relationships between the 25 libraries I started to track, but that became a bit overwhelming. Instead, I've continued to see how many of the original 25 turn up on our list of members with my books. Six of them remain among the top group of weighted members with our books: bibliophiles (1), pobanion (10), ellenandjim (14), jgarrig (20), alex19 (35), and pomonomo2003 (39), Three more turn up when you click on "more": debweiss (46), meburste (56), and markell (69). That's unchanged from last year (though I'm not sure how much their ranks have changed from previous years, since I just started tracking that).

A feature that always interested me was how these lists related to one another. Although I don't have a systematic grid any more, it's clear that someone like markell's list of members with his books includes many more of the libraries being tracked than ours does. Bibliophiles ranks in the upper ranks of several of the tracked libraries. Also, there's almost no symmetry between who is at the top of our list and where we stand on other people's lists (reciprocals). For example, we rank 37th on bibliophiles' list. Of the other top 5 comparable libraries, we are "below the more" for aquaticus, ranked 17th for bwogilvie, below the more for chuck_ralston, and ranked 30th for drsabs.I can only find two libraries where we rank higher on their list than they rank on ours: Illiniguy71 ranks 28th on ours, we rank 9th on his, and RonKaplanNJ ranks 23rd on ours, we rank 21st on his.

I also have noted frequently how stable the list of most similar libraries is, even though Library Thing has been growing rapidly. I checked on the date of creation for our five most similar libraries. The oldest dates from October, 2005, the newest from August, 2008. Of the 49 libraries that appear when you click on members with your book recent, the most recent to join (Darblibrary) is from January 12, 2013. The oldest (booksbobbi) dates from August 25, 2011. Only three also appear on our weighted list, homeless, JonDaniels, fsuflorencelibrary. JonDaniels ranks highest of those at 18th.

In 2007, I tried to find the libraries among the 25 I tracked that had the smallest and largest averages for books. They were lycanthropist with 7152 books, 2 median, and 24 mean, and sylphette with 1847 books, 64 median, 747 mean. This year, their comparable numbers are lycanthropist 7528, 10, 157, sylphette 1635, 298, 2347.

There are a couple of disruptions from previous years. As predicted last time, this is the first survey where fewer than 10% of the books in our collection are unique. We still rank near the top of ultb in Library Thing among those who note them. Also, the progression of 90th percentile books didn't follow the usual linear growth. Last year's 90th percentile book was leapfrogged in rank by two earlier works and a pretty substantial gap emerged between the first three years' and the more recent years'.

Unlike the last two years, we don't have the 1000th most common book in Library Thing, Dance with Dragons, this year.

In 2010, I checked how many books in our collection surpassed thresholds in Library Thing. I checked again this year. How many books in our collection have at least this many copies in Library Thing?

10,000 137
1,000 831
100 2659
10 5075

I'm sure I've forgotten some comparisons, but this is all I can think of for now. Check in again next year.

19johnandlisa
mar 29, 2016, 11:15 am

Our eighth anniversary on LibraryThing. Here are our annual stats, streamlined version.

The traditionals:

Size: 7242
Zeitgeist Rank: 424
Median: 49
Mean: 867
Top User in Common (raw): 870 (Michael g16)
Bottom User in Common: 452 (j.a.lesen)
Bottom User in Common (more): 379 (StanleyBalsky)
Publication Date: 1976
% Unique: 9.4
10th percentile: 2
90th percentile: 1449 copies (LeCarre, Our Game) rank 724
Here I Stand Copies: 1816 ranked 634

More esoteric categories:

Net addition of 162 books

LibraryThing Users: 1,799,027
Books Cataloged: 89,098,829
Different Works: 8,152,866

10000 owner books: 151
1000 owner books: 895
100 owner books: 2831
10 owner books: 5284
1000th Book in Zeitgeist (Lewis, That Hideous Strength 4689 copies) not in our library. Would rank 327 if we had it.

Last 7 years 90th percentile books (#/rank)

How to Lie with Statistics 1407/734
Katz und Maus 1357/757
Utilitarianism 1437/726
The Early Church 1396/746
Bully for Brontosaurus 1139/831
Cultural Literacy 1143/830
Trout Fishing in America 1159/821

No changes to tracked "Users with Your Books (weighted)" Only new user who appears in "Users with Your Books (weighted)" is IraSandperlLibrary.

If you need any of those categories explained, check previous years' entries.

That's it. Hope you enjoy my musings. I'll be here again next year.

20johnandlisa
mar 29, 2016, 11:16 am

Our Ninth Anniversary on LibraryThing. Time for the annual checking of the statistics.

Size: 7358 (net gain of 116)
Zeitgeist Rank: 451
Median: 52
Mean: 899
Top User in Common (raw): Michaelg16 877
Bottom User in Common: AmanteLibros 458
Bottom User in Common (more): StanleyBalsky 391
Publication Date: 1974
% Unique: 692 9.4%
10th percentile: 2
90th percentile: 1507 copies/736 rank (Sense and Sensibility and Sea Monsters)
Here I Stand Copies and Rank: 1941 copies/629 rank
LibraryThing Users: 1,928,510
Books Cataloged: 96,237,372
Different Works: 8,778,235
Books with 10000 copies in our library: 151
Books with 1000 copies in our library: 945
Books with 100 copies in our library: 2922
Books with 10 copies in our library: 5421
1000th most popular book in Librarything: Where the Heart Is. 4920 copies, would rank 327

Last nine years' 90th percentile books with number of copies and rank:

Trout Fishing Fishing in America: 1215/832
Cultural Literacy: 1189/840
Bully for Brontosaurus: 1172/854
The Early Church: 1471/748
Utilitarianism: 1505/737
Katz und Maus: 1426/764
How to Lie with Statistics: 1468/751
Our Game: 1503/739
Sense and Sensibility and Sea Monsters: 1507/736

There have been no new additions to our weighted members with your books since IraSandperlLibrary last year. Of the 43 libraries I singled out as Interesting Libraries over the years, 16 turn up in weighted members with your books. 9 of those reciprocally list us in their weighted members with your books. As a rule, other peoples' libraries are more like ours than ours are like theirs. For example, the number one library of weighted members with your books for us is bibliophiles (as they have been since the first year I started paying attention) whereas we are number 48 on bibliophiles list of weighted members with their books. I know of only one library that reverses that and ranks our library higher on weighted members with their books than we list theirs on ours. Illiniguy71 ranks 26 on our list, while we rank 9 on his.

The 16 and their rank:
bibliophiles 1
drsabs 3
bwogilvie 4
wrbucla 6
pobanion 11
ellenandjim 13
annamorphic 17
jgarrig 19
Illiniguy71 26
igallupd 29
alex19 35
pomonomo2003 45
debweiss 50
meburste 67
markell 69
MMcM 93

In an earlier year I noted the extremes of highest and lowest medians/means among the 43 libraries. This year, their numbers are:
sylphette: 1629 books, median 335 mean 2620
lycanthropist: 7640 books, median 11 mean 156

Finally, a new stat. I quickly checked to see how many of the 43 libraries were still active.
Added a book in the last year: 27
Last added a book more than a year ago, but less than 2 years ago: 6
Have not added a book in at least 2 years: 10
(of those, last year of entry: 2006 1, 2007 2, 2008 1, 2009 1, 2010 3, 2011 0, 2012 2.)

A couple of other random statistical observations about LibraryThing related to most the popular books.

The top seven spots in the list of books owned are the Harry Potter series. Number 1 is Harry Potter and the Sourcerer's Stone, with 76,819 copies. LibraryThing lists 1,799,027 users, so around 4.2% of all users have HPSS in their library (though some libraries -- like ours -- have duplicates.

I was curious how many of the weighted members with our books also had Harry Potter. Of the 41 that turn up before you click on "more," 10 had at least one Harry Potter book. Most of those had the whole series. With two exceptions, all of the others that didn't have Harry Potter had one of the other top 50 books in Zeitgeist in common with us, usually one of the familiar classics: 1984, Pride and Prejudice, Jane Eyre, the Odyssey. The two exceptions share a large number of baseball books with us. One of those has Erik Larson's Devil in the White City, in common with us, which is ranked 188 on Zeitgeist. The other's most popular book in common was Michael Lewis's Moneyball, which has 3522 copies in LibraryThing, beyond the 1000th book on Zeitgeist.

21johnandlisa
mar 30, 2016, 1:49 pm

I promised some 10th anniversary observations here yesterday, so here they come. I think there will probably be two more posts later. One on "members with your books" and one on "what have I learned?" Like these, I'll post them at our own page first and then post here.

First 10th Anniversary Post. The Usual Stats:

Size: 8136 (net gain of 778)
Zeitgeist Rank: 423
Median: 40
Mean: 862
Top User in Common (raw): Michaelg16 895
Bottom User in Common: katbook 473
Bottom User in Common (more): featherbear 401
Publication Date: 1970
% Unique: 1219 15%
10th percentile: 1
90th percentile: 1360 copies/814 rank (Washington Irving, The Alhambra)
Here I Stand Copies and Rank: 2053 copies/628 rank
LibraryThing Users: 2,043,415
Books Cataloged: 104,545,987
Different Works: 9,738,061
Books with 10000 copies in our library: 159
Books with 1000 copies in our library: 983
Books with 100 copies in our library: 3008
Books with 10 copies in our library: 5566
1000th most popular book in Librarything: Asimov, Caves of Steel. 5200 copies, would rank 330

All ten years' 90th percentile books with current number of copies and rank:

Trout Fishing Fishing in America: 1268/852
Cultural Literacy: 1253/857
Bully for Brontosaurus: 1217/869
The Early Church: 1555/763
Utilitarianism: 1579/753
Katz und Maus: 1480/776
How to Lie with Statistics: 1561/760
Our Game: 1581/752
Sense and Sensibility and Sea Monsters: 1601/745
The Alhambra: 1360/814

22johnandlisa
mar 30, 2016, 1:50 pm

Second 10th Anniversary Post. Ten Year Running Totals:

For some reason, I stopped doing running totals the last couple of years. I can't even remember why. Here's what the key statistics look like in a table (leaving out a few stats that don't change much) so you can see overall evolution.

Size 5255 5447 6714 6785 6882 6976 7081 7242 7358 8136
Zeitgeist Rank 53 ukn 103 176 228 290 346 424 451 423
Median 5 12 21 28 34 40 45 49 52 40
Mean 144 310 419 543 643 718 794 867 899 862
TUIC 390 413 563 672 752 823 846 870 877 895
BUIC 78 197 318 346 374 393 414 452 458 473
BUIC+ ukn 157 255 280 303 335 356 379 391 401
% Unique 22 19 14 12 11.2 10.2 9.8 9.4 9.4 15
90th % 333 494 697 881 1051 1173 1312 1449 1507 1360
HereIStand 293 565 864 1051 1283 1456 1652 1816 1941 2053

I only started keeping track of Library Thing overall numbers in my 4th year. Here are 7 years of stats:

Users: 1086837 1310954 1513521 1656349 1799027 1928510 2043415
Books: 50012841 61097862 71426286 79895042 89089829 96237372 104545987
Works: 5240699 5990111 6765349 7457254 8152866 8778235 9738061

I was even less consistent in keeping running totals on a couple of other stats. But here's what I did accumulate.

Number of Books in my collection that reach thresholds for all LT users for all ten years:

10000 users ukn ukn ukn 73 ukn ukn 137 151 151 159
1000 users ukn ukn ukn 633 ukn ukn 831 895 945 983
100 users ukn ukn ukn 2132 ukn ukn 2659 2832 2922 3008
10 users ukn ukn ukn 4421 ukn ukn 5075 5284 5421 5566

And here's where the 1000th ranked book in LibraryThing would rank if it were in our library for all ten years:

ukn ukn ukn ukn 285 286 ukn 327 327 330

And here's a final "new" statistic which I didn't calculate at the time, but which is easy to calculate in retrospect from existing information. It's one I mentioned early on at the Too Obscure Group as an alternative measure of obscurity: What % of your books does the library with the most books in common with your library have? Measure that for the library that comes out on top for "raw" total and for the one that comes out at the bottom after you click "more."

% overlap top 7.4 7.6 8.4 10 10.9 11.8 12 12 12 11
% overlap bottom 1.4 2.8 3.8 4.1 4.4 4.8 5 5.2 5.3 4.9

23johnandlisa
mar 30, 2016, 1:50 pm

Addendum to 10 year running totals. The 90th percentile books.

Each year, I noted what book was the 90th percentile book in our collection and how many copies of the book were shared by other Library Thing libraries. The rank of the 90th percentile book within our own collection in any given year is obvious (total number of books x .9), but after that first year, it would migrate in relation to other books depending both on how many books I added to our collection, whether they tended to be more or less popular than the 90th percentile book, and how much their popularity changed with the influx of new libraries. It seemed interesting to chart those changes, so I kept track of them (except alas for the first change, where I didn't check on the first 90th percentile work's new stats). Unfortunately, this is not something that can be done retrospectively. What rank was this year's 90th percentile book in 2007? I don't know how you could find out.

In reverse chronological order, here are the number of copies and rank within our library of each of the books that was the 90th percentile book, and how the number of copies and rank changed in subsequent years (the numbers are inversely correlated. The higher the number of copies, the more popular, the lower the rank, the more popular):

The Alhambra
2016: 1360 copies/814 rank

Sense and Sensibility and Sea Monsters
2015: 1507/736
2016: 1601/745

Our Game
2014: 1449/724
2015: 1503/739
2016: 1581/752

How to Lie With Statistics
2013: 1312/708
2014: 1407/734
2015: 1468/751
2016: 1561/760

Katz und Maus
2012: 1173/698
2013: 1254/732
2014: 1357/757
2015: 1426/764
2016: 1480/776

Utilitarianism
2011: 1051/688
2012: 1181/696
2013: 1295/714
2014: 1437/726
2015: 1505/737
2016: 1579/753

The Early Church
2010: 881/678
2011: 1036/694
2012: 1174/703
2013: 1299/712
2014: 1396/746
2015: 1471/748
2016: 1555/763

Bully for Brontosaurus
2009: 697/671
2010: 848/696
2011: 946/729
2012: 1042/746
2013: 1085/791
2014: 1139/831
2015: 1172/854
2016: 1217/869

Cultural Literacy
2008: 565/544
2009: 678/681
2010: 806/716
2011: 926/737
2012: 1020/755
2013: 1086/790
2014: 1143/830
2015: 1189/840
2016: 1253/857

Trout Fishing in America
2007: 333/526
2008: ukn
2009: 648/700
2010: 772/737
2011: 902/750
2012: 985/778
2013: 1049/802
2014: 1159/821
2015: 1215/832
2016: 1268/852

24johnandlisa
mar 30, 2016, 9:05 pm

More tenth anniversary information. The other enthusiasm, Libraries with "Your Books".

From the beginning of our time on Library Thing I've been interested in which libraries are most similar to ours as indicated by the list of "weighted" "Members with Your Books." What has been most fascinating is how stable the list has been over the past ten years. Bibliophiles emerged as the top library with books in common within a week of our entering books and it remains at the top today. For the last three years, I've noted if new libraries emerged on our list and there has been almost no movement. Last year, I listed where on our list of members 16 libraries stood. Here is a list of where they ranked last year and where they rank this year.

Bibliophiles 1 1
drsabs 3 3
bwogilvie 4 4
wrbucla 6 6
pobanion 11 11
ellenandjim 13 14
annamorphic 17 17
jgarrig 19 19
igallupd 29 27
IlliniGuy71 26 31
alex19 35 36
pomonomo 45 47
debweiss 50 52
markell 69 71
meburste 67 73
MMcM 93 86

One of the things that I've found fascinating is that sometimes there is little symmetry or reciprocity between how library's rank on each other's lists. I've noted before that while bibliophiles is at the top of our list, we rank about 50th on their list. Bibliophiles is also at the top of alex19's list. I did a quick check of the lists for all of our "interesting libraries" and found some, like us, have many more libraries appearing on our list than we appear on their lists: 16 vs. 9. While others, such as bibliophiles, have fewer libraries on their list than libraries on whose lists they appear: 12 vs. 22. Still others, like sdarwall, are balanced: 10 vs. 10.

Most out of balance libraries in our focus group:
Fledgist 15 vs 2
jgarrig 10 vs 5
mcmoran 15 vs 7
annamorphic 6 vs 3
markell 11 vs 19
pomonomo 10 vs 18
wrbucla 13 vs 17

I thought it would also be interesting to see how similar a most similar library to us would actually be. So I looked at the stats for bibliophiles to compare with ours. They are listed here, with our numbers first and bibliophiles second:

Library size: 8136 5554
Zeitgeist Rank: 423 1000
Median: 40 119
Mean: 862 729
Pub Date: 1970 1980
Top User: 895 (michaelg16) 1501 (michaelg16)
Bottom user: 473 (katbook) 619 (IraSandperlLibrary)
Last User: 401 (feather bear) 459 (EnriqueFreeque)
% Unique: 15% (1219) 1% (60)
90th percentile: 1360 copies (The Alhambra) 1285 copies (Marcovaldo)

We don't have Calvino's Marcovaldo in our collection. The closest book we have in common to it is Montaillou, with 1279 copies, which is the 841st most popular book in our collection.

Of the 20 most popular books in bibliophiles library, we have 13.

25johnandlisa
mar 31, 2016, 10:39 pm

Random final thoughts and factoids.

Almost all of the analyses I've done over the last few posts were "by hand," just clicking on libraries and counting. But I've often thought that LibraryThing would lend itself very well to a digital humanities analysis. I wonder if anyone has been doing that?

Many of the features that I've been relying on our actually a bit of a mystery to me.

I'm intrigued by the libraries that appear on our list of "weighted" members with our books, but I don't know how the weighting works. I've just sort of assumed that the more books we share that comparatively few other people have the more that "weighs" for comparison. But how that is calculated is unclear.

I'm also intrigued by the rankings of books in libraries based on the numbers of copies entered into LibraryThing. But there's a bit of a gap in how that can be used. The 1000 most widely held books are listed on the Zeitgeist page. And every book as a number for its "popularity" which you would think aligns with rank, but doesn't. First of all, the two lists are slightly out of sync, with "popularity" being slightly ahead. (HP and the Sorcerer's Stone, for instance, has 84,454 copies according to Zeitgeist rank and 84,469 according to popularity). But more confusing is that HP and the Sorcerer's Stone is ranked first on the Zeitgeist list, but has a popularity of 2, along with all the other Harry Potter books. If you click on the popularity number of a work, it gives you several interesting breakdowns worthy of deep analysis to understand book reception: popularity by year, by quarter, and by month, and cumulative popularity month by month from the beginning of LibraryThing. (HP and the Sorcerer's Stone was only the 3rd most popular work the first month of LibraryThing and only definitively become number 1 in September, 2007. I would have assumed that a cumulative popularity of 1 would mean that the popularity number would be 1, but it's 2. Is there some averaging of cumulative numbers and month by month numbers that produces the discrepancy. The gap become even more conspicuous when you get to the bottom of the Zeitgeist rank. #1000 rank, Isaac Asimov's Caves of Steel has a popularity of 860. If you used popularity to try to judge how a work ranks in comparison to others, you will get thrown off. This becomes a big problem when you get to the works with few users. As far as I can tell, all of the ULTB's have the same popularity 3,689,591. It would be nice to think that that meant that there were 3,689,591 works in LibraryThing that had more than 1 copy, but that's pretty clearly not true. As a result, I can't just subtract the number of works with more than one copy from the total number of works in LibraryThing to determine how many works with just one copy are in LibraryThing (if we did the subtraction, it would lead to the conclusion that there are more than 6 million ULTB's in LibraryThing.)

In fact, if one relies on the Zeitgeist numbers for LibraryThing as a whole, it produces some questionable results. The implied average library size (dividing books by users) is 51.2, which is plausible. But the implied mean number of copies of books (dividing books by works) is 10.7, which seems awfully low. And the implied median book is 1. There must be some way to check this against the averages of all libraries in LibaryThing.

There seem to be at least 3 broad groups of books with different strategies of analysis. For books in the top 1000 of Zeitgeist there is a direct ranking and detailed monthly analyses of popularity that can be compared. For books outside of the top 1000 up to about 100,000, there are detailed monthly analyses of popularity, but measurements of rank between them are only easily feasible within individual libraries. Right now, the dividing line between those in and out of the 100,000 barrier is about 120 copies. Beyond that, the only number we have to go on is popularity, which is almost certainly off by a significant degree as a ranking.

The clever folks at LibraryThing have been adding new information that I'm sure I would have started following from the beginning had it been available. I'll just note a couple of factoids. First, the feature vous et nul aultre identifies which libraries share copies of the rarest books in your library, up to books that are held by five people. I looked at the list of books shared by just one other library to see which libraries had the most such books. There were two that shared 4 books with us and no one else: Hookom and RonKaplanNJ.

I also quickly scanned the physical dimensions information. 5002 of our 8136 books had physical information, which means that all the information for the library as a whole is very speculative. Still, it's interesting to learn that our library weighs about 9000 lbs and would stack up 570 feet and that our average book has about 312 pages.

There were a couple of other stats that I've noted occasionally in previous years, that deserve follow up here. I have noted the changes in mean and median for what appeared to be the most obscure and least obscure of the libraries I've been following over the years. Here's where they stand this year: lycanthropist 12/164 sylphette 360/2817. Unsurprisingly, these two do not turn up on the lists of libraries with your books for each other, though they can be linked by two degrees of separation.

In an early Too Obscure post, I also mentioned % works in your library owned by the top library with your books raw number as a potential measure of obscurity. For this year, those numbers are 11% for the Top user in common Michaelg16 and 4.9% for the bottom user in common feather bear.

And finally, I was intrigued by one other thought. If my median book is 40 and my mean book is 862, what kind of title would one find for each? There were 28 median books in our collection. Representative titles are Bernd Moeller's Imperial Cities and the Reformation, Pam Smith's The Body of the Artisan, and Stephen Dunn's Local Visitations. There were 4 books at our mean. 2 copies of Peter Weiss's Marat/Sade, Joseph Krumgold's Onion John, and John Keegan's Mask of Command. Those seem like pretty good comps for our "average" works. I also checked what kinds of works were characteristic of the thresholds I was following, 10000, 1000, 100, and 10 books. Here are characteristic works at each threshold:
10000 Mortenson, Three Cups of Tea
1000 Bel Kaufman, Up the Down Staircase
100 Anthony Grafton, New Worlds, Ancient Texts
10 Lisa Rosner, The Most Beautiful Man in Existence (it's good to get one of our own in there, though we'd prefer that it be at a higher level!)

You may have noticed that our collection grew pretty substantially over the last year (net gain of 778) and that has had a noticeable impact on our statistics. I have to admit that I did not have a lot of time to clean up works that need combining, so I'm sure the numbers I have produced are flawed for that reason. The work of combiners has undoubtedly affected the statistics to some extent over the years, but I cannot think of any way that one could measure that impact.

I'll close with one cute observation about the growth of LibraryThing itself. Two huge milestones were reached in just the last few months. LT had its 2,000,000th member and its 100,000,000th book. And before I get to our eleventh anniversary analysis, it will pass the third milestone, its 10,000,000th work. All good round numbers within about a year of each other.

I wonder if anyone had the energy/interest to read all these posts. Let me know!

26bluepiano
Redigeret: apr 1, 2016, 5:19 am

You've had at least one reader, though I must say that I sometimes skimmed a bit. I can only imagine the time and pains you've taken over the years to keep track in this way, and I can't even begin to imagine how long it must have taken to get the stats on reciprocity between others' most similar libraries. Cheers.

(It's quite interesting that bibliophiles' library has been consistently most similar to yours given that the member is actually two people--I suppose it's because they've both apparently an interest in history.)

27bnielsen
Redigeret: apr 1, 2016, 5:43 am

>26 bluepiano: I'll make that two :-)

And FYI: I've kept track of the number of books in LT here:

http://www.librarything.com/wiki/index.php/User:Bnielsen/MillionMarks#Million_ma...

28hailelib
apr 1, 2016, 7:16 am

Interesting posts.