This reminds me of MusicBrainz, whose database stores "release groups", e.g. the album Nevermind by Nirvana is one, which can have hundreds of "releases", as different media (tape, CD, LP, promo, ...), different countries, later re-issues, etc. [0]
Sometimes these have different catalogue numbers or barcodes to distinguish them, sometimes they don't but they're still different. I've seen releases where the only difference is the label in the centre of the LP, or the back of the CD case has a two-column tracklisting vs a one-column tracklisting. Music publisher uses the same code and says it's identical and yet it's clearly not.
Then there's the "recordings" on an album, which even if they're never re-recorded can still end up chopped up, bleeped or remastered. They're not the same sound. MusicBrainz likes to track when they are exactly the same recording (e.g. the LP recording of a song appearing on a compilation album verbatim) and when they're not (e.g. radio edits of the LP recording). And if we're going beyond recordings by one artist of "their" song, i.e. cover versions, or just plain standards, those are "works", with composers, lyricists, and can be recorded thousands of times by different artists...
I greatly appreciate the pedantry and flexibility for noting down when creative works are the same versus where they differ, in relational database form.
I haven't looked into what their schema is like, but if it's anything like Musicbrainz it will be pretty comprehensive and easy to pull the data you want out of!
That's the post I made on r/plex a decade ago that pissed off a dumbass moderator and got me banned from there! I guess he hated books.
I've recently been doing data entry on Open Library... sometimes even worldcat doesn't have an OCLC for an edition, and Open Lib is my fallback. Maybe I should be doing it on Bookbrainz instead.
My favorite example of this sort of thing has been In My Tribe by 10000 Maniacs. The UPC/Catalog Number remained the same between the 1987 release and the removal of Peace Train (track 7) in 1989. I have this memory of sifting through the stock at a large used CD store in the mid-90s hoping to find the pre-removal version.
I know that for a book I've published via Kindle Press (the real ones, not digital) that there are at least 3 official revisions, and many many minor ones that as far as I know are only differentiated by the minor typos fixed, and MAYBE one of the numbers buried in the front matter. The ISBN has remained the same.
Sometimes we definitely want 'items' though, so for example I am in a physical bookstore and see a book I might be interested in, so I buy it, to find out later back home that I already have the very same book - and edition - already. It's a scenario that anyone with some amount of books definitely encountered multiple times, I know I did it myself a few times. :)
Ability of an ISBN search of my collection would have helped me in this case - scanning a barcode is easy enough task to accomplish.
And even if I had a different edition, the resulting title from searching for a different edition would be enough to help me figure out that I should not buy a book I already own.
If anyone in the comments is in a similar predicament to the author and would like a book logging app, I will say that I disagree on their judgement of StoryGraph - I've found it a pretty decent interface, the search function is very good, and the (anti)features mentioned in the footnote are incredibly easy to not use, as the creators seem to understand that many of their users have a very strong preference to avoid AI bloat.
https://hardcover.app is another choice. It's the one I've been using since right after the second Trump inauguration when I decided to "de-oligarch" as much as possible.
I don't know that work, but I agree with you in general because of forewords etc. Or even appendices. And translations by different translators.
I "grew up with" a specific translation of Lord of the Rings into Norwegian, for example. There are two. They are very different. But the editions also differ in whether they include the appendices, whose illustrations are used, and more.
When you delve into real domain specific knowledge, surprises often surface and it turns out that what you might think is a simple thing is actually rather complicated.
I'm mildly surprised at exactly how successful ISBNs are. I worked in a book wholesaler's warehouse 35 odd years ago and the ISBN was used as the product code by the "system". I'd get a series of picking lists for pallets on good old green "staved" fan fold. I'd whizz around the warehouse with my trolley and pick from paper packets of books. The product lines had the rack and bay, last four from the SBN, quantity, title and full SBN. The packets of books had the rack/bay/last four from SBN printed on a label in large and small other details. I got very good at optimising my course around the warehouse and could pick at a right old rate, whilst listening to my mini cassette player. Its pretty boring work so you might as well game it!
Sometimes an individual book might fall off my trolley and be dumped in the big cardboard "skip" for rejects. For some reason casualties around me generally involved subjects like maths, material sciences, geology, surveying, hydrology. Oh and fractals!
I graduated in civil engineering.
Anyway. Surely all of us here know that really getting to grips with defining what it is that you are cataloguing/indexing/numbering/whatever and why can be quite tricky.
Both Dewey and SBNs catalogue "books" but for very different reasons. Both systems are extremely successful. You might think that in our world of LLMs n that, that books, Dewey and SBNs will go the way of the dodo.
Perhaps, but I doubt it.
Right, bugger all this old school nonsense. I've got a C64 (it rocks a SD card interface and a HDMI out (via SCART - must sort that out)) blinking away on my telly in the sittingroom and some mutant camels need a bloody good kicking.
This also fails to take into account that ISBNs also contain the publisher ID in them. So identical copies of a book could have different ISBNs depending on which markets they are sold in.
They don't contain the publisher name, but ISBNs are usually purchased in blocks of 10 or 100 or 1000 or whatever by a single entity, which is often a single publisher or corporation.
However, within the block publishers can assign ISBNs to different imprints.
For ISBNs from the big 5, the number really does indicate the publisher. I think the 5th digit (second after 978) can indicate at least some of the big publishers. Smaller ranges are available for purchase from the brokers. In Canada, the national library will even issue you one for free, if you self-publish.
The ISBN always indicates the country it's from, the United States getting the biggest block, other European nations and Japan getting their own, with Africa, the Middle East, and so forth all getting a block in common.
ISBN prefixes does not always indicate a country. They may be are indeed countries, but others are language areas (e.g. 0/1=English) or "regions" (groups of countries) or even other subjects.
I'm not sure this is the case, I got my ISBN range through my government national library service, I could be wrong but when you let them know what the book is you are publishing they ask for the Publisher name, though I am guessing as the service is free and it only applies to New Zealand books and publications.
My state had a reading competition that listed books by ISBN, which was a real challenge for students to track down. Each library had different editions and even different cover art, so if you “found” the book you might not recognise it on the shelf, etc…
I worked on the library systems and one of my innovations was to use the ISBN mapping database of WorldCat to find books with identical content but different ISBNs to help kids find the books on the list.
Over ten years that one SQL join in the code made the kids read an extra million books they wouldn’t have otherwise.
If the author sees this comment, https://news.ycombinator.com/item?id=43168838 might be relevant as it relates to catalogue completeness. OpenLibrary is very good, but Anna's Archive is potentially more complete.
I used Letterboxd a lot before kids. I used Goodreads until the Trump inauguration when I de-Amazon'd myself as much as possible (Amazon owns Goodreads). I switched to Hardcover, which is a much better interface. There are ways to improve, but overall I prefer it over Goodreads.
>Uh-oh. Why do we have so many distinct versions of The Last Unicorn? Well, each distinct format of a work has its own ISBN (so a hardcover, paperback, and eBook all have different ISBNs),
This isn't even the half of it. On some digital books, I'll find a dozen ISBNs in the front matter. Of course there's the hardback, the clothbound (not always the same as the hardback), the alk. paper variant, paperback, trade paperback, epub, pdf, "Adobe digital", and "master digital e-book" (no idea what that even is myself). And that's all just issued together. If they reprint, it won't get a new ISBN, but if the rights convey to another publisher, that one will get a whole 'nother set again. Some popular titles likely have low hundreds of ISBNs, and keep in mind that these have only been a thing since the late 1960s (9 digit ISBNs, technically just SBNs back then). Then with the now dead paperback trade, you could go through a dozen different covers for the most popular books (King, etc) but they'd all use the same ISBN.
Then, and this one bites me the most... if archive.org scans in a hardback with its ISBN, what do I use for the scanned pdf? I've decided that for lack of a better alternative I have to use it, but if the publisher made their own pdf (even just scanning the hardback), then it is supposed to issue a new ISBN to it.
Cataloging my own library, I've had to use a hodgepodge of unique ids. ASINs, ISBNs, Worldcat's OCLC numbers, Open Library's, and a few others besides. And it still comes up short. The number of oddball publishers and pamphlets and so forth that have never been cataloged anywhere is enormous.
tl;dr; - The ISBN is intended to be a physical Part Number, within the book business. Where "hardcover, or paperback, or trade paperback, or large print, or revised edition, or ..." very much matters.
I've stumbled across 3 or 4 magazines that printed the wrong ISSN in more than one issue. One from the 80s did so in every single issue of it's 20some issue run. It must be true that some books have done so as well, but I don't even check that those are correct.
Sometimes these have different catalogue numbers or barcodes to distinguish them, sometimes they don't but they're still different. I've seen releases where the only difference is the label in the centre of the LP, or the back of the CD case has a two-column tracklisting vs a one-column tracklisting. Music publisher uses the same code and says it's identical and yet it's clearly not.
Then there's the "recordings" on an album, which even if they're never re-recorded can still end up chopped up, bleeped or remastered. They're not the same sound. MusicBrainz likes to track when they are exactly the same recording (e.g. the LP recording of a song appearing on a compilation album verbatim) and when they're not (e.g. radio edits of the LP recording). And if we're going beyond recordings by one artist of "their" song, i.e. cover versions, or just plain standards, those are "works", with composers, lyricists, and can be recorded thousands of times by different artists...
I greatly appreciate the pedantry and flexibility for noting down when creative works are the same versus where they differ, in relational database form.
[0] https://musicbrainz.org/release-group/1b022e01-4da6-387b-865...
https://bookbrainz.org/about
I haven't looked into what their schema is like, but if it's anything like Musicbrainz it will be pretty comprehensive and easy to pull the data you want out of!
I've recently been doing data entry on Open Library... sometimes even worldcat doesn't have an OCLC for an edition, and Open Lib is my fallback. Maybe I should be doing it on Bookbrainz instead.
Ability of an ISBN search of my collection would have helped me in this case - scanning a barcode is easy enough task to accomplish.
And even if I had a different edition, the resulting title from searching for a different edition would be enough to help me figure out that I should not buy a book I already own.
For example, compare the most recent edition of 'Straight and crooked thinking' with the one published in 1930.
I "grew up with" a specific translation of Lord of the Rings into Norwegian, for example. There are two. They are very different. But the editions also differ in whether they include the appendices, whose illustrations are used, and more.
Are we talking material plot or characterisation changes?
[0] Before anyone says it, I'm sure some bible nerd has numbered them, it's hyperbole.
I'm mildly surprised at exactly how successful ISBNs are. I worked in a book wholesaler's warehouse 35 odd years ago and the ISBN was used as the product code by the "system". I'd get a series of picking lists for pallets on good old green "staved" fan fold. I'd whizz around the warehouse with my trolley and pick from paper packets of books. The product lines had the rack and bay, last four from the SBN, quantity, title and full SBN. The packets of books had the rack/bay/last four from SBN printed on a label in large and small other details. I got very good at optimising my course around the warehouse and could pick at a right old rate, whilst listening to my mini cassette player. Its pretty boring work so you might as well game it!
Sometimes an individual book might fall off my trolley and be dumped in the big cardboard "skip" for rejects. For some reason casualties around me generally involved subjects like maths, material sciences, geology, surveying, hydrology. Oh and fractals!
I graduated in civil engineering.
Anyway. Surely all of us here know that really getting to grips with defining what it is that you are cataloguing/indexing/numbering/whatever and why can be quite tricky.
Both Dewey and SBNs catalogue "books" but for very different reasons. Both systems are extremely successful. You might think that in our world of LLMs n that, that books, Dewey and SBNs will go the way of the dodo.
Perhaps, but I doubt it.
Right, bugger all this old school nonsense. I've got a C64 (it rocks a SD card interface and a HDMI out (via SCART - must sort that out)) blinking away on my telly in the sittingroom and some mutant camels need a bloody good kicking.
However, within the block publishers can assign ISBNs to different imprints.
The ISBN always indicates the country it's from, the United States getting the biggest block, other European nations and Japan getting their own, with Africa, the Middle East, and so forth all getting a block in common.
See https://en.wikipedia.org/wiki/List_of_ISBN_registration_grou...
I worked on the library systems and one of my innovations was to use the ISBN mapping database of WorldCat to find books with identical content but different ISBNs to help kids find the books on the list.
Over ten years that one SQL join in the code made the kids read an extra million books they wouldn’t have otherwise.
My biggest “bang for buck” in my career!
There is. https://hardcover.app
I used Letterboxd a lot before kids. I used Goodreads until the Trump inauguration when I de-Amazon'd myself as much as possible (Amazon owns Goodreads). I switched to Hardcover, which is a much better interface. There are ways to improve, but overall I prefer it over Goodreads.
This isn't even the half of it. On some digital books, I'll find a dozen ISBNs in the front matter. Of course there's the hardback, the clothbound (not always the same as the hardback), the alk. paper variant, paperback, trade paperback, epub, pdf, "Adobe digital", and "master digital e-book" (no idea what that even is myself). And that's all just issued together. If they reprint, it won't get a new ISBN, but if the rights convey to another publisher, that one will get a whole 'nother set again. Some popular titles likely have low hundreds of ISBNs, and keep in mind that these have only been a thing since the late 1960s (9 digit ISBNs, technically just SBNs back then). Then with the now dead paperback trade, you could go through a dozen different covers for the most popular books (King, etc) but they'd all use the same ISBN.
Then, and this one bites me the most... if archive.org scans in a hardback with its ISBN, what do I use for the scanned pdf? I've decided that for lack of a better alternative I have to use it, but if the publisher made their own pdf (even just scanning the hardback), then it is supposed to issue a new ISBN to it.
Cataloging my own library, I've had to use a hodgepodge of unique ids. ASINs, ISBNs, Worldcat's OCLC numbers, Open Library's, and a few others besides. And it still comes up short. The number of oddball publishers and pamphlets and so forth that have never been cataloged anywhere is enormous.
tl;dr; - The ISBN is intended to be a physical Part Number, within the book business. Where "hardcover, or paperback, or trade paperback, or large print, or revised edition, or ..." very much matters.