Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Rise of Pirate Libraries (2016) (atlasobscura.com)
143 points by yitchelle on July 22, 2017 | hide | past | favorite | 64 comments


Although descriptive of the legal situation, I object to the term "pirate library". There is no practical difference between a pirate library and a brick and mortar library, except the larger stock of books online. My local library even has a photocopier.

A library is a place where you can choose a book, and read it for free. That's how it's been for thousands of years. Copyright is a modern intruder and has no right to brand bone fide libraries as "pirate".


> There is no practical difference between a pirate library and a brick and mortar library, except the larger stock of books online.

Physical libraries purchase the books they own, own them legally, and can legally lend them out. The practical difference between the two couldn't be more stark - physical libraries operate within the law, pirates operate in spite of it.

>My local library even has a photocopier.

They don't intend for you to use it to copy the books they lend you, though. There's a reason that libraries don't run printing presses in the back and only carry limited copies of a book - because they depend on copyright law for their survival.


> Physical libraries purchase the books they own, own them legally, and can legally lend them out. The practical difference between the two couldn't be more stark - physical libraries operate within the law, pirates operate in spite of it.

The law is broken. There is no legal mechanism to lend out a digital file. That's the law's fault, not the lender's fault.

> libraries don't run printing presses in the back and only carry limited copies of a book

Now that is an important difference. But no one bothers trying to make a lend-only digital library because it's more difficult and no less illegal.


> The law is broken. There is no legal mechanism to lend out a digital file. That's the law's fault, not the lender's fault.

Oh yes, there are laws covering lending out digital files. They might not have been written specifically for the purpose but they're used anyway. Lending out digital files is, as anyone who has an insight in the industry, fraught with all sorts of silly artificial restrictions. In Sweden it is actually much more expensive for a library to lend out a digital publication that a physical one, especially when the publication is recent and popular. Depending on the contract the library has with the publisher they'll have to buy a one-time license any time a publication is lent out, other contracts stipulate a fee per loan plus an artificial 'wear' limit on digital publications. In contrast a physical publication is bought once - at a higher price than that charged in a book shop - after which it can be lent out until it falls apart.

I do not borrow digital publications from the library for this reason. Where digital technology could be used to further the mission of libraries it is now used to counter them. How... very much expected. Fortunately there are alternatives, as mentioned in the article.


Maybe I'm not following the conversation correctly, but plenty of US libraries lend digital copies of books, audiobooks, movies, tv shows, etc.


But they have to get special licenses, at the discretion of the copyright holder. They can't rely on the first sale doctrine.


Imagine you go to Ancient Greece and tell them: "in the future there will be a library of over 1 million books and papers containing the knowledge of the world, that any person rich or poor can access from almost anywhere in the world"

Do you think their reaction will be: A) wonder that such a library can exist, or B) to tell you that it is not a library because of the flagrant copyright violations.

Libgen and scihub are among the most amazing products of humanity, that will receive a prominent place in human history. That they are illegal is an almost boring item of trivia.


You know authors get payments from loans that libraries make, right? (At least, in UK and Ireland)

https://www.plr.uk.com/


Not in the US. US libraries are protected under what is referred to as the "first sale" doctrine, meaning that once a physical object is purchased, it's the purchasers prerogative to do whatever they like with it, including loan it. Publishers and libraries have a bit of a love/hate relationship. Even though publishers would like to prevent loaning, libraries buy a lot of material and also purchase quite a bit from the back catalog of titles that would otherwise not sell much any more.


Also not in Sweden, but the government uses a small trick where tax money is diverted to Swedish (exclusively) authors that have a book loaned. This is technically not part of copyright law since Berne convention prohibits countries from making a distinction between authors of different nations.


Are there a lot of writers publishing Swedish only? I'm learning Swedish very slowly, and would love to know how the Nordic languages are fairing amidst English's cultural domination.


It depends on what you would consider a lot. The swedish writers organisation has about 3000 members [1] and 1720 Swedish books of fiction was published 2016 [2].

[1] https://sv.wikipedia.org/wiki/Sveriges_F%C3%B6rfattarf%C3%B6... [2] http://www.forlaggare.se/den-totala-bokutgivningen-i-sverige


If Elsevier or IEEE had a subscription model priced like Netflix, I'd be glad to give them my money. But as it is, access to academic and research material at low rates, to me is a question of survival. I'm glad and extremely grateful for the work of scholars like Elbakyan. They are basically doing God's work as far as I am concerned.


When you sign up for IEEE they SPAM you relentlessly for years even after you close your account. I'm not a fan of the IEEE for that reason or for their highly conservative political stance.


Yes, they use call centers with pressure tactics to get you to keep it up. Almost all EE research is done via an IEEE whitepaper though, so not much of a choice.


I've never heard of such in the US. I'm now reading your link, but I'd like to hear if anyone know of similar arrangements in the US.


In the US libraries pay for their physical books, which pay royalties per usual.

For digital media they usually have a negotiated rate where they pay royalties and limit the number of copies "out on loan".


Negotiated might be strong term. Most libraries work through middlemen who do that negotiation and are then given a rate that they can take or leave. There's no "shopping around" for digital media since publishers control the pricing and availability. This is proving to be a budgeting challenge since digital media is more expensive that physical. There are no discounts from wholesalers for bulk purchasing. Many libraries get physical media at up to a 40% discount from retail because they buy so much.

Of course, digital media does not need to be maintained. It does not need to be cataloged and reshelved; it does not wear out or get damaged. But libraries are still adapting to this shift since there still is a great deal of physical media checking out and that staff is still needed.


The library landscape is surprisingly varied in the United States. Big systems such as New York, Chicago and Los Angelos exert a huge amount of control on the "middle men" that service them (and a 40% discount is where all systems start and always have). This is of course not surprising. Like all parts of the book industry the ecosystem surrounding libraries is changing very fast.

The digital media side is not as clear as you are making it out to be. Publishers do not in fact set the pricing and availability, because the publishers don't particularly want to be in the business of servicing libraries (just like they don't want to be in the business of library binding and cataloging) so they have to allow third parties the ability to do some negotiations. In many ways its just like physical books (the cost of physical books is largely not the act of creating the physical copy).

The difference is that the classic rift between desires of libraries and publishers is more stark with electronic books. Libraries want to provide access as cheaply as possible usually as a governmental agency and publishers want to have a profitable business.

That doesn't even begin to talk about the existential crisis libraries are going through. Its a fairly interesting thing to watch as an outside observer.

I don't work in this space but my wife does and I've had drinks with enough publishers, jobbers and librarians to see it as fascinating


I am a librarian who works with digital media, so I see this industry close up as part of my job. So a few things:

The vast majority of libraries are not NYPL, LAPL, King County PL, etc. Most are medium to small and servicing almost every town in the United States. They are arranged in a dizzying array of geographic, bureaucratic and budgetary configurations. I once worked with a library that was a unified system with shelf level access to items, but every municipality funded its own local branch, so money went into a central system and had to be accounted for when purchasing items. They handled 15 different budgets. It was staggering.

This make negotiation impossible. We rely on vendors like Overdrive, Baker and Taylor, Midwest Tape, Recorded Books, etc, to provide access to digital materials. And though there are sales, digital materials are unquestionably more expensive. I work in a system with a service population of about 500,000 (it's a statewide consortia of local libraries), and hold lists can run into the 100s for a popular title. If the title is from Penguin Random House, it will likely cost more than $50 per copy, leading to thousands of dollars just to keep hold times down to a few months. If the publisher is Harper Collins or Simon and Schuster the price will be more reasonable, but we lose copies as we check them out. For example, say we buy 15 copies that have 52 checkouts each. As soon as we've checked those 15 copies out 52 times, we're down to 14 copies. It is very challenging and if we exerted influence, it would not be like this. And our that our most popular device, the Kindle, is controlled by a vendor that is fanatical about its control over the service and was dragged kicking and screaming into working with libraries.

But physical materials still remain our most popular items. E-book sales have stagnated at around 35% (not including Amazon's nebulous self-publishing numbers) and we've seen the same in libraries. That makes it difficult to shift staff. As a colleague of mine once said, "In government I can't lay everyone off and rehire people with the right skills". Over time it will work out, but in the short term the budgetary challenges are limiting access.

All that said, we are healthy. The library as a physical space and American institution will be OK because people have a strong attachment to the idea and the use case.


> This make negotiation impossible. We rely on vendors like Overdrive, Baker and Taylor, Midwest Tape, Recorded Books, etc, to provide access to digital materials.

But how is that different than your previous interactions with the wholesale/jobber market? You relied on them for rebinding, catalog record import, fulfillment, etc. The only difference I can see is that its harder to become competent in the delivery of e-books than it is in the delivery of physical books because its more new.

> and hold lists can run into the 100s for a popular title

How much of that is just that demand is easier to generate with digital books. I don't need to go to the library to get the book, there is no cost to be on a hold list and it is delivered as soon as it is available?

> It is very challenging and if we exerted influence, it would not be like this

Sure, but if you could exert perfect influence you'd get the books for free ;)

Look, I'm not saying we've reached an optimal system for ebooks and libraries but its fairly easy to understand the publishers position. Too many people get caught up in the physical costs of books, which are not what the publisher is worried about. They are worried about recouping all the pre-production IP costs and marketing dollars they put into the things that don't sell. That they've fallen back onto a model that poorly mimics physical books is probably unfortunate, but completely unsurprising.

> All that said, we are healthy. The library as a physical space and American institution will be OK because people have a strong attachment to the idea and the use case.

Completely and totally agree and can think of few groups of people more likely to adapt to the new information dense world than librarians. I'm much more worried about the publishers...


> there is no cost to be on a hold list

What do you mean by this? At least at my library, there is no monetary cost to be on the hold list for any resource, but you can only put a hold on a limited number of resources; and Overdrive checkouts work just the same.


> But how is that different than your previous interactions with the wholesale/jobber market? You relied on them for rebinding, catalog record import, fulfillment, etc. The only difference I can see is that its harder to become competent in the delivery of e-books than it is in the delivery of physical books because its more new.

Because working with wholesalers is different than working with agency pricing, especially once you bring DRM into the fold. Previously, publishers controlled the creation of materials, now they control the pricing as well. Additionally, only a few vendors have the capability of distributing ebooks, and those vendors also control the devices for consumption as well. So if a user comes in with a nook, an iPad, a Kindle, etc, we also need to come up with mechanisms to get that content to their devices. This means we're working with vendors who either have contracts with Amazon (of which there is only one), the ability to operate Adobe DRM licensing servers, or the ability to make a good mobile app. We control almost none of the process. NYPL is currently working on an app to unify ebooks across vendors but because of DRM this requires vendors to play ball as well, and as a result it's a less than optimal experience for the user. The barriers to entry in this market are huge. (Also, most public libraries do not rebind, and often then do their own processing and do copy cataloging through a coop (OCLC).)

Once physical materials leave the publisher, they have lost control of the process, pricing, distribution, etc. Now a small number of players control everything from the creation, sale, distribution and even reading experience for that item. A few big players can hope to crack this market, but most libraries can't rely on their budget from year to year, making even unified pressure difficult.

> How much of that is just that demand is easier to generate with digital books. I don't need to go to the library to get the book, there is no cost to be on a hold list and it is delivered as soon as it is available?

I would argue that e-books are harder to get than physical ones, especially if you are already a regular library user. You have to own a device and be tech savvy enough to get through the setup process. But setting that aside, this doesn't change the fact that it's more expensive to meet digital demand than physical demand.

I predate ebooks in this industry, so I have seen it evolve a great deal. The prices have come down, the user experience has improved ten-fold, and cooperation has improved among libraries to the point that even the smallest library can offer e-books. But until libraries can make their own devices, build their own software, break the agency pricing model, and get rid of DRM it will never reach the point physical materials have.

But as I said, we focus on the bad, and not the good. Usage of audiobooks has increased 20% every year for the past several years, driven almost entirely by electronic titles (this is our only format where electronic circulation is higher than physical). E-books are far more accessible for users with disabilities or simply older readers who struggle with seeing or even holding books. E-books never get stolen or damaged, nor do they wear out (indeed, this is the argument Random House made when they increased their pricing). Publishers have discovered they want to work with libraries because libraries are big spenders and buy materials that would otherwise not sell (especially backlist titles).

The larger "threat", if you want to call it that, of e-books is not major publishers, but Amazon, YouTube, Spotify, etc. The industry is changing and increasingly, creators do not need to work with a publisher to be successful. A large chunk of the market for materials is being hoovered away into silos. Personally I feel the library's future is in its physical space.


The thing about these 'pirate libraries' that makes them seem so vital to me is they're made up of people far more motivated to preserve and keep available the documents than the actual copyright holders ever will be.

I draw comparisons to the vast value that was lost when Oink and What.cd were shut down.


Sadly, I never even knew about What.cd when it was alive, not that I'd have had the time and wherewithal to join. But this is the article that made me lament its passing: https://qz.com/840661/what-cd-is-gone-a-eulogy-for-the-great...


Thanks for the link - the comparison to the burned Library of Alexandria was one I hadn't thought of before.


Wow, likewise never knew of it until now! Sounds incredible although I also would never have had the time to take part.

That is one thing that makes me a bit sad about Spotify (and there's not much, I think it's a great product and I'd happily pay more for it if I could be sure a reasonable amount went to the artists I listen to) - I worry that people will get so used to streaming from large, but limited catalogs, that obscure recordings (e.g. 90s dance tracks only released on vinyl and the label is now gone) will be forgotten.

Hopefully there's enough hardcore fans to prevent that happening - but in an ideal world Spotify would somehow have access to all this kind of stuff you currently only find on YouTube!


I am already wondering whether a fundraiser to download KG would make sense. If that would disappear... there is absolutely no collection like that, anywhere. 100 000 non mainstream movies!

If you all you want is a cold copy and not keep it online, the costs are not enormous perhaps 5K EUR or so.


Indeed, I was briefly involved with the "bookz" scene as it was known at the time, over a decade ago, and the amount of --- completely voluntary --- effort that its members expended, especially considering the risks, was nothing short of amazing. People would go to their local libraries and borrow dozens of books to scan and upload, while also discussing on the forums ways to make the process go more quickly, and in general collaborating very efficiently.

When a few of them turned out to be from the same region, they would even organise "scanparties" where the goal would be to digitise whole sections of a particular library within a short time. I'm sure they raised some suspicion among librarians, and others would've noticed when entire shelves suddenly became empty, but to my knowledge no one was actually caught... and there were plenty of excuses to go around.

A memorable result from one of those is that there is now an extensive collection of car service manuals floating around on the Internet in various torrents and also as individual PDFs.


I have an irrational love for Gen1 Series 2 Subaru Liberty RS Turbos. An Australian variant of the Japanese Legacy with a number of mechanical differences.

Through some unknown hero (to me) going out and scanning the service manual held at a technical school in a country town 4 hours from here I have a copy as a PDF.

You never know when what you're 'pirating' is providing a real and wonderful value to another person.


Those two should really be mentioned in an article about Pirate Libraries. They are exactly what the author describes, minus the 'rise' part.


It's true, beside entertainment data (movies, tv shows), other libraries were made by deeply passionate people, that had exhaustive collection of everything. Also, pirate networks have different and maybe better uptime than official ones [1]. Often I see PhD thanking scihub because some bug took down springer or nature.

[1] surely the resources and money goes in different directions in official operations so not comparing 100%


Anyone who is a student should know that most of the big publishers e.g. Springer, Cambridge Uni. Press, etc have extensive online libraries that your university probably subscribes to. You can go there and download complete books, many of which would be extremely expensive to buy otherwise. A lot of it is academic and dry, but occasionally there are some gems. It was fun to sift through random topics that I had no idea about (like forensic pathology) and build up a small eBook library. Best of all it's within your right as a student to do it (your uni pays thousands for these subscriptions), so exploit it while you can.

Most of the time you have to download chapters separately, but I did write a little script that would authenticate with Shibboleth, pull the whole book and combine the PDFs. There was no DRM either, just a watermark with your IP or university details to discourage sharing.


> You can go there and download complete books, many of which would be extremely expensive to buy otherwise.

As much as I hate to say something nice about one of the big publishers, Springer even has a nice little filip: if you already have the eBook, you can buy a physical copy for very cheap ($25, I think). I've never tried this, so I don't know if having access through your university counts.

> Most of the time you have to download chapters separately, but I did write a little script that would authenticate with Shibboleth, pull the whole book and combine the PDFs. There was no DRM either, just a watermark with your IP or university details to discourage sharing.

Be careful with this, though; Cambridge, for example, bans you for a while if you download too many files. (Obviously "too many" isn't rigorously defined, but I hit it while downloading two many-chapter books.) I can imagine triggering a more aggressive ban if they think you look like a robot downloader.


I made a list of books I liked the look of, saved it to a text and then setup the script to run every hour ± a bit. Wasn't going to be any faster than I could read them! I don't recall ever being rate limited so I guess the strategy worked.


Would you be willing to share that script? I'm heading to grad school in a month, this would be very useful.


Sure, I'll drop you an email.


I really hate to be that person, and content-wise I find the article very interesting, but:

    “The text collections were far too valuable to simply delete,” he writes, 
    and instead migrated to “closed, membership-only FTP servers.”
    More recently, though, those collections have moved online
So an FTP server is not "online"? I know, I know, it is a minuscule nitpick. But for some reason this kind of thing takes my attention hostage.

And let me repeat, content-wise this is a fine and fascinating article.


> So an FTP server is not "online"?

I always have to stop and consider whether I'm talking to someone who equates the web with the internet. If your first and only exposure to the internet was on the web, I suspect it's a very hard notion to shake.

For those of us who routinely use non-http protocols (even if that's only ssh), or are old enough to remember using gopher, telnet, UUCP, NNTP, SMTP, IMAP, POP3, etc., then ~"moved online and off ftp" is just weird...


It seems that the "moved online" part is less about FTP and more about the "closed, membership-only" part.

Colloquially, "online" pretty much means "available online", and data that is on systems technically connected to internet but accessible only to a few people isn't considered available online.


I get that. The wording was not misleading or ambiguous.

In my job, I have to deal with network problems of various kinds from time to time, and at the network layer, "online" has a specific meaning (Okay, actually several, depending on context), and that was what my brain came up with first.

Like I said, I was nitpicking.


The US government is not allowed to copyright anything, and is surely a funder of many of these works. Contractors for the US government should similarly not be able to copyright work done under contract. What would happen to these copyright arguments if this became the case?


I tend to agree with you. Universities/Colleges that receive public money also should be able to copyright or patent findings. If public money funds it, it should be public domain.


What about some authority like a professor who writes a book about his subject? Suppose the guy's been paid or heavily subsidised through his career.

Seems a bit unobvious to me. On the one hand, the guy's gotta bother to sit and write a book. OTOH, his book would be worthless if he didn't have the skills paid for by other people.

I guess what's normal is that we disregard how his expertise was funded and just let him keep whatever he makes. After all, he doesn't have to write a textbook.


The difference is writing as part of one's career (at the direction of supervisors) and writing at one's own discretion.

If your school says "yo, prof! Write a book on $subject," then it's a work-for-hire and, if public money is involved, should be public domain. If the professor takes it upon himself to write a book, presumably on hus own time, he then holds the copyright.


No different than programmers not owning the rights to what they develop while at work.


Alexandra Elbakyan is a modern day hero. Her work has put more knowledge in the hands of those who could otherwise not afford it than Sallie Mae. Cheers to you Alex!


As long as textbooks in North America keep being so ridiculously expensive[0] I wont take "pirate" as a valid moral argument.

[0]https://www.amazon.com/Calculus-Early-Transcendentals-James-...


It's really complicated those copyright vs moral stuff. Remember Google have somewhere hundreds of thousands book ready to be shipped who probably never will be

https://www.theatlantic.com/technology/archive/2017/04/the-t...

Previously discussed here

https://news.ycombinator.com/item?id=14172791


Reminds me of Gigapedia (later named library.nu), which started as soon as 2007 and in 2012 had a huge amount of books of all kinds Nowadays Library Genesis has a good collections, but it is mostly technical. Gigapedia was amazing in the diversity of the topics it covered. It definitely was a loss.


What is the current state of such book libraries (in the sense of, "these are some that are alive")? I can never keep track of which among LibGen, BookFi, etc., are still alive, which are alive but not loading, which are the same under different names, and so on.


I have a large physical book library.

My 3 year old has a large pirated physical book library.

I also have a huge pirated ebook library.

In the future, my child will have a copy.

Times, they are a-changin'.


> My 3 year old has a large pirated physical book library.

Could you explain this? Did you steal these from someone?


They are books produced by pirate printing factories whose major business is making duplicates of famous children's books. Quality is great, they are basically indistinguishable from the real thing.


I don't know if I realized such a thing existed. I've seen a few print-on-demand machines in libraries, so I guess I should have guessed, but I have never seen such a thing before.


He read his father's without buying a second "license" himself for each book.


Nice article on the history of Shadow Libraries. They've come a long way since their Russian roots in the 90s.

We just released a study titled "Sci-Hub provides access to nearly all scholarly literature". Preprint at https://doi.org/10.7287/peerj.preprints.3100. There's an accompanying interactive browser at https://greenelab.github.io/scihub.

Some highlights:

> As of March 2017, we find that Sci-Hub's database contains 68.9% of all 81.6 million scholarly articles, which rises to 85.2% for those published in closed access journals.

> Coverage also varies by publisher, with the coverage of the largest publisher, Elsevier, at 97.3%.

> we estimate that over a six-month period in 2015–2016, Sci-Hub provided access for 99.3% of valid incoming requests.


Extant libraries I am aware of, please add you're own in replies: https://www.cgpeers.to/ Computer Graphics software, plugins, models and tutorials.


It's only a matter of time before somebody delivers a too-cheap-to-meter DOI-to-PDF service.


That is what https://oadoi.org/ and http://doai.io/ do. Sci-Hub can also ingest DOIs.


scihub.cc?


For those times when you click a link to a paywalled article, http://unpaywall.org is a pretty handy Firefox extension that will redirect to a (legal) unpaywalled alternative site, if it exists.


Is there some reason why such libraries have not established themselves on the dark web?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: