Essays

A Brief History of Project Gutenberg

Katie Moench

Contributor

Katie Moench is a librarian, runner, and lover of baked goods. A school librarian in the Upper Midwest, Katie lives with her husband and dog and spends her free time drinking coffee, trying new recipes, and adding to her TBR.

Sometime between 1440 and 1450, Johannes Gutenberg began using his printing press to make published works available to the masses, forever changing the landscape of literacy and reading. Over 500 years later, Michael S. Hart, a student at the University of Illinois, uploaded the text of the Declaration of Independence to the nascent internet, signaling the beginning of the ebook age.

In 1971, Hart, a student in machine-human interfaces at Illinois, had been given the gift of virtually unlimited time on a mainframe in the school’s Materials Research Lab by a friend, time that was immensely valuable in the days when computers were far from the personal and easily accessible models used today. Hart wanted to find a way to use this time to give back to the larger public in order to show his gratitude for computer access, and on July 4, 1971, he was inspired by a free copy he had been given of the Declaration of Independence. Hart typed the document into the computer but was told he could not email it to so many people as he wanted to due to the risks of overwhelming the system. So instead, he published the digital document onto the ARPANET, to which his computer was one of 15 connecting nodes. ARPANET would go on to form the skeleton for the internet that we use today, and by publishing this document, Hart had created and shared the first e-version of a printed document that was shared on the network, paving the way for the e-books of the future.

Project Gutenberg Logo

When Hart started his project, he manually entered all text himself, with a vision of promoting the accessibility to texts in the public domain. As scanning books became more feasible in the mid-1990s, the project was able to gain speed in its digitization projects and was assisted by a network of volunteers across the world who did everything from entering documents into the project database to proofreading the e-texts and creating a website for Project Gutenberg. After Hart had graduated from the University of Illinois, Project Gutenberg was first hosted by Illinois Benedictine College and then eventually moved to ibiblio, an “internet librarianship project” run through the University of North Carolina at Chapel Hill, where it continues to be hosted.

Though today readers might be just as likely to pick up an ebook as a print one, when Hart began his work, that was not the case. At the time, computers and the limited networks that connected them were mainly used by academics, governments, militaries, and a sprinkling of dedicated hobbyists, who were shaping the ever-online world as we know it today. When Hart decided to upload the Declaration of Independence to ARPANET, he was working in a world where accessing such documents was not as simple as googling for the text. There was no Google. While it might seem almost unfathomable to those of us today who enjoy internet access at the click of a mouse or the touchscreen of a phone, Hart was working in a world in which texts were primarily still on paper and their information could only be read by either owning them or viewing them at a library. It’s incredibly fitting that Hart christened the project after Johannes Gutenberg, as it represented a leap in knowledge sharing similar to the first mass-printed works.

When Hart was asked in 2004 about the purpose of Project Gutenberg, he replied that, “The mission of Project Gutenberg is simple: ‘To encourage the creation and distribution of ebooks,” and, “to provide as many ebooks in as many formats as possible for the entire world to read in as many languages as possible.” Furthermore, Hart and the volunteers of Project Gutenberg saw it as a means of combating illiteracy and democratizing knowledge in much the same way as the libraries of the early 19th century. For members of Project Gutenberg, their undertaking is not just about digitizing public domain works from multiple nations, but also about honoring a tradition of progress in giving readers access to texts.

Today, Hart and Project Gutenberg are seen as the inventors of the ebook as a format. Though ebook sales figures are hard to track, due to programs like Kindle unlimited and the number of self-publishing authors, it’s believed that ebooks make up over 20% of books sold, even by conservative estimates. The ebook industry has become a huge segment of how people read and has led to a fundamental shift in how text content is obtained and consumed. Many of these ebooks are bought directly by consumers, making the industry a multi-billion dollar business and drawing companies like Amazon and Apple into conflicts over issues of copyright and corporate trust issues. Though Project Gutenberg’s e-texts were — and continue to be — in the public domain and free for all to view, the foundations of the project also made it possible for authors, publishers, and booksellers to use ebooks as a commercially viable means of bookselling.

Hart’s foresight in putting texts online is not just about showing what computers could do, but also about the power of computing technology and networks to foster intellectual exchange. Unlike other sites, including Google Books, texts on Project Gutenberg are not simply scanned and uploaded onto the site. Instead, a global network of volunteers proofreads texts before they are made available. Distributed Proofreaders, a group of such volunteers, works closely with Project Gutenberg to have multiple volunteers work on an e-text at a time, thereby increasing the speed with which texts can be uploaded.

Texts on Project Gutenberg are either in the public domain or used with special permission of the copyright holder, though the site has run into issues with documents that are public domain in the U.S. but not yet in other countries. Additionally, those working on the project face the challenge of uploading texts that might have been altered throughout history or translated from their original language into multiple versions. What makes Project Gutenberg unique is that its network of volunteers means that there is actual, human thought behind the archive, which includes making decisions about which version of a text to use. While the site has faced criticism for how it documents these decisions, the beauty of Project Gutenberg is that its texts are not merely scanned images, but are e-texts that are meant to be read.

Perhaps the most amazing thing about Project Gutenberg is not the sheer number of e-texts it has made available online, but the ways in which it continues to uphold the early promises of the internet in what has become a commercialized landscape focused more on images and opinions than on sharing information. When Michael Hart chose to use his computer time to begin the work that would go on to become Project Gutenberg, he was fulfilling the pure aims of early computer networks: that such access would make existing information available to a greater number of people. While many users might simply pop into the site when they need a copy of something in the public domain, it’s worth spending time there, marveling at what the vision of Project Gutenberg has achieved.