The Whole Internet Archive In A Box

Share

Wayback Machine – Internet Archive

Browse through 85 billion web pages archived from 1996 to a few months ago. To start surfing the Wayback, type in the web address of a site or page where you would like to start, and press enter. Then select from the archived dates available. The resulting pages point to other archived pages at as close a date as possible.

The Wayback Machine allows one to take a look at the history of any website that’s been around for about a 100 days or more.

The Wayback Machine is a product of The Internet Archive organization (“IA”), which currently houses about two petabytes of compressed data and is growing at about one petabyte a year (see IA PDF from Sun).  The Wayback Machine contains a growing library of 150 billion web pages and is accessed about 500 times a second.  A “mirror” site, which stores an identical copy of the Wayback Machine’s archives, is kept at the New Library of Alexandria in Egypt.  Since history often repeats, it’s good to have more than one copy.

In addition to the Web, The IA keeps back-ups of 169,320 movies, 63,341 live concerts, 327,753 audio recordings and 1,327,481 texts – these collections continue to increase in size, of course.

The impressive man behind the IA, Brewster Kahle, spoke at TED in 2007 and expressed his vision of making all of the world’s knowledge reliably and widely available to all of the world’s citizens at no charge.  To that end, the IA has migrated from a large array of common desktop computers to the Sun Modular Datacenter platform.  The SunMD employed by the IA is an “Internet in a box” as the entire archiving apparatus fits in one standard-sized shipping container.

eWeek offers some great pictures of the unveiling ceremony and information on the IA’s SunMD.

The Internet Archive, one of the fastest-growing digital libraries in the world, has migrated its massive amount of content into a new Sun Microsystems-built portable data center loaded with 60 Sun X4500 Thumper arrays that each have 48TB of storage capacity. Sun staged a launch event at its Santa Clara, Calif., headquarters on March 25.

“It’s amazing to think that the whole Web collection, which is about 2PB compressed and from 4PB to 5PB uncompressed, can live in a 20-foot-by-8-foot-by-8-foot shipping container, which, from our standpoint, is a computer,” Brewster Kahle, digital librarian and founder of the Internet Archive, told eWEEK.

Because of the modular, self-contained nature of the SunMD, additional SunMDs can be added as the Internet grows.

The Internet Archive’s database is the ultimate zeitgeist.  The storage and free distribution of all network based content is, perhaps, one of the most important endeavors ever undertaken by humans.  But unlike the ancient Library of Alexandria, where only the relative handful of works and writings by the most learned intellectuals were housed, the IA captures and stores most everything on the Web and much of the Net, from highly sophisticated papers written by the most respected scientists to mundane 140 character Tweets spouted out by regular folks.

While somewhat embarrassing to admit, a 1996 version of a website I authored still lives. Let this be a warning that one’s offerings via the Web will most certainly survive one’s self!

Now watch what you say
They’ll be calling you a radical, liberal,
Fanatical, criminal.

These lyrics from Supertramp’s “The Logical Song” of 1979 are more applicable today than they ever were before.

Similar Posts:

This entry was posted in News and tagged , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
blog comments powered by Disqus