archive.today: On the trail of the mysterious guerrilla archivist of the Internet
Do you like reading articles in publications like Bloomberg, the Wall Street Journal or the Economist, but can’t afford to pay what can be hundreds of dollars a year in subscriptions? If so, …
gyrovague.com
archive.today: On the trail of the mysterious guerrilla archivist of the Internet
Do you like reading articles in publications like Bloomberg, the Wall Street Journal or the Economist, but can’t afford to pay what can be hundreds of dollars a year in subscriptions? If so, odds are you’ve already stumbled on archive.today, which provides easy access to these and much more: just paste in the article link, and you’ll get back a snapshot of the page, full content included.For a long time, I assumed that this was some kind of third-party skin on top of the venerable Internet Archive, whose Wayback Machine provides a very similar service at the very similar address of archive.org. However, the Wayback Machine is slow, clunky, frequently errors out, and most importantly, it’s very easy for websites to opt out, retroactively erasing all their content forever. In contrast, archive.today has no opt-outs or erase buttons: like it or not, they store everything and it’s not going anywhere, with some limited exceptions for law enforcement, child porn, etc.
The Internet Archive is a legitimate 501(c)(3) non-profit with a budget of $37 million and 169 full-time employees in 2019. archive.today, by contrast, is an opaque mystery. So who runs this and where did they come from?
The origins and owners of archive.today
The first historical record we have of the site dates from May 16, 2012, when a “Denis Petrov” from Prague, Czech Republic registered the domain archive.is, the original name of the site. archive.today followed in 2014, and the site has since registered countless variations: archive.li, archive.ec, archive.vn, archive.ph, archive.fo, etc. Denis Petrov is a common Russian name, with pages and pages of matches on LinkedIn, but it may well be an alias: informer.com notes that the same contact information was used to register a series of very sketchy domains, ranging from “carding forum” verified.lu to piracy sites btdlg.com and moviesave.us (all long since gone), many seeded with German keywords (spiel, gewinnt, online).Domains aside, “Denis Petrov” has little presence on the web, and three seemingly connected domains proved dead ends. The obvious denispetrov.com was an entertaining rabbit hole, with the author an accomplished programmer with an interest in Web automation, but it’s clearly the work of a New Yorker, they’re blogging at the tail end of a 25-year career and the blog dries up entirely in 2011, so it doesn’t match the place or time. denis.biz (2001) and petrov.net (1998!) contain nothing. The one intriguing bit of evidence we have is this series of screenshots (archive) where Brave’s tech support addresses webmaster@archive.is as “Denis”, but odds are that’s just from the same DNS record.
We can glean a few more clues from archive.today‘s web presence. The FAQ, unchanged since 2013 (!), states that they are located in Europe and asks for PayPal donations in euros. Looking through the voluminous Tumblr blog, featuring tons of questions but very terse answers, the author’s English is excellent but not quite native, with occasional Noun Capitalization also hinting at a German background. Yet they answer questions in Russian, and the site uses a Russian analytics engine.
The most interesting detective work to date comes from Stack Exchange, where Ciro Santilli managed to link the profile picture of an account archive.today once used to archive LinkedIn content to a “Masha Rabinovich” in Berlin. Even more intriguingly, in a 2012 F-Secure forum post, a “masharabinovich” complains about “my website http://archive.is/” being blacklisted. They pop up on Wikipedia as well getting told off for adding too many links to archive.is, including a mention that they’re using the Czech ISP fiber.cz, and their early edit history includes many updates to the pages “Russian passport” and “Belarusian passport”. “Masha” (Маша) is a common Russian diminutive of Maria, although it can also be a Hebrew form of Moses (מַשה), and Rabinovich is an Ashkenazi Jewish surname.
Early Github captures on archive.today are linked to a now completely disappeared account called “volth” (copy archived by archive.today itself), who was a fluent speaker of Russian, contributed extensively to NixOS (which archive.today uses) and has a profile picture not dissimilar to Masha’s. The linked volth.com domain is now only an empty husk, but it dates back to 2004, with early versions first doing some kind of sketchy search engine network marketing thing (2005), promising “Total Success in Internet” (2008) and eventually being put up for sale (2010), making it likely that its original owners the Espinosas are unrelated to whoever owns the domain today.
While we may not have a face and a name, at this point we have a pretty good idea of how the site is run: it’s a one-person labor of love, operated by a Russian of considerable talent and access to Europe. Let’s move on to the nitty gritty.