Redactions

The year of the disappearing websites

by Alex Wellerstein, published December 27th, 2013

I’m a big fan of digital historical research. Which is to say, I’ve benefited a lot from the fact that there are a lot of great online resources for primary source work in nuclear history. These aren’t overly-curated, no-surprises resources. The paper I gave at the last History of Science Society meeting, on US interest in 50-100 megaton weapons, was surprising to pretty much everyone I told about it, yet was based almost exclusively on documents I found in online databases. You can do serious research with these, above and beyond merely “augmenting” traditional archival practices.1

One of the most interesting documents I found in an online database — an estimate for the ease of developing a 100 megaton weapon in a letter from Glenn Seaborg to Robert McNamara. Knowing the estimated yield and weight of the bombs in question allows one to divine a lot of information about their comparative sophistication.

One of the most interesting documents I found in an online database — an estimate for the ease of developing a 100 megaton weapon in a letter from Glenn Seaborg to Robert McNamara. Knowing the estimated yield and weight of the bombs in question allows one to divine a lot of information about their comparative sophistication.

Like all things, digital history comes with its pitfalls. The completely obvious one is that not everything is digitized. No surprise there. That doesn’t really change the digital archival experience from the physical one, of course, since even physical archives always are missing huge chunks of the documentation. As with “regular” archives, the researcher compensates for this by looking at many such databases, and by looking closely at the materials for references to missing documents (e.g. “In response to your letter of March 5″ indicates there ought to be a letter from March 5th somewhere). This doesn’t make digital archives less useful, it just means their role cannot usually be absolute. Being able to quickly search said databases usually more than compensates for this problem, of course, since the volume of material that can be looked at quickly is so much higher than with physical paper. And I might note that one of the best part about many of the digital archives for nuclear sources is that the documents often indicate their originating archive — which can point you to sources you might not have considered (like off-the-beaten-trail National Archives facilities).

But perhaps the biggest problem with digital sources, though, is that like so many things in the digital world, they somehow have the ability to vanish completely when you really want or need them. (As opposed to the normal online trend of things sticking around forever when you wish they would go away.) The fall of 2013 was, among other things, the season of the disappearing websites. At least three major web databases of nuclear history resources that I used on a regular basis silently disappeared.

Fallout from the 1952 "Ivy Mike" shot of the first hydrogen bomb. Note that this is actually the "back" of the fallout plume (the wind was blowing it north over open sea), and they didn't have any kind of radiological monitoring set up to see how far it went. As a result, this makes it look far more local than it was in reality. This is from a report I had originally found in the Marshall Islands database.

Fallout from the 1952 “Ivy Mike” shot of the first hydrogen bomb. Note that this is actually the “back” of the fallout plume (the wind was blowing it north over open sea), and they didn’t have any kind of radiological monitoring set up to see how far it went. As a result, this makes it look far more local than it was in reality. This is from a report I had originally found in the Marshall Islands database.

The first of these, I believe (it is hard to know exactly when things vanished as opposed to when I became aware of them — in this case, September 2013) was the DOE’s Marshall Islands Document Collection. This was an impressive collection of military and civilian reports and correspondence relating primarily to US nuclear testing in the Pacific. Its provenance isn’t completely clear, but it probably came out of the work done in the mid-1990s to compensate victims of US atmospheric testing.

I found this database incredibly useful for my creation of NUKEMAP’s fallout coding. It also had lots of information on high yield testing in general, and lots of miscellaneous documents that touched on all matter of US nuclear developments through the 1960s. It used to be at this URL, which now re-directs you to a generic DOE page. I e-mailed the webmaster and was told that it isn’t really gone per se, it’s just that “Access to the HSS website has been disabled for individuals trying to access our website from the public facing side of the internet. We are working to put mitigation in place that will allow us to enable public access to our web site.” Which was several months ago, right before the government shutdown. What I fear, here, is that a temporary technical disabling of the site — because they are re-shuffling around things on their web domains, as government agencies often do — will lead to nobody ever getting it back up again.

A photograph of an early Hanford reactor that used to be in the Hanford DDRS — one of my favorites, both because of its impressive communication of activity and scale.

A photograph of an early Hanford reactor that used to be in the Hanford DDRS — one of my favorites, both because of its impressive communication of activity and scale.

Next was the Hanford Declassified Document Retrieval System which in November 2013 (or so) went offline. It used to be here, which now gives a generic “not found” message. It used to have thousands of documents and photographs relating to the Hanford Site spanning the entire history of its operation. In my research, I used it extensively for its collection of Manhattan Project security records, as well as its amazing photographs. Again, I suspect it was a creation of the mid-to-late 1990s, when “Openness” was still a thing at DOE.

I’d be the first to admit that its technical setup seemed a little shaky. It required a clunky Java applet to view the files, and its search capabilities left a little to be desired. Still, it worked, and could be actively used for research. I got in contact with someone over there, who said it had to be taken down because it had security vulnerabilities, and that eventually they planned to get it back up again, but that “we don’t have a timeline for accomplishing that right now.” They offered to search the database for me, through queries sent via e-mail, but obviously that doesn’t quite cut it in terms of accessibility (especially since my database process involves many, many queries and glancing at many, many documents, most of which are irrelevant to what I’m looking for).

Will it get back up? The guy I talked to at Hanford said they were trying to resurrect it. But I have to admit, I’m a little skeptical. It’s not at the top of their agenda, and clearly hasn’t been for over a decade. If they do get it up, I’ll be thrilled.

d: Exploratory tunnel dug by a 25-foot-diameter tunnel boring machine at the proposed  Yucca Mountain, Nevada, repository for spent nuclear fuel. From the DOE Digital Archive.

Exploratory tunnel dug by a 25-foot-diameter tunnel boring machine at the proposed
Yucca Mountain, Nevada, repository for spent nuclear fuel. From the DOE Digital Photo Archive.

Lastly, there is the DOE Digital Photo Archive, which was a publicly-accessible database of DOE photographs, from the Manhattan Project through the present. Some of these were quite stunning, and quite rare. One of my all-time favorite photographs of the nuclear age came from this database. The archive used to be here, it now redirects to a generic page about e-mailing the DOE for photographs. Not the same thing. I got in touch with someone who worked there, who said that the database site “has been closed down,” and that instead I could trawl through their Flickr feed. They, too, offered to help me find anything I couldn’t — but that doesn’t actually help me too much, given how much serendipity and judgment play in archival practice.

As an extra “bonus” lost website, Los Alamos‘ pretty-good-but-not-perfect history website was also taken down very recently, and replaced with a single, corporate-ish page that skips from World War II to the present in one impressive leap and gives nothing but a feel-good account of the first atomic bombs. The site it replaced was more nuanced, had a reasonably good collection of documents and photographs, and covered Los Alamos’ history through the Cold War pretty well. It had its issues, to be sure, including some technical bugs. But even a buggy site is better than a dead one, in my opinion. A new site is supposedly in the works, but it seems to not be a high priority and no short-term changes are expected.

None of these sites were taken down because of anything objectionable about their content, so far as I know. The issues cited have been a mixture of technical and financial (which are, of course, intertwined). Websites require maintenance. They require upkeep. They require keeping technically-inclined people on staff, with part of their day devoted to putting out the little fires that inevitably come up over the years with a long-lasting website. Databases and interactive sites in particular require considerable effort to put together, and a lot of time over the years to keep up to date in terms of security practices.

I work on web development, so I get all that. Still, it’s a terrible thing when these things just vanish. Aside from that fact that some people (I imagine more than just myself) find them useful, the amount of resources essentially wasted when such a long-term investment (think of the man-hours that went into populating those databases!) is simply turned off.

What should scholars do about it? We can complain, and sometimes that works. A better solution, perhaps, is to keep better mirrors of the sites in question. This is particularly true of sites with any potential “national security implications.” When Los Alamos took their declassified reports offline after 9/11, the Federation of American Scientists managed to cobble together a fairly complete mirror. (The Los Alamos reports have since been quietly reinstated for public access through the Los Alamos library site.)

Los Alamos Technical Reports

I wish, in retrospect, that in the past I had considered the possibility that the Hanford and Marshall Islands databases might go down. Making a mirror of a database is harder than making a mirror of a static website, but it’s not impossible. (Archive.org does not do it, before you offer that possibility up.) For the specific reports, documents, and photographs that I actually use in my work, I always have a local copy saved. But there is so much out there that was yet to be found. I might try filing a FOIA request for the underlying data (it would be trivial for me to turn them into a useful database hosted on my own servers), but I’m not sure how well that will work out (it seems to go a bit beyond a normal FOIA).

After the Hanford database went down, I thought, what are the other public databases that my work depends on? The most important is DOE’s OpenNet database, which contains an incredibly rich (if somewhat idiosyncratic) collection of documents related to nuclear weapons development. Huge chunks of my dissertation were based on records found through it, as are most of the talks I give. If it went down tomorrow, I’d be pretty sunk. For that reason, while the government was going through its shutdown last October (I figured no one would be around to object), I made a reasonably complete duplicate of everything in OpenNet using what is known as a “scraper” script.2 Obviously as OpenNet gets updated, my database will fall out of sync, but it’s a start, and it’s better than nothing if it gets unplugged tomorrow.

The amazing thing about digital databases it that they take the archive everywhere at once, instantly. The terrible thing about them is that it only takes the pull of one plug to shut it down everywhere at once, instantly. Anyone who does research on nuclear history issues should be deeply disturbed by this rash of site closures, and should start thinking seriously about how to make copies of government databases they rely on. (Private databases are more complicated, for copyright reasons.) The government gave, and the government has taken away.

Notes
  1. Which databases, you ask? 1. The CIA’s online FOIA database; 2. Gale’s DDRS database; 3. the DOD’s online FOIA database; 4. DTIC; 5. ProQuest’s Congressional hearing database; 6. the JFK Library’s online files; 7. the National Security Archive’s online database; 8. the Nuclear Testing Archive (DOE OpenNet); 9. the OHP Marshall Islands Database; 10. the ProQuest Historical Newspaper database; 11. the UN’s website; 12. the searchable Foreign Relations of the United States. The only other significant non-online archival sources were the Hansen papers at the National Security Archive and some files from the JFK Library that they provided me over e-mail. []
  2. I whipped something together using Snoopy for PHP, which allows you to do all sorts of clever database queries very easily. []

Tags: , , , , ,


15 Responses to “The year of the disappearing websites”

  1. Drako says:

    “With the deep, unconscious sigh which not even the
    nearness of the telescreen could prevent him from
    uttering when his day’s work started, Winston pulled the
    speakwrite towards him, blew the dust from its mouthpiece,
    and put on his spectacles. Then he unrolled and clipped
    together four small cylinders of paper which had already
    flopped out of the pneumatic tube on the right-hand side
    of his desk.

    In the walls of the cubicle there were three orifices. To
    the right of the speakwrite, a small pneumatic tube for writ-
    ten messages, to the left, a larger one for newspapers; and in
    the side wall, within easy reach of Winston’s arm, a large
    oblong slit protected by a wire grating. This last was for the
    disposal of waste paper. Similar slits existed in thousands or
    tens of thousands throughout the building, not only in ev-
    ery room but at short intervals in every corridor. For some
    reason they were nicknamed memory holes. When one
    knew that any document was due for destruction, or even
    when one saw a scrap of waste paper lying about, it was an
    automatic action to lift the flap of the nearest memory hole
    and drop it in, whereupon it would be whirled away on a
    current of warm air to the enormous furnaces which were
    hidden somewhere in the recesses of the building.

    Winston examined the four slips of paper which he had
    unrolled. Each contained a message of only one or two lines,
    in the abbreviated jargon—not actually Newspeak, but con-
    sisting largely of Newspeak words—which was used in the
    Ministry for internal purposes. They ran:

    times 17.3.84 bb speech malreported africa rectify

    times 19.12.83 forecasts 3 yp 4th quarter 83 misprints verify
    current issue

    times 14.2.84 miniplenty malquoted chocolate rectify

    times 3.12.83 reporting bb dayorder doubleplusungood refs
    unpersons rewrite fullwise upsub antefiling

    With a faint feeling of satisfaction Winston laid the
    fourth message aside. It was an intricate and responsible job
    and had better be dealt with last. The other three were rou-
    tine matters, though the second one would probably mean
    some tedious wading through lists of figures.

    Winston dialled ‘back numbers’ on the telescreen and
    called for the appropriate issues of ‘The Times’, which slid
    out of the pneumatic tube after only a few minutes’ delay.
    The messages he had received referred to articles or news
    items which for one reason or another it was thought neces-
    sary to alter, or, as the official phrase had it, to rectify.
    For example, it appeared from ‘The Times’ of the seventeenth
    of March that Big Brother, in his speech of the previous
    day, had predicted that the South Indian front would re-
    main quiet but that a Eurasian offensive would shortly be
    launched in North Africa. As it happened, the Eurasian
    Higher Command had launched its offensive in South In-
    dia and left North Africa alone. It was therefore necessary
    to rewrite a paragraph of Big Brother’s speech, in such a
    way as to make him predict the thing that had actually hap-
    pened. Or again, ‘The Times’ of the nineteenth of December
    had published the official forecasts of the output of various
    classes of consumption goods in the fourth quarter of 1983,
    which was also the sixth quarter of the Ninth Three-Year
    Plan. Today’s issue contained a statement of the actual out-
    put, from which it appeared that the forecasts were in every
    instance grossly wrong. Winston’s job was to rectify the
    original figures by making them agree with the later ones.
    As for the third message, it referred to a very simple error
    which could be set right in a couple of minutes. As short a
    time ago as February, the Ministry of Plenty had issued a
    promise (a ‘categorical pledge’ were the official words) that
    there would be no reduction of the chocolate ration during
    1984. Actually, as Winston was aware, the chocolate ration
    was to be reduced from thirty grammes to twenty at the end
    of the present week. All that was needed was to substitute
    for the original promise a warning that it would probably
    be necessary to reduce the ration at some time in April.

    As soon as Winston had dealt with each of the messages,
    he clipped his speakwritten corrections to the appropriate
    copy of ‘The Times’ and pushed them into the pneumatic
    tube. Then, with a movement which was as nearly as possi-
    ble unconscious, he crumpled up the original message and
    any notes that he himself had made, and dropped them into
    the memory hole to be devoured by the flames.”

    – George Orwell: 1984, chapter four

    Wow… What a terrible boy is Big Brother!

    • Jim-Bob says:

      Sadly, as Edward Snowden pointed out in his UK Christmas message, 1984 is rather quaint and outdated compared to what governments can do today.

      With regards to the actual article in question, I think it serves little good to retroactively censor this sort of material. The mechanisms necessary to build a functional nuclear weapon are widely known by physicists, fast food workers and everyone in between. The actual amounts of each element used have usually been more carefully guarded though since this is the science necessary to get a decent yield. However, even if all of this were published, it still wouldn’t necessarily mean that nuclear weapons would suddenly proliferate without prior knowledge by the world’s larger economic/military powers. The mechanism for production of these materials is difficult to obtain for anyone but a state since it requires large scale breeder reactors and centrifuges. That has always been what kept nuclear proliferation at bay, not a lack of knowledge.

  2. Bill Higgins says:

    Another disappearing database in 2013 (though perhaps not relevant to your own work) was the NASA Technical Reports Server (NTRS). In March the database was abruptly withdrawn after Rep. Frank Wolf publicized an alleged security breach involving a NASA contractor.

    Wolf urged: “NASA should immediately take down all publicly available technical data sources until all documents that have not been subjected to export control review have received such a review and all controlled documents are removed from the system.”

    Of the millions of documents formerly available from NTRS, the majority were restored by May, but last I heard, hundreds of thousands were still unavailable. Aerospace historians I know are none too happy.

  3. Bradley Laing says:

    —Update on someone else’s posting. Someone else wrote that the Salt Lake City newspaper, the “Deseret News” (spell?) had run an article qouting Joseph Goebbels threatening New York City with a “uranium torpedo.”

    “The Rocket U-boat was an abandoned military project to create the first ballistic missile submarine. It was conceived of by Nazi Germany during the Second World War. Plans for the rocket U-boat involved an attack on New York City with newly invented V-2 rockets.”

    —A nautical device to attack New York City? Could the “uranium torpedo” be a garbbeled version of the Rocket U-boat, but not a Nazi Germany built nuclear weapon?

  4. phuzz says:

    It might be worth getting in contact with Jason Scott. He works at the Internet Archive but he’s also the defacto honcho of the Archive Team, who specialise in backing up websites that are going to be deleted.
    If you hear of a site that is going to be taken offline then get in contact with them and they’ll find a way to make a mirror of it. This is exactly the sort of thing that they exist to prevent.

  5. Daniel Olive says:

    Can’t you FOI many of the documents? I know US FOI sometimes has charges, but you could make requests below the de minimis floor.

    My experience is in UK FOI law, but I think you might be able to FOI “[all the files in the collection known as] X Database”. It’s identifiable, and electronic delivery is essentially free.

    • In theory, yes, but I worry that it would become a prohibitively large request. What I am considering doing is first making a FOIA request for the metadata — that is, the database data itself — which would then allow at least future requests for specific documents.

  6. Lori Reed says:

    I have been following your blog for some time now, and I really appreciate the way you stick to the facts and avoid excessive conjecture. You maintain a consistently readable style without boring the reader. Just wanted to tell you that if yours were a book, I would buy it.

  7. nukeman says:

    Thank you for bringing up this important topic of discussion. It really is sad that various organizations are taking down their documents and an important part of history is being lost. The removal of the NTRS database and then its reappearance after removal of “sensitive: documents is worthy of discussion. One document that was removed is Volume III of the Nuclear Pulse Space Vehicle Study that is part of the project Orion project. It is interesting that many people have commented about the first section of this volume that deals with the design of the pulse unit but there has been no open discussion of section 7 which deals with Development Program Planning. A careful review of that section starting on page 133 reveals a discussion of the Low-energy nuclear source (LENS) which is a very-low-yield gun-type plutonium assembly. This is the only time that I know of that the characteristics of a plutonium gun-type assembly have been described in the declassified literature.

    I still go back and look through my collection of material (books, CDs, articles, etc.) that were put written by Chuck Hansen. Just this week I learned something from his CDs that I hadn’t known about. He was an amazing person and is missed for his contributions.

  8. anon says:

    If it’s any comfort, the Marshall Islands DB is still available within the DOE network. So it really is still there–just not publicly accessible.

  9. Julie Staggers says:

    Finishing a book on Hanford/Richland and I am utterly deranged at the moment. Was away from the book project from Thanksgiving-Spring Break because I was directing our national conference. Popped on just now to grab an image to use in my grad class this evening, and discovered the DDRS was gone. Hoping to have a more astute, political/theoretical response, but at the moment I am just gobsmacked. Thank you so much for the cogent overview of the sites that have been “disappeared.”

  10. […] gave them sortable filenames. Why? Because I want people — you — to be able to use these (and I do not trust the government to keep this kind of thing online). They’ve still got loads of deletions, especially in the Los Alamos and diffusion sections, […]

    • Don’t be deceived by appearances — try to access any of the documents from their search page. They’re not archived, as the Internet Archive doesn’t record “deep web” sorts of things (like big databases dependent on searching). There were over 10,000 PDFs in the original database.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>