I’m a big fan of digital historical research. Which is to say, I’ve benefited a lot from the fact that there are a lot of great online resources for primary source work in nuclear history. These aren’t overly-curated, no-surprises resources. The paper I gave at the last History of Science Society meeting, on US interest in 50-100 megaton weapons, was surprising to pretty much everyone I told about it, yet was based almost exclusively on documents I found in online databases. You can do serious research with these, above and beyond merely “augmenting” traditional archival practices.1
Like all things, digital history comes with its pitfalls. The completely obvious one is that not everything is digitized. No surprise there. That doesn’t really change the digital archival experience from the physical one, of course, since even physical archives always are missing huge chunks of the documentation. As with “regular” archives, the researcher compensates for this by looking at many such databases, and by looking closely at the materials for references to missing documents (e.g. “In response to your letter of March 5″ indicates there ought to be a letter from March 5th somewhere). This doesn’t make digital archives less useful, it just means their role cannot usually be absolute. Being able to quickly search said databases usually more than compensates for this problem, of course, since the volume of material that can be looked at quickly is so much higher than with physical paper. And I might note that one of the best part about many of the digital archives for nuclear sources is that the documents often indicate their originating archive — which can point you to sources you might not have considered (like off-the-beaten-trail National Archives facilities).
But perhaps the biggest problem with digital sources, though, is that like so many things in the digital world, they somehow have the ability to vanish completely when you really want or need them. (As opposed to the normal online trend of things sticking around forever when you wish they would go away.) The fall of 2013 was, among other things, the season of the disappearing websites. At least three major web databases of nuclear history resources that I used on a regular basis silently disappeared.
The first of these, I believe (it is hard to know exactly when things vanished as opposed to when I became aware of them — in this case, September 2013) was the DOE’s Marshall Islands Document Collection. This was an impressive collection of military and civilian reports and correspondence relating primarily to US nuclear testing in the Pacific. Its provenance isn’t completely clear, but it probably came out of the work done in the mid-1990s to compensate victims of US atmospheric testing.
I found this database incredibly useful for my creation of NUKEMAP’s fallout coding. It also had lots of information on high yield testing in general, and lots of miscellaneous documents that touched on all matter of US nuclear developments through the 1960s. It used to be at this URL, which now re-directs you to a generic DOE page. I e-mailed the webmaster and was told that it isn’t really gone per se, it’s just that “Access to the HSS website has been disabled for individuals trying to access our website from the public facing side of the internet. We are working to put mitigation in place that will allow us to enable public access to our web site.” Which was several months ago, right before the government shutdown. What I fear, here, is that a temporary technical disabling of the site — because they are re-shuffling around things on their web domains, as government agencies often do — will lead to nobody ever getting it back up again.
Next was the Hanford Declassified Document Retrieval System which in November 2013 (or so) went offline. It used to be here, which now gives a generic “not found” message. It used to have thousands of documents and photographs relating to the Hanford Site spanning the entire history of its operation. In my research, I used it extensively for its collection of Manhattan Project security records, as well as its amazing photographs. Again, I suspect it was a creation of the mid-to-late 1990s, when “Openness” was still a thing at DOE.
I’d be the first to admit that its technical setup seemed a little shaky. It required a clunky Java applet to view the files, and its search capabilities left a little to be desired. Still, it worked, and could be actively used for research. I got in contact with someone over there, who said it had to be taken down because it had security vulnerabilities, and that eventually they planned to get it back up again, but that “we don’t have a timeline for accomplishing that right now.” They offered to search the database for me, through queries sent via e-mail, but obviously that doesn’t quite cut it in terms of accessibility (especially since my database process involves many, many queries and glancing at many, many documents, most of which are irrelevant to what I’m looking for).
Will it get back up? The guy I talked to at Hanford said they were trying to resurrect it. But I have to admit, I’m a little skeptical. It’s not at the top of their agenda, and clearly hasn’t been for over a decade. If they do get it up, I’ll be thrilled.
Lastly, there is the DOE Digital Photo Archive, which was a publicly-accessible database of DOE photographs, from the Manhattan Project through the present. Some of these were quite stunning, and quite rare. One of my all-time favorite photographs of the nuclear age came from this database. The archive used to be here, it now redirects to a generic page about e-mailing the DOE for photographs. Not the same thing. I got in touch with someone who worked there, who said that the database site “has been closed down,” and that instead I could trawl through their Flickr feed. They, too, offered to help me find anything I couldn’t — but that doesn’t actually help me too much, given how much serendipity and judgment play in archival practice.
As an extra “bonus” lost website, Los Alamos‘ pretty-good-but-not-perfect history website was also taken down very recently, and replaced with a single, corporate-ish page that skips from World War II to the present in one impressive leap and gives nothing but a feel-good account of the first atomic bombs. The site it replaced was more nuanced, had a reasonably good collection of documents and photographs, and covered Los Alamos’ history through the Cold War pretty well. It had its issues, to be sure, including some technical bugs. But even a buggy site is better than a dead one, in my opinion. A new site is supposedly in the works, but it seems to not be a high priority and no short-term changes are expected.
None of these sites were taken down because of anything objectionable about their content, so far as I know. The issues cited have been a mixture of technical and financial (which are, of course, intertwined). Websites require maintenance. They require upkeep. They require keeping technically-inclined people on staff, with part of their day devoted to putting out the little fires that inevitably come up over the years with a long-lasting website. Databases and interactive sites in particular require considerable effort to put together, and a lot of time over the years to keep up to date in terms of security practices.
I work on web development, so I get all that. Still, it’s a terrible thing when these things just vanish. Aside from that fact that some people (I imagine more than just myself) find them useful, the amount of resources essentially wasted when such a long-term investment (think of the man-hours that went into populating those databases!) is simply turned off.
What should scholars do about it? We can complain, and sometimes that works. A better solution, perhaps, is to keep better mirrors of the sites in question. This is particularly true of sites with any potential “national security implications.” When Los Alamos took their declassified reports offline after 9/11, the Federation of American Scientists managed to cobble together a fairly complete mirror. (The Los Alamos reports have since been quietly reinstated for public access through the Los Alamos library site.)
I wish, in retrospect, that in the past I had considered the possibility that the Hanford and Marshall Islands databases might go down. Making a mirror of a database is harder than making a mirror of a static website, but it’s not impossible. (Archive.org does not do it, before you offer that possibility up.) For the specific reports, documents, and photographs that I actually use in my work, I always have a local copy saved. But there is so much out there that was yet to be found. I might try filing a FOIA request for the underlying data (it would be trivial for me to turn them into a useful database hosted on my own servers), but I’m not sure how well that will work out (it seems to go a bit beyond a normal FOIA).
After the Hanford database went down, I thought, what are the other public databases that my work depends on? The most important is DOE’s OpenNet database, which contains an incredibly rich (if somewhat idiosyncratic) collection of documents related to nuclear weapons development. Huge chunks of my dissertation were based on records found through it, as are most of the talks I give. If it went down tomorrow, I’d be pretty sunk. For that reason, while the government was going through its shutdown last October (I figured no one would be around to object), I made a reasonably complete duplicate of everything in OpenNet using what is known as a “scraper” script.2 Obviously as OpenNet gets updated, my database will fall out of sync, but it’s a start, and it’s better than nothing if it gets unplugged tomorrow.
The amazing thing about digital databases it that they take the archive everywhere at once, instantly. The terrible thing about them is that it only takes the pull of one plug to shut it down everywhere at once, instantly. Anyone who does research on nuclear history issues should be deeply disturbed by this rash of site closures, and should start thinking seriously about how to make copies of government databases they rely on. (Private databases are more complicated, for copyright reasons.) The government gave, and the government has taken away.
- Which databases, you ask? 1. The CIA’s online FOIA database; 2. Gale’s DDRS database; 3. the DOD’s online FOIA database; 4. DTIC; 5. ProQuest’s Congressional hearing database; 6. the JFK Library’s online files; 7. the National Security Archive’s online database; 8. the Nuclear Testing Archive (DOE OpenNet); 9. the OHP Marshall Islands Database; 10. the ProQuest Historical Newspaper database; 11. the UN’s website; 12. the searchable Foreign Relations of the United States. The only other significant non-online archival sources were the Hansen papers at the National Security Archive and some files from the JFK Library that they provided me over e-mail. [↩]
- I whipped something together using Snoopy for PHP, which allows you to do all sorts of clever database queries very easily. [↩]