How the Librarians Saved History: Harvesting Government History, One Web Page at a Time

My work is rewarding whether is gets recognition or not, but I have to admit, it was nice to get an honorable mention in this NYTimes article.   screen-shot-2016-12-01-at-11-29-42-pm

There is so much I love about what I do, and where I do it and who I do it with, and this project brought it all together. My fellow harvesters and I were all connected by at most 1.5 degrees of separation. There was even someone who was in a class I taught a zillion years ago when I was an adjunct at Queens College!

It is so great to go the conferences and events and run into so many Pratt graduates, many were my students and took my information policy or government information classes. I remember their term papers and their presentations, and it is great to see them involved in information activism.

But now that my five minutes of fame are over it’s time to get back to work for access to information. Looking for suggestions. And don’t forget to #GovDocs@Trump


Double feature: #GovDocs2Trump Tweetathon and End of Term Harvest

This came to me through library channels and may have originated from @noftalee . The idea is to tweet Trump some of the documents that tell the story of our country.

The Tweetathon announcement  says:

#GovDocs2Trump Tweetathon

America deserves a president who is well versed in the history of this nation and the documents upon which that history was built. Let’s present those documents to the President-Elect through his favorite medium–Twitter.

Tweetathon will begin at 9am (central) on December 1, 2016. You are welcome to join at any time.

Feel free to use whatever government related document (Supreme Court decisions, innagurial addresses, speeches, early American papers, etc.) strikes your fancy.

Tag each tweet with the hashtag #GovDocs2Trump and please send them to @realdonaldtrump. This way we can fill his feed.

Finally, please make your first tweet “Dear @realDonaldTrump, We the people demand an informed President.

So yes, of course I plan to join the Tweetathon. In fact, I started making a list of documents I will send. These include the CONAN, The US Constitution Annotated , the Nixon grand jury records  and many more

For those who would like to join the conversation but need suggestions on where to find government documents, here are some suggestions:

Our Documents  has a list of 100 millstones documents from American history such as the Emancipation Proclamation . A much larger collection is available from Govinfo and Government Publishing Office’s database. Browse their index  for Executive orders, Presidential papers and more.

Are your interests in history, diplomacy, foreign affairs? Try FRUS  Foreign relations of the United States. There you will find all the correspondences, cables, letters, etc. between presidents and other official. The collection is arranged by president and by topic. For example John F Kennedy Kennedy-Khrushchev Exchanges, Volume VI (it’s basically a retrospective edited wikileaks)

For those that are more into numbers, there are reports from the Census Bureau  on topic such as poverty as well as infographics

And the Double Feature? The start of the Tweetathon happens to coincide with the End of Term Harvest event I am facilitating tomorrow at the New York Academy of Medicine

Grey Literature End of Term Harvest. 10-1pm, The New York Academy of Medicine, 1216 Fifth Avenue at 103rd Street, New York, NY 10029.

The change of government administration brings the potential to eliminate websites, remove information and limit access to past administration content. This day we will identify such websites, particularly in areas on the Affordable Care Act, climate change and more, focusing on government social media and information not on .gov domains.

The plan is to double dip. Not matter where you are #GovDocs2Trump


Archiving the Obama administration

On Thursday. Dec 1 I will be facilitating an event in which participants help archive websites for the Obama administration, website that are in danger of disappearing with the change of administration. The event is hosted by the New York Academy of Medicine. Details on NYAM website Archiving instructions are available here

screen-shot-2016-11-21-at-4-34-29-pm


Capturing government websites during the 2013 shutdown

The shutdown of the U.S. government between Oct. 1-16 left government websites in varying stages of disarray. Some agencies shut their websites completely, others remained accessible although no longer maintained, and others still seemed unaffected.
Libraries and commercial vendors stepped in to fill the gap. Librarians created LibGuides with updates on the status of government websites and sources, and commercial vendors provided temporary complimentary access to their databases that are based of government information. In the few days that past since the reopening of the government, these LibGuides ceased to exist and commercial access removed (in fact, access to Social Explorer ceased two days before the shutdown ended). This should not be surprising given the nature of immediacy of websites where here-today-gone-tomorrow is the prevailing approach and the historic value of documenting transitions is overlooked.

I thought it worthwhile to document government websites during the shutdown and was looking for a way to do so quickly and with the limited technology tools and skills available to me. The immediate solution was to capture government websites with Zotero and create a library of websites at the time on the shutdown.

Zotero is an open source bibliographic citation manager from George Mason University. It can be integrated to a web browser and when clicking ‘add’, it will capture the website displayed in the browser, saving a screenshot of the website as well as the bibliographic citation. It is quick and easy to use and answered the needs of this project.

My first priority was to capture all official government websites. Using the A-Z list from USA.gov, I captured 405 websites from the legislative, executive and judicial branches as well as quasi-governmental websites. All the websites are available from this Zotero library.
Screen Shot 2013-10-20 at 6.34.38 PM

Next, I decided to capture the official social media websites used by the U.S. government. This part of the project was done by my colleague Anthony Cocciolo. We based the capture of social media sites on work previously done for the End of Term Harvest. Anthony wrote a script for a program that would crawl all the social media websites, import them into Zotero and capture their screenshots. The result is a library of 1356 government social media websites.
Screen Shot 2013-10-20 at 6.37.57 PM
The final step of this project is still in progress, and includes adding tags to all snapshots. These tags can allow future researchers to search the collection applying different filters such as shutdown status (Completely shut down, Available but not updated, No apparent change), by branch of government (Executive, Executive Office of the President, etc) and by agency (Dept. of Labor, Interior, etc.). These tags must be added manually and I will continue to do so over the next few weeks.

While we recognize this is not a true archive, we hope this capture will help those who are interested in learning more about the status of government website during the shutdown.


Suggestion Box: The Full Catastrophe: NYPL – Please review your circulation policy

The off-site storage policy of New York Public Library has stirred much public debate, and rightly so. I feel honored to live in a city that cares about its public library. I wont repeat the debate and will only say briefly that I personally have no problem with off-site storage. From what I tested, the turn-around time is pretty good and I feel the books are accessible. What I do take issue with is that all the books stored off-site do not circulate. To invoke the cliché, what does that have to do with the price of tea in China? In other words, why can’t off-site books circulate, and why are so many book in-library use only? Once a book is delivered, why can’t I check it out? For example, Zygmunt Bauman is a contemporary socialist that has published widely on themes on post-modernism, modernity, liquid societies. He published 57 books and countless articles. Many of his popular books are available on Amazon but at NYPL, only two of the books circulate, and the others are all either off-site or in-library use only. Why don’t these books circulate? These are not books one can read at the library, these are not reference books; these are books you have to have by your side as you read them.
Screen shot 2013-05-26 at 5.40.53 pmAnother example: The Full Catastrophe, a book by David Carkeet, is a academic funny mystery novel with a linguistic twist. It’s fiction, it’s a novel, it’s mystery, it’s summer on-a-rainy-day upstate kind a book. What is it stored off-site and does listen? NYPL are you listening?