The Pulse of Ocean Data Archiving

Every morning, I wake up and check my email. Overnight, I get a few pings from mailing lists, people in other time zones, spammers – the usual. Lately, I’ve been waking up every morning to hundreds of new emails, and it’s inspiring. It’s cheering the dark bit of worry it my soul. It’s making me bask in the warm glowing warming glow of true science heroes.

It’s the issue tracker for the GitHub repo dedicated to saving the US Government’s public science data sets.

Given recent actions taken by the administration, as well as the painful memories of what the conservative Harper government literally throwing out reams of data into dumpsters (for a fairly full timeline of Harper’s assault on science and removal of data – particularly at the Department of Fisheries and Oceans, near and dear to our hearts, see here as well as here), scientists have realized that the precious data we rely on from the government might not be there tomorrow. The EPA is under assault with promises to review individual data sets and (fortunately now held at bay) plans to remove their information on climate change. The Research wing of the Department of Agriculture was briefly gagged. There are discussions in the air of NASA’s division of Earth Science receiving some heavy regulation*. We’ve yet to hear much about changes and possible assaults on science at NOAA, the USGS, or other agencies that are the lifeblood of many scientific enterprises. I hope they do not come to pass.

But given what we saw with Harper in Canada and the swirling rhetoric of today, not to metion people like Pruitt becoming the EPA head, it’s concerning. Will we have the data we need to make decisions about the health and future of our oceans? Will we be able to do science with the long-term records these agencies have been able to create over the past few decades? I don’t know about you, but if those data streams disappeared tomorrow, some of my science would be well and truly farked.

Enter Data Refuge, Climate Mirror, The Environmental Data Governance Initiative, and dedicated scientists and coders dedicated to the preservation of this precious resource. Coders and non-coders alike have been gathering at events around the country (for a list of upcoming, see here) to preserve these vital resources.

But how can *you* participate if you’re just someone sitting on the side with few resources do work at one of these events (besides watching things unfold on twitter)? How can you see what’s going on, if only to lower your blood pressure if you’re a user of these data sources?

A lot of the coordination for Climate Mirror is happening on Github. For those of you unfamiliar, GitHub is a repository for version control of computer code in the cloud. It’s part of many people’s daily workflow (WOO!), but, what makes it super powerful is the ease of collaboration and tracking issues. Clever folk have used GitHub’s issue tracking system to build an open-source tasklist of what data sets need to be archived and where are they found. This comes with detailed instructions about dataset reporting.

For me, I use a lot of wave hindcast data from NOAA for global climate change effect models. I didn’t see any listed, so, I filed an issue as directed and within a day or so some intrepid archiver picked it up. So, if something happens to this source, in a month, in a year, we’ll have the data we need to ask questions about how waves have changed over the past few decades. I relax, just a little bit. I hope others do, too

And so, sometimes in drips, sometimes in firehose-like floods, my inbox fills with issue tracking reports from this repo. Every one makes me breathe a bit easier. Every one lets me know that we, as a society, value information and what it can do for us. Every one makes me think about the library of Alexandria, and that we will not let that happen – not to the data from our oceans, our planet. We need the data to tell us its stories. And now we can rest a little easier with each new data set flying up into the cloud.

N.B. Some of my lab’s funding currently comes from this division at NASA, and I also have funding from NOAA SeaGrant.