Storage, Part I of Many

While on a break from blogging, I’ve been churning about all things In Perpetuum. By churning, I mean creating lots of digital audio and video files, making a short movie, and sipping a campari and worrying about formats and storage. In these dog days, then, when it doesn’t hurt to prop the laptop on icepacks, I am taking steps to Do Something.

Lo these many months ago when I conducted an inventory of my materials, I found that I had different priorities and strategies for preserving music, video, captured sound, and text files. After exploring how best to optimize my devices to capture the best quality audio and video (the better to preserve), I floundered on how to store and manage the files. None of the options seemed like The One Solution.

Since I’m somewhat recently come from an academic institution, I assumed that having The One Solution (supported by central IT, of course) was the only solution. Silly me, forgetting the purpose of this exercise: Save my stuff sans the infrastructure and resources that public and private institutions afford. I’m taking one for the team. In which case, there isn’t The One Solution maintained by central IT. There are many solutions, and it’s complicated enough that I’ll have to write a plan just to remember what to do. The basic parts involve:

  • two 1-terabyte drives that rotate between my domicile and a safe deposit box at my bank
  • 1 terabyte of space in the ether (see comments below)
  • a new laptop to handle video and data processing
  • two old laptops that manage music (the oldest one) and travel (the second oldest one) – of course, backed up to the drives and the ether.

What – you don’t save your old computers?

I’m almost ready to blow my tax return on digital and analog storage, and I’ll provide the numbers in a subsequent post. But I’ve been reading and thinking, especially about online storage, and here’s what I’ve learned:

There’s a language problem: Web hosting. Cloud storage. Network storage. Everybody says they has it cheap, though their grammars aren’t always rights, and theys seem to have many company for one services. (note to self: look at DNS Registration). Everybody says they gives you tools to manages your stuffs. And it encrypted. And it unlimited. Caveat emptor. The web hosting peeps don’t do data storage, but they’re keen on SSL and will let you FTP an unlimited amount of stuff to their site. They do want to throttle traffic and maintain service for everyone who’s hosting a site on their servers, and they do promise much uptime. The cloud/network storage peeps don’t trust you with FTP and want you to use/download their synching tool. They have good thoughts about security mostly, but storage space is at a premium, and they don’t cop to how long it actually takes to send 1 terabyte of information into the cloud for storage. (Note to self: check bandwidth of ISP.) Several sites review web hosting and cloud/network storage options, though it seems that there is a “reward” for reviewers. Mad props to the peeps on the MacRumors forum ( who suss out language and storage options.

There are choices to be made: Just because you’ve created an array of online and offline storage options doesn’t mean you can FTP and overwrite willy nilly. Should there be an Ur machine that contains everything? What if you have a video file from, say, a Cannon FS200 that saves a .MOD and a .MOI for each video clip. These files are virtually unplayable (thank goodness for VLC) and un-editable unless converted to another format. What do you save in your 3 1-terabyte locations: the .MOD/.MOI files; the converted .DV files; the edited and marked .DV clip that you plan to use for a movie? What about the original file on the SD chip? Safe deposit box?

There are other lessons, but this post is long enough, and the WordPress servers are limitless(!), so I can post more thoughts on storage. The lessons learned, for the moment: I have further confirmation that it’s best to make a storage and preservation plan for each type of item that’s dependent on its whole life cycle. Somehow, finding a way to print my high school thesis in Word Perfect 3.x seems like a piece of cake.

In Perpetuum Week 2 Report

The primary Week 2 activity was to take the “Born Digital Blog” AIMS survey and modify with supplementary questions as needed. The goal was to have a more structured survey to provide more context for my informal Materials Inventory. Unfortunately, taking the AIMS Survey was not as bounded an exercise as I’d hoped. After struggling with the broad scope of the survey, I found was easier to begin modifying the survey with respect to images, while trying to remain media-neutral, than to answer the whole survey for all of my materials.

Ultimately, I modified the two-part AIMS survey by:

  1. combining “digital environment” (e.g., hardware, software, back-up) questions from Parts I and II for capture in a spreadsheet;
  2. answering relevant “digital creation, use, and management” questions from Part I while interjecting my own; and,
  3. creating a list of “personal context” questions that reflect my priorities and process activities specifically for images.

The final result yielded answers that were structured in such a way that I could better act upon them (the difficult part!), which will be chronicled in subsequent entries.

Brief Analysis of the AIMS Survey for Personal Preservation

The survey is split into two sections. Part I begins with the note: This part of the survey is designed to be a prompt sheet for phone / face-to-face interview with donors by curators / digital archivists. Since I am playing both roles, the asking/answering is not such a fraught event as a one-time donor/curator conversation might be, though the survey asks for follow-up contact info. However, one of the lessons learned from acquiring and cataloging digital materials for an institutional repository is that at least one or two exchanges are required about the materials, the metadata, the policies. For a donor with even a minimal amount of hardware and history of creating digital files, I imagine this would be a prolonged conversation. In fact, I’m curious about the phases of an inventory / acquisition cycle before there is a complete hand-off to the institution, a very human-intensive process. Certainly schlepping materials off a few hard drives makes for quicker acquisition, but time invested at the beginning of the inventory and during acquisition might result in easier organization and less uninformed forensic work in the lab. Part I was very useful as a prompt to consider the range of digital environments where a donor’s content might reside and the informal policies or practices a donor might have regarding use.

Part II begins with the note: This part of the survey is designed to be filled out by digital archivists regarding technical details of the tools used to create digital material. This section was also useful as a prompt to consider hardware, software, networking, internet access, and security issues. However, this information begged to be organized in a spreadsheet, and there was some information from Part I that would make more sense when combined with Part II. Also, there was information from my informal inventory that I wanted to capture, hence the new spreadsheet.

The Born Digital Blog mentioned in a late 2010 post that the AIMS survey was to be put online with a database backend, but I can’t find the exact post at the moment.

I don’t mean to sound like I’m hating on the AIMS survey. It was developed and modified by two very thoughtful groups of people who are working in a different context from me and who have other pressures as well (e.g., library directors; institutional missions; project partners; funding requirements). Thanks to their hard work and willingness to share, I can adapt the survey for the home user or the lone preservationist.

Per the document, “This work is based on the Paradigm records survey published by the Bodleian Library, Oxford University.” Further, “This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License. Revision: July 16, 2010. Born Digital Collections: An Inter-Institutional Model for Stewardship (AIMS).”

Next posts will be the actual artifacts (spreadsheet, modified questions with answers, personal context questions) and some decisions about images.

Materials Inventory

As noted in the In Perpetuum project plan, this week’s activity was to take an inventory of my materials that I want to, shall we say, save. “Preserve” is such a loaded term and doesn’t fully express what I mean. Rather than give a blow-by-blow of this watching-paint-dry exercise, I’ll report some of the highlights, observations, and next steps.

Compulsive, but deductive, organizer that I am, I began by listing the information about the materials to capture. Whoa there, it’s not metadata yet. The result was a more thorough list of my materials (digital and analog) to save, how I’d like to sort the eventual metadata (by Type, Level of Access, and Priority), and other bits of context to capture in the inventory (e.g., dates, type of use, end goal for material, current storage, file type, OS, priority for saving, and unit(s) of measure (total can of worms)).

I was mostly successful in capturing what was outlined. The inventory itself took about 2 hours. Useful items: pen, paper, tape measure, lots of floor space, dust-free area, dust cloth, plastic bag and tape for batteries (tape nodes, save for recycle). I was remiss in not making photos but will do a better documentary job as this progresses. In fairness, I spent several days over the past two months consolidating files on my external drive and computer, so 2 hours is the culmination of a week of work.

Observation 1: In another post, I’ll summarize the amount of storage on all my devices and how much space I’m using. Don’t yawn. The units of measure don’t permit an apple-to-apple comparison between digital and analog when planning for saving or storage, but there is overlap between the two: a 250GB external hard drive in its box also takes up a 10×5 inch space on a shelf.

Observation 2: There is A LOT of redundancy in what’s been saved to date, at least for images. I don’t delete and re-use the image cards from my digital camera, and the images are on a computer, an external hard drive, and Flickr. What to do?

Observation 3: However, there are other files/folders for which I have only one copy. The worst case: At some point I put files from 1987-1999 that were on 3.25 floppies onto a PC and have migrated that folder through at least four computers (Mac and PC). It lives on the external hard drive only.

Observation 4: After the inventory, I was a bit overwhelmed. It’s a royal pain to manage my materials, and I like doing this. What is everyone else doing, and what’s being lost? I made the “fire list” of priorities in case I run out of steam, and go figure, I want to save all the materials on tape and film. I’m less concerned about the digital materials (except for my high school senior paper in Word Perfect). In twenty years, will today’s kids have to worry about their term papers being inaccessible in Google Docs? For the record, I would save: an interview with my great-grandmother on micro-cassette; a folder of “vital” documents; all pictures; all video; mix tapes (audio cassette); home VHS tapes; vinyl; my iTunes library; master file of work notes; email; work documents

Observation 5: I’ve spent a lot of time burning email to discs and exporting it from one machine to another, but it’s a really low priority to save. In the interest of “saving” (my time, trees, energy) should I print that special email from my mom, put it in a folder, and delete the million other mundane messages? Also, saving things that I’ve published is dead last on the list. Sorry open access folks, but right now I’m counting on publishers (open or not) to perpetuate the academic record of my work.

Observation 6: Take out your batteries! Especially the alkaline ones if you haven’t used the device for more than a year. I understand some people (not me) like to lick the white stuff that flakes off corroded batteries. Save them, and you and your materials, a trip to the emergency room, and remove the batteries from devices you don’t use but won’t throw away, ahem, recycle.

This post is way too long, so I’ll put next steps in another segment and will post the spreadsheet with inventory. Scintillating reading.

Personal Preservation

I’m beginning a new project to explore the options for preserving my digital (and maybe analog) stuff: images, videos, and texts. This three month, proof-of-concept endeavor is a classic plan-to-plan activity, but at the conclusion, I should have a solid idea of what I can do, as a layperson, about digital preservation and a plan for doing it. I’ll post the project plan shortly.