[Thread Prev] [Thread Next] [Thread Index] [Date Prev] [Date Next] [Date Index]

Re: Digital documents



As a professional in the data storage business and an amateur to this 
group, I'd like to comment on the discussion of data retention.

The problem is wide spread and far more insidious than imagined at first
glance.  For example, my Social Security records are on digital tapes
that go back to the mid-50's.  Most of those tapes were rated with a
maximum life of 25 years.  Many are suffering the degradation that comes
with time on celluloid, mylar, etc.  (Consider the sorry case of early
cinema where 70-80% of all early reels are now gone.)

The problem is more than the changing standards.  The actual media
doesn't have the history, deveopement and maturity of paper.  It is
centuries away from that.

The changing formats are rightfully an issue.  Most of the Social
Security, IRS and DMV records are on 7 track tape.  It has been almost
20 years since 7-track units were readily available.  These agencies
spend tens of thousands of dollars per custom built unit just to read
the old tapes, let alone move data to more modern media.

If the government won't budget enough to upgrade the tax base,
educational and scientific needs probably won't even be considered.

Several of my customers are facing the problem head on.  But there have
been no great success stories to my knowledge.  The saddest one I've
heard recently is the fate of the data from many of the early weather
satellites over the last 35 years.  The budget would not allow even
proper storage.  No hard copy was ever made.  Most of this data was on
magnetic tape that is now in a powered form, gone forever.

One of the positive aspects of the changing formats is that they hold
vastly more data every year.  The state of the art capacity for the 3
1/2 inch floppy is 1.44 MB vs 76 KB for the first 8 inch floppy, a 20
fold increase in just a few years.  Hard drives are currently at 5-10 GB
and doubling every 18 months.  Not fast enough for the rate of data
generation, but still impressive.

How to preserve precious data in this rapidly changing environment is a
challenge that is being addressed by several small firms here in silicon
valley.  Many offer sevices to upgrade records on an ongoing basis, old
media to new.  Their customer base plans to be in a constant mode of
upgrading, 7-track to 9-track to 18-track to CD-Rom to DVD, etc.  The
task seems to be smaller than preserving the old media and maintaining
the hardware to read the old format.  It also gains physical space and
faster access speeds.  And cost is significantly better.  In my career,
the devices have dropped from about $100/MB to $.10/MB even with
inflation present.

Embracing the changes is not pleasant.  But it does keep all your data
available.  It puts more data into smaller packages.  It lowers the cost
of an archive.  And it makes it more readily available.

One of the projects I worked on was to convert paper files to CD-Roms
via a scanner.  Once on the CD-Rom, copies were around 80 cents to make.
Each CD held more than a full file cabinet.  Another project I know is
in progress is to convert 25,000 reels of 7-track tape to hard drives.
Each reel holds less than 20 MB of data, so the total data base comes to
half a terabyte, or about 50 of current technology drives.  The user
goes from minutes or hours for access to milli-seconds.  It's automobile
registration so cost isn't the first factor.

I'm not a technology bigot.  I love my books, their feel, their
appearance.  But as a book collector, I'm still searching for many books
less than 100 years old.  Books don't solve the problem either.  And how
many students can afford the journals or small print run books?  In
paleo, how many have even seen an AMNH Novitate or some of the scarcer
Smithsonian pubs?

What all this means is that the large amount of scientific data can be
preserved, but it would take an active continuous program.  It could be
done cheaper than paper and books but not nearly as secure or
permanent.  Copies could be widely distributed to prevent tradegies
through war, acts of God, etc.  (Look what war did to vast archeology
collections in Europe.)

On the down side, the format changes are always present; once you start
this path you can't get off.  Second the data can be manipulated; it's
hard to change a lot of books.  But data can be changed by one person
during an upgrade to new technology.  It also requires technical
expertise by individuals who spend a lifetime just learning their own
field.

I, and many of my compatriots have been looking for a solution to this
problem for many years.  If we could find one, we would certainly be
retired in luxuary because the market is huge: every government agency,
every library, every business in the world.

Wish I had more to offer this group.  My focus for the last 25 years has
been on this issue.  But I haven't found anything better than what one
gentleman described: update to the new and throw away the old.