Backup tips and strategies for Artists
Summary: an introduction to taking proper backups with emphasis
on the data commonly used by artists.
A backup, in computer lingo, refers to making a copy of
important data for the purpose of data recovery. Should the
important data get damaged or lost, a properly made backup will
restore it all. The word "data" refers to anything stored on a
computer system: images, programs, document, videos, etc. Taking
backups of important data can prevent loss of valuable work and
the time needed to recreate it.
In this article we'll take a look at common backup types and
strategies, data compression, and common backup media types. A
real life backup scenario will illustrate my own backup
procedures. The article will end with general backup tips.
COMMON BACKUP TYPES
The best backup methods rely on simple and time proven
concepts. New or unnecessary technologies are best avoided till
proven reliable and necessary. The simpler the procedure, the
more likely it is to work correctly.
A full-backup consist of making a copy of all important data.
When you copy a folder with important files, from say a hard
drive to a floppy, you actually make a full-backup of those
files. Due to simplicity, this approach is the most reliable of
all backup types. Its main advantage is ease of backup creation
and restoration. The main disadvantage is that the backup will
use as much space as the important data. If the data is large,
the backup process can be very resource intensive in terms of
time and the processing power needed to carry out. Imagine the
time needed to full-backup a digital library consisting of
millions of books. Such operation takes days.
An incremental-backup works differently in that it backs up only
the modified files since the last backup. When using this
method, a full backup is created first and then incremental
backups are run on regular basis. For large amounts of data this
method is often the only practical way to backup. It takes up
less space than a full backup and is less resource intensive to
run. On the other hand, contrary to full backups, incremental
backups need dedicated backup software to keep track of what
files to backup.
Compressing the backup data is a popular option. Such practice
lowers the amount of space needed on the backup media. Although
compression adds an additional layer of complexity, it can be a
good (if relied on wisely) and sometimes necessary solution.
ESSENTIAL BACKUP STRATEGIES
Regardless of the backup type and data, the following backup
strategies should always be followed:
- backup should be taken on a regular basis
- backup should be automatic and need as little human
supervision as possible
- backup should be stored in a safe remote location
- backup should rely on well established hardware and software
technologies
Backup should be taken on a regular basis. The more frequently
the data changes the more often it should be backed-up. For
example, some of my most frequently updated files (website
files, source code, notes, etc.) are backed-up daily. Files that
are less frequently updated are backed-up monthly.
Backup should be automatic. Except for the initial configuration
of the backup program and the occasional supervision, the whole
backup process should be automatic and completely transparent.
That is, the backup should run by itself without causing any
attention unless necessary.
Backup should be stored in a safe remote location. Should the
location of the important data get damaged, destroyed, or
exposed to theft - a remotely stored backup becomes invaluable.
How remote? Disasters like fire, flood, tornado, earthquake,
etc., can cause widespread damage. Ideally a backup should be
stored in a far away enough, minimal risk location.
Backup should rely on well established hardware and software
technologies. Such technologies are typically in widespread use
- thus cheaper and easier to troubleshoot, or get help in the
event of failure. As established technologies become gradually
replaced by new and better ones, so should the backup media and
hardware and, if used, the software to re/store the data. There
is no guarantee that the common backup media of today, like CD
or DVD, will be usable in ten years. The same is true for
software. A good data preservation strategy should include
continual migration of the backup data to mature and well
established technologies of the time.
A BIT ABOUT DATA COMPRESSION
Compression makes data smaller and thus is a popular backup
option. Its main advantage is lower backup cost due to lower
space use. The downside is the time needed to compress the data
and later to uncompress it for restoration.
Many compression formats exist. Each format use some sort of
compression method called an algorithm. There are two types of
data compression algorithms: "lossy" and "lossless". Lossless
compression reduce the data size without modifying its content.
Lossy compression modify the data content to make it even
smaller than lossless compression.
Some compression formats, like MP3 or JPG, are highly
specialized. They use lossy algorithms and produce very small
file sizes but can only compress a particular type of data.
Other formats, like ZIP or BZIP2, are of general purpose. They
rely on lossless compression algorithms and can work on any
data. However, they will never outdo special purpose formats
like MP3 or JPG. PNG and TIFF are popular image file formats
which support lossless compression.
Unfortunately, due to the nature of lossy compression, JPG, MP3
or any other lossy format degrade the original data to some
extent. In other words, saving an image or music in a lossy file
format will make it different then the original. Usually the
difference, called compression artifacts, is so small that most
of us don't see or hear it.
For the above reasons, lossy compression should never be used
when saving important data. Only lossless compression is
suitable for that. PNG and TIFF are examples of image file
formats that support lossless compression. Such formats are
ideal for storing hi-resolution master images.
Finally, compression takes time and normally uses all available
processing power. Generally, the better the compression the
slower it is. Some compression algorithms are extremely good at
compressing but also extremely slow. For backup purposes, one
should evaluate common compression formats and set for the most
suitable one.
CONSIDER YOUR NEEDS
Some additional issues need to be considered when designing the
most suitable backup strategy for own use:
- the type of backup files
- if compression is desired, what compression to use and how
- backup storage media
As noted earlier the best backups are simply copies of important
data. Such approach works especially well for artists who rely
on compressed image formats like PNG or TIFF.
Note the difference between "built-in" image compression, done
every time you save an image in a format that supports it, and
compressing the backup data - applied to all backup data
regardless of what it is.
What backup compression to use, and if to use it at all, depends
on the type of backup data. Generally, text files (TXT, HTML,
XML, etc) can be compressed the most of all file types. Images
that have been compressed with their own algorithms (PNG, JPG,
TIFF, etc) can't later be compressed much if at all. Images
which don't have own compression (BMP, TGA, etc) can often be
compressed quite a bit, though this depends on the actual image
data.
Thus if most of your important art data consist of images that
are already compressed, there is no need to compress the backup.
Text files on the other hand, can be compressed a lot and save
significant amount of space.
There are a few other things to consider when compressing backup
data. What compression program to use and how to compress the
files.
ZIP is the most commonly used compression format today - it's
fast and compresses well. It's been around for a long time and
is universally available. But there are other, less known, good
alternatives. For example, 7ZIP, RAR, and BZIP2 compress
significantly better than ZIP and are only slightly slower.
Finally, how to compress backups. Basically one can either
create a compressed archive of many files, or compress each file
individually. The main disadvantage to creating a compressed
archive is the possibility of loosing all files in the archive
if the archive gets corrupted and can not be recovered. On the
other hand, if files are compressed individually one looses only
one file - should it get corrupted and be unrecoverable.
Additionally, since a compressed file use less space than
uncompressed, it's less likely to get corrupted. Thus it's more
safe to compress files individually.
WHICH BACKUP MEDIA TO USE
The commonly used backup media today are hard drives, tapes and
CDs/DVDs. Hard drives are the fastest and often the best option
for large amounts of data. They are also the most expensive and
not very durable. Tapes are slow but can store a lot of data and
can last decades. CDs/DVDs are probably the most common backup
media used today due to its very low cost. Unfortunately, just
like hard drives, most have a relatively short expected life
span of between two to five years. Internet backup solutions are
also becoming a popular backup option.
Reliability is important to consider when choosing the backup
media. How robust is the media and for how long can it retain
the data? The quality of the media plays a significant role
here. All media degrade over time, but some degrade more than
other. Most of the low cost burnable CDs have a life span of
around two years. Higher quality CDs can last up to five. Very
high quality CDs with a gold layer are expected to last decades.
Generally, if the handling and storage conditions are good,
quality media should last at least few years without data loss.
However, unless the best quality media is used, an annual full
backup is probably the safest prevention against data loss due
to media degradation.
A combination of different media may often be the ideal
solution. For example, some of my own backup practices include
using an external hard drive to mirror (update) certain parts of
my computer hard drives. Twice a year I burn all important data
on several DVDs.
I recommend spending some time investigating the most suitable
media and the hardware to operate it. High quality products will
minimize the possibility of backup failure.
THE NECESSITY OF VERIFYING BACKUPS
The most important aspect of taking backups is making sure they
are error free. The backup data may prove useless if corrupted
due to media or other error. It's good practice to immediately
test the backup for its validity. Errors will be detected and a
new backup can be taken right away. Any respectable backup
program provides an option for data verification. What good is a
backup if its data is corrupted?
A REAL LIFE BACKUP SCENARIO
My most valuable data is my art data, website files, source
code, and various docs. All my hi-resolution work is stored in
either PNG or TIFF. Nearly all my reference images are JPGs.
Thus all my image data can be backed up without the use of
compression and save huge amounts of backup time and space. I do
compress 3d files which don't use own compression. For that I
use bzip2 with the maximum compression setting. All the
remaining data are basically text files and are compressed
individually using either bzip2 or 7zip. Images and 3d files,
even compressed, can be huge in size. Not surprisingly over 90%
of my backup space is used on art data.
I backup daily, monthly and twice a year. Once a day, the files
which are frequently updated (notes, work in progress images,
source code, website files, email, etc.) are backed up to
another hard drive. This happens during the boot process and
takes a few minutes. Once a month I backup to a CD which also
includes less frequently updated files. A copy of that CD is
stored in a remote location. Twice a year I take full backup and
store it on several DVDs at a friends house. If I work on
something especially important, I store it daily on a CD/DVD or
a USB mem-stick. My most critical data is also regularly
encrypted and stored on a very remote internet host. I wrote a
script to run all these backups automatically. With the
exception of CD/DVD storage, no manual work is involved.
As you can see, a custom backup solution can be quite
sophisticated yet simple to carry out. It can involve a
combination of different media and backup procedures to
optimally satisfy ones needs.
FINAL NOTES
Depending on your needs a dedicated backup software may be a
necessary investment. Make sure to research this carefully.
Usually, products from reputable companies that specialize in
certain solutions are best. There are also many good open source
or free software alternatives.
It's best to avoid products which rely on proprietary or closed
solutions. For example, a backup software may store the backup
data in an unknown format only supported by this particular
backup software. Avoid that. If the company goes out of business
and the backup software breaks, your backup data may be lost
forever. Look for products that rely on well known, mature, and
ideally open technologies. For example, PNG is an open format
for storing image data. What this means is that the
specification, or blueprint, for that format is publicly
available for anyone to use it. This increases compatibility and
reduces reliance on any specific vendor or product.
Most artists important data consists mainly of images and 3d
files. To save space rely on PNG, TIFF or JPG for bitmap image
formats. Vector images and 3d files can be compressed
individually if needed. A basic backup software that simply
copies specified files or directories to the backup media may be
all that is needed. It's best to make two sets of the backup
data and store each at different location. One close to home,
like a friends place, or a bank box and the other far away.
Setting up a proper backup strategy may initially require a
significant amount of time and cost money. There is a lot to
research and consider. In the end however, a good backup
procedure will prove an exceptionally valuable investment. As
you read this, your screen could go blank due to a hard drive
crash. All your valuable data - years of work, reference images,
documents, photo albums, 3d files, email, etc., - could be lost
forever. Unless you were prepared and took a backup.