[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [SAGE] Linux Backup Software -- HELP needed in finding a reasonable one



Mike: 
You didn't specify your data size or your budget. 
That impacts what you can run.

We have a dual backup standard, i.e., use either Veritas NetBackup or amanda.

Amanda shines in a networked backup environment where multiple hosts can back up 
in parallel to a single large disk(s). Backups are written to the disk; at the 
same time the disk data is streamed to tape. The multi host backup and 
asynchronous streaming is what really improves the performance, i.e., reduces 
the backup window.

Here is an example summary Amanda run using SDLT with real world results.
The data is primarily hosted on a 2TB IDE RAID box attached to a Linux x86 host 
but additional data is also backed up from the local drives of engineering 
workstations. The latter are mostly Sparc architectures, ranging from Ultra 10 
to Blade 2000.

STATISTICS:
                          Total       Full      Daily
                        --------   --------   --------
Estimate Time (hrs:min)    0:39
Run Time (hrs:min)         8:50
Dump Time (hrs:min)       18:50      18:12       0:38
Output Size (meg)      148693.6   147454.5     1239.1
Original Size (meg)    325030.9   321042.6     3988.3
Avg Compressed Size (%)    45.7       45.9       31.1   (level:#disks ...)
Filesystems Dumped           75         43         32   (1:31 2:1)
Avg Dump Rate (k/s)      2245.4     2305.0      550.8

Tape Time (hrs:min)        3:34       3:31       0:03
Tape Size (meg)        129788.2   128549.0     1239.2
Tape Used (%)              81.5       80.7        0.8   (level:#disks ...)
Filesystems Taped            74         42         32   (1:31 2:1)
Avg Tp Write Rate (k/s) 10333.7    10390.1     6611.4

No special tuning was done for the above. 

I disagree with the posters who claimed that amanda is hard to set up. On alinux 
box, it comes installed via rpms, no compilation needed. IMHO, it's very easy to 
add systems to backup, and to restore data (especially using "amrecover"). 

The main achilles heel w/amanda is that if you're using a file system backup 
method (e.g., dump) the size of the resultant compressed file must fit on to the 
backup tape medium. You can use a directory based method (e.g., tar) to cut that 
down to acceptable size pieces. Practically, this means that you can use 
multiple tapes, but that the largest backup chunk - whether generated by tar or 
dump - is limited by single tape capacity. 

>From my interaction with folks at LISA, I believe the majority of amanda admins 
do not use multiple drives. It's not a big issue, it's just an empirical 
finding. People looking for inexpensive backup software, tend to have less 
expensive backup hardware. Amanda is configurable to use all your drives at 
once, but that's where I believe you'll have to dedicate time, learning how it 
works with a multiple drive library.

Long ago, before Curtis Preston published an O'Reilly book about backups, he 
gave a backup BoF at Lisa (1997?). I was at that BoF. At one point Curtis was 
surprised at how many people kept asking about amanda. So he asked how many in 
the audience were using it. More than half the hands went up. I think that was a 
factor in why he dedicated a piece of the book to amanda.

Amanda does not have a gui. If this is a must, then fuggedaboutit.

You said that you are using DLT. Assuming 40/80 capacity media, the limitation 
is that your backup chunks, i.e., tar or dump pieces, must be allocated in max 
80GB sizes, assuming 50% or better (i.e. smaller) compression factors. I use 
gzip software compression and I do achieve average compression better than 50% 
(it's 45.7% in the example data above).

One nice feature to amanda is that you give it a set of rules (e.g., a full 
backup must occur at least once every 5 days) and it picks the time when a full 
backup is run for each backup element independently. No more dedicating a run 
exclusively to full backups - unless you want to do that. 

I can also tell you that it's nice to put a message out on the amanda list 
asking a question, and getting back an answer from the developers within two 
hours. 

Good luck, tell us what you decide to do.


Mario Obejas
Engineering Automation & Computing
Raytheon
310-334-7201 (Voice)
310-366-4867 (Pager)