I just got my first IPhone/IPod Touch application approved for the App Store today!
It is called “NIOSH Chemical Hazards“. It is based on the NIOSH Pocket Guide to Chemical Hazards. It should be available in the app store at midnight!
I had never developed any kind of application in the Apple world. It was quite a shock. I’ve been developing in the MS world for over 10 years so I didn’t expect it to be too difficult. It was VERY difficult. I did not find Objective-C or Interface Builder to be at all intuitive although I did like XCode. I certainly won’t say Apple has done a poor job with the tools, but it isn’t something that would drive me away from Visual Studio! I should probably write a blog post about my experience some time.
I blogged the other day about having setup a Xubuntu/Ubuntu machine to backup my FreeNAS based NAS server. Probably my favorite new tool is RSnapshot. RSnapshot is a script that uses rsync to create snapshot based backups. Snapshot backups are a backup technique where you first generate a full backup of a data set and then subsequent backup sets are generated against a copy of the original files that consists of hard links to the original files. If you don’t understand hard links then this can be a bit difficult to understand, but the effect is very cool. I’m not sure I can explain it properly, but I’ll try.
First you do a full backup. Let’s assume we are only doing daily backups and we want 7 days available at all time. Once you have configured rsnapshot.conf with your sources and destinations you kick it off with a command of ‘rsnapshot daily‘. The first backup will be ‘daily.0′. This backup is the one that takes hours, where all the files are physically copied from your source to the backup location. Let’s say we start with 100gig of data, this initial backup might take several hours depending on if it is a local or network connection.
Now, the next ‘rsnapshot daily‘ is executed to backup the same data. When it executes makes a hard link copy of the entire ‘daily.0′ backup to ‘daily.1′. Here is the command on my machine that it executes:
/bin/cp -al /mnt/Backup1/daily.0 /mnt/Backup1/daily.1
Because it is a hard link backup there are no actual files copied, but if you browse both directory trees you will find the identical files exist in both locations. The truth is that they are literally the SAME files. Both directory trees point to the same disk locations so there is almost no additional disk space used. You still have only used ~100gig of disk space. Next the script executes rsync to do the next backup of the source data. This is where it gets cool. Since it is executed against a hard link copy (daily.0) of the data and rsync is only going to copy changed files and removed deleted files. What happens is that when rsync deletes or modifies a file in ‘daily.0′ it only breaks the hard link copy and so the file still exists in it’s original form in ‘daily.1′. New files are also added, but there are no existing links to break for those files.
The effect of this is that only new or changed files will actually be copied and occupy additional disk space. Deleted files will be deleted from ‘daily.0′ but still exist in any previous backup, so no actual disk space is immediately freed up since the file still exists in another snapshot. If you had 100meg of new files and 100meg of deleted files then you will only increase your backup space by about 100meg.
So, now again the next day we execute ‘rsnapshot daily‘ and the process repeats in roughly the same manner each day until you reach the configured number of snapshots to retain. Let’s say 7 days. On the 7th day ‘daily.6′ will get deleted. Only when a full snapshot is deleted will you ever have the potential to decrease disk usage. As an example, say you backup a 100meg file on day one and the next day you delete it before the next backup. That 100meg file will continute to exist until 7 days later when the last snapshot containing a reference to it gets deleted.
To quantify some of the benefits of this strategy lets look at some log files and execute a disk usage command.
Here is the log from executing my 4th rsnapshot backup:
[07/Jun/2009:03:30:02] /usr/bin/rsnapshot daily: started
[07/Jun/2009:03:30:02] echo 11794 > /var/run/rsnapshot.pid
[07/Jun/2009:03:30:10] mv /mnt/Backup1/daily.2/ /mnt/Backup1/daily.3/
[07/Jun/2009:03:30:10] mv /mnt/Backup1/daily.1/ /mnt/Backup1/daily.2/
[07/Jun/2009:03:30:10] /bin/cp -al /mnt/Backup1/daily.0 /mnt/Backup1/daily.1
[07/Jun/2009:03:30:43] /usr/bin/rsync -av –delete –numeric-ids –relative –delete-excluded /mnt/FreeNAS /mnt/Backup1/daily.0/FreeNAS/
[07/Jun/2009:03:34:18] touch /mnt/Backup1/daily.0/
[07/Jun/2009:03:34:18] rm -f /var/run/rsnapshot.pid
[07/Jun/2009:03:34:18] /usr/bin/logger -i -p user.info -t rsnapshot /usr/bin/rsnapshot daily: completed successfully
You can see that the entire backup executed in only 4min and 16seconds even though if you review daily.0/FreeNAS you would find what looks like a full backup. Browsing to daily.1/FreeNAS would contain what looks like a full backup from the previous day. If you wanted to recover a file that you had deleted you could simply browse to the appropriate daily.X directory and simply copy it, it is that easy.
Now, if you execute the command ‘rsnapshot du’ it will enumerate all your backups and how much actual disk space each one contains. The first one, daily.0 will always appear to be the one that contains the full backup and the rest of the files will contain the differences. Here is the output from that command on my system:
rsnapshot du
require Lchown
Lchown module loaded successfully
/usr/bin/du -csh /mnt/Backup1/daily.0/ /mnt/Backup1/daily.1/ \
/mnt/Backup1/daily.2/ /mnt/Backup1/daily.3/ /mnt/Backup1/daily.4/ \
/mnt/Backup1/daily.5/ /mnt/Backup1/daily.6/47G /mnt/Backup1/daily.0/
117M /mnt/Backup1/daily.1/
143M /mnt/Backup1/daily.2/
146M /mnt/Backup1/daily.3/
97M /mnt/Backup1/daily.4/
321M /mnt/Backup1/daily.5/
140M /mnt/Backup1/daily.6/
48G total
Very cool. I love this linux stuff, so much to learn and explore. I won’t leave my Windows desktop, but Linux is certainly a powerful tool for those willing and able to learn.
I didn’t even get into it, but by default rsnapshot can also do hourly, daily, weekly, and monthly snaps to augment daily. It can also backup multiple sources in each execution. All weekly and monthly backups are done against the most recent daily (or hourly if you have hourly configured). There are many configuration options. The only drawback for someone like me from the Windows world is that there is no GUI, it is fully configured from a text configuration file. This isn’t bad, but it can be intimidating.
I had an interesting experience with Linux filesystem performace today. I had a USB drive that I was using to backup about 45 gig of data. The drive had previously been formatted NTFS and since Linux recognized it I tried to use it for the backup drive. It did work just fine. I know USB isn’t the perfect interface for this kind of test, but it is what I have. Later in the day I decided to reformat the drive and rerun the same backup with no other changes. Here is the result. I was pretty surprised at the difference!
NTFS: 2-3MBps
EXT3: 6-7MBps
I’ve had a recent goal of getting a NAS up and running to properly backup all the accumulating data our family generates. Pictures, Video, and Windows PC Backups are the primary goal for proper backup. Linux is the obvious solution for my implementation and have been trying a FreeNAS (BSD based) install running on some old hardware. I’ve had a lot of struggles with FreeNAS, but in hindsight I think almost all my problems can be traced back to hardware incompatibilities. My FreeNAS server appears to be VERY stable now that I have given up on my external ESATA and USB drives. With the external devices I had lots of kernel panics.
My new goal, now that I have FreeNAS stable, is to try the Umbuntu solution. Trying to roll my own NAS. It seems pretty reasonable and the more I get into it the more excited (yeah, I know, sad) I get about the possibilities of a Linux server. At this point I really like Linux as a server OS, but I wouldn’t consider it for my everyday desktop (commence flame war). Anyway, the pieces that I have been installing include:
- Xubuntu (a minimal GUI distribution of Ubuntu)
- RSync (for backup)
- RSnapshot (for rsync snapshot integration and automation)
- Webmin (very cool web based Linux server management tool)
- SSH (remote command line access)
- GPartEd (gui based partioning and formatting tool)
- NFS (for access to my NFS share on the FreeNAS server)
- SAMBA (to provide Windows file shares)
So what I have done is setup my FreeNAS server with an NFS share of the root of it’s data drive. I then setup a persistant mount point on the ubuntu server. Then I configured rsnapshot to do daily, weekly, and monthly snapshot backups of the FreeNAS data. Installation of everything was done through the GUI or the Webmin interface so it has been pretty easy. The only piece that I’ve had to configure though the command line was rsnapshot, but that was pretty easy and well worth it for what it does.
Why setup a separate server? Why not a second FreeNAS server? No good reason, I could, so I did. I really like the flexibility of the Ubuntu server, but the FreeNAS setup and configuration is hard to beat and pretty feature packed. Both servers choked on my ESATA card that I bought at Frys so I can’t say either one did better with hardware either.
The other I shutdown (properly) my FreeNAS box to replace a drive. The next day I noticed that it was rebooting itself frequently. Once in a while is a bad frequency, every 15 minutes is unusable! I removed all my external drives in an attempt to narrow down the problem. Nothing changed. I finally plugged in a monitor and watched. As I expected, it was rebooting as a result of a kernel panic. Unfortunately I?ve had lots of problems like this with FreeNAS (I still want to love it!) and I?ve passed them all off as hardware incompatibilities with BSD. This time, it wasn?t. What I found was that the panic was for ?ffs_blkfree?. I?d love to copy the who panic message, but it only displays for about 15 seconds so I just look for key phrases I can google. After rebooting several more times I noticed that the drive was thrashing quite a bit and that the fsck process was running. Once I saw that I disabled an option that I had remembered setting previously which was to perform a background fsck on every boot.
After disabling the background fsck the system stabilized, but I was sure I had some kind of data problem so I dropped to a command prompt through an SSH session and manually executed the following command:
freenas:~# fsck -t ufs /dev/ad0s2
** /dev/ad0s2
** Last Mounted on /mnt/Data1
** Phase 1 - Check Blocks and Sizes
1 DUP I=4
UNEXPECTED SOFT UPDATE INCONSISTENCY
INTERNAL ERROR: dups with -p
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 1b - Rescan For More DUPS
1 DUP I=4
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
BAD/DUP DIR I=4 OWNER=root MODE=40700
SIZE=2048 MTIME=May 21 20:57 2009
CLEAR? [yn] y
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? [yn] y
SUMMARY INFORMATION BAD
SALVAGE? [yn] y
ALLOCATED FRAGS 1-8 MARKED FREE
BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y
87964 files, 23572718 used, 212474857 free (6921 frags, 26558492 blocks, 0.0% fragmentation)
Better, but let’s try again and make sure
freenas:~# fsck -t ufs /dev/ad0s2
** /dev/ad0s2
** Last Mounted on /mnt/Data1
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
LINK COUNT DIR I=2 OWNER=jackmang MODE=40777
SIZE=512 MTIME=Apr 4 17:02 2009 COUNT 9 SHOULD BE 8
ADJUST? [yn] y
UNREF FILE I=5 OWNER=root MODE=100400
SIZE=499132436784 MTIME=May 20 19:27 2009
RECONNECT? [yn] y
NO lost+found DIRECTORY
CREATE? [yn] y
UNREF FILE I=8 OWNER=root MODE=100400
SIZE=499132416000 MTIME=May 20 19:55 2009
RECONNECT? [yn] y
UNREF FILE I=11 OWNER=root MODE=100400
SIZE=0 MTIME=May 20 23:19 2009
RECONNECT? [yn] y
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? [yn] y
SUMMARY INFORMATION BAD
SALVAGE? [yn] y
87965 files, 23572719 used, 212474856 free (6920 frags, 26558492 blocks, 0.0% fragmentation)
***** FILE SYSTEM WAS MODIFIED *****
and one more time to be sure
freenas:~# fsck -t ufs /dev/ad0s2
** /dev/ad0s2
** Last Mounted on /mnt/Data1
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
87965 files, 23572719 used, 212474856 free (6920 frags, 26558492 blocks, 0.0% fragmentation)
All fixed! I don’t know why the background fsck was causing a panic, but this worked. It may be possible that after stopping the background fsck command and rebooting that I could have used the GUI fsck command successfully, but hopefully I’ll never know. That was not fun.
:: Next >>