Neko

Automatically Backing Up Your Google Docs

I love Google.  Google has so many wonderful products and services that really add value to my life, and business.  One of these products, is Google Docs.

Google Docs is really neat, but anyone keeping all of their data in “the cloud” should be concerned.  Anything could happen to the cloud… and with a little rain, all of our data could be washed away.

I’ve tried a few solutions to manually backup my Google documents.  But remembering to backup your files regularly, is not an option.

So after some digging around, I found some tools that will allow you to setup regular Google Docs backups of your files.

What you’ll need:

Python is probably already installed on your Linux distribution.  If not, you will need to install it.  The Google Data API is available in several programming langauges.  I chose Python for reasons unknown.  The gdatacopier is a python script that talks to the Google Data API to talk to Google.

I decided that I would just install this for my user account on my server, and not for the whole server.  We will first download the Google Data API and gdatacopier bundles from their respective websites, and extract them. I’ve not captured the output of the following commands.  I encourage you to review the readmes with each package before starting.

Download two archives with wget:

$ wget http://gdatacopier.googlecode.com/files/gdatacopier-1.0.2.tar.gz

$ wget http://gdata-python-client.googlecode.com/files/gdata.py-1.2.4.tar.gz

Uncompress them:

$ gzip -d gdatacopier-1.0.2.tar.gz

$ gzip -d gdata.py-1.2.4.tar.gz

Un-tar them:

$ tar xf gdatacopier-1.0.2.tar

$ tar xf gdata.py-1.2.4.tar

Install the Google Data API:

$ cd gdata.py-1.2.4

$ python setup.py install –home=~

$ cd ..

Copy the gdatacopier library to $HOME/lib/python/ and create a symbolic link.

$ cp -r gdatacopier-1.0.2 ~/lib/python/

$ cd ~/lib/python

$ ln -s gdatacopier-1.0.2 gdatacopier

Next, you’ll want to add a line in your .profile or .bash_profile.  You’ll want to edit an existing PYTHONPATH line, or create a new one:

PYTHONPATH=~/lib/python

export PYTHONPATH

Source in the profile by doing this:

$ . ./.profile

or

$ . ./.bash_profile

Finally, create a directory to store your Google Docs.

$ mkdir ~/GoogleDocsBackups

At this point, you should have the Google Data API and the gdatacopier library installed and ready to go.  I created a simple script to actually do the backups.  My script looks like this, and is located in a directory called “bin” created off of my home directory.

$ cat ~/bin/backup_googledocs
#!/bin/bash

# Source in the profile for good
# measure, since we'll be
#running this from cron.

. $HOME/.bash_profile

~/lib/python/gdatacopier/gdoc-cp.py \
–username ‘YourUserNameHere@gmail.com’ \
–password ‘YourPasswordHere’ \
–export default \
–google-id all \
–local ~/GoogleDocsBackups/

The script above basically exports ALL of your Google Docs Spreadsheets and Documents, and exports them as the default export format, which is Open Office format.   You can read the documentation for the libraries on their websites and learn about other things they can do.  The Google Data API, for example, lets you access all kinds of Google content.

Once you verify that your script works, and backs up your files successfully, you’ll want to schedule this in cron.  Otherwise, we still have a manual process that we’ll probably forget to do.

$ crontab -e

This will open your crontab in your editor.  You will want to add a line like the bottom line here:

SHELL=/bin/bash

# Backup my Google Docs each Day
10 6 * * * ~/bin/backup_googledocs

This will execute your script each day at 6:10am.  I hope this provides some comfort for you.  It does for me.  I no longer have to worry about having a backup copy of my Google Docs.

Bash Shell Version 4.0 released

The latest version of the Bash shell was recently released.  Version 4.0.   Bash is pretty much the default shell is most Linux distributions, and seems to be preferred by most Sys-Amins on other flavors of UNIX as well.

I just downloaded and compiled the source on my Ubuntu 8.10 workstation as I didn’t see it in the update stream yet.

The manual is available online as well.

Using command-line Perl to search and replace

One of things I always seem to be doing is searching for strings in files and changing them.   I’ve gotten used to using Sed and VI for this.  But perl also makes it easy to do via the command line.

Consider the following text file:

$ cat resolv.conf
search theunixzone.com
nameserver 10.1.1.1
nameserver 10.1.1.2
nameserver 10.1.1.3

By doing a command similar to the following, you can easily change strings from the command line.

$ perl -pi -w -e ’s/10.1.1/10.5.5/g;’ resolv.conf

$ cat resolv.conf
search theunixzone.com
nameserver 10.5.5.1
nameserver 10.5.5.2
nameserver 10.5.5.3

Often, the benefit of using Perl to do this is the fact that you can specify multiple filesnames, or use a wildcard, and edit the files in-place — without writing to a temporary file.   I use this often for making mass changes to DNS Zone files, etc.

Let’s look at the command line argruments we used.   The “-pi” allows Perl to edit the file in place.  The “-w” enables many useful warnings.   And, finally, the “-e” basically says that the next string in quotes is the perl program instruction you want to exectute.  In this case, we are changing all occurances of the string “10.1.1″ and changing it to “10.5.5″.    This is a regular expression which we’ll cover more of in the future.

350,000+ Virtual Desktop Instalation in Brazil

eWeek writes that Brazil now has the largest single desktop virtualization and Linux deployment.  Over 350,000 workstations in schools.

Personally,  I think getting only 10 virtual workstations to a single server is pretty week.  Memory is cheap, and I doubt the CPU requirements are going to be all that demanding.  I’d probably pushing to get at least 20 to 25 per server.  To me, that would seem to “really” be worth it.  Although 10 to 1 does provides a significant savings both in monetary resources, as well as being green for our environment, to make it worth it.

Virtualization is very cool.

HP Certification for Ubuntu

According to Canonical, plans are in the works to get Ubuntu fully certified on HP hardware.

I’ve always been a fan of Hewlett-Packard and have used their Intel, PA-RISC, Alpha, and Itanium servers forever running UNIX and Linux.  On the Intel side, this has been primarily SUSE and RedHat Enterprise Linux due to official vendor support.

Working towards hardware certification with a vendor such as HP, will be a great step for Ubuntu.  Often, this is the main reason why one Linux distribution is chosen over another.

This is great news!

COTD: wc - print newline, word, and byte counts for each file

The wc command is very useful.  It’s a simple little command, but I use it all the time.  The wc command basically counts the number of characters, number lines, or the number of words.  Some version can also show the length of the longest line.

Using it is simple.  Pass the output of another command to it, or, give it the name of a file to read.  For example:

$ cat /etc/services | wc -l
562
$

Here we use the cat command on the /etc/services file, and then piped the output (”|”) to the wc command with the -l option (for lines).   The output?  562 lines.

Now, lets do the same thing only with characters.

$ cat /etc/services | wc -c
18449
$

Here we see that the /etc/services command has 18,449 characters in it.

Finally, we will do the same for words.  Maybe a better example would be for a book report, or paper that you need to write that needs to be at least 2000 words:

$ cat ~/homework/book_report_on_unix.txt | wc -w
2544
$

Plenty of words.  The uses for wc are endless, and I will point out it’s use in examples that I show here moving forward.

Questions? Feel free to ask.