So cygwin allows you to port all the great *nix utilities to a Windows environment, and that’s useful because 1) there are some many handy tools and 2) you can write scripts or tasks once and use them everywhere (Unix, Linux, OS X, Windows).

Today, let’s discover how you can get two of the most useful tools (automated tasks and mail notification) installed in less than 5 min.

Since the typical Windows box was not designed as a server, scheduled services/tasks and mail transfer agents are not commonly implemented/installed, but we can quickly fix that.

First, install the exim and cron cygwin package and then run exim-config. You can accept most of the defaults. The only change that you might want to make is to set up a primary hostname. I use “mail.local”. Be sure to add that to your hosts file (%WINDIR%\system32\drivers\etc\hosts), too:

127.0.0.1 mail.local

Next since you want mail to be deliverable for users without accounts, we need to disable the check_local_user option under the localuser router in /etc/exim.conf.

Verify the setup with

$ exim -bt test@mail.local

You should see something like …

test@mail.local
   router = localuser, transport = local_delivery

Now any output from the cron jobs will appear in the mail logs located in /var/spool/mail.

Also note that you can quickly send mail from the command line now without any other tools. Try this:

$ exim -v -odf test@mail.local
This is a test message.

Just end the message with a newline and then type Ctrl+D. Now check /var/spool/mail/test.

Now for cron. Run cron-config accepting the defaults as needed. You can quickly test the setup afterward by editing your crontab.

echo '* * * * * echo hi' | crontab -

In less than a minute (and every minute thereafter until you change it), you should see the output in your mail spool.

And we’re done. Now just set up whatever cron jobs you want on whatever schedule you want. That wasn’t so bad, was it?

I’m on a simplification kick, and I’ve been dramatically reducing the clutter in my life and enjoying the results.

I recently noticed a good deal on a drive at Microcenter and thought “Why keep 5 external drives (1.3 TB total, 5 usb cables, 5 power cords) when a single drive (1.5 TB, 1 usb cable, 1 power cord) will do?”

So began the long process of consolidating, organizing, and erasing years worth of files, and if you’re doing the same, I have a few tips to share.

After consolidating everything to one disk, I wanted to find duplicate files that may have proliferated over the years. Fortunately, there’s a little utility called fdupes that does the trick nicely. On a Windows system, you can use it through cygwin.

fdupes -r /cygdrive/VOLUME_HERE

The first pass was preliminary to see how bad it was. I found 100s of GBs worth of data that I had forgotten about. There were also several .svn directories cluttering up the results. To remove that useless (for a backup) metadata, I removed it with

IFS=$(echo -en "\n\b")
for i in $(find -regex ".*\.svn$"); do rm -rf $i; done

The first line sets the input file separator to a newline character (as opposed to the space) so the results of the find in the next statement works as expected. Now it will treat filenames with spaces as a proper path.

Now I could recursively, forcibly delete all the duplicates:

fdupes -rFd /cygdrive/VOLUME_HERE

I also wanted to remove all those Thumbs.db files that Windows loves to create as well as all the empty files and (now) empty directories

find -regex ".*Thumbs.db" -delete
find -size 0 -type f -delete
find -type d -empty -delete

Note: You may want to run these commands without the delete option first as a sanity check.

Now prep the old drives for donation …

Considering that you may have very sensitive data (e.g. passwords, bank/credit card statements, etc.) on your old drives, be sure to clean the drives before you donate them.

I simply formatted each and then used the windows built in cipher utility to clean each drive after copying over the contents.

cipher /w:VOLUME_HERE:

This wipes all the empty space on a disk with a 3 pass system – writing all 0’s, writing all 1’s, and then writing random data. In all that’s 4 passes (including the original format).

I could go on to discuss encryption of your data on your new drive and offsite backup, but that’ll have to wait for another day …

The fastest download is no download, so I was trying to prevent even the 304s (last modified requests & responses) on a recent project. It had been a while since I had to configure Apache for this, but here was the start of a .htaccess file:


<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
ExpiresActive On
ExpiresDefault "access plus 10 years"
</FilesMatch>

# compress text, html, javascript, css, xml:
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript
Header unset ETag
FileETag None
Header unset Last-Modified

I set up a versioning system on the backend so that whenever resources are updated new version numbers are automatically generated to create new request URLs. Thus, I could set far future expires tags. However, it didn’t seem to be working. I kept seeing the requests in fiddler even though the Expires header was set and the Last-Modified header removed. The problem? I was hitting refresh which was causing the clients (FF & Chrome) to request the resources regardless of the headers. Navigating away and then back generated the expected behavior. :p

wget can be a very handy tool during development. Combine wget with other *nix utilities (like time and cron), and you can write some very useful scripts/snippets. For example,

wget -p -H -nv http://www.my-domain.com/cart/ 2>&1

uses -p/–page-requisites to download all the files necessary to render the page while -H causes it to span domains if necessary. If you pass this to grep (after redirecting the output (defaulted to STDERR) to STDOUT), you could look for 403/404’s or any other undesirable response code. The -nv/–no-verbose simply prevents the deluge of information that wget spits out by default.

Here’s part of a shell script run through my crontab that I have set up to time downloads during the development cycle. I can monitor a list of urls on my dev/test servers (passed in $1) and monitor the performance over time.

for i in $(
 # list of urls here
 echo '/product/index.jsp?productId=3633213'
 echo '/cart/index.jsp'
 # ....
); do
time -f "`date` | %E | $1$i\n" -a -o time.log wget -a wget.log -O /dev/null -p $1$i
done

BOMs away!

In: Uncategorized

11 Mar 2010

So I was having problems backporting some changes to a fork of our platform today. WinMerge kept reporting files as different in the summary view, but displayed no differences in the file comparison view. I looked closer and noticed the character sets were “different”.

A developer on the team had started saving files in the repository with BOM (Byte Order Mark) information. I don’t know if it was intentional or an effect of their IDE. Either way, it had to go, so … perl to the rescue.

for i in $(find -regex ".*\.jsp$"); do perl -pi.old -e 'BEGIN { $/ = undef; } s/^\xEF\xBB\xBF//' $i; done

About this blog

This portifolio represents a brief sampling of my work. More thorough demonstrations and walk-thrus of my latest and greatest work are available via remote virtual session on DimDim, a free flash based web conferencing tool.

Also note that I recently joined StackOverflow where you will find me asking and answering many web development questions.