← Back to Archives

Checking a Web Page for Updates

elections

I really like http://watchthatpage.com/ for tracking changes to web pages, but it's mostly useful for changes on the order of a day.

Today, due to the DC Digital Vote By Mail pilot project, I find myself wanting to monitor changes on the order of 5 minutes or so. (DC has a limited number of testing credentials that it will issue, and I don't know when that link will go live!)

So, this is a perfect job for a simple shell script.

First, it would be useful to have a generic script that took a web address as a URI and then compared the new version to an old version. So, I put this in a file called checkpagechange, made it executable and stuck it in /usr/bin:

#!/bin/bash curl -s $@ > /tmp/new.html diff -u /tmp/old.html /tmp/new.html mv /tmp/new.html /tmp/old.html

This uses curl to grab the page and save it to /tmp/new.html, then uses diff to compare this version to an old version and then moves the new version to the old version's location.

Then one can do checkpagechange http://foo.bar and it will print to the screen any changes. Of course, the first time old.html doesn't exist or is something from another page.

To accomplish my goal for today, I can just put this last command in a loop for the URI for which I'm interested. That is, I create a small specific shell script to use this in a loop. Save the following to something like checkdc.sh and make it executable:

#!/bin/bash while [ 1 ] do echo date checkpagechange http://www.dcboee.us/DVM/ sleep 300 done

This is a infinite loop that first outputs the date and time to the screen, then uses the previous script to output any changes in the web page and then goes dormant for 5 minutes (300 seconds).

You can run this in a terminal window and place it off to the side so that just the first few letters of the date are visible... when that changes, voila!