Some information that might be of use to someone, at discount prices.

March 21, 2005

Photomosaics and a Google Image Grabber

Category: Photo,Software — badsegue @ 1:45 am

background

You’ve probably seen photomosaics before.


Mosaic (scaled way down)


Original image (full size)

These are images that are composed of other images. There are several free/cheap programs out there that can take a given picture and make a mosaic using a set of pictures of your choosing. The one I’ve used and had great results with is AndreaMosaic. It’s free and easy to use.

Here’s a sample mosaic I made. This is a scaled down version of the original, which is around 8MB. The original image is only 70×70, and the component images are thumbnail sized, around 100×100.

You don’t need high resolution images to make a mosaic, but the resulting image can have enough detail to produce poster sized prints. I’ve made 24×30 prints using nothing more than low resoultion source image and a bunch of thumbnails.

Once you figure out the basic approach and dimensions needed for the final images, you can be producing mosaics in a matter of minutes. The hardest part is coming up with enough feeder images to give the mosaic enough

approach

If you’ve got hundreds or thousands of images and you want to use those in the mosaic then you may not need to find any more feeder images. I like to use images related to the original’s subject matter, rather than just any image (although that can be interesting as well). So for the holiday dog picture I wanted holiday and dog pictures. The natural place to look was Google Images. You can search on anything and find any number of relevant images, and in thumbnail size from the results page. Since thumbnails are the perfect size for feeding into the mosiac there is no need to go to the host page and download the full-size version.

implementation

This Perl program takes a search term and a range, then fetches the matching images from Google Images. It saves them into a folder with the same name as the search term, in your current directory. The images are saved using the URL of the image, so if you re-run the search it won’t fetch an image it’s already stored.


Usage: get.pl [start range] [end range]

search term: This is the query string passed to Google Images. It can be whatever you want, but if it is more than one word then you have to put the term in quotes. You can use the Google query language, like "flower AND rose", "rose -wine", etc.
start range: The starting index to retrieve. Google returns 20 images per page, so this will start retrieving the page that contains the start range image.
end range: The ending index to retrieve. The program will stop once it retrieves the page that contains the end range image.

Use start-end to control how many images to fetch. Usually you will just do something like

get.pl "flower" 0 100

If you later wanted to get more images of that type you can do

get.pl "flower" 100 500

This will avoid the images you’ve already retrieved and save you some time.

Because you’re only downloading the thumbnails the program is usable even on dial-ups.

use HTML::Parser;
use HTTP::Request::Common;
use LWP;
use URI::Escape;

use strict;

$|=1;

my $client = LWP::UserAgent->new(agent=>'Mozilla', timeout=>'0', keep_alive=>1);
my $ua    = "Mozilla";
my $in    = "./";
my $query = shift; chomp($query);
my $start_idx = shift; chomp($start_idx);
my $end_idx = shift; chomp($start_idx);
my $url   = "http://images.google.com/images?q=$query+filetype:jpg\&safe=off";
my $start = $start_idx || "0";
my $stop = $end_idx || 0;
my $dest_dir = "$in/" . uri_escape ($query);

my $count = 1;

my $p = new HTML::Parser (
 api_version => 3,
 start_h     => [\&tag, "tagname, attr"],
);

print "Start = $start, Stop = $stop, Query = $query\n";
mkdir $in || die "Couldn't make $in ($!)\n";
mkdir $dest_dir || die "Couldn't make $dest_dir ($!)\n";


while (1) {
  my $test = $start;
 
  # Get the search results page
  my $request = HTTP::Request->new('GET', "${url}\&start=${start}");
  my $response = $client->request($request);
  
  $p->parse( $response->content );
  # See if we are out of images
 if ($test == $start || ($stop && ($start >= $stop))) {
  print "Done.\n";
  exit 0;
 }
}

sub tag {
  my ($tagname, $attr) = (@_);

  # Found the next page graphic, increment counter to continue grabbing
  if ($attr->{'src'} && ($attr->{'src'} eq "/nav_next.gif" )) {
        $start += 20;
  }

  return unless ($tagname eq 'img');
  return unless ($attr->{'src'} && $attr->{'src'} =~ /images\?q=tbn:.*\.jpg/i);
  my $filename = $attr->{'src'};
  $filename =~ s/\/images\?q=tbn:.*://;
  $filename = uri_escape($filename);

  if (-e "$dest_dir/$filename") {
    print "Skipping ";
  } else {
    my $request = HTTP::Request->new('GET', "http://images.google.com$attr->{'src'}");
    my $response = $client->request($request, "${dest_dir}/${filename}");
  }
  print "$filename (", $count++, ")\n";
}
• • •

March 20, 2005

Google Maps GPS GPX Waypoint Extractor

Category: GPS,Software — badsegue @ 1:17 am

background

I previously wrote about how to extract waypoints and create a GPX file from MSN Yellow Pages using a bookmarklet. This article explains how to do the same thing for Google Maps. I use mine to supplement the points of interest (POIs) in North American CitySelect v5 for my Garmin 76C.

POI coverage can be spotty, even in well established and stable cities. In newly developed or more remote areas there may be nothing at all. Online directories should have just about every business that would appear in the Yellow Pages, and are much more timely than the software releases. By tapping these online resources you can have the most accurate and complete set business POIs possible.

approach

This can be done relatively easily because of the way the search results are contained on a single place with the relevant data unencoded. Yahoo and most of the other online providers lack this ease of access. Even Google Local doesn’t put all the information on a single page, you’d have to drill down into each returned place to get the coordinates.

implementation

The code looks like this:

javascript:
(function(){
  var t=document.getElementById('vp').
        contentDocument.getElementsByTagName('SCRIPT').item(0).text;
  var pts=t.match(/<point .*?<\/title>/g);
  var doc=open().document;
  var bod=doc.body;
  doc.write('<textarea rows=%2250%22 cols=%22100%22>');
  doc.write('\n<gpx xmlns=%22http://www.topografix.com/GPX/1/1%22 
             creator=%22gpxextr%22 version=%221.1%22 

             xmlns:xsi=%22http://www.w3.org/2001/XMLSchema-instance%22>');
  for(i=0;i<pts.length;i++){
    var latlon = pts[i].match(/(-?\d{2}\.\d{6}).*?(-?\d{2}\.\d{6})
    .*?title.*?>(.*?)<\/title>/);
    latlon[3] = latlon[3].replace(/<.*?>/g, '');
    doc.write('\n<wpt lat=%22', latlon[1], '%22 lon=%22', latlon[2],
    '%22>\n<name>', latlon[3], '</name>\n</wpt>');
  }
  doc.write('\n</gpx></textarea>');
  doc.close();
}
)()

The Google Maps GPX Waypoint Extractor link (Firefox only) will run a little Javascript bookmarklet that parses the interesting parts of the map link and write them as a GPX file into a new browser window. (The MSN and Google extractors only works in Firefox/Mozilla for now. IE limits the the length of a bookmark and right now it’s too long. There’s a way around this but it requires putting the code into a file and having the bookmarklet ‘inject’ it into the page. I haven’t got that working yet though.)

If you click on the link, the script won’t find anything on this page. Add the link (right-click and add it, or drag the link to your toolbar) then open Google Maps and do a search for “Pizza Duck,NC”. Now select the GPS GPX Extractor bookmark and you should get waypoints for all the results on the page. You should be able to import that file into most software that manages waypoints.

etc…

MSN Yellow Pages and Google Maps are the only sites I know of that have easily parsed coordinates. I think Google Local searches can be tapped as well, but it would require fetching each detail page. There are online GIS sources that can be used to get other types of waypoints, like parks and such. If you know of other sources of information that can be extracted like this, let me know.

• • •

March 14, 2005

MSN GPX GPS Waypoint Extractor

Category: GPS,Software — badsegue @ 1:09 am

background

Buy GPS Stuff

If you have a mapping GPS you may have noticed that the map sets you have are missing lots of points of interest. While planning the annual trip to the beach I noticed that there were no POIs for the area in City Select North America v5. This is the map set I use on my Garmin 76C. The only update available is the next version, which I can get for $75, but is unlikely to be much better.

What I needed was a simple way to get accurate and up to date POIs for any area that I know I’m traveling to.

approach

Obviously if you’re looking for a type of business somewhere, you’re going to just search online. There are plenty of options that provide yellow pages, and can serve maps and driving directions for any given place.

I don’t need the maps or directions–the GPS handles that. I just need the latitude and longitude, preferably with multiple results on a single page. I don’t want to have to drill down to a different page for each returned result to get the coordinates.

A quick survey turned up MSN and Google as the most promising candidates. Google has their Local search, and the Maps beta. The Local search can return the largest number of results, but the coordinates in the page are for the center of the search region, not the results. The Maps search (dissected here) has the right data, but is limited to 10 results and I don’t see any way to get any more. That leaves MSN, which has the right data and also has other useful search options, like the ability to search within a radius of a specific address.

implementation

MSN Yellow Pages produces result details like this

Each matched business has a map link which has the latitude, longitude, and name.

I started with a Perl script that took a search term, made the HTTP connection, and parsed the results. I did something similar to download thumbnails from Google image searches, for feeding into a photo mosaic. This approach works ok, but it would be nicer to have it tied to the browser results page without having to run an external script. That way I can tweak the search interactively until I find what I want. MSN searches sometimes return a category page which you have to go beyond to get to the link lists, so running directly from the browser is useful.

A bookmarklet works well for this job. Bookmarklets are bits of Javascript that have been saved as a bookmark. When activated they run in the context of the current page in the browser, as if they were part of the page itself.


This MSN GPX Waypoint Extractor link
(Firefox only) will run a little Javascript bookmarklet that parses the interesting parts of the map link and write them as a GPX file into a new browser window. (The MSN and Google extractors only works in Firefox/Mozilla for now. IE limits the the length of a bookmark and right now it’s too long. There’s a way around this but it requires putting the code into a file and having the bookmarklet ‘inject’ it into the page. I haven’t got that working yet though.)

If you click on the link, the script will find the sample map link in this page, which isn’t that useful. Add the link (right-click and add it, or drag the link to your toolbar) then open the detailed search results of pizza places in Duck, NC. Now select the GPS GPX Extractor bookmark and you should get waypoints for all the results on the page.

The code looks like this:

javascript:
(function(){
  var i,x,h,n;
  var doc=open().document;
  var bod=doc.body;
  doc.write('<textarea rows=%2250%22 cols=%22100%22>');
  doc.write('\n<gpx xmlns=%22http://www.topografix.com/GPX/1/1%22
             creator=%22gpxextr%22 version=%221.1%22
             xmlns:xsi=%22http://www.w3.org/2001/XMLSchema-instance%22>');
  var links = document.getElementsByTagName('a');
  for(i=0;i < links.length; i++) {
    x=links[i];
    h=x.href;
    var latlon = h.match(/lat=([-\d]*)&POI1lng=([-\d]*)/);
    var nm = h.match(/POI1name=(.*?)&street/);
    if (latlon != null && !h.match(/^javascript/) && nm != null) {
      n = nm[1].replace(/\+/g, ' ');
      n = unescape(n);
      n = n.replace(/&/g, 'and');
      latlon[1] = latlon[1].replace(/0(\d\d)(\d*)/, '$1.$2');
      latlon[2] = latlon[2].replace(/0(\d\d)(\d*)/, '$1.$2');
      doc.write('\n<wpt lat=%22', latlon[1], '%22 lon=%22', latlon[2],
      '%22>\n<name>', n, '</name>\n</wpt>');
    }
  }
  doc.write('\n</gpx></textarea>');
  doc.close();
}
)()

There is a little bit of manipulation using regular expressions needed because the coordinates have a leading zero and no decimal.

Save the file and open it up in some program that understands GPX files, which should be most current versions of GPS map/waypoint software. I used G7toWin to make them all restaurants and send them to the 76C.


Here are the pizza places centered around Duck:

Here they are from the Find menu:

You can also open them up in USAPhotomaps and get the satellite views:
(Topo view)
(Sat view)

• • •