Creating sitemap files for GeoNetwork

Sitemaps are a valuable way to index your content for web crawlers.  GeoNetwork is a great tool for metadata management and a portal environment for discovery.  I wanted to push out all metadata resources out as a sitemap so that content can be found by web crawlers.  Python to the rescue:

#!/usr/bin/python
import MySQLdb
# connect to db
db=MySQLdb.connection(host='127.0.0.1', user='foo',passwd='foo',db='geonetwork')
# print out XML header
print """<?xml version="1.0" encoding="UTF-8"?>
<urlset
 xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
 xmlns:geo="http://www.google.com/geo/schemas/sitemap/1.0"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">"""

# fetch all metadata
db.query("""select id, schemaId, changeDate from Metadata where isTemplate = 'n'""")
r = db.store_result()

for row in r.fetch_row(0): # write out a url element
    if row[1] == 'fgdc-std':
        url = 'http://devgeo.cciw.ca/geonetwork/srv/en/fgdc.xml'
    if row[1] == 'iso19139':
        url = 'http://devgeo.cciw.ca/geonetwork/srv/en/iso19139.xml'
    print """ <url>
  <loc>%s?id=%s</loc>
  <lastmod>%s</lastmod>
  <geo:geo>
   <geo:format>%s</geo:format>
  </geo:geo>
 </url>""" % (url, row[0], row[2], row[1])
print '</urlset>'

Done!  It would be great if this were an out-of-the-box feature of GeoNetwork.

1 Comment so far »

  1. links for 2009-04-17 « j03lar50n’s Blog said,

    Wrote on April 17, 2009 @ 19:31:45

    WordPress MU

    […] Creating sitemap files for GeoNetwork « tommy’s scratchpad (tags: web) […]

    Posted from United States United States
    WordPress MU

Comment RSS · TrackBack URI

Leave a Comment

Name: (Required)

E-mail: (Required)

Website:

Comment:

Modified: 15 April 2009 18:44:56 EST