{"id":309,"date":"2009-04-15T18:44:56","date_gmt":"2009-04-15T23:44:56","guid":{"rendered":"http:\/\/www.kralidis.ca\/blog\/?p=309"},"modified":"2009-04-15T18:44:56","modified_gmt":"2009-04-15T23:44:56","slug":"creating-sitemap-files-for-geonetwork","status":"publish","type":"post","link":"https:\/\/www.kralidis.ca\/blog\/2009\/04\/15\/creating-sitemap-files-for-geonetwork\/","title":{"rendered":"Creating sitemap files for GeoNetwork"},"content":{"rendered":"<p><a title=\"Sitemaps\" href=\"http:\/\/www.sitemaps.org\/\">Sitemaps<\/a> are a valuable way to index your content for web crawlers.\u00a0 <a title=\"GeoNetwork\" href=\"http:\/\/geonetwork-opensource.org\/\">GeoNetwork<\/a> is a great tool for metadata management and a portal environment for discovery.\u00a0 I wanted to push out all metadata resources out as a sitemap so that content can be found by web crawlers.\u00a0 Python to the rescue:<\/p>\n<pre>#!\/usr\/bin\/python<\/pre>\n<pre>\r\nimport MySQLdb<\/pre>\n<pre>\r\n# connect to db<\/pre>\n<pre>db=MySQLdb.connection(host='127.0.0.1', user='foo',passwd='foo',db='geonetwork')<\/pre>\n<pre>\r\n# print out XML header<\/pre>\n<pre>print \"\"\"&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;<\/pre>\n<pre>&lt;urlset<\/pre>\n<pre>\u00a0xmlns=\"http:\/\/www.sitemaps.org\/schemas\/sitemap\/0.9\"<\/pre>\n<pre>\u00a0xmlns:geo=\"http:\/\/www.google.com\/geo\/schemas\/sitemap\/1.0\"<\/pre>\n<pre>\u00a0xmlns:xsi=\"http:\/\/www.w3.org\/2001\/XMLSchema-instance\"<\/pre>\n<pre>\u00a0xsi:schemaLocation=\"http:\/\/www.sitemaps.org\/schemas\/sitemap\/0.9<\/pre>\n<pre>\u00a0http:\/\/www.sitemaps.org\/schemas\/sitemap\/0.9\/sitemap.xsd\"&gt;\"\"\"\r\n\r\n# fetch all metadata<\/pre>\n<pre>db.query(\"\"\"select id, schemaId, changeDate from Metadata where isTemplate = 'n'\"\"\")<\/pre>\n<pre>r = db.store_result()\r\n\r\nfor row in r.fetch_row(0): # write out a url element<\/pre>\n<pre>\u00a0\u00a0\u00a0 if row[1] == 'fgdc-std':<\/pre>\n<pre>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 url = 'http:\/\/devgeo.cciw.ca\/geonetwork\/srv\/en\/fgdc.xml'<\/pre>\n<pre>\u00a0\u00a0\u00a0 if row[1] == 'iso19139':<\/pre>\n<pre>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 url = 'http:\/\/devgeo.cciw.ca\/geonetwork\/srv\/en\/iso19139.xml'\r\n\u00a0\u00a0\u00a0 print \"\"\" &lt;url&gt;<\/pre>\n<pre>\u00a0 &lt;loc&gt;%s?id=%s&lt;\/loc&gt;<\/pre>\n<pre>\u00a0 &lt;lastmod&gt;%s&lt;\/lastmod&gt;<\/pre>\n<pre>\u00a0 &lt;geo:geo&gt;<\/pre>\n<pre>\u00a0\u00a0 &lt;geo:format&gt;%s&lt;\/geo:format&gt;<\/pre>\n<pre>\u00a0 &lt;\/geo:geo&gt;<\/pre>\n<pre>\u00a0&lt;\/url&gt;\"\"\" % (url, row[0], row[2], row[1])<\/pre>\n<pre>print '&lt;\/urlset&gt;'<\/pre>\n<p>Done!\u00a0 It would be great if this were an out-of-the-box feature of GeoNetwork.<\/p>\n<link rel=\"stylesheet\" href=\"http:\/\/cdn.leafletjs.com\/leaflet-0.5\/leaflet.css\" \/>\n<!--[if lte IE 8]>\n  <link rel=\"stylesheet\" href=\"http:\/\/cdn.leafletjs.com\/leaflet-0.5\/leaflet.ie.css\" \/>\n<![endif]-->\n<script src=\"http:\/\/cdn.leafletjs.com\/leaflet-0.5\/leaflet.js\"><\/script>\n<style type=\"text\/css\">#map309 { width: 300px; height: 200px; }<\/style>\n\n<div id=\"map309\"><\/div>\n<script type=\"text\/javascript\">\n  var map309 = L.map('map309').setView([43.620495, -79.513198], 10);\n  L.tileLayer('http:\/\/{s}.tile.osm.org\/{z}\/{x}\/{y}.png', {\n      attribution: '&copy; <a href=\"http:\/\/osm.org\/copyright\">OpenStreetMap<\/a> contributors'\n  }).addTo(map309);\n<\/script>\n","protected":false},"excerpt":{"rendered":"<p>Sitemaps are a valuable way to index your content for web crawlers.\u00a0 GeoNetwork is a great tool for metadata management and a portal environment for discovery.\u00a0 I wanted to push out all metadata resources out as a sitemap so that content can be found by web crawlers.\u00a0 Python to the rescue: #!\/usr\/bin\/python import MySQLdb # [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5,7,3,11],"tags":[],"class_list":["post-309","post","type-post","status-publish","format-standard","hentry","category-geospatial","category-open-source","category-technology","category-web"],"_links":{"self":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts\/309","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/comments?post=309"}],"version-history":[{"count":3,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts\/309\/revisions"}],"predecessor-version":[{"id":312,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts\/309\/revisions\/312"}],"wp:attachment":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/media?parent=309"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/categories?post=309"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/tags?post=309"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}