{"id":174,"date":"2008-10-13T08:36:36","date_gmt":"2008-10-13T13:36:36","guid":{"rendered":"http:\/\/www.kralidis.ca\/blog\/?p=174"},"modified":"2008-10-13T08:59:00","modified_gmt":"2008-10-13T13:59:00","slug":"clear-skies-with-python-and-tag-clouds","status":"publish","type":"post","link":"https:\/\/www.kralidis.ca\/blog\/2008\/10\/13\/clear-skies-with-python-and-tag-clouds\/","title":{"rendered":"Clear Skies with Python and Tag Clouds"},"content":{"rendered":"<p>I&#8217;ve been researching <a title=\"tag clouds\" href=\"http:\/\/en.wikipedia.org\/wiki\/Tag_cloud\">tag clouds<\/a> in the last few days.  I think tag clouds can help geospatial search front ends in giving the user a &#8220;weighted list&#8221;, to get them to what they want quickly and more efficiently.<\/p>\n<div id=\"attachment_180\" style=\"width: 302px\" class=\"wp-caption alignright\"><a href=\"http:\/\/www.kralidis.ca\/blog\/wp-content\/uploads\/2008\/10\/tag_cloud.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-180\" class=\"size-medium wp-image-180\" title=\"tag_cloud\" src=\"http:\/\/www.kralidis.ca\/blog\/wp-content\/uploads\/2008\/10\/tag_cloud.jpg\" alt=\"tag cloud example\" width=\"292\" height=\"109\" \/><\/a><p id=\"caption-attachment-180\" class=\"wp-caption-text\">tag cloud example<\/p><\/div>\n<p>The following Python script takes a list of terms as input.\u00a0 Such a list can be derived from many things, such as an existing taxonomy, analyzing an httpd log file for commonly used search terms, user votes, and so on.\u00a0 In this (simple) example, we use comma separated input.<\/p>\n<p>By creating a term and count dictionary, this sets up the anatomy of a tag cloud.\u00a0 From here, you can pass this for output to the web (i.e. font sizes, colours, etc.).\u00a0 Here we output this to an <a title=\"APML\" href=\"http:\/\/www.apml.org\/\">APML<\/a> document, which is <a title=\"often used to represent tag clouds\" href=\"http:\/\/en.wikipedia.org\/wiki\/Tag_cloud#Visual_appearance\">often used to represent tag clouds<\/a>.\u00a0 You can then use tools such as <a title=\"clutzr\" href=\"http:\/\/www.cluztr.com\/\">cluztr<\/a> to generate tag clouds with ease.<\/p>\n<p>Considerations:<\/p>\n<ul>\n<li>the script does a very simple job to assign values of 0.0 to 1.0 to weights<\/li>\n<li>It would be neat to apply these to searches against spatial identifiers (i.e. &#8220;Montreal&#8221;), and then map them accordingly<\/li>\n<li>It would be interesting to hear Cartographers&#8217; thoughts on the tag cloud concept<\/li>\n<\/ul>\n<pre>#!\/usr\/bin\/python\r\n\r\nimport sys\r\nimport fileinput\r\nimport datetime\r\nfrom lxml import etree\r\n\r\n# term dictionary\r\ndTags = {}\r\ntn = datetime.datetime.now().isoformat()\r\n\r\nfor line in fileinput.input(sys.argv[1]):\r\n    aTags = line.strip().split(\",\")\r\n    for sTag in aTags:\r\n        # if term is not in list, add\r\n        if sTag not in dTags:\r\n            dTags[sTag] = 1\r\n        # else increment term count\r\n        else:\r\n            dTags[sTag] += 1 \r\n\r\n# output as APML document\r\nnode = etree.Element('APML', nsmap={None: 'http:\/\/www.apml.org\/apml-0.6'})\r\nnode.attrib['version'] = '0.6'\r\nsubnode = etree.Element('Body')\r\nsubnode.attrib['defaultprofile'] = 'owscat'\r\nsubsubnode = etree.Element('Profile')\r\nsubsubnode.attrib['defaultprofile'] = 'Terms'\r\nsubsubsubnode = etree.Element('ImplicitData')\r\nsubsubsubsubnode = etree.Element('Concepts')\r\n\r\nfor term, count in sorted(dTags.iteritems()):\r\n    termnode = etree.Element('Concept')\r\n    termnode.attrib['key']     = term\r\n    termnode.attrib['value']   = str(float(float(count\/10.0)))\r\n    termnode.attrib['from']    = 'owscat'\r\n    termnode.attrib['updated'] = str(tn)\r\n    subsubsubsubnode.append(termnode)\r\n\r\nsubsubsubnode.append(subsubsubsubnode)\r\nsubsubnode.append(subsubsubnode)\r\nsubnode.append(subsubnode)\r\nnode.append(subnode)\r\n\r\nprint etree.tostring(node, xml_declaration=True, encoding='UTF-8', pretty_print=True)<\/pre>\n<link rel=\"stylesheet\" href=\"http:\/\/cdn.leafletjs.com\/leaflet-0.5\/leaflet.css\" \/>\n<!--[if lte IE 8]>\n  <link rel=\"stylesheet\" href=\"http:\/\/cdn.leafletjs.com\/leaflet-0.5\/leaflet.ie.css\" \/>\n<![endif]-->\n<script src=\"http:\/\/cdn.leafletjs.com\/leaflet-0.5\/leaflet.js\"><\/script>\n<style type=\"text\/css\">#map174 { width: 300px; height: 200px; }<\/style>\n\n<div id=\"map174\"><\/div>\n<script type=\"text\/javascript\">\n  var map174 = L.map('map174').setView([43.620495, -79.513198], 10);\n  L.tileLayer('http:\/\/{s}.tile.osm.org\/{z}\/{x}\/{y}.png', {\n      attribution: '&copy; <a href=\"http:\/\/osm.org\/copyright\">OpenStreetMap<\/a> contributors'\n  }).addTo(map174);\n<\/script>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve been researching tag clouds in the last few days. I think tag clouds can help geospatial search front ends in giving the user a &#8220;weighted list&#8221;, to get them to what they want quickly and more efficiently. The following Python script takes a list of terms as input.\u00a0 Such a list can be derived [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5,7,3,11],"tags":[],"class_list":["post-174","post","type-post","status-publish","format-standard","hentry","category-geospatial","category-open-source","category-technology","category-web"],"_links":{"self":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts\/174","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/comments?post=174"}],"version-history":[{"count":16,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts\/174\/revisions"}],"predecessor-version":[{"id":191,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/posts\/174\/revisions\/191"}],"wp:attachment":[{"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/media?parent=174"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/categories?post=174"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kralidis.ca\/blog\/wp-json\/wp\/v2\/tags?post=174"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}