Archive for open source

Cheers to 2008

Since I did this last year, I thought I’d try this again for 2008. Here’s the lowdown for my 2008:

  • Geospatial Catalogues: the saga continues.  I have dug deeper into this area this year as part of my day-job, and find that interoperability is difficult to achieve in the OGC Catalogue space.  Clearly there is a balance between abstraction/flexibility and ease of integration.  And the two step approach to discovering, say, OGC WMS layers (invoke GetRecords, then chain to GetRecordById) is cumbersome, IMHO.  At the end of the day, the most common use cases (that I have seen) are publishing data and services, and being able to query for them (data, service endpoints, service resources [layers/feature types/coverages]) with spatial, temporal or aspatial predicates.  And have the content come back in some usable format for display or binding.  Seems easy, eh?
  • Publications: glad to see “Open Source Approaches in Spatial Data Handling” was finally published.  Alot of the well known folks in the foss4g community contributed to this.  At the same time, the release took so long (like many publishing processes) that some items ended up dated.  Overall, I think the book gives a good viewpoint into foss4g at this time, and makes me think about how far we’ve come.  It’s good for the community to be published in this format / manner
  • JavaScript frameworks: they are everywhere.  Late this year I started delving back into the application space, and find these challenging, compared to the days of doing things by hand anyway.   2009 should shake off alot of rust I think
  • MapServer:  We just launched a new website.  Beers for hobu!  Also, lots of OGC CITE fixes and improvements, and next generation of OGC standards, adding updateSequence to OWS support
  • Python fun again: it’s been fun contributing to owslib for SOS, OWS and Filter support.  OWS Common presents a huge opportunity to abstract codebase when it comes to next generation OGC standards.  As well, I’ve been using Python for day-to-day scripts.  Not bad!
  • kralidis.ca turns 10: from humble beginnings, alot less done by hand now, and easier to manage (thanks WordPress!).

Other stuff:

  • Basement renovation: this took up most of my time this past year.  Frustrating, expensive (I should have been a plumber or electrician!), but gratifying.  Took a bit longer than expected, and still not 100% finished, but the major work is done.  I think this needed to happen for the property overall, even if it means I have more space than I could possibly need :) .  N.B. if you ever want to lose weight, do a home reno;  I shed 20lbs!
  • New job: I started a new job in the fall, which promises to be very exciting and satisfying, especially given the state of the geospatial web.  The new gig will give me more opportunity for discovery and SensorWeb information management spaces.  So I’m grateful for the opportunity.  I’ve also been having fun 1/2 time with the GeoConnections program again, so it’s fun to work with some previous colleagues and getting acquianted with new faces who are helping to shape and evolve our national infrastructure.  So thanks again to those for helping me along a tough road and getting me here; I owe you big time :)

For 2009:

  • Work:  January 11 will mark 10 years of civil service for yours truly
  • Data dissemination: this is my key function in my day job for the months to come.  I look forward to evolving what started off as a very high level strategy into an architecture all the way to implementation.  This will be fun!
  • Standing up usable catalogues: you’ll see a few OGC Cat2.0 instances this year.
  • MapServer: more CITE fixes for SOS and O&M.  One thing I’d really like to see for 2009 is official compliance for OGC standards in MapServer
  • T.O. Code Sprint in March: this event is going to be fun.  What could be better than foss4g and beers, all in the centre of the universe :)
  • Renovations: I think that is it, for this place, for now.  Almost three years and it’s time for a rest in this space
  • Property: I think it will be a good time to buy in 2009.  The question (for me) is where.  Locally, or down south?

All the best for 2009 for you and your loved ones!

Written from home:

Shiny New MapServer Website

Check out the new MapServer website.  Based on Sphinx, the website now has a glossary, full PDF documentation, and a snazzy front page demo to boot.  I think this will result in a much more manageable and up-to-date website for the community.

Kudos to hobu for an amazing job!

Written from home:

Coding Fun in the T Dot

I’ve been helping Paul with setting up the OSGeo Toronto Code Sprint 2009 event here in Toronto in March.

This promises to be an effective and fun event, to have many developers from the various OSGeo projects in one place for a few days.

Check it out, and hope to see you there!

Written from home:

Revolution OS

Check out this video if you’re interested in learning more about the evolution of open source.  Pretty neat!

Written from home:

new Open Source Geospatial Book

Check out “Open Source Approaches in Spatial Data Handling” by Hall, Leahy et. al. (disclosure: I did chapter 1).  An interesting read covering many facets and tools of open source geospatial.

Written from the Delta Ottawa:

Clear Skies with Python and Tag Clouds

I’ve been researching tag clouds in the last few days. I think tag clouds can help geospatial search front ends in giving the user a “weighted list”, to get them to what they want quickly and more efficiently.

tag cloud example

tag cloud example

The following Python script takes a list of terms as input.  Such a list can be derived from many things, such as an existing taxonomy, analyzing an httpd log file for commonly used search terms, user votes, and so on.  In this (simple) example, we use comma separated input.

By creating a term and count dictionary, this sets up the anatomy of a tag cloud.  From here, you can pass this for output to the web (i.e. font sizes, colours, etc.).  Here we output this to an APML document, which is often used to represent tag clouds.  You can then use tools such as cluztr to generate tag clouds with ease.

Considerations:

  • the script does a very simple job to assign values of 0.0 to 1.0 to weights
  • It would be neat to apply these to searches against spatial identifiers (i.e. “Montreal”), and then map them accordingly
  • It would be interesting to hear Cartographers’ thoughts on the tag cloud concept
#!/usr/bin/python

import sys
import fileinput
import datetime
from lxml import etree

# term dictionary
dTags = {}
tn = datetime.datetime.now().isoformat()

for line in fileinput.input(sys.argv[1]):
    aTags = line.strip().split(",")
    for sTag in aTags:
        # if term is not in list, add
        if sTag not in dTags:
            dTags[sTag] = 1
        # else increment term count
        else:
            dTags[sTag] += 1 

# output as APML document
node = etree.Element('APML', nsmap={None: 'http://www.apml.org/apml-0.6'})
node.attrib['version'] = '0.6'
subnode = etree.Element('Body')
subnode.attrib['defaultprofile'] = 'owscat'
subsubnode = etree.Element('Profile')
subsubnode.attrib['defaultprofile'] = 'Terms'
subsubsubnode = etree.Element('ImplicitData')
subsubsubsubnode = etree.Element('Concepts')

for term, count in sorted(dTags.iteritems()):
    termnode = etree.Element('Concept')
    termnode.attrib['key']     = term
    termnode.attrib['value']   = str(float(float(count/10.0)))
    termnode.attrib['from']    = 'owscat'
    termnode.attrib['updated'] = str(tn)
    subsubsubsubnode.append(termnode)

subsubsubnode.append(subsubsubsubnode)
subsubnode.append(subsubsubnode)
subnode.append(subsubnode)
node.append(subnode)

print etree.tostring(node, xml_declaration=True, encoding='UTF-8', pretty_print=True)

Written from home:

I heart this WMS

I’ve written my share of catalogues, Capabilities parsers, map clients, and context import/export tools to know that having good example WMS instances is paramount in testing functionality and building features. I usually have a handy list of WMS servers which I constantly use when writing code.

Bird Studies Canada provides WMS access to their various bird distribution and abundance data. BSC has taken every effort to:

  • populate their Capabilities metadata exhaustively. Title, abstract, keywords, and even MetadataURL pointers to FGDC XML documents for all layers. And _full_ service provider metadata (including Attribution, which is great for displaying Logo images, etc.)
  • return GetFeatureInfo in both GML and HTML for prettier responses

This WMS is always at the top of my testing list, as well as my first response when people ask to see an existing WMS example which is well constructed, and serves catalogues and search demos very well indeed.

Kudos to BSC!

Written from home, but pointing to the BSC offices:

Making W*S suck less

I’m starting to work on contributing SOS and OWS Common support in OWSLib, a groovy and regimented little GIS Python project.

So far so good; some initial implementations are done (committing soon hopefully, writing tests around these).  I think this will add value to the project, seeing that SOS 1.0 has been around long enough to start seeing implementations.  And the OWS Common support will act as a baseline for all calling specs/code to leverage.

And it’s been a nice journey in Python for me so far.  Another thing I like about this project is the commitment to testing — awesome!

Written from home:

GDAL Saves the Day Again

A piece of work I help out with involves the visualization and access of hydrometric monitoring data over the Web. Part of this involves the data management and publishing of voluminous databases of monitoring information.

We use Chameleon for basic visualization and query of the data. Behind the scenes, we run a slew of complex processes (shell scripts via cron) to output the data in a format that can be understood by MapServer (which we use to publish WMS layers). The processes work across many disparate database connections, so outputting them to shapefiles and accessing them locally helps with performance in web mapping apps. ogr2ogr is used exclusively and extensively for the access and format translation.

Well, today I found out that an effort began to write a bunch of scripts to additionally output OGC KML. Thank goodness things didn’t get very far, because the following addition to our processes:

$ ogr2ogr -F KML foo.kml bar.ovf -dsco NameField=NAME -dsco DescriptionField=COMMENT

…worked like a charm, and put a big smile on people’s faces!

So now, OGC KML is also supported for visualization in Earth browsers. Just like that.

Output styles are relatively simple; I’m thinking a -dsco like:

-dsco LayerStyle=LayerName,styles.kml#mystyle

…would point to an existing (local or remote) KML style document style ID via XPointer, i.e.:

<styleUrl>somefile.kml#mystyle</styleUrl>

Of course the default behaviour would be in place if this -dsco is not defined. I’ll see what the GDAL KML gurus think about this.

At any rate, once again, thank you GDAL for being an uber-utility for day-to-day GIS tasks. Happy faces everywhere!

Written from home:

pivoting in Python

I needed to do some pre-processing of some data which involved transposing column names to values. The condition was that the value for each respective column (frequency count) had to be > 1.

My input was a csv file, and my goal was an output csv file which would feed into a batch database import process.

ID,DA,NL,PHENOM1,PHENOM2,PHENOM3,PHENOM4
233,99,44,0.00,27.00,12.00,0.00

The other interesting bit was that only a range of columns applied to the condition; the other columns represented ancillary data.

Enter Python:

#!/usr/bin/python

import sys
import csv

# open file and read headers
fPhenomenon = open("phenomenon.txt","r")
sHeaders    = fPhenomenon.readline().replace(r'"','')
aHeaders    = sHeaders.split(",")

# feed the rest to csv
csvIn  = csv.reader(fPhenomenon)
csvOut = csv.writer(sys.stdout)

for sRowIn in csvIn:
    aRowOut = []
    aPhenomenon = []
    aRowOut.append(sRowIn[0]) # procedure ID
    aRowOut.append(sRowIn[1]) # major drainage area ID
    for nIndexTupleVal, tupleVal in enumerate(sRowIn[3:-1]):
        if (float(tupleVal) > 0): # phenomenon measured at least once
            # add phenomenon name to list
            aPhenomenon.append(aHeaders[nIndexTupleVal+3])
        # add phenomenon list to record
        aRowOut.append(",".join(aPhenomenon))
    csvOut.writerow(aRowOut)

Notes

  • hooray for raw strings!
  • enumerate() is great and saves you the trouble of declaring your own counter
  • like any language, modules/libraries makes things so easy to work with
  • I wish the header stuff was a bit cleaner (I should look further into the csv module w.r.t. headers

That’s my hack for the day. Have a good weekend!

UPDATE: ah, the csv module has a .next() method, which can be used instead of the shoemaker attempt I made above to regularize / split / store the header list.

Written from home:

Modified: 13 October 2008 12:13:08 EST