Saturday, August 16, 2014

Where clause on text.

Where clause.

A where clause in its basic form is used to filter features and is used with database formats.
The use of a where clause can deliver workspace related efficiency, by resulting in only the features necessary for the translation.
You can do much more in a database where clause (joins, sub-queries), but for the purpose of this post I will stick to its basic use (e.i filtering)


Inspired by Safe's blog post, I have downloaded some bird tracking data* from the Movebank Data Repository .

The data comes in a csv format, which is considered a database format in FME, but is actually plain text. The goal is to present the features on a map.

Cory's shearwater - going the distance.

Data content.

The csv file contains location information as lat/long coordinates among other types of sensor related information.
For more information about the data see the readme file provided.

Data transformation.

read data that cannot be used ?
To transform the location information into point features the VertexCreator transformer can be used.
However when doing so, disregarding the first law of FME (which is?), the transformation will halt because some features do not contain values in the location columns.
That can be easily solved by testing the data before creating the geometry. 
But by reading the entire dataset and then filtering unnecessary (or unusable) features, you are reading more than is necessary and it is not efficient.

Filtering while reading.

To my surprise, I have stumbled across a new functionality in the csv reader, that enables such filtering.
I say to my surprise, since I totally missed out on the announcement related to this addition.
This functionality is found at the csv reader parameters. First you have to enable it and then set it.

According to the documentation: "The filtering is done by a regular expression string that will be compared against the values of attribute fields specified."

This means that if you know your regular expressions, serious complex filtering can take place.

For this case it is a simple string that filters the lat/long attribute fields, returning only columns in which values are found.
Simple regular expression.


With and without filtering.
This new functionality offer new possibilities that did not exist before FME 2014. And even if it's not a where clause as in a database, the abilities to filter and sort are welcome useful additions.

* Gagliardo A, Bried J, Lambardi P, Luschi P, Wikelski M, Bonadonna F (2013) Oceanic navigation in Cory's shearwaters—evidence for a crucial role of olfactory cues for homing after displacement. Journal of Experimental Biology, v. 216, p. 2798-2805.