07.04.2016, 18:27, alex kirjoitti:
Hi Ari,

Sorry for responding to your message late. I do appreciate your initiative to 
facilitate map algebra in GDAL.

No worries, the initiative and RFC will be active for some time still and decisions on it are still in the future.


What exactly do you mean by Map Algebra? I understand it to cover the following:

1. Local operations: pixel-by-pixel operations on raster bands.
2. Focal operations: operations that calculate an indicator for each pixel 
based on the values in surrounding pixels
3. Zonal operations: operations that calculate indicators for groups (zones) of 
pixels.
4. Global operations: operations that calculate indicators over all pixels.

This is the traditional understanding and I share it. I've come to map algebra from hydrological analysis of digital elevation models, spatial simulation with rasters, analysis of land cover data, etc. I've dealt less with multi-channel RS data but I understand that analysis of such data, for example classification, is an important application domain.


1. Local operations: pixel-by-pixel operations on raster bands.

To support this well, I think you should allow the user to specify a function 
and a combination of raster bands and constants that feed into the function, 
e.g:

my_average = apply(average_function, band1, band2, band3);

but the function should also be able to handle a mix of raster bands and 
constants, e.g.:

my_average = apply(average_function, band1, band2, 7.25);

Also, I think it would make sense to have overloads for all or most of the 
operators, so you can write:

my_average = (band1 + band2 + band3) / 3.0;

And, I would like to combine these:

my_average = (2* apply(average_function, band1, band2) + band3) / 3.0;

Is that the kind of functionality you are trying to offer?

That is one aspect, yes. However, so far I've considered only operations on one or two bands at time. Operations on more than two can of course be dissolved into a series of operations with max two at a time.

I agree that supporting overloading is important and it is more than just syntactic sugar since it can delegate some tasks to the compiler, which may create temporaries as needed etc.


If you want to support this, it seems to be best practice to use expression 
templates (https://en.wikipedia.org/wiki/Expression_templates), because that 
would allow to create complex compositions of operations without having to 
create temporary datasets (e.g. for the sum of band1 and band2 in the example 
above).

Yes, I agree that avoiding temporaries is a worthy goal, and what I understand of expression templates, motivation for them has been that.

However, expression templates seem to be very C++ centric and do their work in compile time. The C API and high level language bindings are very important in GDAL. Therefore expression templates might not be the best solution in this case. However, perhaps the idea can be exploited somehow.


The key thing that you need in order to implement this is an iterator over the 
raster bands. Eric Niebler writes at length about the benefits of using Ranges 
as a means of composing functions (https://github.com/ericniebler/range-v3 ) 
.You seem to be applying an iterator that iterates over blocks, rather than 
over pixels. How can that work when my_average, band1, band2 and band3 have 
different block sizes? Wouldn't your life be a lot easier if you iterated over 
pixels in the same order for all bands?.

I agree that it takes a bit of haggling to be able to work with more than one band at the same time when one has to maintain a separate cache for each band and each band has different block sizes. The situation is even worse for focal methods since one focal region can be on four blocks of one band in the worst case. It is however, in my opinion, an efficient approach when it is not feasible to load all raster data into memory. The approach is also strongly supported by the basic design of GDAL and the fundamental algorithm is presented in GDAL documentation (I gave the link in earlier email).

The solution I already have seems to work well for the one and two band case even with 3x3 focal region and as far as I can see it could be extended to cases involving more bands. The macros and functions I've written make cell (pixel) access possible with only a few lines of code but it could be probably improved. That would probably be desirable if more than two bands are present.


If you are interested, I have a solution, albeit in C++11 
(https://github.com/ahhz/map_algebra ).

I remember your email and I've taken a look at your code. I have nothing against using modern tools but I think GDAL is a bit conservative on them and portability is a big issue. However, I believe the map algebra implementation I'm writing will be a add-on or plugin for GDAL, and not something that will be tightly integrated into it.

  The following is a program using it:

#include <blink/map_algebra/map_algebra.h>

namespace ma = blink::map_algebra;

int my_function(int w, int x, int y, int z)
{
   return w * x + y * z;
}

int main()
{
   auto a = ma::open_read_only<int>("input_1.tif");
   auto b = ma::open_read_only<int>("input_2.tif");
// Example 1: Using operators
   auto w = ma::create_from_model<double>("output_1.tif", a);
   w = a + 3 * b;

   // Example 2: Using assigning operators
   auto x = ma::create_from_model<double>("output_2.tif", a);  x = 1;
   x *= a;
   x += 3 * b;
// Example 3: Map algebra using cell-by-cell functions
   auto y = ma::create_from_model<double>("output_3.tif", a);
   y = ma::apply(my_function, 1, a, 3, b);

   // Example 4: Combination
   auto z = ma::create_from_model<double>("output_4.tif", a);
   z = b + 3 * ma::apply(my_function, 1, a, 2, b);

   return 0;
}

2. Focal operations: operations that calculate an indicator for each pixel 
based on the values in surrounding pixels

I think the key challenge here is to allow users to specify the function to be 
applied on the surrounding pixels. In my opinion a Map Algebra library should 
support all kinds of moving window analysis, and not just a number of built-in 
types of analysis. This can be implemented as yet another range that allows 
iteration over the pixels in the same order as the raster bands, but instead of 
return pixel values, it returns neighbourhood / window summaries.

The hydrological algorithm would benefit a lot from random access to extended neighborhood cells but so far I've written only code that uses linear, block by block and pixel by pixel within blocks, access.




I have a C++11 library for this as well, but it is not yet working smoothly 
with the map algebra library (but it will!)
https://github.com/ahhz/moving_window
http://www.sciencedirect.com/science/article/pii/S0303243415300337

3. Zonal operations: operations that calculate indicators for groups (zones) of 
pixels.
This seems more trivial than the other two components of map algebra and I 
haven't given it much attention, but I think should be supported.

One complication, which I just today worked on, seems to stem from one requirement that I have. I want the API be free of templates and thus there is a need for generic classes for, e.g., numbers (GDAL supports bands where data is various length integers or floats). Thus, for example a simple zonal method, which returns which zones share a boundary returns a rather complex data structure.


4. Global operations
I don't think these need to be explicitly supported, because there is no 
genericity to exploit.

I find that I often need methods, which return for example the histogram of values in a band.


In summary:

What I am really trying to say is that developing map algebra functionality on 
top of GDAL or as part of GDAL seems very worthwhile and feasible. However, I 
would aim for functionality that is more generically supporting map algebra, 
rather than implementing specific map algebra tools. The foundation of such 
functionality in my opinion must be a GDALRange class that facilitates 
iteration over pixels in a uniform order. If I were you, I would first push a 
RFC to create a GDALRange, and only once that is done and dusted, consider the 
map algebra on top of it.

The RFC procedure of GDAL is open to everybody and the deciding body is the GDAL PSC, to which I do not belong to. As I wrote, the API I'm suggesting is essentially an add-on and thus leaves, even if adopted, room for other solutions.

Best regards,

Ari


Kind regards, Alex

(sorry for the self-promotion)

-----Original Message-----
From: gdal-dev [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Ari
Jolma
Sent: 01 April 2016 09:39
To: gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Map algebra

I've got this into a rather nice working shape. The code needs to be
reorganized (it is now, horror, all in header files) but that's another
thing.

I've so far built this around a basic DEM catchment analysis workflow.
It includes many kinds of steps so it is a pretty good benchmark. In
pseudo code the workflow is something like

dem = open_from_file
dem->fill_depressions
fd = dem->compute_flow_directions
fd->route_flat_cells
ua = fd->compute_upstream_area
c = new_raster // compute the catchment into this
c->set_border_cells // we're interested in the outlet cells
c *= ua
c *= c > 10000 // we're interested in the big catchments
outlets = c->get_cells // outlets is an array of cells, and a cell is an
object containing (x,y,value)
c = 0
c->mark_catchment(fd, outlets[0], 1) // assuming the outlet cell is now
in outlets[0]
ua2 = new_raster // for visualization log of upstream area is nice
ua2 += ua
ua2->log

The C code is of course not this nice but it is not very far from this.
All the basic four method types I mention below are implemented and
adding new methods is not very difficult. The code makes heavy use of
templates and macros in an attempt for versatility (GDAL rasters can
have many datatypes) and easiness to write methods in a few lines of
code. The whole set of code is now ~2500 lines but it creates rather
large executables due to templates. The code is C++ but the API is C
like, I've so far done no changes to existing GDAL code because of this.

As written below, the code works on two levels of x,y loops: the blocks
within a band, which is generic and cells within a block, which is
separate for each method. The methods work on line by line (no recursion
etc.) and thus for example the fill_depressions is not optimal but
instead very simple. Also, very large rasters can be processed.

I think this may be somewhat similar thing in relation to GDAL as the
network support, so it could be a thing to add to the library if wished.

I've added RFC 62 "raster algebra" to spark discussion.

Ari


20.03.2016, 10:21, Ari Jolma kirjoitti:
Folks,

I played around a bit with the map algebra idea and came up with
something that could work nicely as a plugin. The initial code is at
https://github.com/ajolma/gdal/tree/trunk/gdal/map_algebra

The idea is based on processing raster bands block by block and
calling a given function on each block. Using macros and templates the
code should be rather nice and manageable even when supporting
multiple data types for raster cells.

Further, the idea is to support three kinds of map algebra methods,
mostly working on a raster band in-place.

1) Simple ones, which do not take any arguments beyond the raster band
itself, examples are functions like log, sin, rand.

2) Those which take one non-spatial argument. The argument can be a
scalar like a number, or more complex like a reclassifier.

3) Those which operate on two rasters. Examples are summation of
rasters and computing flow directions on a DEM. The latter is a bit
more complex since it is a focal method and requires a 3 x 3 matrix of
blocks from the other operand raster.

Maybe a fourth kind of methods are those which compute something from
a raster.

This would lead to four functions in the C API and raster band methods
in the bindings.

Comments are welcome, the initial code base contains a small demo
program. Eventually, if this works, I'll make a RFC from this.

Ari

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to