Aggregation example

configuration-examples Aggregation

In modelling, data often need to be aggregated. This example presents how to aggregate data with the GeoDMS from the CBS neighborhood level towards the municipality level.

domain units and relation

An aggregation in GeoDMS terms means data is aggregated from a source domain-unit to a target domain unit (usually with less entries).

To make the aggregation, a relation is needed that relates the source domain unit to the target domain unit.

Example

container aggregation
{
   container units
   {
      unit<uint32>  nr_inh := BaseUnit('inhabitant', uint32);
      unit<float32> ratio  := float32(nr_inh) / float32(nr_inh);
      unit<float32> perc   := 100f * ratio;
   }
   unit<uint32> neighborhood
   : StorageName = "%SourceDataDir%/CBS/2017/neighborhood"
   , StorageType = "gdal.vect"
  {
     attribute<string>       code;
     attribute<units/nr_inh> nr_inhabitants
     attribute<units/perc>   perc_0_14_yr;

      // defined(_def) variants used to set missing values to zero
     attribute<units/nr_inh> nr_inhabitants_def := 
        nr_inhabitants >= 0i ? nr_inhabitants[units/nr_inh] : 0[units/nr_inh] / 0;
     attribute<units/perc>   perc_0_14_yr_def   :=  (perc_0_14_yr >= 0i && perc_0_14_yr <= 100i) 
           ? perc_0_14_yr[units/perc] 
           : 0[units/perc] / 0f;

     attribute<string>       municipality_code := substr(neighborhood/code,0,5);
     attribute<municipality> municipality_rel  := rlookup(municipality_code, municipality/values);
  }
  unit<uint32> municipality := unique(neighborhood/municipality_code)
  {
     attribute<units/nr_inh> sum_nr_inhabitants         := 
       sum(neighborhood/nr_inhabitants_def , neighborhood/municipality_rel);
     attribute<units/perc>   mean_perc_0_14_yr          := 
       mean(neighborhood/P_00_14_JR_DEF, neighborhood/municipality_rel);
     attribute<units/perc>   mean_weighted_perc_0_14_yr := 
        sum(neighborhood/P_00_14_JR_DEF * float32(buurt/AANT_INW_DEF) , neighborhood/municipality_rel) 
      / sum(float32(buurt/AANT_INW_DEF), neighborhood/municipality_rel);
  }
}

explanation

  • The attributes nr_inhabitants and perc_0_14_yr are aggregated from the neighborhood towards the municipality domain.
  • For that reason a municipality_rel relational attribute is configured with as domain unit : neighborhood and as values-unit : municipality. The data is calculated with the rlookup function.
  • For the nr_inhabitants attribute the actual aggregation is configured with the sum function, resulting in the sum_nr_inhabitants attribute.
  • For the perc_0_14_yr attribute two aggregations are configured:
  1. mean_perc_0_14_yr : the aggregation is configured with the mean function, resulting in the mean perc_0_14_yr attribute of all neighborhoods in a municipality.
  2. mean_weighted_perc_0_14_yr: the aggregation is configured with two sum functions, resulting in the weighted mean perc_0_14_yr attribute of all neighborhoods in a municipality. The nr_inhabitants is used as weight.

see also

In relational databases, SQL Group By statements are often used for aggregations. See the example [[Select … From … Group By … Select-…-From-…-Group-By-…]] for more information.