Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal of some updates to read_habitatmap_xxx() #140

Merged
merged 13 commits into from
May 12, 2021

Conversation

florisvdh
Copy link
Member

@florisvdh florisvdh commented May 11, 2021

  • moved the version check for filter_hab to the front: in case of a mismatch the error will come up immediately. Otherwise the habitatmap data source is read first, which can take long.
  • documentation of read_habitatmap_stdized(): adding the exception for "3130,rbbmr" etc. since habitatmap_stdized_2020_v1.
  • a few minor updates

Also addressed some remaining parts of issue #117. Specifically I propose for the 2018 versions to reorder the xxx_types tibble columns for stdized/terr according to the order of the 2020 datasource version, where this was done in a more logical way. Do you agree? - Also pinging @ToonHub for this one.

Commit 9eff6ec on documenting the key goes back to inbo/n2khab-preprocessing#50 (comment):

Importantly, such new approach of the habitatmap_types table (the non-spatial table in the processed data source) will have to be documented both in the Zenodo metadata (for the new version at least) and read_habitatmap_stdized(). As far as I can tell from habitatmap_stdized metadata it was not (yet) documented what is the primary key for habitatmap_types, but I presume it was polygon_id, type. Now it will become polygon_id, type, certain.

Results of testing the functions
library(n2khab)

filepaths_2018 <- 
    file.path(fileman_up("n2khab_data"),
              c("10_raw/habitatmap_2018",
                "20_processed/habitatmap_stdized_2018/habitatmap_stdized.gpkg",
                "20_processed/habitatmap_terr_2018/habitatmap_terr.gpkg"))

#############################################################
# 2020
#############################################################

read_habitatmap()
#> Simple feature collection with 646589 features and 30 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 21991.38 ymin: 153058.3 xmax: 258871.8 ymax: 244027.3
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 646,589 x 31
#>    polygon_id  eval  eenh1 eenh2 eenh3 eenh4 eenh5 eenh6 eenh7 eenh8 v1    v2   
#>  * <chr>       <fct> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 000098_v20… m     b     <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  2 000132_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  3 000135_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  4 000136_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  5 000142_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  6 000150_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  7 000297_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  8 000991_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  9 000999_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#> 10 001000_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#> # … with 646,579 more rows, and 19 more variables: v3 <chr>, source <chr>,
#> #   info <chr>, bwk_label <chr>, hab1 <chr>, phab1 <int>, hab2 <chr>,
#> #   phab2 <int>, hab3 <chr>, phab3 <int>, hab4 <chr>, phab4 <int>, hab5 <chr>,
#> #   phab5 <int>, source_hab <chr>, source_phab <chr>, hab_legend <fct>,
#> #   area_m2 <dbl>, geometry <MULTIPOLYGON [m]>

read_habitatmap(filter_hab = TRUE)
#> Simple feature collection with 87781 features and 30 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243446.1
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 87,781 x 31
#>    polygon_id  eval  eenh1 eenh2 eenh3 eenh4 eenh5 eenh6 eenh7 eenh8 v1    v2   
#>  * <fct>       <fct> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 130153_v20… mz    qb    pins  un    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  2 130815_v20… mz    weg   kt(q… <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  3 137826_v20… w     hrb   <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  4 170624_v20… w     mru   kbfr  kbcr  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  5 203261_v20… wz    hr    mr    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  6 204352_v20… wz    kd    hr    sp    sz    <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  7 204376_v20… wz    kd    kbp   sg    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  8 205188_v20… wz    kt(s… khu   kbq   kt(h… <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  9 205291_v20… wz    ku    sp    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#> 10 205756_v20… wz    lhi   sf    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#> # … with 87,771 more rows, and 19 more variables: v3 <chr>, source <chr>,
#> #   info <chr>, bwk_label <chr>, hab1 <chr>, phab1 <int>, hab2 <chr>,
#> #   phab2 <int>, hab3 <chr>, phab3 <int>, hab4 <chr>, phab4 <int>, hab5 <chr>,
#> #   phab5 <int>, source_hab <chr>, source_phab <chr>, hab_legend <fct>,
#> #   area_m2 <dbl>, geometry <MULTIPOLYGON [m]>

read_habitatmap_stdized()
#> $habitatmap_polygons
#> Simple feature collection with 87781 features and 2 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243446.1
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 87,781 x 3
#>    polygon_id  description_orig                                             geom
#>  * <fct>       <chr>                                          <MULTIPOLYGON [m]>
#>  1 130153_v20… 70% 9120_qb; 30% … (((150669.4 227248.6, 150668.1 227242.9, 1505…
#>  2 130815_v20… 70% gh; 30% 9160   (((258338.4 158696.1, 258336.5 158693.5, 2583…
#>  3 137826_v20… 100% 6430,rbbhf    (((181587.2 234938, 181613.1 234933.1, 181646…
#>  4 170624_v20… 100% rbbmr         (((145876.8 229686.8, 145701.7 229680.6, 1456…
#>  5 203261_v20… 70% gh; 30% rbbmr  (((117137.2 210307.9, 117136.3 210288.3, 1171…
#>  6 204352_v20… 80% gh; 20% rbbsp  (((116357.3 159278.3, 116340.5 159259.8, 1163…
#>  7 204376_v20… 90% gh; 10% rbbsg  (((116110.1 210545.5, 116102.4 210541.1, 1160…
#>  8 205188_v20… 60% rbbsp; 40% gh  (((232114.7 161594.5, 232122.7 161590.5, 2321…
#>  9 205291_v20… 70% gh; 30% rbbsp  (((191253.5 160641.5, 191254.8 160636.1, 1912…
#> 10 205756_v20… 70% gh; 30% rbbsf  (((216258.8 156749, 216260.9 156750, 216287.7…
#> # … with 87,771 more rows
#> 
#> $habitatmap_types
#> # A tibble: 110,485 x 5
#>    polygon_id   type     certain code_orig  phab
#>    <fct>        <fct>    <lgl>   <chr>     <int>
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100
#>  2 000043_v2016 9130_end TRUE    9130_end    100
#>  3 000064_v2020 9130_end TRUE    9130_end    100
#>  4 000132_v2016 9130_end TRUE    9130_end    100
#>  5 000204_v2016 91E0_vn  TRUE    91E0_vn     100
#>  6 000255_v2016 91E0_vc  TRUE    91E0_vc     100
#>  7 000297_v2016 rbbsp    TRUE    rbbsp        10
#>  8 000311_v2016 9130_end TRUE    9130_end     70
#>  9 000311_v2016 rbbsp    TRUE    rbbsp        30
#> 10 000390_v2016 91E0_vn  TRUE    91E0_vn      70
#> # … with 110,475 more rows

read_habitatmap_terr()
#> $habitatmap_terr_polygons
#> Simple feature collection with 78602 features and 4 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 78,602 x 5
#>    polygon_id  description_orig  description source                         geom
#>  * <fct>       <chr>             <chr>       <fct>            <MULTIPOLYGON [m]>
#>  1 130153_v20… 70% 9120_qb; 30%… 70% 9120_q… habita… (((150669.4 227248.6, 1506…
#>  2 130815_v20… 70% gh; 30% 9160  70% gh; 30… habita… (((258338.4 158696.1, 2583…
#>  3 137826_v20… 100% 6430,rbbhf   100% 6430_… habita… (((181587.2 234938, 181613…
#>  4 170624_v20… 100% rbbmr        100% rbbmr  habita… (((145876.8 229686.8, 1457…
#>  5 203261_v20… 70% gh; 30% rbbmr 70% gh; 30… habita… (((117137.2 210307.9, 1171…
#>  6 204352_v20… 80% gh; 20% rbbsp 80% gh; 20… habita… (((116357.3 159278.3, 1163…
#>  7 204376_v20… 90% gh; 10% rbbsg 90% gh; 10… habita… (((116110.1 210545.5, 1161…
#>  8 205188_v20… 60% rbbsp; 40% gh 60% rbbsp;… habita… (((232114.7 161594.5, 2321…
#>  9 205291_v20… 70% gh; 30% rbbsp 70% gh; 30… habita… (((191253.5 160641.5, 1912…
#> 10 205756_v20… 70% gh; 30% rbbsf 70% gh; 30… habita… (((216258.8 156749, 216260…
#> # … with 78,592 more rows
#> 
#> $habitatmap_terr_types
#> # A tibble: 99,784 x 6
#>    polygon_id   type     certain code_orig  phab source            
#>    <fct>        <fct>    <lgl>   <chr>     <int> <fct>             
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100 habitatmap_stdized
#>  2 000043_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  3 000064_v2020 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  4 000132_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  5 000204_v2016 91E0_vn  TRUE    91E0_vn     100 habitatmap_stdized
#>  6 000255_v2016 91E0_vc  TRUE    91E0_vc     100 habitatmap_stdized
#>  7 000297_v2016 rbbsp    TRUE    rbbsp        10 habitatmap_stdized
#>  8 000311_v2016 9130_end TRUE    9130_end     70 habitatmap_stdized
#>  9 000311_v2016 rbbsp    TRUE    rbbsp        30 habitatmap_stdized
#> 10 000390_v2016 91E0_vn  TRUE    91E0_vn      70 habitatmap_stdized
#> # … with 99,774 more rows

read_habitatmap_terr(keep_aq_types = FALSE)
#> $habitatmap_terr_polygons
#> Simple feature collection with 78602 features and 4 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 78,602 x 5
#>    polygon_id  description_orig  description source                         geom
#>  * <fct>       <chr>             <chr>       <fct>            <MULTIPOLYGON [m]>
#>  1 130153_v20… 70% 9120_qb; 30%… 70% 9120_q… habita… (((150669.4 227248.6, 1506…
#>  2 130815_v20… 70% gh; 30% 9160  70% gh; 30… habita… (((258338.4 158696.1, 2583…
#>  3 137826_v20… 100% 6430,rbbhf   100% 6430_… habita… (((181587.2 234938, 181613…
#>  4 170624_v20… 100% rbbmr        100% rbbmr  habita… (((145876.8 229686.8, 1457…
#>  5 203261_v20… 70% gh; 30% rbbmr 70% gh; 30… habita… (((117137.2 210307.9, 1171…
#>  6 204352_v20… 80% gh; 20% rbbsp 80% gh; 20… habita… (((116357.3 159278.3, 1163…
#>  7 204376_v20… 90% gh; 10% rbbsg 90% gh; 10… habita… (((116110.1 210545.5, 1161…
#>  8 205188_v20… 60% rbbsp; 40% gh 60% rbbsp;… habita… (((232114.7 161594.5, 2321…
#>  9 205291_v20… 70% gh; 30% rbbsp 70% gh; 30… habita… (((191253.5 160641.5, 1912…
#> 10 205756_v20… 70% gh; 30% rbbsf 70% gh; 30… habita… (((216258.8 156749, 216260…
#> # … with 78,592 more rows
#> 
#> $habitatmap_terr_types
#> # A tibble: 96,201 x 6
#>    polygon_id   type     certain code_orig  phab source            
#>    <fct>        <fct>    <lgl>   <chr>     <int> <fct>             
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100 habitatmap_stdized
#>  2 000043_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  3 000064_v2020 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  4 000132_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  5 000204_v2016 91E0_vn  TRUE    91E0_vn     100 habitatmap_stdized
#>  6 000255_v2016 91E0_vc  TRUE    91E0_vc     100 habitatmap_stdized
#>  7 000297_v2016 rbbsp    TRUE    rbbsp        10 habitatmap_stdized
#>  8 000311_v2016 9130_end TRUE    9130_end     70 habitatmap_stdized
#>  9 000311_v2016 rbbsp    TRUE    rbbsp        30 habitatmap_stdized
#> 10 000390_v2016 91E0_vn  TRUE    91E0_vn      70 habitatmap_stdized
#> # … with 96,191 more rows

read_habitatmap_terr(keep_aq_types = FALSE, drop_7220 = FALSE)
#> $habitatmap_terr_polygons
#> Simple feature collection with 78602 features and 4 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 78,602 x 5
#>    polygon_id  description_orig  description source                         geom
#>  * <fct>       <chr>             <chr>       <fct>            <MULTIPOLYGON [m]>
#>  1 130153_v20… 70% 9120_qb; 30%… 70% 9120_q… habita… (((150669.4 227248.6, 1506…
#>  2 130815_v20… 70% gh; 30% 9160  70% gh; 30… habita… (((258338.4 158696.1, 2583…
#>  3 137826_v20… 100% 6430,rbbhf   100% 6430_… habita… (((181587.2 234938, 181613…
#>  4 170624_v20… 100% rbbmr        100% rbbmr  habita… (((145876.8 229686.8, 1457…
#>  5 203261_v20… 70% gh; 30% rbbmr 70% gh; 30… habita… (((117137.2 210307.9, 1171…
#>  6 204352_v20… 80% gh; 20% rbbsp 80% gh; 20… habita… (((116357.3 159278.3, 1163…
#>  7 204376_v20… 90% gh; 10% rbbsg 90% gh; 10… habita… (((116110.1 210545.5, 1161…
#>  8 205188_v20… 60% rbbsp; 40% gh 60% rbbsp;… habita… (((232114.7 161594.5, 2321…
#>  9 205291_v20… 70% gh; 30% rbbsp 70% gh; 30… habita… (((191253.5 160641.5, 1912…
#> 10 205756_v20… 70% gh; 30% rbbsf 70% gh; 30… habita… (((216258.8 156749, 216260…
#> # … with 78,592 more rows
#> 
#> $habitatmap_terr_types
#> # A tibble: 96,269 x 6
#>    polygon_id   type     certain code_orig  phab source            
#>    <fct>        <fct>    <lgl>   <chr>     <int> <fct>             
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100 habitatmap_stdized
#>  2 000043_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  3 000064_v2020 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  4 000132_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  5 000204_v2016 91E0_vn  TRUE    91E0_vn     100 habitatmap_stdized
#>  6 000255_v2016 91E0_vc  TRUE    91E0_vc     100 habitatmap_stdized
#>  7 000297_v2016 rbbsp    TRUE    rbbsp        10 habitatmap_stdized
#>  8 000311_v2016 9130_end TRUE    9130_end     70 habitatmap_stdized
#>  9 000311_v2016 rbbsp    TRUE    rbbsp        30 habitatmap_stdized
#> 10 000390_v2016 91E0_vn  TRUE    91E0_vn      70 habitatmap_stdized
#> # … with 96,259 more rows

#############################################################
# 2018
#############################################################

read_habitatmap(filepaths_2018[1], 
                version = "habitatmap_2018")
#> Simple feature collection with 618147 features and 30 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 21991.38 ymin: 153058.3 xmax: 258871.8 ymax: 244027.3
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 618,147 x 31
#>    polygon_id  eval  eenh1 eenh2 eenh3 eenh4 eenh5 eenh6 eenh7 eenh8 v1    v2   
#>  * <chr>       <fct> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 000098_v20… m     b     <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  2 000132_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  3 000135_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  4 000136_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  5 000142_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  6 000150_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  7 000152_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  8 000154_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  9 000297_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#> 10 000805_v20… m     bl    <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#> # … with 618,137 more rows, and 19 more variables: v3 <chr>, source <chr>,
#> #   info <chr>, bwk_label <chr>, hab1 <chr>, phab1 <int>, hab2 <chr>,
#> #   phab2 <int>, hab3 <chr>, phab3 <int>, hab4 <chr>, phab4 <int>, hab5 <chr>,
#> #   phab5 <int>, source_hab <chr>, source_phab <chr>, hab_legend <fct>,
#> #   area_m2 <dbl>, geometry <MULTIPOLYGON [m]>

# error expected, since habitatmap_stdized 2018 is not in the default location
read_habitatmap(filepaths_2018[1], 
                filter_hab = TRUE, 
                version = "habitatmap_2018")
#> Error: You are trying to use habitatmap version 'habitatmap_2018' with another version of habitatmap_stdized. Specify the correct version as argument (version =) and add the corresponding files under 'n2khab_data/10_raw/habitatmap' and 'n2khab_data/20_processed/habitatmap_stdized'.

read_habitatmap_stdized(filepaths_2018[2], "habitatmap_stdized_2018_v2")
#> $habitatmap_polygons
#> Simple feature collection with 77048 features and 2 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 77,048 x 3
#>    polygon_id  description_orig                                             geom
#>  * <fct>       <chr>                                          <MULTIPOLYGON [m]>
#>  1 129420_v20… 60% x; 40% 6430_hf (((242838.5 161337.3, 242842.8 161324.7, 2428…
#>  2 130153_v20… 70% 9120_qb; 30% … (((150669.4 227248.6, 150668.1 227242.9, 1505…
#>  3 130815_v20… 70% gh; 30% 9160   (((258338.4 158696.1, 258336.5 158693.5, 2583…
#>  4 137826_v20… 100% 6430,rbbhf    (((181587.2 234938, 181613.1 234933.1, 181646…
#>  5 170624_v20… 100% rbbmr         (((145876.8 229686.8, 145701.7 229680.6, 1456…
#>  6 203261_v20… 70% gh; 30% rbbmr  (((117137.2 210307.9, 117136.3 210288.3, 1171…
#>  7 204352_v20… 80% gh; 20% rbbsp  (((116357.3 159278.3, 116340.5 159259.8, 1163…
#>  8 204376_v20… 90% gh; 10% rbbsg  (((116110.1 210545.5, 116102.4 210541.1, 1160…
#>  9 205188_v20… 60% rbbsp; 40% gh  (((232114.7 161594.5, 232122.7 161590.5, 2321…
#> 10 205291_v20… 70% gh; 30% rbbsp  (((191253.5 160641.5, 191254.8 160636.1, 1912…
#> # … with 77,038 more rows
#> 
#> $habitatmap_types
#> # A tibble: 98,737 x 5
#>    polygon_id   type     certain code_orig  phab
#>    <fct>        <fct>    <lgl>   <chr>     <dbl>
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100
#>  2 000043_v2016 9130_end TRUE    9130_end    100
#>  3 000132_v2016 9130_end TRUE    9130_end    100
#>  4 000204_v2016 91E0_vn  TRUE    91E0_vn     100
#>  5 000255_v2016 91E0_vc  TRUE    91E0_vc     100
#>  6 000297_v2016 rbbsp    TRUE    rbbsp        10
#>  7 000311_v2016 9130_end TRUE    9130_end     70
#>  8 000311_v2016 rbbsp    TRUE    rbbsp        30
#>  9 000390_v2016 91E0_vn  TRUE    91E0_vn      70
#> 10 000390_v2016 91E0_va  TRUE    91E0_va      30
#> # … with 98,727 more rows

read_habitatmap_terr(filepaths_2018[3], 
                     version = "habitatmap_terr_2018_v2")
#> $habitatmap_terr_polygons
#> Simple feature collection with 68300 features and 4 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 68,300 x 5
#>    polygon_id  description_orig  description source                         geom
#>  * <fct>       <chr>             <chr>       <fct>            <MULTIPOLYGON [m]>
#>  1 129420_v20… 60% x; 40% 6430_… 60% x; 40%… habita… (((242838.5 161337.3, 2428…
#>  2 130153_v20… 70% 9120_qb; 30%… 70% 9120_q… habita… (((150669.4 227248.6, 1506…
#>  3 130815_v20… 70% gh; 30% 9160  70% gh; 30… habita… (((258338.4 158696.1, 2583…
#>  4 137826_v20… 100% 6430,rbbhf   100% 6430_… habita… (((181587.2 234938, 181613…
#>  5 170624_v20… 100% rbbmr        100% rbbmr  habita… (((145876.8 229686.8, 1457…
#>  6 203261_v20… 70% gh; 30% rbbmr 70% gh; 30… habita… (((117137.2 210307.9, 1171…
#>  7 204352_v20… 80% gh; 20% rbbsp 80% gh; 20… habita… (((116357.3 159278.3, 1163…
#>  8 204376_v20… 90% gh; 10% rbbsg 90% gh; 10… habita… (((116110.1 210545.5, 1161…
#>  9 205188_v20… 60% rbbsp; 40% gh 60% rbbsp;… habita… (((232114.7 161594.5, 2321…
#> 10 205291_v20… 70% gh; 30% rbbsp 70% gh; 30… habita… (((191253.5 160641.5, 1912…
#> # … with 68,290 more rows
#> 
#> $habitatmap_terr_types
#> # A tibble: 87,473 x 6
#>    polygon_id   type     certain code_orig  phab source            
#>    <fct>        <fct>    <lgl>   <chr>     <dbl> <fct>             
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100 habitatmap_stdized
#>  2 000043_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  3 000132_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  4 000204_v2016 91E0_vn  TRUE    91E0_vn     100 habitatmap_stdized
#>  5 000255_v2016 91E0_vc  TRUE    91E0_vc     100 habitatmap_stdized
#>  6 000297_v2016 rbbsp    TRUE    rbbsp        10 habitatmap_stdized
#>  7 000311_v2016 9130_end TRUE    9130_end     70 habitatmap_stdized
#>  8 000311_v2016 rbbsp    TRUE    rbbsp        30 habitatmap_stdized
#>  9 000390_v2016 91E0_vn  TRUE    91E0_vn      70 habitatmap_stdized
#> 10 000390_v2016 91E0_va  TRUE    91E0_va      30 habitatmap_stdized
#> # … with 87,463 more rows

read_habitatmap_terr(filepaths_2018[3], 
                     version = "habitatmap_terr_2018_v2",
                     keep_aq_types = FALSE)
#> $habitatmap_terr_polygons
#> Simple feature collection with 68300 features and 4 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 68,300 x 5
#>    polygon_id  description_orig  description source                         geom
#>  * <fct>       <chr>             <chr>       <fct>            <MULTIPOLYGON [m]>
#>  1 129420_v20… 60% x; 40% 6430_… 60% x; 40%… habita… (((242838.5 161337.3, 2428…
#>  2 130153_v20… 70% 9120_qb; 30%… 70% 9120_q… habita… (((150669.4 227248.6, 1506…
#>  3 130815_v20… 70% gh; 30% 9160  70% gh; 30… habita… (((258338.4 158696.1, 2583…
#>  4 137826_v20… 100% 6430,rbbhf   100% 6430_… habita… (((181587.2 234938, 181613…
#>  5 170624_v20… 100% rbbmr        100% rbbmr  habita… (((145876.8 229686.8, 1457…
#>  6 203261_v20… 70% gh; 30% rbbmr 70% gh; 30… habita… (((117137.2 210307.9, 1171…
#>  7 204352_v20… 80% gh; 20% rbbsp 80% gh; 20… habita… (((116357.3 159278.3, 1163…
#>  8 204376_v20… 90% gh; 10% rbbsg 90% gh; 10… habita… (((116110.1 210545.5, 1161…
#>  9 205188_v20… 60% rbbsp; 40% gh 60% rbbsp;… habita… (((232114.7 161594.5, 2321…
#> 10 205291_v20… 70% gh; 30% rbbsp 70% gh; 30… habita… (((191253.5 160641.5, 1912…
#> # … with 68,290 more rows
#> 
#> $habitatmap_terr_types
#> # A tibble: 85,051 x 6
#>    polygon_id   type     certain code_orig  phab source            
#>    <fct>        <fct>    <lgl>   <chr>     <dbl> <fct>             
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100 habitatmap_stdized
#>  2 000043_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  3 000132_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  4 000204_v2016 91E0_vn  TRUE    91E0_vn     100 habitatmap_stdized
#>  5 000255_v2016 91E0_vc  TRUE    91E0_vc     100 habitatmap_stdized
#>  6 000297_v2016 rbbsp    TRUE    rbbsp        10 habitatmap_stdized
#>  7 000311_v2016 9130_end TRUE    9130_end     70 habitatmap_stdized
#>  8 000311_v2016 rbbsp    TRUE    rbbsp        30 habitatmap_stdized
#>  9 000390_v2016 91E0_vn  TRUE    91E0_vn      70 habitatmap_stdized
#> 10 000390_v2016 91E0_va  TRUE    91E0_va      30 habitatmap_stdized
#> # … with 85,041 more rows

read_habitatmap_terr(filepaths_2018[3], 
                     version = "habitatmap_terr_2018_v2",
                     keep_aq_types = FALSE, drop_7220 = FALSE)
#> $habitatmap_terr_polygons
#> Simple feature collection with 68300 features and 4 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22003.2 ymin: 153084.4 xmax: 258871.8 ymax: 243351.8
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 68,300 x 5
#>    polygon_id  description_orig  description source                         geom
#>  * <fct>       <chr>             <chr>       <fct>            <MULTIPOLYGON [m]>
#>  1 129420_v20… 60% x; 40% 6430_… 60% x; 40%… habita… (((242838.5 161337.3, 2428…
#>  2 130153_v20… 70% 9120_qb; 30%… 70% 9120_q… habita… (((150669.4 227248.6, 1506…
#>  3 130815_v20… 70% gh; 30% 9160  70% gh; 30… habita… (((258338.4 158696.1, 2583…
#>  4 137826_v20… 100% 6430,rbbhf   100% 6430_… habita… (((181587.2 234938, 181613…
#>  5 170624_v20… 100% rbbmr        100% rbbmr  habita… (((145876.8 229686.8, 1457…
#>  6 203261_v20… 70% gh; 30% rbbmr 70% gh; 30… habita… (((117137.2 210307.9, 1171…
#>  7 204352_v20… 80% gh; 20% rbbsp 80% gh; 20… habita… (((116357.3 159278.3, 1163…
#>  8 204376_v20… 90% gh; 10% rbbsg 90% gh; 10… habita… (((116110.1 210545.5, 1161…
#>  9 205188_v20… 60% rbbsp; 40% gh 60% rbbsp;… habita… (((232114.7 161594.5, 2321…
#> 10 205291_v20… 70% gh; 30% rbbsp 70% gh; 30… habita… (((191253.5 160641.5, 1912…
#> # … with 68,290 more rows
#> 
#> $habitatmap_terr_types
#> # A tibble: 85,091 x 6
#>    polygon_id   type     certain code_orig  phab source            
#>    <fct>        <fct>    <lgl>   <chr>     <dbl> <fct>             
#>  1 000038_v2016 91E0_va  TRUE    91E0_va     100 habitatmap_stdized
#>  2 000043_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  3 000132_v2016 9130_end TRUE    9130_end    100 habitatmap_stdized
#>  4 000204_v2016 91E0_vn  TRUE    91E0_vn     100 habitatmap_stdized
#>  5 000255_v2016 91E0_vc  TRUE    91E0_vc     100 habitatmap_stdized
#>  6 000297_v2016 rbbsp    TRUE    rbbsp        10 habitatmap_stdized
#>  7 000311_v2016 9130_end TRUE    9130_end     70 habitatmap_stdized
#>  8 000311_v2016 rbbsp    TRUE    rbbsp        30 habitatmap_stdized
#>  9 000390_v2016 91E0_vn  TRUE    91E0_vn      70 habitatmap_stdized
#> 10 000390_v2016 91E0_va  TRUE    91E0_va      30 habitatmap_stdized
#> # … with 85,081 more rows

Created on 2021-05-11 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.5 (2021-03-31)
#>  os       Linux Mint 20               
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language nl_BE:nl                    
#>  collate  nl_BE.UTF-8                 
#>  ctype    nl_BE.UTF-8                 
#>  tz       Europe/Brussels             
#>  date     2021-05-11                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source        
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.0.2)
#>  class         7.3-19     2021-05-03 [4] CRAN (R 4.0.5)
#>  classInt      0.4-3      2020-04-07 [1] CRAN (R 4.0.2)
#>  cli           2.4.0      2021-04-05 [1] CRAN (R 4.0.5)
#>  crayon        1.4.1      2021-02-08 [1] CRAN (R 4.0.3)
#>  DBI           1.1.1      2021-01-15 [1] CRAN (R 4.0.3)
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.0.3)
#>  dplyr         1.0.5      2021-03-05 [1] CRAN (R 4.0.5)
#>  e1071         1.7-6      2021-03-18 [1] CRAN (R 4.0.5)
#>  ellipsis      0.3.1      2020-05-15 [1] CRAN (R 4.0.2)
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.0.2)
#>  fansi         0.4.2      2021-01-15 [1] CRAN (R 4.0.3)
#>  forcats       0.5.1      2021-01-27 [1] CRAN (R 4.0.3)
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 4.0.2)
#>  generics      0.1.0      2020-10-31 [1] CRAN (R 4.0.3)
#>  git2r         0.28.0     2021-01-10 [1] CRAN (R 4.0.3)
#>  git2rdata     0.3.1      2021-01-21 [1] CRAN (R 4.0.3)
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)
#>  highr         0.8        2019-03-20 [1] CRAN (R 4.0.2)
#>  htmltools     0.5.1.1    2021-01-22 [1] CRAN (R 4.0.3)
#>  KernSmooth    2.23-18    2020-10-29 [4] CRAN (R 4.0.3)
#>  knitr         1.31       2021-01-27 [1] CRAN (R 4.0.3)
#>  lifecycle     1.0.0      2021-02-15 [1] CRAN (R 4.0.4)
#>  magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.0.3)
#>  n2khab      * 0.4.0.9000 2021-05-11 [1] local         
#>  pillar        1.5.1      2021-03-05 [1] CRAN (R 4.0.5)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.0.2)
#>  plyr          1.8.6      2020-03-03 [1] CRAN (R 4.0.2)
#>  proxy         0.4-25     2021-03-05 [1] CRAN (R 4.0.4)
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.0.2)
#>  R6            2.5.0      2020-10-28 [1] CRAN (R 4.0.3)
#>  Rcpp          1.0.6      2021-01-15 [1] CRAN (R 4.0.3)
#>  reprex        2.0.0      2021-04-02 [1] CRAN (R 4.0.5)
#>  rlang         0.4.10     2020-12-30 [1] CRAN (R 4.0.3)
#>  rmarkdown     2.7        2021-02-19 [1] CRAN (R 4.0.4)
#>  rprojroot     2.0.2      2020-11-15 [1] CRAN (R 4.0.3)
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.0.3)
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.2)
#>  sf            0.9-8      2021-03-17 [1] CRAN (R 4.0.5)
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.0.2)
#>  tibble        3.1.0      2021-02-25 [1] CRAN (R 4.0.5)
#>  tidyr         1.1.3      2021-03-03 [1] CRAN (R 4.0.5)
#>  tidyselect    1.1.0      2020-05-11 [1] CRAN (R 4.0.2)
#>  units         0.7-1      2021-03-16 [1] CRAN (R 4.0.5)
#>  utf8          1.2.1      2021-03-12 [1] CRAN (R 4.0.5)
#>  vctrs         0.3.7      2021-03-29 [1] CRAN (R 4.0.5)
#>  withr         2.4.1      2021-01-26 [1] CRAN (R 4.0.3)
#>  xfun          0.22       2021-03-11 [1] CRAN (R 4.0.4)
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] /home/floris/lib/R/library
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

@florisvdh florisvdh requested a review from cecileherr May 11, 2021 18:29
@florisvdh florisvdh changed the title Proposal of some updates to #139 Proposal of some updates to read_habitatmap_xxx() May 12, 2021
Co-authored-by: Cécile Herr <31855012+cecileherr@users.noreply.github.com>
florisvdh and others added 3 commits May 12, 2021 10:41
Co-authored-by: Cécile Herr <31855012+cecileherr@users.noreply.github.com>
From https://github.com/inbo/n2khab-preprocessing/blob/b088c44/src/generate_habitatmap_stdized/10_generate_habmap_stdized.Rmd:

> An exception to this rule are following codes: `3130,rbbmr`, `3140,rbbmr`, `3150,rbbmr` and `3160,rbbmr`.

Also in the R code of above file, it can be seen it is only these codes that are handled.
Comment on lines 51 to 52
#' created for each of them.
#' The variable \code{certain} in this case will be \code{TRUE} for both types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About issue #117:

you may want to add a note on the interpretation of phab for code combinations from one column of the original habitatmap

Should also be mentioned in the geospatial hab vignette, but I think this can be useful here too, so:

Suggested change
#' created for each of them.
#' The variable \code{certain} in this case will be \code{TRUE} for both types.
#' created for each of them, with \code{phab} for each new row simply
#' set to the original value of \code{phab}.
#' The variable \code{certain} in this case will be \code{TRUE} for both types.
#' \itemize{
#' \item Note that this implies that the a given polygon could contain the same
#' type with variable \code{certain} as \code{TRUE} several times, e.g. when:
#' \code{31xx_rbbmr} is present with \code{phab} = yy% and
#' \code{31xx} is present with \code{phab} = zz%.
#' In that case the rows with the same \code{polygon_id}, \code{type}
#' and \code{certain} were gathered into one row and the respective
#' \code{phab} were summed up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks! I'll re-add your suggestion as a new commit, as the lines it depends on are flagged as outdated (so cannot apply suggestion).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm supposing this bullet is at a lower level. The {} of \itemize needed to be closed 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cecileherr Are you sure this is restricted to 31xx_rbbmr and it does not occur in general?

Anyway, the code executes this as a general rule. So I think it will be better to state it in general and add this bullet as a subnote under the first bullet (where you added 'with phab for each new row simply set to the original value of phab').

I leave consideration to you; I'll postpone the initial plan of adding the above suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sidenote: it appears that '%' needs to be escaped as \%

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cecileherr Are you sure this is restricted to 31xx_rbbmr and it does not occur in general?

Anyway, the code executes this as a general rule. So I think it will be better to state it in general and add this bullet as a subnote under the first bullet (where you added 'with phab for each new row simply set to the original value of phab').

I thought it was a general rule in the code, but that it only happened for the 31xx_rbbmr case, but I was wrong (e.g. 605723_v2014 is an example for the uncertain case). I would be tempted to add this is as 3rd item (after the uncertain case and the 31xx_rbbmr case), instead of making a subitem out of it.

Something like (other items abbreviated):

#'     \itemize{

#'     \item For some polygons the vegetation type is uncertain (...)
#'     if only one vegetation type is provided.

#'     \item Some polygons contain both a standing water habitat type (...)
#'     created for each of them, with \code{phab} for each new row simply 
#'     set to the original value of \code{phab}. 
#'     The variable \code{certain} in this case will be \code{TRUE} for both types.

#'     \item After those first two steps, a given polygon could contain the same 
#'     type with the same value for \code{certain} several times, e.g. when:
#'     \code{31xx_rbbmr} is present with \code{phab} = yy\% and
#'     \code{31xx} is present with \code{phab} = zz\%.
#'     In that case the rows with the same \code{polygon_id}, \code{type} 
#'     and \code{certain} were gathered into one row and the respective 
#'     \code{phab} were summed up.

#'     \item For some polygons the original vegetation code in the (...)
#'     code was adjusted.
#'   }

@florisvdh : your thought on this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea @cecileherr. It better fits the idea of 'steps'.

Minor comments:

  • when: drop colon
  • summed up: means 'enumerated'; should become something like 'added up'.
  • with the same value for \code{certain} several times: I think '[...] repeated several times' is better

I suggest you add another suggestion in this PR, or else do it as commit in your branch update_habitat2020_ch (including the Rd file update) after merging this PR. Merging this branch is up to you anyway, since its target branch is yours.

Copy link
Collaborator

@cecileherr cecileherr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the update (and sorry, I had obviously missed issue 117)
Maybe just take a look at this suggestion:
#140 (comment)
and then I think we will have covered all open questions, so we should be able to merge

@florisvdh
Copy link
Member Author

Thank you for reviewing @cecileherr 🙏.

@florisvdh
Copy link
Member Author

@cecileherr for me this PR is finished, you can merge

@cecileherr
Copy link
Collaborator

@cecileherr for me this PR is finished, you can merge

Many thanks ! Let's go!

@cecileherr cecileherr merged commit 48b79f7 into update_habitat2020_ch May 12, 2021
@cecileherr cecileherr deleted the update_habitat2020_ch_fv branch May 12, 2021 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants