Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Great to have a "how to create random sample points on maps using R" section in the GIS basics in the next round #83

Open
pbkeating opened this issue Dec 15, 2021 · 8 comments

Comments

@pbkeating
Copy link

As above

@pbkeating pbkeating changed the title Great to have a "how to create random sample points using R" in the GIS basics in the next round Great to have a "how to create random sample points using R" section in the GIS basics in the next round Dec 15, 2021
@pbkeating pbkeating changed the title Great to have a "how to create random sample points using R" section in the GIS basics in the next round Great to have a "how to create random sample points on maps using R" section in the GIS basics in the next round Dec 15, 2021
@pbkeating
Copy link
Author


bgd_adm3 <- sf::read_sf(here::here("BGD admin shapefile/BGD_adm3.shp")) %>% 
        janitor::clean_names()

p1 <- st_sample(bgd_adm3, size = 19, type = "random")
sample <-do.call(rbind.data.frame, p1) %>% 
        ## rename columns
        rename_at(1, ~ "lat") %>% 
        rename_at(2, ~ "lon") %>% 
        st_as_sf(coords = c("lat", "lon"))


st_write(sample, 
         here::here("8_output", 
                    "sample_random_points_gpxTEST3.gpx"),
         driver = "GPX")

@pbkeating
Copy link
Author

pbkeating commented Jan 12, 2022

Code to do PPS at block/sector level in a camp and to export the points as .gpx

## import shapefile
block_boundaries <- sf::read_sf(here::here("Camp shape file/block_boundaries_2021_w20ext.shp")) %>% 
        janitor::clean_names() %>% 
        ## change CRS (if necessary)
        st_transform(crs = 4326)

## import the clusters per block data
block_clusters <- rio::import(here::here("4_sampling_frame", 
                                         "block_clusters.xlsx")) %>% 
        janitor::clean_names()
##  merge cluster data with block shapefile
block_sample <- block_boundaries %>% 
        left_join(block_clusters, by = "block_code") 
## identify all the unique block code values
block_names <- unique(block_sample$block_code)
cluster_size <- block_sample$clusters

## Create a function to choose random samples PPS per block
lqas_sample <- function(block_names2, cluster_size2){
        st_sample(block_sample %>%  
                          filter(block_code == block_names2), 
                  size = cluster_size2, type = "random") %>% 
                ## convert the list into a dataframe
                do.call(rbind.data.frame,.) %>% 
                # rename columns
                rename_at(1, ~ "lat") %>% 
                rename_at(2, ~ "lon") %>% 
                ## convert the dataframe to an sf object with lat and lon
                st_as_sf(coords = c("lat", "lon")) %>% 
                ## write the sf object as a GPX file
                st_write(here::here("8_output", str_glue(
                    "block_sample_{block_names2}.gpx")),
         driver = "GPX"
         )
}

## Iterate over the blocks and cluster sizes and export to the output folder
my_samples <- map2(
        .x = block_names,
        .y = cluster_size,
        .f = ~lqas_sample(block_names2 = .x, cluster_size2 = .y))

@aspina7
Copy link
Contributor

aspina7 commented Jan 20, 2022

@AlexandreBlake this too!

@AlexandreBlake
Copy link
Contributor

I saw this one too. But I need to play with Pat's data to see exactly what it does. Because I see no point cleaning, I would assume that the sampling is just about the 1st stage of the sampling. If it is the case, I am not sure the GIS strategy is competitive enough compared to the "usual" approach. Again: I need to run it.

@pbkeating
Copy link
Author

Do you mean doing it in ArcGIS/QGIS? Can you elaborate more on what would be included in point cleaning? I don't think there is any manipulation of the selected points that can be done in R. I'll email you the files so you can have a look. There isn't anything confidential in them.

@AlexandreBlake
Copy link
Contributor

Hey Pat! I have a chunk doing that: it uses a while loop to show the points 1 by 1 on a tile with a radius around it. The trouble is that it used to be simpler to get tiles out of Google map. Now there needs to be an account to get an API and there is a limit to how many tiles a day you can download for free.

Here is an old chunk assisting in the cleaning. It probably needs some updating and polishing it draws the points and does the cleaning through R directly but it works better if the polygon has been massaged first to exclude as much empty space as possible (on Google Earth/QGIS).

geo_sampler <- function(polygon=pap,radius=10){
require(maptools)
require(rgdal)
require(rgeos)
require(ggplot2)
require(ggmap)

if(is.na(proj4string(polygon))){
	stop("Missing CRS")
}else{
	data <- data.frame()
	n_point <- nrow(data)

	polygon <- spTransform(polygon,CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))
	spoint <- spsample(polygon,1,"random",iter=10)
	point <- data.frame(lon=coordinates(spoint)[,1],lat=coordinates(spoint)[,2])

	map <- get_map(location=c(point$lon,point$lat),zoom=19,maptype='satellite')
	map <- ggmap(map)+
			geom_polygon(data=buffer(spoints=spoint,radius=radius,crs=CRS_haiti,fortify=TRUE),aes(x=long,y=lat),fill=NA,color="red")+
			geom_point(data=point,aes(x=lon,y=lat),size=1.5,color="red")
	print(map)

	cat("\nType 'T' to keep the point and 'F' to draw another point.\nType 'q' to quit. The validated points are saved by quitting.\n")
	answer <- readline(prompt="\nKeep this point:")
	if(!(answer %in% c("T","F","q"))){
		cat("\nIncorrect answer, try again.\n")
		cat("\nType 'T' to keep the point and 'F' to draw another point.\nType 'q' to quit.\n")
		answer <- readline(prompt="\nKeep this point:")
	}

	if(answer=="T"){
		data <- rbind(data,coordinates(spoint))
		n <- nrow(data)
		cat(paste("\n",n," point(s) currently validated.\n"))
	}

	while(answer!="q"){
		spoint <- spsample(polygon,1,"random",iter=10)
		point <- data.frame(lon=coordinates(spoint)[,1],lat=coordinates(spoint)[,2])

		map <- get_map(location=c(point$lon,point$lat),zoom=19,maptype='satellite')
		map <- ggmap(map)+
				geom_polygon(data=buffer(spoints=spoint,radius=radius,crs=CRS_haiti,fortify=TRUE),aes(x=long,y=lat),fill=NA,color="red")+
				geom_point(data=point,aes(x=lon,y=lat),size=1.5,color="red")
		print(map)

		cat("\nType 'T' to keep the point and 'F' to draw another point.\nType 'q' to quit.\n")
		answer <- readline(prompt="\nKeep this point:")
		if(!(answer %in% c("T","F","q"))){
			cat("\nIncorrect answer, try again.\n")
			cat("\nType 'T' to keep the point and 'F' to draw another point.\nType 'q' to quit.\n")
			answer <- readline(prompt="\nKeep this point:")
		}

		if(answer=="T"){
			data <- rbind(data,coordinates(spoint))
			n <- nrow(data)
			cat(paste("\n",n," point(s) currently validated.\n"))
		}
	}
	if(answer=="q"){
		cat(paste("\n",n," point(s) currently validated.\n"))
		return(data)
	}
}

}

@denis-or
Copy link

Interesting. But how does this go to epidemiology? It looks very advanced.

@aspina7
Copy link
Contributor

aspina7 commented Jan 24, 2022

Interesting. But how does this go to epidemiology? It looks very advanced.

agree the code can (and should) be simplified but its important epidemiological concepts to include for implementing surveys with appropriate designs and sampling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants