Geomatics with R

The fundamentals of R for geomatics

Author

Paul Passy

Contact : paul.passy@u-paris.fr

Course objectives

This course material is designed to familiarise you with handling geographic data using the R programming language. It includes exercises that will teach you how to use the main GIS features, both vector and raster, with R.

In the first two sections, you will find a set of vector manipulations (file import/export, selections, joins, etc.), classic geoprocessing operations (intersection, difference, etc.) and manipulations combining rasters and vectors (cutting, zonal statistics, vectorisation, etc.). The third section will be devoted to the production of thematic maps (location maps, choropleth maps, proportional circle maps) at different scales using rasters. In the fourth section, we will see how to produce an interactive map for the web. Finally, in the fifth section, we will look at some remote sensing treatments with R.

Context and data

We will use the exploration of demographic data at the Division level in Pakistan as our guiding thread. We will look at how the population is distributed across the Divisions throughout the territory. We will use the following data:

  • PK_admin_L2.gpkg : the polygon vector layer of the Divisions of Pakistan
  • PK_census.xlsx : a spreadsheet showing the population of Pakistani Divisions in 2017 and 2023
  • main_rivers_PK.gpkg : a line vector layer of the main rivers of Pakistan.
  • ESA_WorldCover.tif : a raster of land cover produced by the ESA (for a small part of Pakistan).
  • GTopo_PK.tif : a DEM at the scale of Pakistan

All these data are provided in the Data folder available at this link. You can also download the Data_bonus directory containing some of the data produced at this link.

To start with

Beforehand, it is useful to have some basic knowledge of GIS, particularly regarding the following points:

  • the projection systems
  • the differences between vector and raster data
  • the main raster and vector formats

We will use R via the RStudio development interface. This software is free and available on all operating systems. If you have not already done so, please install it on your computer.

Basic knowledge of R or another programming language will also be necessary. We will specifically use the following R libraries:

  • terra: for manipulating vector and raster data.
  • readxl: for importing Excel spreadsheets into R
  • mapsf: for thematic mapping
  • sf: for vector data management (required for mapsf)
  • dplyr: for more direct dataframe manipulation
  • plotly: for plotting dynamic graphs
  • leaflet: for generating interactive maps
  • RColorBrewer: for generating shimmering colour palettes

If you haven’t already done so, start by installing these libraries by going to RStudio, then the Tools menuInstall Packages… → then search for the package to install in the search bar.

Working directory structure

To follow this tutorial, you can create a dedicated directory on your computer, named R_and_geomatics (or any other name). In this directory, you can create a subdirectory called Data, where you will store the data provided (geopackages and others). You should also create a subdirectory called Data_processed, where you will export your results. Next, you can create an RStudio project that points to your R_and_geomatics directory. To do this, in RStudio, go to FileNew Project…Existing Directory, then point to your R_and_geomatics directory and Create Project. This will allow you to access the data in Data using relative paths, which are easier to handle. Below, we remind you of a specific feature relating to paths in Windows.

# absolute path (under Windows)
absolute_path <- "C:\\MyFolder\\Documents\\Tutorial\\R_and_geomatics\\Data\\PK_admin.gpkg"
# or
absolute_path <- "C:/MyFolder/Documents/Tutorial/R_and_geomatics/Data/PK_admin.gpkg"

# relative path (thanks to the RStudio project)
relative_path <- ".\\Data\\PK_admin.gpkg"
# or
relative_path <- "./Data/PK_admin.gpkg"

Once your RStudio project has been created, create a new R script by going to FileNew FileR Script. Then save it by going to FileSave As… and save your script with a name of your choice (script_geomatics.R, for example). It should be saved in your project directory. You can then work in this script.

Note

An R script is a simple text file. It weighs nothing and you can open it in a basic text editor.

Your R_andGeomatics* directory should contain the following files (Figure 1).

Figure 1: The structure of your folder.

Additional resources

There are countless resources available on R applied to geomatics, including courses, documentation, video tutorials, and more. Here is a selection of resources (not exhaustive):

Code development is increasingly being done with the support of AI. Conversational AI is a powerful tool that greatly aids development. Nevertheless, a good prior knowledge of code is necessary in order to know how to ask the right questions, interpret the answers provided correctly, and modify them to suit your own data set. While ChatGPT is the best-known AI, there are others that are also interesting. There are dozens of them, but Claude.ai is very effective for code generation and data processing. However, we must bear in mind that all these platforms thrive on our data (which they manage in a more or less opaque manner) and that AI in general is a powerful generator of greenhouse gases…