2  Lab 2: Coordinate Reference Systems

The purpose of this lab is to help you understand why we need to pay attention to Coordinate Reference Systems (CRS) when working with spatial data. CRS’s are what make data spatial - they associate the actual data to locations on the surface of the Earth (or other planets!). But there are dozens of CRS’s in existence, each adapted for a specific world region and purpose. So quite often you will obtain spatial data in different coordinate systems, which can cause problems if not normalised before analysis.

2.1 Guided Exercise 1: Understanding Coordinate Reference Systems

In this exercise, you will learn how to use QGIS to identify the coordinate reference system (CRS, sometimes wrongly called as just “projection”) of spatial data you acquire and how to manage data and project coordinate reference systems.

Stop and Think

Why are coordinate reference systems and ‘projections’ not the same thing?

Coordinate reference systems combine a datum, which defines a geometric representation of the Earth’s shape and how it ‘intersects’ with the real surface of the Earth, a coordinate system (for example latitude and longitude or northings and eastings) and a map projection which is a set of mathematical rules to project the 3D surface of the datum’s ellipsoid into a flat plane (such as a screen or a map).

2.1.1 Obtaining the required data

  1. For this exercise, we will use the the 2024 country boundaries data in GeoJSON format, at the 1:20 million scale, that is availabl from the link below. The GeoJSON format is a more recent GIS file format, commonly used for web-based mapping. It is derived from the JavaScript Object Notation (JSON) data file format, widely used to exchange data among websites and web servers. More info here.

https://ec.europa.eu/eurostat/web/gisco/geodata/administrative-units/countries

We will need two files for this exercise:

  • The boundaries geometry type (BN) file in the EPSG:4326 coordinate reference system.
  • The boundaries geometry type (BN) file in the EPSG:3035 coordinate reference system.

Stop and Think
  1. What is the source of the data you are downloading? Does it seem reliable?

  2. What are the conditions (provisions) of use for the data?

  1. The page providing the data is managed by Eurostat, the statistical office of the European Union. Therefore, you would be inclined to trust in the quality and correctness of the data provided.

  2. The webpage presents a link to “rules” which describe the authorised uses of the data. This information can also be found in the data’s metadata file.

  1. Create a lab_2 folder in your GEOU9SP main folder. Then create a simple folder structure to organise the data.

  2. Open QGIS and start a new project. Save it as lab_2 in its proper folder. Then look at the contents of the folder holding the country data, using the QGIS browser panel. If the panel is not available, you can enable it by going to the View > Panels menu and checking the box for browser. You should see the two layers on the folder:

  1. Load the CNTR_BN_20M_2024_4326 file into your project. Pay attention to the file name as there are many files with similar names.
Stop and Think

Good file names are always informative of their content. Can you guess the contents of the different GeoJSON files you have downloaded based on their names?

All file names start with CNTR for ‘countries’, followed by a two-letter code. As you seen BN seems to stand for ‘boundary’(vector lines), RG for ‘region’(vector polygons), and LB for ‘labels’ (vector points). Then 2024 specified the reference year, and 20M indicates the 1:20 million mapping scale. The final four-letter number indicates the EPSG code for the data CRS: 4326 (‘unprojected’ WGS84), 3857 (WGS 84 with Pseudo-Mercator projection) or 3035 (ETRS89-extended / Lambert Azimuthal Equal Area for Europe).

2.1.2 Visualising data with different CRS

  1. Set the symbology for the outline and fill as you prefer. Try to manipulate more visual variables than just colour.
Stop and Think

Why can’t you set a fill colour for the countries?

Because the BN files are vector lines, not polygons. Lines only have one dimension, and thus the inner parts of the countries in this dataset are actually empty. If you load the RG dataset instead, you can set the fill, as vector polygons represent 2D areas.

  1. Right-click on this layer’s name and go to Properties > Information.
Stop and Think

What is the Coordinate Reference System (CRS) for this dataset?

The information tab will have a section called Coordinate Reference System, as below:

Name: EPSG:4326 - WGS 84 Units: Geographic (uses latitude and longitude for coordinates) Type: Geographic (2D) Method: Lat/long (Geodetic alias) Celestial Body: Earth Accuracy: Based on World Geodetic System 1984 ensemble (EPSG:6326), which has a limited accuracy of at best 2 meters. Reference: Dynamic (relies on a datum which is not plate-fixed)

  1. Note, on the bottom QGIS status bar, that as you move your mouse pointer around, the coordinates for the mouse position are updated in real time. Also note what the map scale is and how it changes as you zoom in and out. You can also type the second part of a scale number to zoom at the desired map scale (for example 50000 if you want to see the map at a 1:50000 scale)

Stop and Think
  1. Why doesn’t the scale shown on the bar match the “advertised” scale for the dataset (1:20 million)?

  2. The box on the very bottom right of the QGIS status bar tells you what the current project CRS is. How is it different from a layer CRS?

  1. The 20M scale refers to the scale used when digitising the coastline, i.e., what is the ‘closest’ you can view the dataset without loss of detail.

  2. The project CRS defines the ‘viewing’ CRS for the map canvas. Any data that uses a different CRS than the project will be re-projected ‘on the fly’ to match the CRS of the project - but continue with the exercise to learn why that can be a problem.

  1. Zoom to the UK in the shown layer. Note how the scale at the bottom status bar changes with your zoom.
Stop and Think

Does the shape of the UK look “right” to you? If not, what is the issue and what is the cause?

The UK looks ‘squished` vertically. That is because the data is being viewed in the EPSG 4326 (’unprjected’ WGS84) CRS. EPSG 4236 uses what is effectively the Plate Carrée or Equidistant cylindrical projection, the simplest possible map projection - lat and long degrees are just linearly converted to x,y coordinates. This projection does not preserve area nor shape (conformal) and increasingly distorts features as you approach the poles.

  1. Click on the project projection box at the bottom right of the status bar (or go to Project > Properties... > CRS tab). On the Filter text box, search for EPSG:3035. Select this projection for the project and click OK. A warning box will appear, make sure you read it through before selecting OK again.
Stop and Think
  1. What is the name of the Coordinate Reference System specified by EPSG 3035?

  2. What did the warning window warned you about?

  1. EPSG 3035 is called ETRS89-extended / LAEA Europe and is the official projection for cartographic data from the European Union. It uses the European Terrestrial Reference System 1989 datum and the Lambert Azimuthal Equal Area projection, which preserves areas and can be considered conformal for Europe.

  2. It warned you that there is more than one option for the ‘on the fly’ reprojection of your layer from ESPG 4326 to EPSG 3035. It showed you the options with the most accurate (1m) selected by default. But this option is only valid for Europe.

It is important to not “freak out” when an unexpected warning or error appear. Take a breath, and read through the window or error message, most often the explanation is right there. You just have to dare to look.

If you did click through it without looking, here is a screen capture of it:

  1. Look at the shape of the UK again after changing the project CRS. Then right click on the layer name and select “Zoom to Layer(s).”
Stop and Think
  1. How does the rest of the world look now? Why?

  2. When you move your mouse, what unit are the coordinates in?

  1. As you move further away from the centre of the projection, shape gets progressively more distorted. This is because the LAEA projection is only conformal at its centre. But areas are all correct.

  2. In metres. You can check that on the layer’s Properties > Information tab.

  1. Now add to the project the file called CNTR_BN_20M_2024_3035.geojson. Notice the different last four numbers on the file name.
Stop and Think

What is the CRS for this new layer, and how well does it visually align with the previous layer?

This second layers uses the EPSG 3035 CRS, while the previous layer used EPSG 4326. Notice how each layer maintains its original CRS when added to the project, but if necessary they are reprojected ‘on the fly’ to match up visually.

  1. Change the project CRS back to EPSG:4326.

  2. Now go back to the project CRS properties and check the box that says No CRS at the top of the window. This disables the on-the-fly projection. Then click OK and go back to your map.

  3. Right click on the 4326 layer and select Zoom to Layer. Then select the zoom out tool (the loupe with a minus sign) at the top button row, and start clicking at the centre of the map. Keep clicking as it gets really small - you should click about 17 times until the second dataset is fully visible. Check the properties of each layer to make sure they still have the same CRS of when you loaded them.

Stop and Think

What has happened? Why are the two datasets suddenly very different in size?

Since you turned off on-the-fly reprojection, each dataset is now drawn at their original coordinates - but one is in meters and the other in degrees, so their x,y positions and scale become very different.

2.1.3 Potential issues with using mismatched data

  1. Download this vector shapefile (link), unzip it and add it to your QGIS project. Check what the CRS of this layer is.

  2. Set the Project CRS back to EPSG:3035. Then go to the top menu bar and select Vector > Geoprocessing > Clip. Select the layer that has the EPSG 3035 projection as your Input Layer, and the new “clip_bounds” layer as your Overlay Layer. You can just leave the output as a temporary file. Click Run.

  3. Turn off the visibility of all layers except the new “Clipped” layer to see the result of the Clip operation.

  4. Rename the “Clipped” layer to “Clipped_3035” by right clicking on it and selecting Rename layer. Then repeat the Clip operation, this time selecting the 4326 world layer as Input, and “clip_bounds” as Overlay again. Rename the result to “Clipped_4326.”

Using what you learned on the previous lab activities, pick two contrasting colours for each “Clipped_…” layer, and make the lines thicker. Zoom in at the lines of each clipped layer and check if they overlap perfectly.

Stop and Think

Why are the clipping results different even though the initial 3035 and 4236 layers looked perfectly aligned?

Because although they are reprojected ‘on-the-fly’ to visually match on screen, GIS operations will not take this into consideration when doing their calculations in the background. As the files still have different CRSs, this affects the lining up between the Input and Overlay layers. That is why it is so important to always permanently translate (reproject) all datasets to the same CRS at the start of a project.

  1. Now go to Vector > Data Management Tools > Reproject Layer. Select the 4326 world layer as your Input Layer, and EPSG:3035 as your Target CRS. Let the result be a temporary file and click OK. The new layer will be automatically named as “Reprojected”. What is the CRS of this new layer (check on the layer properties window)?

  2. Now repeat the use of the Clip tool using “Reprojected” as the Input Layer and “clip_bounds” as the Overlay Layer. Rename the resulting layer to “Clipped_Reprojected”. Which of the two originally clipped layers (“Clipped_3035” or “Clipped_4236?) better matches the”Clipped_Reprojected” layer?

Stop and Think

What does the Reproject operation do?

It applies a permanent mathematical transformation to the coordinates of the input data, translating the data from one CRS to another. For any GIS project that involves multiple data layers with different CRSs, you should pick the CRS that makes more sense for your project as the ‘project CRS’, and then reproject all layers to the same CRS before anything else.

This is the end of Lab 2! You should now understand why different datasets may have different Coordinate Reference Systems, what the problems are with working with data that has mismatched CRSs, and how to reproject data to match a given CRS. As this is your first week, there are no additional independent exercises - let’s take it easy!

If you still want to practice more, check the exercises from the QGIS Training Manual!