Overview

Create a new script for this workflow

In this session we will train a classifer to predict flooded land covers using Sentinel1 SAR scenes collected in both dry and wet seasons.

General Supervised Classification Workflow

Step 1. Load the image or images to be classified

Step 2. Gather the training data:

  • Collect training data to teach the classifier
  • Collect representative samples of backscatter for each landcover class of interest

Step 3. Create the training dataset:

  • Overlay the training areas over the images of interest
  • Extract backscatter for those areas

Step 4. Train the classifier and run the classification

Step 5. Validate your results

Identify Area of Interest

We will import our Barbados boundary file from our GEE Assets.

Copy this line of code to construct your aoi object.

var aoi = ee.FeatureCollection("path/to/Barbados").geometry().buffer(1e3);

Then, in the Assets tab at top-left in the Code Editor, find and open your Barbados FeatureCollection, copy its Asset path to your clipboard and replace “path/to/Barbados” in your script.

Prepare SAR Data

Load the Sentinel-1 C-band SAR Ground Range dataset, filtering it based on several metadata items and your AOI.

var collection = ee.ImageCollection('COPERNICUS/S1_GRD') 
.filter(ee.Filter.eq('instrumentMode', 'IW'))
.filter(ee.Filter.eq('orbitProperties_pass', 'ASCENDING'))
.filterMetadata('resolution_meters','equals',10)
.filterBounds(aoi)
.select('VV','VH');

We want to retrieve one S1 scene each from the wet season and the dry season. For each period, we will define a start and end date with which to filter the whole Sentinel S1 collection, select the first scene from the result and clip it to our AOI.

//collect one observation each of dry and wet period SAR data
var dry = collection.filterDate('2022-02-01','2022-02-28').first().clip(aoi);
var wet = collection.filterDate('2022-10-01','2022-10-30').first().clip(aoi);

// Display on map
var vis = {bands:['VV'],min:-20,max:0};
Map.centerObject(aoi,11);
Map.addLayer(dry,vis,'dry season',false);
Map.addLayer(wet,vis,'wet season ',false);

To reduce the salt-and-pepper effect on our eventual product, we apply a smoothing function across each scene with a focal window. Toggle between the original SAR scene and the speckle filtered result to see the effect.

// apply speckle filter
var smoothingRadius = 50;
var dryFiltered = dry.focal_mean(smoothingRadius,'circle','meters');
var wetFiltered = wet.focal_mean(smoothingRadius,'circle','meters');

// display filtered images
Map.addLayer(dryFiltered, vis,'dry season - speckle filter',false);
Map.addLayer(wetFiltered, vis,'wet season - speckle filter',false);

Creating Reference Data

To run a supervised classification, we must collect reference data to “train” the classifier. This involves collecting representative samples of SAR backscatter data for each map class of interest.

Using the Satellite basemap and your wet and dry season SAR layers in your map, we will draw representative polygons for several classes: Open Permanent Water, Flooded Vegetation, Non-Flooded Vegetation, and Urban/Built Area.

Below is the workflow for creating reference data directly in Earth Engine. We will use Open Permanent Water as the example.

  • In the geometry drawing toolbar (top-left of the Map panel), go to Geometry Imports and click ‘new layer’.

  • Click on the ‘Edit layer properties’ button (cog icon next to the geometry’s name) to configure the new geometry layer. Name the new layer ‘openWater’ and change its color. Under ‘Import as’, change it to ‘FeatureCollection’. Finally, click on the ‘+ Property’ button and enter a new property ‘landcover’ with a value of 0.

Each geometry layer, imported to your script now as a FeatureCollection will represent one class within your map product.

  • Draw representative polygons on the map. Click again on the geometry layer ‘openWater’ in the Geometry Imports panel and ensure it shows up in bold text. Then click on the polygon icon. Draw your open water polygons.

Repeat these steps for each of the remaining map classes: Flooded Vegetation, Non-Flooded Vegetation, and Urban/Built Area.

Remember: since each geometry layer represents its own map class, the value for the ‘landcover’ property must change when you are at the configuration step. For example, use landcover value of 1 for the next geometry layer that you create.

To get through this demonstration, shoot for 3 or 4 large polygons or 7 or 8 smaller ones per class. We can refine the quality of our reference data later if we have time.

Potential Flooded Natural Vegetation (Wetland Areas): Graeme Hall Swamp, Long Pond, Green Pond and Chancery Lane

Generate Training and Testing Data

Now that we have reference polygons for our four map classes, we will merge their FeatureCollections into one. Be mindful of the names that you gave to each polygon feature collection for this next line of code.

// Merge training FeatureCollections
var newFc = openWater.merge(floodedVegetation).merge(nonFloodedVegetation).merge(urban);
print(newFc.aggregate_histogram('landcover'));

Code Checkpoint: https://code.earthengine.google.com/fdce670033bb2304b77d5fdfc70991b1

Next, we will use the merged FeatureCollection of reference polygons to extract the SAR backscatter pixel values for each landcover. The polygons within the newFc FeatureCollection are overlaid on the image, and each pixel is converted to a point containing the image’s pixel values and the other properties inherited from the polygon (in our case ‘landcover’ property). After you run this, note in the Console the total size of reference points we now have to train and validate our map.

// define bands to be used to train the data
var bands = ['dryVV','dryVH','wetVV','wetVH'];

// combine dry and wet season composite SAR data
var final = ee.Image.cat(dryFiltered,wetFiltered)
.rename(['dryVV','dryVH','wetVV','wetVH']);
print('combined dry/wet season SAR data', final);

var samples = final.select(bands).sampleRegions({
  collection:newFc,
  properties:['landcover'],
  scale:30});

print('Sample size: ',samples.size());

Next, we want to set aside 80% of the reference points to train our classifier, and use the remaining 20% for validation. We do this simply by adding a new column to the reference point set called ‘random’, which contains random decimal numbers between 0 and 1, then filtering them into the two groups.

//split sample data into training and testing
var trainTest = samples.randomColumn();
print('First Train/Test Sample', trainTest.first());

var training = trainTest.filter(ee.Filter.lte('random',0.8));
var testing = trainTest.filter(ee.Filter.gt('random',0.8));
print('training size: ', training.size());
print('testing size: ', testing.size());

Create and Apply Classifier

We choose a classifier algorithm from ee.Classifier family of functions, and train it on the training FeatureCollection. We must specify the classProperty to be the property we want to map, or predict, with the model, and the inputProperties as the SAR backscatter values defined by the bands variable.

// train the classifier
var classifier = ee.Classifier.smileCart().train({
  features:training,
  classProperty: 'landcover',
  inputProperties:bands
});

Running the classification is as easy as selecting the desired bands from the final image (recall it is the dry season scene and wet season scene together), and calling classify().

// run the classification
var classified = final.select(bands).classify(classifier)

Display the results, modify the color palette if needed.

// Display the results, adjust colors according to your land cover stratification
var classVis = {min:0,max:3,palette:['blue','cyan','green','red']}
Map.centerObject(classified,12)
Map.addLayer(classified,classVis,'classified')

Right away we can pick out where some features are mixed up between classes, and where the classifier excels. Let’s assess the classifier’s performance quantitatively now.

Accuracy Assessment

First, lets see how the classifier performed on the training data.

// Compute training accuracy 
print('RF error matrix [Training Set]: ', classifier.confusionMatrix());
print('RF accuracy [Training Set]: ', classifier.confusionMatrix().accuracy());

100% accuracy on the training set sounds impressive but is not a true representation of the classifier’s performance since it is the data the model saw during training.

To represent a truer picture of the classifier’s performance, we can use the 20% of the samples witheld for testing.

// classify test set with classifier trained on training set
var classificationVal = testing.classify(classifier);
print('Classified points', classificationVal.limit(5));

// Create confusion matrix.
var confusionMatrix = classificationVal.errorMatrix({
  actual: 'landcover', 
  predicted: 'classification'
});

// Print Confusion Matrix and accuracies.
print('Confusion matrix [Testing Set]:', confusionMatrix);
print('Overall Accuracy [Testing Set]:', confusionMatrix.accuracy());
print('Producers Accuracy [Testing Set]:', confusionMatrix.producersAccuracy());
print('Users Accuracy [Testing Set]:', confusionMatrix.consumersAccuracy());

An Overall Accuracy (OA) above 90% is quite good, but note that it is the average of all class-specific accuracies. In our case, the OA is inflated by a very high Open Water class accuracy, effectively masking the much lower accuracies of the other classes.

For this methodology to hold up to scientific scrutiny, we would also want to collect a completely independent set of validation points outside this exercise to assess the resulting map with. A platform like Collect Earth Online is great for this application. Since this is just a demonstration we will move on.

Deploy Classifier to new Observations

Now that we have a trained model, we could deploy it, or apply it to other Sentinel 1 SAR observations quite easily in Earth Engine. Change the start and end date in the .filterDate call below to any date range starting in 2015 (Sentinel1 data availability). Remember that the first() call is taking the first SAR image within that date range.

/// Classify another Sentinel SAR observation
// one SAR observation from dry season
var newObsDry = collection.filterDate('2018-02-01','2018-02-28').first().clip(aoi);
// one SAR observation from wet season
var newObsWet = collection.filterDate('2018-10-01','2018-10-31').first().clip(aoi);

// combine the two season observations
var newObsFinal = ee.Image.cat(newObsDry,newObsWet).rename(bands);
print('New Obs Final', newObsFinal);

// apply smoothing filter
var newObsFiltered = newObsFinal.focal_mean(smoothingRadius,'circle','meters');

// classify new observation band stack
var newObsClassified = newObsFiltered.select(bands).classify(classifier)

Code Checkpoint: https://code.earthengine.google.com/4a0badea503fdb958d6e74e62996f55c

A pre-trained model such as this one could be useful in deriving insights from new satellite observations quickly. Flood monitoring applications like HydraFloods works this way, with just a few more bells and whistles.


Copyright © 2023 SERVIR-Amazonia / Spatial Informatics Group. Distributed by an EPL 2.0 license.