Merry Crowson
Institute of Zoology, Zoological Society of London, London, UK
“The world is one big data problem.” (Andrew McAfee)
There is a lot of excitement in the scientific community surrounding the potential of big data – that is, the recent exponential increase in the amount and variety of data, as well as the speed at which we can process it. The Landsat and Sentinel satellite missions alone beam a vast amount of data down to us every day, inviting us to transform this information into knowledge about the changes occurring on the surface of our planet. This is no small task, and those of us working on mapping land cover are continually striving to improve the accuracy of our maps and pushing at the frontier of what is “mappable” from space – that is to say, squeezing out every last drop of insight from the information available to us.

The village of Sinar Wajo, Sumatra. Betel nuts are being dried in the sun on either side of the road. Although betel nuts are harvested and sold internationally, the main cash crops in the region are palm oil and acacia. Photo by Merry Crowson.
The real world problems we are trying to solve with our data are often pressing. In Sumatra, Indonesia, tropical peat swamp forest is being lost at an alarming rate, due to conversion to plantations and fire. During a recent workshop in Sumatra, Indonesia, we walked under a baking sun through an area of tropical peatland that had burnt a few years earlier. In its natural state, tropical peatland does not usually burn because of the high water table in the peat. However, when tropical peatland forest is cleared and replaced with palm oil or acacia plantations, it is common to dig canals to drain the peat in order to increase productivity. Unfortunately, this also makes the peat more prone to fires, destroying peat swamp forest and plantations, and releasing a huge amount of CO2 into the atmosphere. Burning tropical peatland in Indonesia made international headlines in 2015, because the fires where so severe the haze reached as far as Singapore and Malaysia, creating concern about the effects on people’s health.

A NASA MODIS satellite image showing the extent of the Southeast Asian haze on 24 September 2015, caused by forest fires. Red outlines indicate hot spots where the sensor detected unusually warm surface temperatures associated with fires. Thick, grey smoke triggered air quality alerts and health warnings in Indonesia and neighbouring countries. NASA image by Adam Voiland (NASA Earth Observatory) and Jeff Schmaltz (LANCE MODIS Rapid Response) [Public domain], via Wikimedia Commons.

An area of tropical peatland that burnt in 2015. The area is now covered in grass, fern and some small bushes. Tropical peat swamp forest can be seen in the distance – this was the original land cover for the entire area. Photo by Merry Crowson.
Up-to-date land cover maps underpin efforts by scientists and policy makers to preserve areas of tropical peat swamp forest, restore a high water table within peatland areas and ensure economic development. However, mapping land cover in Sumatra is particularly challenging because this tropical region has high cloud cover year-round. Optical sensors, such as Sentinel-2, passively measure electromagnetic radiation reflected from the Earth’s surface. This means that cloud cover “blocks the view” of the Earth’s surface. Radar sensors, on the other hand, actively emit electromagnetic radiation in wavelengths that penetrate cloud and measure the returning signal, which means that they can “see through” cloud. Although there are difficulties with mapping land cover with radar alone, combining these two data types using data fusion is an ongoing research area.

Examples of optical data (left) and radar data (right). Both Sentinel-2 and Sentinel-1 images have a resolution of 10 m. Notice that no cloud cover is visible in the Sentinel-1 scene, because radar can “see through” cloud. The Sungai Buluh Protection Forest can be seen towards the centre of the area shown. For a detailed view of the land cover in the area, see the next image below. Sentinel images from https://scihub.copernicus.eu/
In our paper we take three different approaches to data fusion and compared the results against the use of a single data type. Our results clearly show that optical data remains the main data source for mapping land cover, even in areas with lots of cloud. The results using data fusion did show some improvement compared to the use of optical data only, but this improvement was small, and only statistically significant in one of the methods we applied. This suggests that those considering using data fusion within a project need to weigh up the benefits of a small increase in accuracy, against the extra processing time and expertise required.

Land cover results for the area surrounding Sungai Buluh protection forest, based on a pixel-based classification using both optical and radar data.
The availability of services such as the cloud-based platform Google Earth Engine and the free virtual machines provided by the European Space Agency, mean that we have an increased capacity to answer important questions using large and varied datasets. This potential is also likely to make workflow more complex, however. Particularly In the context of conservation, where budgets are often tight and the questions being asked are of immediate practical importance, it is particularly important to weigh up the potential benefits and costs of data fusion on a case by case basis.
See below for the paper from Merry Crowson, published in 2018 in Remote Sensing in Ecology and Conservation: