GLAD Landsat ARD Tools

Existing datasets

NDVI is easy to calculate from a wide range of different image sources. In terms of already prepared NDVI data, MODIS data are processed into several different vegetation indices and made available on 16-day, monthly, and yearly intervals at different resolutions. MODIS NDVI data can be downloaded from NASA's Land Processes Distributed Actice Archive Center or from the Global Land Cover Facility.

Prepared MODIS and AVHRR NDVI datasets are also used in online tools like RangeView. While these NDVI datasets cannot be downloaded directly, they can be viewed and used for simple analyses via the website.

Home

1. GLAD Landsat ARD Tools

Landsat composite

 

The Landsat Analysis Ready Data (ARD) developed by the Global Land Analysis and Discovery team (GLAD) provides spatially and temporally consistent inputs for land cover mapping and change detection. The ARD is available for the ice-free area of continents and large islands between 75N and 56S Latitude. The GLAD Landsat ARD data is available online, with no charges for access and no restrictions on subsequent redistribution or use, as long as the proper citation is provided as specified by the Creative Commons Attribution License (CC BY). See License and Disclaimer for additional information and citation. To facilitate the ARD application, the GLAD team providing a set of tools for data processing and supervised classification using machine learning. For all questions and comment contact Peter Potapov (potapov@umd.edu).

 

Potapov, P., Hansen, M.C., Kommareddy, I., Kommareddy, A., Turubanova, S., Pickens, A., Adusei, B., Tyukavina A., and Ying, Q., 2020. Landsat analysis ready data for global land cover and land cover change mapping. Remote Sens. 2020, 12, 426; doi:10.3390/rs12030426 (Potapov_RS_2020.pdf)

 

GLAD Landsat ARD

2. GLAD Landsat ARD

Multi-temporal metrics are a standard method of time-series data transformation. They simplify the analysis of land surface phenology, facilitate land cover mapping and change detection. The metrics approach was developed in the mid-1980s to extract phenology features from time-series of Normalized Difference Vegetation Index (NDVI) (Badwhar, 1984; Maligreau, 1986). At the same time, the idea of using NDVI time-series to extract spectral reflectance corresponding to certain phenological stages was proposed by Holben (1986). Later, both approaches were merged together by researchers from the Laboratory for Global Remote Sensing Studies at the University of Maryland (DeFries, Hansen, and Townsend, 1995). In their work, metrics were calculated by extracting spectral information for specific phenological stages defined by the NDVI annual dynamics. Later, the multi-temporal metrics were widely used for forest extent and structure monitoring at continental and global scales using MODIS (Hansen et al., 2010) and Landsat data (Hansen et al., 2013; Potapov et al., 2012; 2017; 2019;). The purpose of metrics is to create consistent inputs for annual vegetation mapping and change detection models and to overcome the inconsistency of clear-sky data availability that is typical for sensors with low observation frequency, like Landsat.

 

Implementation of the multi-temporal metric approach requires processing the entire archive of the Landsat observations. To simplify metric processing, the Global Land Analysis and Discovery team (GLAD) developed a Landsat Analysis Ready Data (ARD) product. The Landsat ARD produced by the GLAD team using an automated image processing system. The essence of the GLAD ARD approach is to convert individual Landsat images into a time-series of 16-day normalized surface reflectance composites with minimal atmospheric contamination. The Landsat data processing algorithms were prototyped by Hansen et al. (2008) and Potapov et al. (2012) and was significantly improved in the recent years (Potapov et al., 2019). The ARD serve as a data source to generate metrics suitable for annual land cover, land cover change, vegetation composition and structure mapping.

 

The following sections provide an overview of the Landsat processing methodology, explain ARD structure, and provide data download instructions.

    2.1. Landsat ARD Methodology

    2.2. Landsat ARD Structure

    2.3. Landsat ARD Download

    2.4. Topography Data Download

    2.5. License and Disclaimer

    2.6. User Registration

 

References

  1. Badwhar, G. D. (1984) Use of LANDSAT-derived profile features for spring small-grains classification, Int. J. Remote Sens. 5(5):783-797.
  2. Holben, B. N. (1986), Characteristics of maximum-value composite images from temporal AVHRR data, Int. J. Remote Sens. 12:1147-1163.
  3. Malingreau, J.-P. (1986) Global vegetation dynamics: satellite observations over Asia, Int. J. Remote Sens. 7:1121-1146.
  4. DeFries, R., Hansen, M., Townshend, J. (1995) Global discrimination of land cover types from metrics derived from AVHRR pathfinder data. Remote Sens. Environ. 54:209–222.
  5. Hansen, M., Stehman, S., Potapov, P. (2010) Quantification of global gross forest cover loss. Proc. Natl. Acad. Sci. USA. 107(19):8650–8655.
  6. Potapov, P.V., Turubanova, S.A., Hansen, M.C., Adusei, B., Broich, M., Altstatt, A., Mane, L., Justice, C.O. (2012) Quantifying forest cover loss in Democratic Republic of the Congo, 2000–2010, with Landsat ETM+ data. Remote Sens. Environ. 122:106-116.
  7. Hansen, M.C., Potapov, P.V., Moore, R., Hancher, M., Turubanova, S.A.A., Tyukavina, A., Thau, D., Stehman, S.V., Goetz, S.J., Loveland, T.R., Kommareddy, A. (2013) High-resolution global maps of 21st-century forest cover change. Science, 342(6160):850-853.
  8. Potapov, P. V., Turubanova, S.A., Tyukavina, A., Krylov, A.M., McCarty, J.L., Radeloff, V.C., Hansen, M.C. (2015) Eastern Europe’s forest cover dynamics from 1985 to 2012 quantified from the full Landsat archive. Remote Sens. Environ. 159:28–43.
  9. Potapov, P., Tyukavina, A., Turubanova, S., Talero, Y., Hernandez-Serna, A., Hansen, M.C., Saah, D., Tenneson, K., Poortinga, A., Aekakkararungroj, A. and Chishtie, F., 2019. Annual continuous fields of woody vegetation structure in the Lower Mekong region from 2000‐2017 Landsat time-series. Remote Sensing of Environment, 232, p.111278.

 

Landsat ARD Methodology

2.1. Landsat ARD Methodology

The Landsat Analysis Ready Data (ARD) developed by the Global Land Analysis and Discovery team (GLAD) provides globally consistent inputs for land cover mapping and change detection. The entire archive of the Landsat Collection 1 TM/ETM+/OLI data available from the United States Geological Survey National Center for Earth Resources Observation and Science (USGS EROS) was automatically processed to derive ARD time-series. The following section documents the Landsat data processing and temporal aggregation. For all questions and comment contact Peter Potapov (potapov@umd.edu). The following publication summarizes data processing steps: Potapov et al., 2019 (Potapov_RSE_2019.pdf)

 

The Landsat data processing includes four steps: (1) conversion to radiometric quantity, (2) observation quality assessment, (3) reflectance normalization, and (4) temporal integration into 16-day composites.


1. Conversion to radiometric quantity

In the first step, all data is converted to top-of-atmosphere reflectance (Chander et al., 2009) for reflective bands and brightness temperature for the emissive band. Spectral reflectance (value range from zero to one) are rescaled to the range from 1 to 40,000; temperature is recorded as degrees C multiplied by 100 to preserve measurement precision. Only spectral bands with matching wavelengths between TM, ETM+ and OLI/TIRS sensors are processed (Table 1).

 

Table 1. Landsat spectral bands used for the ARD and corresponding MODIS spectral bands

Band name

Wavelength, nm

Landsat 5 TM

Landsat 7 ETM+

Landsat 8 OLI/TIRS

MODIS

Blue

450-520

441-514

452-512

459-479

Green

520-600

519-601

533-590

545-565

Red

630-690

631-692

636-673

620-670

NIR

760-900

772-898

851-879

841-876

SWIR1

1,550-1,750

1,547-1,749

1,566-1,651

1,628-1,652

SWIR2

2,080-2,350

2,064-2,345

2,107-2,294

2,105-2,155

Thermal

10,410-12,500

10,310-12,360

10,600-11,190

10,780-11,280

 

The following equations summarize data conversion and scaling. The value of PI defined as 3.1415926535897932384626433832. All other coefficients are from Landsat Collection 1 metadata and lookup tables from Chander et al., 2009. DN stands for the source raster value.

 

Landsat 8 OLI

TOA reflectance = ((0.00002*DN-0.1)/sin(SUNELEV*PI/180))*40000

 

From the image metadata:

SUNELEV = "SUN_ELEVATION"

 

Landsat 8 TIRS

Brightness temperature = (K2/log(K1/(0.0003342*DN+0.1)+1))*100+0.5

 

From the image metadata:

K1 = “K1_CONSTANT_BAND_10”

K2 = “K2_CONSTANT_BAND_10”

 

Landsat 4/5 TM and 7 ETM+

TOA reflectance = ((PI*D^2*(G*DN+B))/(ESUN*sin(SUNELEV *PI/180)))*40000

 

From the image metadata:

SUNELEV = “SUN_ELEVATION”

 

From the lookup tables (Chander et al., 2009):

G – gain (sensor- and band-specific)

B – bias (sensor- and band-specific)

ESUN – exoatmospheric irradiance (sensor- and band-specific)

D – Sun-Earth distance

 

Brightness temperature = (K2/log(K1/(G*DN+B)+1))*100+0.5

 

From the lookup tables (Chander et al., 2009):

K1 = Thermal band calibration constant 1 (sensor-specific)

K2 = Thermal band calibration constant 2 (sensor-specific)


2. Observation quality assessment

During the second step, we determine per-pixel observation quality, i.e. a probability of an image pixel to be collected during clear sky conditions. The GLAD quality assessment model developed by our team represents a set of regionally adapted decision trees to map probability of a pixel to represent cloud, cloud shadow, heavy haze, and, for clear-sky observations, land, water, or snow/ice. The Landsat Collection 1 data include observation quality layers based on the globally consistent CFMask cloud and cloud shadow detection algorithm (Foga et al., 2017). Since our primary goal is to reduce the presence of clouds and shadows in the time-series data, we merge both the CFMask product (high-probability clouds and shadows) with the GLAD algorithm output. Through these outputs, cloud, shadow, haze, water, and land masks are created for each Landsat image. The masks are subsequently aggregated into a single quality flag that highlights cloud/shadow contaminated observations, indicates confidence of cloud/shadow detection, separates topography shadows from likely cloud shadows, and specifies the proximity to high-confidence clouds and cloud shadows. Table 2 summarize possible observation quality flags. For the subsequent analysis, quality flags 1, 2, 5, 6, 11, 12, and 14 are treated as clear-sky observations, while codes 3, 4, 7, 8, 9, and 10 are considered contaminated by clouds and shadows. Codes 5, 6, 11, 12, and 14 are used for 16-day composition to prioritize observations with minimal atmospheric contamination.

 

Table 2. Per-pixel observation quality flags

Flag

Quality

 Definition

1

Land

Clear-sky land observation.

2

Water

Clear-sky water observation.

3

Cloud

Cloud detected by either GLAD or CFMask algorithm.

4

Cloud shadow

Shadow detected by either GLAD or CFMask algorithm. The pixels located within the projection of a detected cloud. Cloud projection defined using the solar elevation and azimuth and limited to 9 km distance from the cloud.

5

Topographic shadow

Shadow detected by either GLAD or CFMask algorithm. The pixel located outside cloud projections and within estimated topographic shadow (defined using SRTM DEM and the observation solar elevation and azimuth).

6

Snow

Clear-sky observation classified as snow by the GLAD algorithm

7

Haze

Classified as haze (dense semi-transparent cloud) by the GLAD algorithm.

8

Cloud proximity

Aggregation (OR) of two rules:

(i) 1-pixel buffer around detected clouds.

(ii) Above-zero cloud likelihood (estimated by GLAD cloud detection model) within 3-pixel buffer around detected clouds.

9

Shadow proximity

Shadow likelihood (estimated by GLAD shadow detection model) above 10% for pixels either (i) located within the projection of a detected cloud; OR (ii) within 3 pixels of a detected cloud or cloud shadow.

10

Other shadows

Shadow detected by either GLAD or CFMask algorithm. The pixel located outside (a) projection of a detected cloud and (b) estimated topographic shadow.

11

Additional cloud proximity over land

Clear-sky land pixels located closer than 7 pixels of detected clouds

12

Additional cloud proximity over water

Clear-sky water pixels located closer than 7 pixels of detected clouds

14

Additional shadow proximity over land

Clear-sky land pixels located closer than 7 pixels of detected cloud shadows

 


3. Reflectance normalization

The third step consists of reflectance and brightness temperature normalization to reduce the effects of atmospheric scattering and surface anisotropy. The purpose of relative normalization is to simplify extrapolation of classification models in space and time by ensuring spectral similarity of the same land-cover types. Relative normalization is not computationally heavy and does not require synchronously collected or historical data on atmospheric properties and land-cover specific anisotropy. The normalized surface reflectance is not equal to surface reflectance and should not be used as a source dataset for the analysis of reflectance properties and dynamic. The purpose of the ARD product is solely to facilitate land cover and land cover change mapping.

 

The Landsat image normalization consists of four steps: (A) Production of the normalization target dataset; (B) selection of pseudo-invariant target pixels to derive the normalization model; (C) model parametrization; and (D) model application.

 

A. Normalization target

The normalization target data were collected from the MODIS 44C surface reflectance product. We used MODIS bands with a similar wavelength to the selected Landsat bands (see Table 1). MODIS surface reflectance and brightness temperature were scaled using the same coefficients as for the processed Landsat data. To ensure spatial consistency of reflectance target dataset, we used all 16-day global MODIS observation composites from the year 2000 to 2011. To reduce the effects of atmospheric contamination and surface anisotropy, only near-nadir clear-sky and low aerosol observations were retained in the time-series. From all selected observations, a single reflectance composite was created for each spectral band. We calculated the normalization target composite value as the mean reflectance of observations with normalized difference vegetation index (NDVI) above the 75% percentile.

 

B. Selection of pseudo-invariant target pixels

We define the pseudo-invariant target pixels as clear-sky land observations that represent the same land cover type and phenology stage in Landsat image and MODIS normalization target composite. To check for the land cover type and condition, we calculate the absolute difference between Landsat and MODIS spectral reflectance for red and shortwave infrared bands and select pixels with difference below 0.1 reflectance value for both spectral bands. Bright objects (with red band reflectance above 0.5) are excluded from the mask. Landsat images with less than 10,000 selected pseudo-invariant target pixels are discarded from the processing chain.

 

C. Model parametrization

To parametrize the reflectance normalization model, we calculate the median bias between MODIS and Landsat reflectance of pseudo-invariant pixels for each reflective band for each 10 km interval of distance from the Landsat ground track. The set of median values is used to parametrize a per-band linear regression model using least squares fitting method. For each image and each spectral band, we derive Gain and Bias coefficients to predict the reflectance bias as a function of the distance from the ground track:

 

[Reflectance Bias] = Gain * [Distance from ground track] + Bias

 

For images where only a small portion (less than 1/16 of the image) is covered by land, a mean reflectance bias is calculated instead (Gain = 0). Such conditions are usually found in coastal regions or over small islands. For the brightness temperature band, we calculate a single mean bias value for all pseudo-invariant target pixels within the image.

 
D. Model application

The model is then applied to all pixels within the image to estimate the reflectance (brightness temperature) bias. We subtract the estimated reflectance bias from the Landsat top-of-atmosphere reflectance for each spectral band and from brightness temperature values. The model has been shown to effectively normalize land surface reflectance and reduce effects of atmospheric scattering and surface anisotropy (Potapov et al., 2012)

 


4. Temporal integration into 16-day composites

The final step of Landsat ARD processing is temporal aggregation of individual images into 16-day composites. The range of dates for each composite is provided in the Table 3.

 

Table 3. 16-day composite intervals

Interval ID

DOY start

DOY end

Date start

Date end

1

1

16

1-Jan

16-Jan

2

17

32

17-Jan

1-Feb

3

33

48

2-Feb

17-Feb

4

49

64

18-Feb

4-Mar

5

65

80

5-Mar

20-Mar

6

81

96

21-Mar

5-Apr

7

97

112

6-Apr

21-Apr

8

113

128

22-Apr

7-May

9

129

144

8-May

23-May

10

145

160

24-May

8-Jun

11

161

176

9-Jun

24-Jun

12

177

192

25-Jun

10-Jul

13

193

208

11-Jul

26-Jul

14

209

224

27-Jul

11-Aug

15

225

240

12-Aug

27-Aug

16

241

256

28-Aug

12-Sep

17

257

272

13-Sep

28-Sep

18

273

288

29-Sep

14-Oct

19

289

304

15-Oct

30-Oct

20

305

320

31-Oct

15-Nov

21

321

336

16-Nov

1-Dec

22

337

352

2-Dec

17-Dec

23

353

366

18-Dec

31-Dec

 

Temporal compositing is done per-pixel using all overlapping observations within the 16-day interval. From all available observations we retain the one with the highest observation quality. The proximity to clouds/shadows is used as a criterion to select the least contaminated observation. If several clear-sky observations were available, the per-band mean reflectance value is retained in the composite. Each 16-day composite consists of normalized surface reflectance for six spectral bands (blue, green, red, NIR, and two SWIR), normalized brightness temperature, and the quality flag as separate raster layers. The 16-day interval composites are stored in the geographic coordinates (+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs) as raster tiles of 1.0005 by 1.0005 degree size and pixel size of 0.00025 degree. The tile system features a 2-pixel overlap to simplify parallelization of the focal average computation. For detailed information about the Landsat ARD see Landsat ARD Structure section.

 
Important notes on data limitations:
  • Images that were not processed to Tier 1 standard, as well as images with high cloud cover and Landsat 5 TM sensor issues were automatically detected and excluded from processing.
  • Due to low sun azimuth and similarity between snow cover and clouds during winter season in temperate and boreal climates, the GLAD Landsat ARD algorithm is not suitable for winter time image processing above 30N and below 45S Latitude. Using the MODIS snow and ice product and NDVI time-series we defined dates with high probability of the seasonal snow cover for each WRS2 path/row. We are not processing images for these dates, and the resulting 16-day intervals have no data. Some of the images (and resulting 16-day composites) may still include snow-covered observations. We suggest to further filter such observations using “snow/ice” quality flag or by removing certain 16-day intervals from data processing.
  • The GLAD team is committed to the annual ARD update. Due to the delay of Tier 1 data processing by USGS EROS (between 14 and 26 days after the data acquisition) and the time required to download and process new data by GLAD, the current data processing system is not designed for real-time ARD delivery.
  • The users should be aware that the image normalization method is not design to deal with specular reflectance and thus introduces bias over the water surfaces.

Landsat ARD Structure

2.2. Landsat ARD Structure

Tile system

The global Landsat ARD product is provided as a set of 1x1 degree tiles. Tile names are derived from the tile center, and refer to their integer value of the tile center degrees. E.g., the name of a tile with center 17.5E and 52.5N is 017E_52N. Please use the shapefile glad_landsat_tiles.zip to select tiles for your area of interest.

 

16-day intervals

Each data granule (GeoTIFF file) contains observation data collected for a single 16-day interval. There are 23 intervals per year, see Table 2 (Landsat ARD Methodology) for interval dates. Each interval has a unique numeric ID, starting from the first interval of the year 1980. Use the 16-day interval ID table (16d_intervals.xlsx) to select intervals for your analysis.

 

Raster data

Each 16-day interval data stored as a single 8-band LZW-compressed GeoTIFF file in geographic coordinates (+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs), with the size of 4004x4004 pixels, an extent of 1.0005 by 1.0005 degrees, and a pixel size of 0.00025 degree. The tile system features a 2-pixel overlap to simplify parallelization of the focal average computation.

 

All raster layers are 16-bit unsigned. The list of layers are provided in Table 1 (see Landsat ARD Methodology for band wavelengths). The last layer (band 8) consists of a quality flag (QA) that reflects the quality of observation used to create the composite. Table 2 provides QA flag metadata. QA values 1, 15, and 2 are always considered as clear-sky observations. QA values 5 and 6 indicate seasonal quality issues (topographic shadows and snow cover). These observations may be removed from data processing if the number of clear-sky observations is sufficient. QA values 11-14 and 16-17 are considered as cloud-free data with an indication of potential cloud or cloud shadow contamination. QA values 15-17 are added to simplify intermittent water analysis (e.g., floods detection). QA values 3, 4, and 7-10 are considered contaminated by clouds and shadows and are usually excluded from data processing.

 

Table 1. 16-day composite image layers

Band number

Data

Units, Format

1

Normalized surface reflectance of blue band

Normalized surface reflectance scaled to the range from 1 to 40,000, UInt16

2

Normalized surface reflectance of green band

3

Normalized surface reflectance of red band

4

Normalized surface reflectance of NIR band

5

Normalized surface reflectance of SWIR1 band

6

Normalized surface reflectance of SWIR2 band

7

Normalized brightness temperature

degrees C * 100, UInt16

8

Observation quality code (QA). See 16d_intervals.xlsx for QA code description.

code, UInt16

 

Table 2. Per-pixel observation quality flag (QA)

QA code

Description

Quality

1

Land

clear-sky

2

Water

clear-sky

3

Cloud

Cloud contaminated

4

Cloud shadow

Shadow contaminated

5

Hillshade

clear-sky

6

Snow

clear-sky

7

Haze

Cloud contaminated

8

Cloud buffer

Cloud contaminated

9

Shadow buffer

Shadow contaminated

10

Shadow high likelihood

Shadow contaminated

11

Additional cloud buffer over land

clear-sky

12

Additional cloud buffer over water

clear-sky

14

Additional shadow buffer over land

clear-sky

15

Land, water detected but not used

clear-sky

16

Additional cloud buffer over land, water detected but not used

clear-sky

17

Additional shadow buffer over land, water detected but not used

clear-sky

 

Note for ARD v1.0 users:

The v1.0 16-day data was produced using the same methodology that the V1.1 data. The differences include the use of Landsat pre-collection data, incomplete coverage of Landsat 5 data (exclude recently added to the USGS archive), and different QA code (see below). The ARD v1.1 data is currently available east of 30W longitude. The data west of 30W longitude is only available in the v1.0 standard and is being reprocessed to the v1.1.

 

The QA flag is combined with the number of observations used to calculate the value for the 16-day interval. The output value is constructed as:

[QA ARD v1.0] = [Observation quality code * 100] + [number of observations]

In the v1.1 the number of observations is deprecated and the QA value consists of the observation quality code.

 

 

Landsat ARD Download

2.3. Landsat ARD Download

To download the Landsat ARD data, follow the steps below:

  • Create/define local folder for data download
    • If you are downloading data for the first time or for a new project, create a new folder where data will be stored. Make sure you have enough disk space. The data volume for a single 1x1 degree tile, one year of data, is between 4 and 5 GB on average.
    • If you are adding data to the existing project, use name of the folder with previously downloaded 16-day composites.
  • Select tiles to download from the tile database (glad_landsat_tiles.zip). Make a text file that contains only the list of selected tiles (single column, tile names only – see example tiles.txt).
  • Select dates from the 16-day interval IDs (16d_intervals.xlsx).
  • For a new user: register to the service and receive your username and password (User Registration)
  • To download a single interval for a single tile, use the following command:

curl -u <username>:<password> -X GET https://glad.umd.edu/dataset/landsat_v1.1/<lat>/<tile>/<interval>.tif -o <outfolder>/<interval>.tif

 

Required parameters:

<username> and <password> - User login information

<tile> - lat/long 1x1 degree tile name (e.g. 105E_13N)

<lat> - tile latitude, second half of the tile name (e.g., for the 105E_13N, <lat> is 13N)

<interval> - unique 16-day interval ID (16d_intervals.xlsx).

<outfolder> - output folder (make sure that each tile is stored in a separate folder)

 

To download multiple tiles, use data download code (download_V1.1.pl)

  • Make sure that the ActivePERL is installed on your computer (see GLAD Tools Setup for details).
  • Make list of tiles as a text file (single column, tile names only – see example tiles.txt).

Usage: > perl download_V1.1.pl <username> <password> <tile list file> <start interval> <end interval> <output folder>

 

Required parameters:

<username> and <password> - User login information

<tile list file> - list of lat/long 1x1 degree tiles (in text format)

<start interval> <end interval> - a range of 16-day intervals (16d_intervals.xlsx).

<output folder> - output folder

 

 

Topography Data Download

2.4. Topography Data Download

Topography metrics (elevation, slope, and aspect) are frequently used as additional source data for image classification along with multi-temporal spectral metrics. To facilitate the image classification applications, the GLAD team provides selected topography metrics in the ARD tile format. The source data was extracted from the NASA Making Earth System Data Records for Use in Research Environments (MEaSUREs) version of the Shuttle Radar Topography Mission (SRTM) global elevation dataset. The source product (SRTMGL1v003) has a spatial resolution of 1 arc second (~30 meters) and is freely distributed by NASA LPDAAC. The dataset has global coverage between 60°N to 56°S.

 

The GLAD team re-projected and subset the SRTM elevation data to the standard ARD tile format. The dem.tif files contain SRTM elevation in meters, data format Int16 (signed). The No Data value is -32768. Two other topography metrics (slope and aspect) were derived from the global DEM dataset using the following procedure: (1) the DEM was resampled into a local UTM map projection with pixel size of 30m; (2) aspect and slope were calculated using GDAL tools; and (3) slope was scaled by multiplying by 500, and aspect by multiplying by 100, and both variables were stored as UInt16 images (slope.tif and aspect.tif).

 

To download multiple tiles, use data download code (download_V1.1.pl)

  • Make sure that the ActivePERL is installed on your computer (see GLAD Tools Setup for details).
  • Make list of tiles as a text file (single column, tile names only – see example tiles.txt).

Usage: > perl download_SRTM.pl <username> <password> <tile list file> <output folder>

 

Required parameters:

<username> and <password> - User login information

<tile list file> - list of lat/long 1x1 degree tiles (in text format)

<output folder> - output folder

 

License and Disclaimer

2.5. License and Disclaimer

The GLAD Landsat Analysis Ready Data (ARD) data is available online, with no charges for access and no restrictions on subsequent redistribution or use, as long as the proper citation is provided as specified by the Creative Commons Attribution License (CC BY).

 

Copyright © Global Land Analysis and Discovery Team, University of Maryland

 

Suggested citation

Global Land Analysis and Discovery team (GLAD). Landsat Analysis Ready Data. Downloaded from https://glad.umd.edu/ on 04/05/2020.

 

Suggested citation for ARD methodology

Potapov, P., Hansen, M.C., Kommareddy, I., Kommareddy, A., Turubanova, S., Pickens, A., Adusei, B., Tyukavina A., and Ying, Q., 2020. Landsat analysis ready data for global land cover and land cover change mapping. Remote Sens. 2020, 12, 426; doi:10.3390/rs12030426 (Potapov_RS_2020.pdf)

 

Disclaimer

While the GLAD makes every effort to ensure completeness and consistency of the Landsat ARD product, we acknowledge that it may contain faults and unreadable data. We ask that you notify us immediately of any problems in our data. We will make every effort to correct them.

 

With respect to the Landsat ARD data available from this service, the GLAD team does not make any warranty, express or implied, including the warranties of merchantability and fitness for a particular purpose; nor assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of the data products.

 

Although the GLAD team is committed to operational product update and to maintain open data access, our work is contingent on adequate funding and resources. Therefore, this service may be interrupted or canceled at any time without notice.

 

User Registration

GLAD Tools for Landsat ARD Applications

3. GLAD Tools for Landsat ARD Applications

GLAD Tools

The primary purpose of the GLAD ARD is to simplify time-series analysis and to create multi-temporal metrics for land cover and land cover change mapping. To facilitate classification model extrapolation, we have implemented globally consistent reflectance normalization. The ARD product does not represent actual surface reflectance and is not suitable for precise measurement of photosynthesis, water quality, and other variables that require actual surface reflectance.

 

In this section, we provide tools for application of the ARD time-series for land cover and land cover change mapping and analysis of classification results. To use the ARD tools, you need to install ActivePerl (https://www.activestate.com/products/activeperl/), OSGeo4W (usually installed as QGIS-OSGeo4W package from here https://qgis.org/downloads/) and the GLAD_1.0 package (https://glad.umd.edu/gladtools/Complete_package/GLAD_1.0_master.zip). The installation instructions are provided in the 3.1. GLAD Tools Setup section. Additional GLAD tools may be added from the 3.2. GLAD Tools Depository.

 

Multi-temporal Metrics

The ARD product is used to create multi-temporal metrics that serve as a consistent input for annual land cover mapping and change detection models. Metrics represent a set of statistics extracted from the normalized surface reflectance time-series within a calendar year. Two independent types of metrics may be created from 16-day time-series data using GLAD Tools: annual phenological metrics and annual change detection metrics.

 

The annual land cover mapping is based on annual gap-filled phenological metrics which simplify the analysis of surface reflectance and land surface phenology. Phenological metrics may be calculated using a complete or incomplete set of 16-day ARD for at least one year. Several years of data are preferable to implement gap-filling. The GLAD system is designed to provide different sets of phenological metrics for various applications. Please, read the metrics methodology and select the metric types that better suit your needs here: 3.3. Phenological Metrics.

 

The annual change detection metrics were designed to highlight inter-annual changes of spectral reflectance and are used as source data for change mapping. To generate this metric type, at least two complete years of 16-day data is required. Several types of change detection metrics are designed for different applications. Please, read the metrics methodology and select the metric types that better suit your needs here: 3.4. Change Detection Metrics.

 

After creating a set of metrics, the data from different tiles must be mosaicked for visualization. Section 3.5. Using Image Mosaics provides tools for stitching tiled data into multi-band image mosaics.

 

Land Cover and Land Cover Change Classification

The GLAD Tools provide software for the application of a simple but powerful decision tree model for land cover mapping and change detection. The decision tree is a type of supervised classification which requires a set of training data for parameterization of the model. The following principles are currently implemented by GLAD Tools:

a. The classification model is created for two classes only. A “target” class usually represents the class of interest (i.e., forest or forest loss), while the “background” class represents all other land cover types. Training for both classes is required. The model output represents the likelihood of each pixel to belong to a “target” class.

b. The classification model uses training (dependent variable) in the form of polygonal ESRI shapefiles. Shapefiles may be created in QGIS (as shown in sections 3.6 Land Cover Classification and 3.7. Change Detection) or in any other GIS software.

c. Independent variables are represented by multi-temporal metrics. Currently, the GLAD classification tool supports several types of metrics. Please, check the metric types and instructions for software application.

d. To reduce model overfitting and the adverse effect of errors in training data, the decision tree model implements bootstrap aggregation (bagging) and pruning techniques. Parameters for the bagging (number of trees in an ensemble) and pruning (deviance decrease limit) can be adjusted for each classification.

e. The accuracy of the classification model is improved through the iterative process of adding training data, known as “active learning”. The active learning method consists of iterations of model construction, application, evaluation of results, and adding new training data until the desired map quality is achieved.

 

 

 

GLAD Tools Setup

3.1. GLAD Tools Setup

QGIS-OSGeo4W

QGIS is the recommended software for raster data visualization and collecting training data for supervised classification. OSGeo4W is required for a number of GLAD tools.

Recommended version: 2.18 [https://qgis.org/downloads/QGIS-OSGeo4W-2.18.28-2-Setup-x86_64.exe]

Recommended plugins: OpenLayers, Freehand Editing, Send2GE

 

Perl

PERL is the language used for data management codes.

ActivePerl (community edition): https://www.activestate.com/products/activeperl/

 

Notepad++ or similar code editor

GLAD_1.0

Primary software package compiled by the GLAD team. Download the latest version from the GLAD Tools Depository. Download the latest complete set of GLAD Tools here https://glad.umd.edu/gladtools/Complete_package/GLAD_1.0_master.zip.

 

The package contains:

MinGW - C++ compiler, GNU C Library (Open source software; Copyright © Free Software Foundation)

gdal - GDAL Core (Open source software; Copyright © Frank Warmerdam and others)

tree.exe – CART model (Open source software; Copyright © B. D. Ripley and J. Ju)

Other utilities – GLAD ARD Tools (Freeware; Copyright © GLAD UMD)

Installation:

Unpack GLAD_1.0_master.zip to C:\GLAD_1.0

Add to PATH variable (see how to: https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/)

C:\GLAD_1.0

C:\GLAD_1.0\MinGW\bin

C:\GLAD_1.0\gdal\bin

C:\GLAD_1.0\gdal\include

Adding/Updating tools:

To install/update a specific tool, select the files from the GLAD Tools Depository and download them into the C:\GLAD_1.0 folder.

 

GLAD Tools Depository

3.2. GLAD Tools Depository

Latest complete GLAD_1.0 package (all GLAD Tools are included)

                  https://glad.umd.edu/gladtools/Complete_package/GLAD_1.0_master.zip

                  See installation instruction here: 3.1 GLAD Tools Setup

                  Update log (https://glad.umd.edu/gladtools/versions.txt)

 

Core GLAD_10.0 package (without any tools)

https://glad.umd.edu/gladtools/Core_package/GLAD_1.0.zip

 

GLAD Tools

Data download (see sections 2.3 and 2.4)

https://glad.umd.edu/gladtools/Tools/download_V1.1.pl

https://glad.umd.edu/gladtools/Tools/download_SRTM.pl

 

Metric generation tools

Phenological metrics (see section 3.3)

https://glad.umd.edu/gladtools/Tools/compute_metrics_pheno_A.zip

https://glad.umd.edu/gladtools/Tools/compute_metrics_pheno_B.zip

Change detection metrics (see section 3.4)

https://glad.umd.edu/gladtools/Tools/compute_metrics_change_A.zip

https://glad.umd.edu/gladtools/Tools/compute_metrics_change_B.zip

 

Classification tools

Land cover classification (see section 3.6)

https://glad.umd.edu/gladtools/Tools/classification_pheno_A.zip

https://glad.umd.edu/gladtools/Tools/classification_pheno_B.zip

Change detection (see section 3.7)

https://glad.umd.edu/gladtools/Tools/classification_change_A.zip

https://glad.umd.edu/gladtools/Tools/classification_change_B.zip

 

Creating image mosaics (see section 3.5)

https://glad.umd.edu/gladtools/Tools/mosaic_tiles.pl

 

Post-processing and data analysis tools (see section 3.10)

https://glad.umd.edu/gladtools/Tools/recode_8bit.exe

https://glad.umd.edu/gladtools/Tools/get_area.exe

https://glad.umd.edu/gladtools/Tools/zonal_stat.exe

 

Access to file depository

Register here and visit https://glad.umd.edu/gladtools/ for downloading all GLAD Tools

 

 

 

Phenological Metrics

3.3. Phenological Metrics

The annual phenological metrics provide information on surface reflectance and land surface phenology to a land cover mapping model. Metrics represent a set of statistics extracted from the normalized surface reflectance time-series within a calendar year (January 1 – December 31). These statistics are stored as a set of 16-bit unsigned raster files in the same tile system as 16-day composites (see Landsat ARD Structure for details).

 

The GLAD system is designed to provide different sets of phenological metrics for various applications. Based on our ongoing research projects and feedback from the user community, our team will develop new metric types. We will add new metric types to this section and provide the corresponding codes for metric generation and application. Data users are encouraged to request new types of phenological metrics for their specific applications (contact Peter Potapov potapov@umd.edu). Each metric set has a specific letter code. Before using a new metric set, do the following:

  • Read the metrics methodology and understand the naming convention.
  • Make sure that the computer resources (disk volume, RAM) are suitable for the selected metric type.
  • Download 16-day ARD data following specific requirements.
  • Download and install all the required codes.

Supported metric types:

Phenological metrics type A (pheno_A)

Purpose: Annual land cover and vegetation structure mapping.

ARD requirement: Four years of data is recommended for metric processing (e.g., to generate 2018 metrics we recommend to download data from 2015 to 2018).

Computer requirement: At least 13GB RAM required for metric calculation.  The metrics dataset size for one tile / one year is 9GB.

 

Phenological metrics type B (pheno_B)

Purpose: Annual land cover mapping. Specifically designed for low capacity PC.

ARD requirement: One year of data is required. Two to four years of data is recommended for metric processing (e.g., to generate 2018 metrics we recommend downloading data from 2015 to 2018).

Computer requirement: At least 8GB RAM required for metric calculation.  The metrics dataset size for one tile / one year is 2.3GB.

 

 

 

pheno_A

3.3.1. Phenological metrics type A (pheno_A)

Description

  • Purpose: annual land cover and vegetation structure mapping.
  • Reference: Potapov et al., 2019 (https://doi.org/10.1016/j.rse.2019.111278)
  • Interval: annual (January 1 – December 31).
  • 16-day interval data: four years of data is recommended for metric processing (e.g., to generate 2018 metrics we recommend downloading data from 2015 to 2018). See details in the description below.
  • Naming convention: Metrics_pheno_A.xlsx
  • Metric generation code: https://glad.umd.edu/gladtools/Tools/compute_metrics_pheno_A.zip
  • Classification code parameters: Use keyword “pheno_A” to specify this metric set.
  • Requires at least 13GB RAM.
  • The metrics dataset size for one tile / one year is 9GB.

Methodology

This metric set is designed to allow annual land cover and vegetation structure mapping models extrapolation in space and time. While this annual metric set is generated primarily using the observations collected during the corresponding year, the data from the three previous years are used to fill gaps in the observation time-series that may affect the consistency of phenological metrics. To generate an annual set of phenological metrics we recommend using 16-day Landsat ARD from four years (the current and three preceding years). The metric generation may be applied to an incomplete set of 16-day intervals (i.e., only intervals for a corresponding year). In this case, the gap-filling algorithm will shut off automatically. Before processing the phenological metrics, use Landsat ARD Download to download the 16-day data time series.

 

The process of phenological metrics construction includes two stages: (1) selecting clear-sky observations and filling gaps in the observation time series; and (2) extracting reflectance distribution statistics from the selected observation time-series.

 

First, we compile a time-series of annual observations with the lowest atmospheric contamination (Figure A). The per-pixel criterion for 16-day data selection is defined automatically based on the distribution of quality flags within the four years of data. If clear-sky land or water observations are present in the time-series data, only those are used for subsequent analysis. If no such observations are found, the code successively changes the quality threshold for data inclusion, first allowing observations with proximity to clouds and shadows, then allowing all available observations.

 

To create an annual gap-filled observation time-series for metric extraction, the code analyzes the duration of the gaps between existing 16-day observations of the current year (Year i). If a gap exceeded two months (four 16-day intervals), it will search for the clear-sky observations in the previous years within the gap date range, starting with Year i-1 and until the Year i-3. When clear-sky observations are found, they are added to the gap-filled time-series data, and the gap analysis is performed again until all gaps longer than two months are filled or no available data are found within the four-year interval.

 

Pheno1

 

Pheno2

After compilation of the annual gap-filled observation time-series, the code computes selected normalized band ratios, or indices, (Band A - Band B)/(Band A + Band B) for each selected observation. A spectral variability vegetation index (SVVI, Coulter et al., 2016) is calculated using the standard deviation of spectral reflectance values.

 

Multi-temporal metrics are generated from the time-series of normalized reflectance and indices using four independent ranking approaches (Figure B). First, all observations are ranked by each spectral band reflectance or index value individually. From obtained individual ranks, we select the highest/lowest, second to the highest/lowest values and values corresponding to the first, second, and third quartiles. In addition to individual observations, we calculate averages for all observations between selected ranks and amplitudes between selected metrics. Second, we distribute observation dates by corresponding ranks of (i) NDVI, (ii) SVVI, and (iii) brightness temperature. For these distributions, we extract observation dates corresponding to highest/lowest, second to highest/lowest and first, second, and third quartiles of the ranked variable. For the metric set, we record normalized surface reflectance of these observations and calculated averages and amplitudes for observations between selected ranks. The amplitudes are not written to the files but calculated on the fly by classification software.


Metric types and naming convention

The metrics for each tile is stored in a separate folder as a single-band UInt16 bit GeoTIFF files. The generic naming convention is the following:

YYYY_B_S_C.tif

Where:

YYYY – corresponding year

B – spectral band or index

S – statistic

C – corresponding band or index used for ranking (only for metrics extracted from ranks defined by a corresponding value)

 

Example:

2018_blue_max_RN.tif - The metric represents the value of the normalized surface reflectance of the Landsat blue band for the 16-day interval that has the highest red/NIR normalized ration (also known as NDVI) value during the year 2018.

 

The table Metrics_pheno_A.xlsx has details on the bands, indices, and computed statistics.

 

In addition to spectral metrics, the metric generation software produces a set of technical layers including the number of cloud-free 16-day composites, gap-filling algorithm outputs, data quality, and water presence.


Software installation

The following software should be installed to generate metrics:


Software application

  • Download all required 16-day composites
  • Download and install software
  • Make a list of tiles to process (single column, tile names only – see example tiles.txt).
  • Use the following command to compute metrics:

> perl C:/GLAD_1.0/metrics_pheno.pl A <tile_list> <year> <input folder> <output folder> <threads>

Example:

> perl C:/GLAD_1.0/metrics_pheno.pl A tiles.txt 2018 D:/Data D:/Metrics 1

 

The command parameters are:

Input folder: the folder with 16-day composite data. It should contain tile data in subfolders.

Output folder: will be created by the code, tile data will be recorded into subfolders.

Threads: the number of parallel processes. The parameter should be increased only if:

  • A computer has a multi-core processor (e.g., Intel Xeon)
  • The RAM can hold several processes simultaneously. Each process will use 13GB RAM. To calculate the total RAM usage, multiply 13GB by the number of processes.

pheno_B

3.3.2. Phenological metrics type B (pheno_B)

Description

  • Purpose: annual land cover mapping. Specifically designed for low capacity PC (minimum requirement: 8GB RAM, at least 50GB free disk volume)
  • Interval: annual (January 1 – December 31).
  • 16-day interval data: four years of data is recommended for metric processing (e.g., to generate 2018 metrics we recommend downloading data from 2015 to 2018).
  • Naming convention: Metrics_pheno_B.xlsx
  • Metric generation code: https://glad.umd.edu/gladtools/Tools/compute_metrics_pheno_B.zip
  • Classification code parameters: Use keyword “pheno_B” to specify this metric set.
  • Require at least 8GB RAM.
  • The metrics dataset size for one tile / one year is 2.3GB.

Methodology

This metric set is design to allow annual land cover and vegetation structure mapping models extrapolation in space and time. It is similar to the pheno_A metric type with a few differences:

  • The number of metrics was significantly reduced to accommodate low capacity PC.
  • The reference ranking by SVVI index was replaced with ranking by NS2 index (normalized ratio of NIR and SWIR2 bands).

The metric methodology is similar to pheno_A type.


Metric types and naming convention

The metrics for each tile is stored in a separate folder as a single-band UInt16 bit GeoTIFF files. The generic naming convention is the following:

YYYY_B_S_C.tif

 

Where:

YYYY – corresponding year

B – spectral band or index

S – statistic

C – corresponding band or index used for ranking (only for metrics extracted from ranking by a corresponding value)

 

 

Example:

2018_blue_max_RN.tif - The metric represents the value of the normalized surface reflectance of the Landsat blue band for the 16-day interval that has the highest red/NIR normalized ration (also known as NDVI) value during the year 2018.

 

The table Metrics_pheno_B.xlsx has details on the bands, indices, and computed statistics.

 

In addition to spectral metrics, the metric generation software produces a set of technical layers including the number of cloud-free 16-day composites, gap-filling algorithm outputs, data quality, and water presence.

 


Software installation

The following software should be installed to generate metrics:


Software application

  • Download all required 16-day composites
  • Download and install software
  • Make a list of tiles to process (single column, tile names only – see example tiles.txt).
  • Use the following command to compute metrics:

> perl C:/GLAD_1.0/metrics_pheno.pl B <tile_list> <year> <input folder> <output folder> <threads>

 

Example:

> perl C:/GLAD_1.0/metrics_pheno.pl B tiles.txt 2018 D:/Data D:/Metrics 1

The command parameters are:

Input folder: the folder with 16-day composite data. It should contain tile data in subfolders.

Output folder: will be created by the code, tile data will be recorded into subfolders.

Threads: the number of parallel processes. The parameter should be increased only if:

  • A computer has multi-core processor (e.g., Intel Xeon)
  • The RAM can hold several processes simultaneously. Each process will use 8GB RAM. To get the total RAM usage, multiply 8GB by the number of processes.

 

Change Detection Metrics

3.4. Change Detection Metrics

The annual change detection metrics were designed to highlight inter-annual changes of spectral reflectance while reducing false detections due to reflectance fluctuations and inconsistent clear-sky observation availability. Metrics represent a set of statistics extracted from (a) the normalized surface reflectance time-series of the current and preceding years, independently, (b) the array of differences in 16-day composite values between the current and preceding years, and (c) the entire time interval. These statistics are stored as a set of 16-bit unsigned raster files in the same tile system as 16-day composites (see Landsat ARD Structure for details).

 

The GLAD system is designed to provide different sets of change detection metrics for various applications. Some of the metric sets are designed specifically to highlight the changes that happened in the last year, while others are sensitive to changes that happened within a longer interval. The latter has higher sensitivity for change detection but should only be implemented when the mask of earlier detected changes is available. Based on our ongoing research projects and feedback from the user community, our team will develop new metric types. We will add new metric types to this section and provide the corresponding codes for metric generation and application. Data users are encouraged to request new types of metrics for their specific applications (contact Peter Potapov potapov@umd.edu). Each metric set has a specific letter code. Before using a new metric set, do the following:

  • Read metrics methodology and understand naming convention.
  • Make sure that the computer resources (disk volume, RAM) are suitable for the selected metric type.
  • Download 16-day ARD data following specific requirements.
  • Download and install all the required codes.

Supported metric types:

Change detection type A (change_A)

Purpose: Annual forest loss mapping. May only be applicable when change data for earlier years exists.

ARD requirement: Four years of data are recommended, at least 2 years required.

Computer requirement: At least 13GB RAM required for metric calculation.  The metrics dataset size for one tile / one year is 9GB.

 

Change detection type B (change_B)

Purpose: Annual forest cover loss mapping. Specifically designed for low capacity PC.

ARD requirement: Two years of data are required (current and preceding).

Computer requirement: At least 8GB RAM required for metric calculation.  The metrics dataset size for one tile / one year is 2.2GB.

 

 

 

change_A

3.4.1. Change detection metrics type A (change_A)

Description

  • Purpose: Annual forest loss mapping. May only be applicable when annual change data for at least three earlier years exist.
  • Reference: Potapov et al., 2019 (https://doi.org/10.1016/j.rse.2019.111278)
  • Interval: Annual (January 1 – December 31).
  • 16-day interval data: Four years of data is recommended for metric processing (e.g., to generate 2017/2018 change metrics we recommend to download data from 2015 to 2018). See details in the description below.
  • Naming convention: Metrics_change_A.xlsx
  • Metric generation code: https://glad.umd.edu/gladtools/Tools/compute_metrics_change_A.zip
  • Classification code parameters: Use keyword “change_A” to specify this metric set.
  • Requires at least 13GB RAM.
  • The metrics dataset size for one tile / one year is 9GB.

Methodology

The change_A metric set was developed and implemented for global (link - http://earthenginepartners.appspot.com/science-2013-global-forest) and regional (Potapov et al., 2019  link - https://doi.org/10.1016/j.rse.2019.111278) annual forest loss mapping. This metric set is sensitive to changes that occurred within the last three years. To map change that occurred in the last year only, the results of change detection must be filtered using the mask of change detected in the preceding years.

 

Similar to the annual phenological metric algorithm (see Phenological metrics), we utilized several years of data to calculate a metric set for a target year. For the change_A metric set, using four years of data (current and three preceding years) is optimal. The metric set can be generated with less than four years of data, but at least 2 years of data are required. Before processing the phenological metrics, use Landsat ARD Download to download the 16-day data time series.

 

The process of metrics construction includes two stages: (1) selecting clear-sky observations and building the time series; and (2) extracting reflectance and reflectance change distribution statistics from the time-series.

 

First, we compile a time-series of annual observations with the lowest atmospheric contamination (Figure A). The per-pixel criterion for 16-day data selection is defined automatically based on the distribution of quality flags within the four years of data. If clear-sky land or water observations are present in the time-series data, only those are used for subsequent analysis. If no such observations are found, the code successively changes the quality threshold for data inclusion, first allowing observations with proximity to clouds and shadows, then allowing all available observations.

 

To construct the metric set, we used all cloud-free 16-day observations from the current year (the year i), hereafter referred to as time-series C (Fig. A). To create a historical baseline for change detection, we collected an average reflectance from the three preceding years (year i-1 – year i-3) only for those 16-day intervals that have cloud-free observations in the time-series C. If no observations were found for a certain 16-day interval in historic data, we used clear-sky data from the closest observation before/after the missing 16-day composite interval. The compilation of spectral reflectance from the three previous years is referred to as time-series P. For each time-series observation, in addition to normalized reflectance, we calculated normalized ratios, or indices (Band A - Band B)/(Band A + Band B) from selected bands. The per-band and per-index differences were calculated for all 16-day intervals with clear-sky data between time-series C and time-series P and recorded as time-series D.

 

ChangeA_pic1

 

Similarly to the phenology metrics algorithm, we ranked spectral reflectance and indices values individually for each of the time-series C and P, and extracted selected ranks and averages. We also ranked observation dates by the corresponding NDVI and brightness temperature values and recorded spectral band values for selected ranks. We aggregated time-series P and C into a single time-series, from which we calculated standard deviations and slope of linear regression for each band and index. The time-series D per-16-day difference values were ranked for each spectral band and index and selected statistics were extracted from these ranks.

Metric types and naming conversion

 

The metrics for each tile is stored in a separate folder as a single-band UInt16 bit GeoTIFF files. The generic naming conversion is the following:

YYYY_B_T_S_C.tif

Where:

YYYY – corresponding year

B – spectral band or index

T –time-series from which the statistics were extracted. “c” represents the current year (time-series C), “p“ stands for the preceding year (time-series P) and “dif” stands for a time-series of per-16-day interval differences between (time-series D). Regression and standard deviation metrics, which are calculated from the entire time-series, does not have this name section.

S – statistic

C – corresponding band or index used for ranking (only for metrics extracted from ranks defined by a corresponding value)

 

Example:

2018_blue_c_max_RN.tif - The metric represent the value of the normalized surface reflectance of the Landsat blue band for the 16-day interval that has the highest red/NIR normalized ration (also known as NDVI) value during the year 2018.

 

In addition to spectral metrics, the metric generation software produces a set of technical layers including the number of cloud-free 16-day composites, data quality, and water presence.

 

The table Metrics_change_A.xlsx has details on the bands, indices, and computed statistics.


Software installation

The following software should be installed to generate metrics:


Software application

  • Download all required 16-day composites
  • Download and install software
  • Make a list of tiles to process (single column, tile names only – see example tiles.txt).
  • Use the following command to compute metrics:

> perl C:/GLAD_1.0/metrics_change_A.pl <tile_list> <year> <input folder> <output folder> <threads>

Example:

> perl C:/GLAD_1.0/metrics_change_A.pl tiles.txt 2018 D:/Data D:/Metrics 1

The command parameters are:

Input folder: the folder with 16-day composite data. It should contain tile data in subfolders.

Output folder: will be created by the code, tile data will be recorded into subfolders.

Threads: the number of parallel processes. The parameter should be increased only if:

  • A computer has a multi-core processor (e.g., Intel Xeon)
  • The RAM can hold several processes simultaneously. Each process will use 13GB RAM. To get the total RAM usage, multiply 13GB by the number of processes.

 

 

change_B

3.4.2. Change detection metrics type B (change_B)

Description

  • Purpose: Annual forest cover loss mapping. Specifically designed for low capacity PC (minimal requirement: 8GB RAM, at least 50GB free disk volume). This metric set is designed for annual change detection.
  • Interval: Annual (January 1 – December 31).
  • 16-day interval data: Two years of data is required (current and preceding).
  • Naming convention: Metrics_change_B.xlsx
  • Metric generation code: https://glad.umd.edu/gladtools/Tools/compute_metrics_pheno_B.zip
  • Classification code parameters: Use keyword “change_B” to specify this metric set.
  • Require at least 5GB RAM.
  • The metrics dataset size for one tile / one year is 2.2GB.

Methodology

This metric set is design to allow annual change detection. It is similar to the change_A metric type with a few differences:

  • The number of metrics was significantly reduced to accommodate low capacity PC.
  • Only two years of data is used limiting model sensitivity to changes that occurred in the preceding year. However, this metric set may still confuse changes that occurred at the end of the growing season of the preceding year with the changes that happened in the corresponding (“current”) year.

The metric methodology is similar to change_A type.


Metric types and naming conversion

The metrics for each tile is stored in a separate folder as a single-band UInt16 bit GeoTIFF files. The generic naming conversion is the following:

YYYY_B_T_S.tif

Where:

YYYY – corresponding year

B – spectral band or index

T –time-series from which the statistics were extracted. “c” represent the current year (time-series C), “p“ stands for the preceding year (time-series P) and “dif” stands for a time-series of per-16-day interval differences between (time-series D). Regression and standard deviation metrics, which are calculated from the entire time-series, does not have this name section.

S – statistic

 

Example:

2018_blue_c_max.tif - The metric represents the value of the normalized surface reflectance of the Landsat blue band during the year 2018.

 

In addition to spectral metrics, the metric generation software produces a set of technical layers including the number of cloud-free 16-day composites, gap-filling algorithm outputs, and data quality.

The table Metrics_change_B.xlsx has details on the bands, indices, and computed statistics.


Software installation

The following software should be installed to generate metrics:


Software application

  • Download all required 16-day composites
  • Download and install software
  • Make a list of tiles to process (single column, tile names only – see example tiles.txt).
  • Use the following command to compute metrics:

> perl C:/GLAD_1.0/metrics_change_B.pl <tile_list> <year> <input folder> <output folder> <threads>

Example:

> perl C:/GLAD_1.0/metrics_ change _B.pl tiles.txt 2018 D:/Data D:/Metrics 1

The command parameters are:

Input folder: the folder with 16-day composite data. It should contain tile data in subfolders.

Output folder: will be created by the code, tile data will be recorded into subfolders.

Threads: the number of parallel processes. The parameter should be increased only if:

  • A computer has a multi-core processor (e.g., Intel Xeon)
  • The RAM can hold several processes simultaneously. Each process will use 5GB RAM. To get the total RAM usage, multiply 5GB by the number of processes.

Using Image Mosaics

3.5. Using Image Mosaics

The multi-temporal metrics are stored in a tiled format. To visualize data for a large region, tiles must be mosaicked together. GDAL tools provide a number of solutions to mosaic the data, either virtually (using VRT format) or as a single image file. The following describes the process of manual stitching the tiles into a mosaic:

  • Select tiles to stitch the data from.
  • Create a list of files to stitch and print it to a new text file. Use the full path to each file. The following example shows the list of tiles to stitch for one metric (example_file_list.txt).
  • Open OSGeo4w interface and navigate to the folder with the text file.
  • Execute the VRT generation command:

gdalbuildvrt -input_file_list list.txt band1.vrt

 

  • You may create a set of separate VRT files for each band, and then merge them together as multi-band image using the following command:

gdalbuildvrt -separate mult-band.vrt band1.vrt band2.vrt band3.vrt band4.vrt …

 

  • The VRT file can be directly loaded to QGIS. For use with other applications (i.e., ArcGIS), you will need to convert the VRT file into raster image:

gdal_translate -of "GTiff" -co "BIGTIFF=YES" -co "COMPRESS=LZW" band1.vrt band1.tif

 

To simplify the process of image mosaicking, the users advised using the mosaic_tiles.pl tool. For a first time users, you will need to download the tool from GLAD Tools Depository and save to C:\GLAD_1.0 folder. To execute the tool, follow these steps:

  • Make the list of tiles to process (single column, tile names only – see example tiles.txt)
  • Make a parameter file.
Parameters Comments

source=D:/Metrics_pheno_B

Source folder

list=tiles.txt

The name of the tile list

year=2018

Year

outname=average

Output name

bands=blue_av2575,green_av2575,red_av2575,nir_av2575,swir1_av2575,swir2_av2575

List of metrics to aggregate (comma separated, no spaces)

ogr=C:/Program Files/QGIS 2.18/OSGeo4w.bat

A link to OSGeo4w.bat file (check your local installation)

 

  • Open cmd, navigate to the folder with tile list, and run the program:

> perl C:/GLAD_1.0/mosaic_tiles.pl param_mosaic_average.txt

 

  • The output will represent a multi-band compressed GeoTIFF file with OVR to assist visualization in QGIS.

Land Cover Classification

3.6. Land Cover Classification

The term Land Cover Classification defined here as a process of assigning the likelihood (probability) of data pixel to represent certain land cover class based on the application of statistical decision rules in the multispectral/multi-temporal domain. The decision rules are generated using a training population of pixels (with assigned land cover classes) by building a decision tree model.

 

The decision tree is a powerful tool for supervised image classification. The tree model is built automatically using a population of training objects (image pixels manually mapped by an expert). Each pixel in the training population has data on its land cover class (defined by the image interpretation expert, called the “dependent variable”) and on its spectral/temporal properties (multi-temporal metrics values, called “independent variables”). The tree model separates the multi-dimensional space of independent variables into compartments (hypervolumes called “nodes”) so that most training pixels within each node belong to the same information class. When the classification tree model is implemented for an entire image, it will predict the land cover class for every pixel.

 

The prediction, however, may be false due to inconsistent or insufficient training data. The model can be improved through the iterative process of adding training data, knows as “active learning”. The active learning method consists of iterations of model construction, application, evaluation of results, and adding new training data until the desired map quality is achieved.

 

In the following section, we describe the application of the supervised decision tree classification tool that employs dependent variable (training data) in the form of vector polygon layers and independent variables in the form of phenological metrics. Provided tools only operate with two land cover classes: a target class, and a background class. The output layer shows the likelihood of each pixel to be assigned to the target class (in percent).


1. Collecting training data

Training data represent two polygon shapefiles, one with areas marking training class pixels (“target”), and the other marking other pixels (“background”). Both shapefiles should in the same coordinate system as phenological metrics (+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs). The classification tool uses only the object shape data, all attributes ignored. The shapefiles may contain overlapping polygons. The correct topology is not required as long as data can be correctly rasterized. The polygons in “target” and “background” shapefiles may overlap. In case of overlap, the area under the “target” class polygons will be erased from the “background” layer.

 

The polygon layers may be created in any GIS software. The following manual demonstrates the use of QGIS 2.18 for shapefile editing. The following checklist summarizes the requirement for training data collection:

  • QGIS 2.18 with OpenLayers, Freehand Editing, Send2GE plugins.
  • Image mosaic (VRT or raster format) of selected metrics used for data visualization.
  • Two empty shapefiles in geographic coordinates WGS84.  

To collect training data, follow the routine described below:

  • Create classification workspace. The workspace (folder) should include:
  • Open QGIS (new project) and load required plugins.
  • Add raster layers (mosaics of selected metrics). Optionally: load Bing Maps layer using OpenLayers plugin.
  • Load target.shp and background.shp files. Put the target layer onto the top of the background layer in the Layer Panel.
  • Start editing (Toggle Editing button) for both shapefiles
  • Use “Add Polygon” or “Freehand Drawing” tools to add training samples. Avoid creating large training polygons. Distribute samples over the entire area of the image.

Sample drawing example:

Classification1

Image composite

Classification2

Target training

Classification3

Background training (overlaid with target training)

 
  • Save layers and project (periodically)

2. Applying classification

Before applying classification, check that all required software installed on your computer:

The following software should be installed to generate metrics:

To apply classification, follow the routine described below:

  • Save all edits and close the QGIS project.
  • Edit the classification parameter file.

mettype=pheno_B

Metric type

metrics=D:/Metrics_pheno_B

Multi-temporal metrics source folder

dem=D:/Data

Topography metrics source folder

year=2018

Year (for multi-temporal metrics)

target_shp=target.shp

Target class shapefile name

bkgr_shp=background.shp

Background class shapefile name

tilelist=tiles.txt

Name of the tile list file

outname=spruce

Output file name

mask=none

Mask file name (none – no mask)

maxtrees=21

Number of trees (odd number in the range 1-25)

sampling=10

Sampling rate (percent training data extracted for each tree)

mindev=0.0001

Tree pruning rule

threads=1

Number of parallel processes

treethreads=21

Number of parallel processes for a tree model

ogr=C:/Program Files/QGIS 2.18/OSGeo4w.bat

A link to OSGeo4w.bat file (check your local installation)

You may modify parameter file depending on the computer capacity, training size, etc. Specifically:

- Increasing maxtrees parameter will slow classification but improve model generalization.

- Increasing mindev will reduce tree complexity, reducing will increase tree complexity.

- Reduce sampling parameter if sample areas are too large. Increase if maxtrees parameter is reduced.

- Reduce threads and treetherads parameters for a low capacity computer (minimal value 1)

 

  • Open cmd, navigate to the folder with tile list, and run the program:

> perl C:/GLAD_1.0/classification.pl param_pheno_B.txt

  • Wait for the process to complete.
  • Open QGIS and load the classification result (TIF file). To visualize target class, use transparency threshold 0-49. To show only background class, apply transparency to the interval 50-100.

3. Understanding classification outputs

A decision tree is a hierarchical classifier that predicts class membership by recursively partitioning a data set into more homogeneous subsets. This splitting procedure is followed until a perfect tree (one in which every pixel is discriminated from pixels of other classes, if possible) is created with all pure terminal nodes or until preset conditions are met for terminating the tree’s growth. Our approach employs a deviance measure to split data into nodes that are more homogeneous with respect to class membership than the parent node. The reduction in deviance (D), is defined as:

D = Ds − Dt − Du

where s is the parent node, and t and u are the splits from s. Right and left splits along the digital counts for all data are examined. When D is maximized, the best split has been found, and the data are divided at that digital count and the process repeated on the two new nodes of the tree. The deviance for nodes is calculated from the following:

formula

where n is the number of pixels in class k in node i and p is the probability distribution of class k in node i.

 

If a decision tree model is implemented without constraints, it will attempt to create pure nodes (the nodes that consist of training pixels of the same class). However, due to noise and errors in training data, such a complex tree may produce incorrect results. To avoid over-fitting of a tree model, we implement pruning based on deviance decrease parameter (mindev). Increasing of the deviance decrease parameter we may reduce the complexity of a tree which will produce a more generalized result.

 

To improve the model performance in case of errors in training data and noise in independent variables (metrics), we implementing bagging (Bootstrap Aggregation) technique. The essence of bagging is to generate an ensemble of decision trees created using independent subsets of data from a training sample chosen randomly with replacement. The output class likelihood is calculated as the median output of all trees in the ensemble. A user may adjust the number of bagged trees (maxtrees). The number should not be too high (using more than 25 trees has a negligible effect on model performance). The number of trees should be odd (to simplify median calculation).

 

The decision tree models are stored in the “trees” folder of the classification workspace.

 

The decision tree file structure:

Input file: D:/xx_GLAD_ARD_WEB/04_classification/sample1.txt

Rows: 7963

Columns : 255

Tree type: Classification tree

mincut: 1

minsize: 2

mindev: 0.000100

Output type of y: label and probabilities

Number of classes: 2

Number of all nodes: 369

Number of terminal nodes: 185

------------------------------------------------------------

Totally 122 Variables used in model building:

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X15 …

Overall misclassification rate on training data: 0.000000

 

node), split, n, deviance, yval, (1 2)

                   * denotes terminal node

1) root 7963 11036.273815 1 (0.509356 0.490644)

2) X133 < 1637.5 4206 3621.415894 2 (0.154541 0.845459)

4) X114 < 5264 345 24.589972 1 (0.994203 0.005797)

8) X173 < 1926.5 337 -0.000000 1 (1.000000 0.000000) *

9) X173 >= 1926.5 8 8.997362 1 (0.750000 0.250000)

18) X8 < 306 2 -0.000000 2 (0.000000 1.000000) *

19) X8 >= 306 6 -0.000000 1 (1.000000 0.000000) *

5) X114 >= 5264 3861 2143.461403 2 (0.079513 0.920487)

10) X102 < 1945.5 2843 377.376031 2 (0.012311 0.987689)

20) X141 < 1207.5 2009 45.036198 2 (0.001493 0.998507)

40) X68 < 6829 1961 -0.000000 2 (0.000000 1.000000) * …

Tree header

Rows – number of sampled training pixels

Columns – Number of metrics

 

 

Classes IDs: 1 – background, 2 - target

 

 

 

 

 

 

 

List of all metrics used to build the model. For metric names, refer to metrics_list_pheno_<type>.txt in C:/GLAD_1.0

 

Misclassification rate. 0 indicates a perfect tree (all training pixels were separated).

 

Tree model. First the root node (all training pixels), then child nodes and terminal nodes (marked with *).

 

Parameters for each node:

Node number

Metric used to produce the split

Threshold value

Number of pixels after the split

Deviance after the split

Assigned class

Target classes probability

 

The classification output is stored as a raster file (LZW-compressed GeoTIFF). All tiles are mosaicked together. The pixel value is in the range of 0 – 100 and represents the target class likelihood. The commonly used threshold to identify the target class is 50% (values 0-49 represent background class, and 50-100 - target class). However, the threshold may be adjusted if needed. The likelihood should not be treated as “probability” or “similarity”, as it depends on the training population. It also never shows the percent of a target class within a pixel.

 

Another output of classification is tree_report.txt which includes the analysis of metric importance. The metric importance is calculated as the total deviance reduction from all nodes that uses a particular metric for a split. The deviance decrease is summarized for all decision tree models in the ensemble. The “percent_decrease_of_root” shows the percent of total deviance decrease for each metric of the root deviance. The higher the value, the higher the importance of a metric to separate the classes.


4. Iterating classification

Due to a high complexity of land cover, the accuracy of classification that is based on a small subjectively selected training population is (usually) low. To improve the classification accuracy, we implement an active learning method. After obtaining the initial classification output, we evaluate it and add new training sites in areas where commission or omission errors are evident. To perform active learning iterative training, follow this routine:

  • Open the QGIS project and load classification results.
  • Start editing for training shapefiles.
  • Visually check the map (using both target and background class masks) and add training to shapefiles.
  • Save shapefiles and the project and close QGIS.
  • Perform classification. Classification results will be updated.

5. Hierarchical classification: using masks

The classification tool only operates with two classes at a time. After a map of a certain class (e.g., “forest”) is completed, it is possible to use classification results to map a sub-class (e.g., “evergreen forest”) within the “forest” class. The following steps illustrate the application of hierarchical classification using classification output:

  • Create a new classification workspace. Copy there empty training shapefiles, the list of tiles, and classification parameter file.
  • Create a copy of previous classification results (e.g., forest class likelihood) in the new classification workspace.
  • Create a text file that contains a recode table. See recode table example: recode_table.txt. Each line of the recode table file has three elements: the minimal value in the input range, the maximal value in the input range, and the output value. The example file contains the table to recode likelihood file into a mask with values 0 (background class) and 1 (target class).
  • Download utility recode_8bit.exe to C:/GLAD_1.0
  • Open CMD. Navigate to the new classification workspace. Perform the following command:

C:/GLAD_1.0/recode_8bit.exe <input>.tif <output>.tif recode_table.txt

  • The resulting file (e.g. mask.tif) may be used as a mask for classification. The training data will be only collected within a non-zero portion of the mask, and the results will be only applied if the mask file value is >0.
  • Create QGIS project and collect training.
  • Save shapefiles and project and close QGIS.
  • In the classification parameter file, replace “mask=none” with “mask=”mask.tif” (or other mask file name)
  • Run classification.

Change Detection

3.7. Change Detection

The change detection metrics simplify interpretation and mapping of the land cover changes that are indicated by significant differences of land surface reflectance between the same seasons of corresponding (“current”) year and preceding year(s). Similarly to Land Cover Classification, we apply a decision tree model that assigns the likelihood (probability) of data pixel to represent land cover change based on the application of statistical decision rules in the multispectral/multi-temporal domain. The decision tree model is calibrated using a training population of pixels with assigned change and no-change classes. See Land Cover Classification for more information on decision trees and the active learning model implementation method.

 

Thematically similar land cover changes may be represented by different trajectories of surface reflectance change. For example, a forest disturbance that causes the removal of woody vegetation may be indicated by contrasting changes in shortwave infrared reflectance (SWIR): SWIR will increase after logging and decrease after forest fire. While the metrics are designed to allow detection of different indications of land cover changes, change detection model requires a set of training data that represents different spectral responses. To examine land cover dynamics and to create a comprehensive training dataset, data analyst is encouraged to visualize a combination of different metrics.

 

In the following section, we describe the application of the supervised decision tree classification tool that employs dependent variable (training data) in the form of vector polygon layers and independent variables in the form of change detection metrics. Provided tools only operate with two classes: a target class (that represent land cover change), and a background class (no change). The output layer shows the likelihood of each pixel to be assigned to the target class (in percent).


1. Creating image mosaics

The section Using Image Mosaics provides instructions for implementing mosaic_tiles.pl tool to stitch together tiled data and create multi-band image composites. Change detection interpretation may require the use of multiple representations of metric data using different spectral bands and statistics. Here we provide an example of a few band combinations that are used to interpret forest disturbances.

 

a. Current and preceding year composites

The following example parameter files for mosaic_tiles.pl are designed to create two separate composites: one that displays the average annual reflectance of the current year (year 2) and the other for the preceding year (year1) from the change_B metric dataset:

param_mosaic_year1.txt

param_mosaic_year2.txt

 

The outputs are SWIR-NIR-Red composites for two years.

Preceding year composite

Preceding year composite (year1)

Current year composite

Current year composite (year2)

 

b. Inter-annual band difference composite

The following example parameter file may be used to create a SWIR band difference image. It creates an RGB composite that displays SWIR band from the current year as the red band and SWIR band from the preceding year as green and blue bands.

 

param_mosaic_changeSWIR.txt

param_mosaic_changeSWIR

Note that not all changes in SWIR spectral reflectance represent land cover change.

 

c. A composite that highlights change between 16-day intervals

One of the most important features of the change detection metrics is the ability to highlight per-16-day composite (seasonal) changes in spectral reflectance. The following example creates a composite that displays highest seasonal change of SWIR band as the red band and average seasonal change of NIR/SWIR2 band ratio as green and blue bands.

 

param_mosaic_changeDIF.txt

param_mosaic_changeDIF

Note that not all changes in spectral reflectance represent land cover change.


2. Collecting training data

Training data represent two polygon shapefiles, one with areas marking training class pixels (“target”), and the other marking other pixels (“background”). Both shapefiles should be in the same coordinate system as metrics (+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs). The classification tool uses only the object shape data, all attributes ignored. The shapefiles may contain overlapping polygons. The correct topology is not required as long as data can be correctly rasterized. The polygons in “target” and “background” shapefiles may overlap. In case of overlap, the area under the “target” class polygons will be erased from the “background” layer.

 

The polygon layers may be created in any GIS software. The following manual demonstrates the use of QGIS 2.18 for shapefile editing. The following checklist summarizes the requirement for training data collection:

  • QGIS 2.18 with OpenLayers, Freehand Editing, Send2GE plugins.
  • Image mosaic (VRT or raster format) of selected metrics used for data visualization.
  • Two empty shapefiles in geographic coordinates WGS84.  Empty shapefiles may be downloaded here: https://glad.umd.edu/gladtools/Examples/target.zip and https://glad.umd.edu/gladtools/Examples/background.zip

To collect training data, follow the routine described below:

  • Create classification workspace. The workspace (folder) should include:
    • list of tiles (single column, tile names only – see example tiles.txt)
    • two shapefiles for training data
    • classification parameter file (see below)
  • Open QGIS (new project) and load required plugins.
  • Add raster layers (mosaics of selected metrics). Optionally: load Bing Maps layer using OpenLayers plugin.
  • Load target.shp and background.shp files. Put the target layer onto the top of the background layer in the Layer Panel.
  • Start editing (Toggle Editing button) for both shapefiles
  • Use “Add Polygon” or “Freehand Drawing” tools to add training samples. Avoid creating large training polygons. Distribute samples over the entire area of the image.

Sample drawing example:

Image composite

Image composite

Target training

Target training

Background training

Background training (overlaid with target training)

 
  • Save layers and project (periodically)

3. Applying classification

 

Before applying classification, check that all required software installed on your computer:

To apply classification, follow the routine described below:

  • Save all edits and close the QGIS project.
  • Edit the classification parameter file.

mettype=pheno_B

Metric type

metrics=D:/Metrics_change_B

Multi-temporal metrics source folder

dem=D:/DEM

Topography metrics source folder

year=2018

Year (for multi-temporal metrics)

target_shp=target.shp

Target class shapefile name

bkgr_shp=background.shp

Background class shapefile name

tilelist=tiles.txt

Name of the tile list file

outname=forest_loss_2018

Output file name (no spaces!)

mask=none

Mask file name (none – no mask)

maxtrees=21

Number of trees (odd number in the range 1-25)

sampling=10

Sampling rate (percent training data extracted for each tree)

mindev=0.0001

Tree pruning rule

threads=1

Number of parallel processes

treethreads=21

Number of parallel processes for a tree model

ogr=C:/Program Files/QGIS 2.18/OSGeo4w.bat

A link to OSGeo4w.bat file (check your local installation)

You may modify the parameter file depending on the computer capacity, training size, etc. Specifically:

- Increasing maxtrees parameter will slow classification but improve model generalization.

- Increasing mindev will reduce tree complexity, reducing will increase tree complexity.

- Reduce sampling parameter if sample areas are too large. Increase if maxtrees parameter is reduced.

- Reduce threads and treetherads parameters for a low capacity computer (minimal value 1)

  • Open cmd, navigate to the folder with tile list, and run the program:

> perl C:/GLAD_1.0/classification.pl param_change_B.txt

  • Wait for the process to complete.
  • Open QGIS and load the classification result (TIF file). To visualize target class, use transparency threshold 0-49. To show only background class, apply transparency to the interval 50-100.

model output

Example of model output (the output raster transparency is set for 0-49 interval). Blue areas represent pixels with change class likelihood of 50% and above.


4. Understanding classification outputs

See Land Cover Classification section for a detailed explanation.


5. Iterating classification

Due to high complexity of land cover and land cover change, the accuracy of classification that is based on a small subjectively selected training population is (usually) low. To improve the classification accuracy, we implement an active learning method. After obtaining the initial classification output, we evaluate it and add new training sites in areas where commission or omission errors are evident. To perform active learning iterative training, follow this routine:

  • Open the QGIS project and load classification results.
  • Start editing for training shapefiles.
  • Visually check the map (using both target and background class masks) and add training to shapefiles.
  • Save shapefiles and the project and close QGIS.
  • Perform classification. Classification results will be updated.

6. Hierarchical classification: using masks

See Land Cover Classification section for a detailed explanation. The masking works the same for change detection. The output of land cover classification may be used as a mask (i.e., to map forest changes only within the forest class).

 

Landsat ARD Sampling

3.8. Landsat ARD Sampling

Under construction

Time-series Profiles

3.9. Time-series Profiles

Under construction

Post-processing and Analysis

3.10. Post-processing and Analysis

Image recode tool

Tool: recode_8bit.exe

Description: The tool recodes values in the 8-bit unsigned raster file.

Installation: Download utility recode_8bit.exe to C:/GLAD_1.0

Usage:

  • Create a text file that contains a recode table. See this file as an example: recode_table.txt. Each line of the recode table file has three elements: the minimal value in the input range, the maximal value in the input range, and the output value. The example file contains the table to recode likelihood file into a mask with values 0 (background class) and 1 (target class).
  • Open CMD. Navigate to the new classification workspace. Perform the following command:

C:/GLAD_1.0/recode_8bit.exe <input>.tif <output>.tif recode_table.txt


Area calculation tool

Tool: get_area.exe

Description: Calculate the area and pixel count. The input file should be 8-bit unsigned raster file.

Installation: Download utility get_area.exe to C:/GLAD_1.0

Usage:

  • Open CMD. Navigate to the new classification workspace. Perform the following command:

C:/GLAD_1.0/get_area.exe <input>.tif

  • The output recorded as “area_report.txt” file

Zonal statistics tool

Tool: zonal_stat.exe

Description: Calculate the area of target pixel classes within zones defined by another raster layer. Both input files should have the same size and coordinate and both should be in 8-bit unsigned format.

Installation: Download utility zonal_stat.exe to C:/GLAD_1.0

Usage:

The first step is to create a zonal raster file. The example below provides a solution to create a zonal raster from vector map using GDAL tools. Other solutions are possible as well.

  • Make sure that the zonal shapefile has a separate database field with zones IDs, that zones IDs are not exceeding 255, and that the vector projection is +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs.
  • Open OCGeo4W and navigate to the work folder.
  • Use GDAL gdalinfo tool to get the input raster extent and resolution:
    • Use command: gdalinfo <input raster>.tif >fileparam.txt
    • Open fileparam.txt to get UL/LR coordinates and pixel size
  • Use gdal_rasterize to make the raster zone file:
    • gdal_rasterize -te ulx lry lrx uly -tr 0.00025 0.00025 -ot Byte -of GTiff -co COMPRESS=LZW -co BIGTIFF=IF_SAFER -a gird vector_zones.shp raster_zones.tif

The second step is to run the area calculation. It requires two files: source map and zones, both in the same format and extent.

  • Open CMD. Navigate to the new classification workspace. Perform the following command:

C:/GLAD_1.0/zonal_stat.exe <input>.tif <zonal>.tif

  • The output recorded as “zonal_area_report.txt” file.