SECTION II.

MAP DEVELOPMENT

From: Influences on Wetlands and Lakes in the Adirondack Park of New York State:
A Catalog of Existing and New GIS Data Layers for the
400,000 Hectare Oswegatchie/Black River Watershed,
1997

OB2 Contents

IIA. Upland Land Cover Classification from LANDSAT Thematic Mapper Data

 

slandcov.jpg (18405 bytes)One of the objectives of this study was to determine the upland vegetative cover for the entire Oswegatchie/Black River watershed by producing a land cover map for non-wetland areas using LANDSAT Thematic Mapper Data. Since this was a major effort in this project, the method is provided in enough detail to instruct the fullest range of users. This section of the report was prepared by Eileen B. Allen and Dr. Donald J. Bogucki at SUNY Plattsburgh.

 

Brief Introduction to Image Processing

The upland landcover classification was derived from four images collected by the Landsat Thematic Mapper (TM) satellite. The process of classifying an image from raw data to a digital landcover file involves grouping the TM image data into desired landcover classes. The Landsat satellite "sees" the ground as a 30 meter ground resolution cell (themal band is 120 meters). For each pixel, the satellite sensors measure the energy reflected or emitted from the ground surface in selected portions of the electromagnetic spectrum. The satellite records ground data in seven sections of the spectrum (bands). The spectrum covered by each band is listed in Table II.A.1. For each pixel in each band the sensor records energy intensity received by the sensor over a quantization range of 256 numbers (8 bits). Thus, an image can be viewed as a seven dimensional multivariate dataset, and can be analysed using statistical methods.


Table IIA.1. LANDSAT Thematic Mapper Bands.

Band Number

wavelength (microns)

"color"

Band 1

0.45-0.52

blue

Band 2

0.53-0.60

green

Band 3

0.63-0.69

red

Band 4

0.76-0.90

reflective infrared

Band 5

1.55-1.74

mid-infrared

Band 6

10.40-12.50

thermal infrared

Band 7

2.08-2.35

mid-infrared

Bands 1, 2, and 3 are from the visible portion of the spectrum. Bands 1, 2, 3, 4, 5 and 7 are reflected radiation while Band 6 represents emitted radiation. Satellite sensors recorded data as 30 meter pixels except Band 6 which was recorded as 120 meter pixels and resampled to 30 meters. Precision corrected scenes were resampled to 25 meter pixels during processing by EOSAT. (Wavelengths from ERDAS, 1991).

Geocoded full scene = 31,450 sq km (12,190 sq mi) (EOSAT, 1990)


Scenes used for this project were purchased as precision-corrected products. The original 30 meter data were geo-referenced to known ground features and resampled to 25 meter pixels by EOSAT. Geo-referencing enables the computer-based overlay of multiple data layers referenced to the same coordinate system.

The two major image processing methods are supervised and unsupervised classification. Supervised classification is preferred when there is good knowledge about the scene to be analyzed and entails "training" the computer to recognize features. In an unsupervised classification, the computer subsets the data into statistically-derived clusters.

In supervised image processing the image processor enhances the image to help with training sample selection. Training samples are chosen based upon desired use of data (i.e., types of features needed in the end product), in consultation with ancillary data such as air photos, maps, and field notes. The imagery is manipulated in an attempt to

create the best possible band combinations for the desired classification. The image is then classified. Additional areas are sampled to assess classification accuracy. The entire process is intensely iterative, with almost endless possibilities for band and signature manipulations. A general outline of variables is shown in Table IIA.2.

MAXCLAS is the primary image classifier for ERDAS, and options include a choice of classification algorithms (Maximum Likelihood, Mahalanobis, and Minimum Distance to Mean). The program can perform a first pass parallelepiped classification with user-defined standard deviation or minimum/maximum digital numbers per band used as classification boundaries. An initial parallelepiped classification can reduce processing time but also modifies the decision rules. MAXCLAS can produce a probability file which is used to build histograms, demonstrating the distance from the signature mean for pixels of each class. A priori values can be used to weight individual classes. The classification may also be manipulated by choosing limited signatures for classification (i.e., not all classes need to be separated at once).

The classification approach adopted for this project was to use a supervised image processing method and examine only the upland portions of the study area. A rasterized file of the detailed vector-based wetland files was created and used to mask out wetlands from the LANDSAT imagery. A sample area (Old Forge 7 1/2 minute quadrangle) was subset from the imagery and used to test standard image processing techniques.

The July reflective bands were classified for each scene. Pixels obscured by July clouds and cloud shadow were reclassified with May reflective bands. Classification accuracy assessments were performed for each scene and classified files were then stitched together. Classification accuracy was performed on the unified GIS land cover file. Detailed file lineages for the northern, southern, and final land cover classifications may be found in Appendix A.

 

LANDSAT Thematic Mapper map oriented precision processed scenes:

Northern image: 015/02900 Acquisition dates: 19920529 & 19900711

Southern image: 015/03000 Acquisition dates: 19920529 & 19920716

 

SUNY-Plattsburgh Remote Sensing Laboratory Image Processing System:

Hardware: Compaq Deskpro 386/33L PC with 317Mb hard drive, Number Nine image processing board with dual color screen capability (one VGA and one RGB 512 X 512 monitor), 1374Mb SCSI hard drive, Cipher 9-track 1600 bpi tape drive, Exabyte EXB-8505XL 7Gb 8mm tape drive, 36"X48" GTCO Super L digitizing tablet.

Software: DOS 5.0, PC ARC/INFO 3.4 D, Arcview 2.0, Image Alchemy 1.5, ERDAS 7.5, TAPEDISK 6.2.2, and TAPEDISK TDRAW 1.01 Beta.

 


Table IIA.2. Oswegatchie/Black Image Processing Variables.

Image Registration
- vector wetlands
- raster wetlands
Image Processing
Signature selection
- classification schemes (ex., Anderson Level II)
- air photo interpretation; ancillary data
- evaluation of within-class signatures
- evaluation of between-class signatures
Processing variables
- actual digital number manipulations
* rectification
* haze removal - histogram method
- 4th Tasseled Cap parameter
* sun angle normalization
- band clarity assessments
* clouds
* striping
* pixel dropout
- band recombination
* band redundancy
* best separability
* band ratioing and "image arithmatic"
* principal components
- type of classifier
* Minimum Distance
* Mahalanobis
* Maximum Likelihood
-classifier options:
*first pass parallelepiped with user-defined standard deviation
* a priori values
- classification evaluation
* statistical assessments:
CMATRIX (signature evaluation using training samples)
DIVERGE (signature separability by band combinations)
*Transformed Divergence method
*Jeffries-Matusita distance method
SIGMAN (signature variance/covariance)
* visual assessments:
DISPLAY
ELLIPSE
CLASOVR
THRESH (histogram-based classification thresholding)
* accuracy assessment
- random pixel selection
- user-selected samples
signature sample pixels
new pixels for accuracy assessment only
GIS File Manipulation
stitch north and south images
smooth salt and pepper; additional QA/QC

 

Thematic Mapper Data Acquisition and Initial File Registration

LANDSAT imagery was purchased from EOSAT as part of a state-wide purchase agreement. Imagery obtained through this agreement are full scenes in band sequential format with a blocking factor of three (header files for bands 1 and 2 of each scene are reprinted in Appendix IIA.2). All scenes are precision-corrected map oriented products with a Transverse Mercator (TM) USGS map zone 61 projection, resampled from the original 30 meter to a 25 meter pixel size (0.15 acres, 0.062 hectares). The scaling parameters for this projection match the Universal Transverse Mercator (UTM) Zone 18 parameters of the PC ARC/INFO watersheds and wetlands databases developed during Phase I of this project. All project databases use the Clarke 1866 spheroid.

The 9-track 6250 bpi EOSAT LANDSAT tapes were not compatible with State University of New York at Plattsburgh Remote Sensing Laboratory (SUNY-P RSL) hardware. Although the SUNY-P Computing Center could have copied the LANDSAT tapes as high-density files on 8mm tapes, TDRAW beta software could not read the long file lines of the blocked data and truncated the northeast corner of each scene. LANDSAT header and data files had to be copied onto the Adirondack Park Agency (APA) Data General workstation with ERDAS software. The SUNY-P RSL 8mm SCSI tape drive was connected to the Data General and ERDAS image files were transferred to 8mm tape with the Unix dd command. Files were copied onto the SUNY-P RSL image processing system from the 8mm tape utilizing Tapedisk TDRAW software which enables access to raw data on the 8mm tape drive with a DOS system.

To verify and assess the projection match between the LANDSAT and PC ARC/INFO databases, wetland arcs were displayed over the ERDAS LANDSAT image. LANDSAT raster cells were assigned coordinates from the header files supplied with the data tapes. Unfortunately, the match was not good. TM and UTM projections were compared by reprojecting one LANDSAT scene to UTM coordinates and reprojecting one PC ARC/INFO wetland coverage (7 1/2' quadrangle) from UTM to TM coordinates. The transformed files were overlayed with the originals but no differences noted.

PC ARC/INFO coverages were draped over LANDSAT imagery using ERDAS Live Link software. All scenes showed a consistent displacement to the southeast relative to the PC ARC/INFO wetland coverages. The wetland coverages were double-checked for positional accuracy. Quadrangle coverages for the wetlands/watershed data layer were derived from the Northern Forest Lands quadrangle/tic file. This file was developed from the DOT_NYTM.TXT file (NYTM Coordinates 11/22/89 purchased from NYSDOT Mapping Services Bureau; John Barge, Senior GIS Analyst, Adirondack Park Agency, personal communication). Tic projection parameters matched UTM zone 18 parameters (TM projection with central meridian 75oW; scale factor at central meridian: 0.9996, Clarke 1866 ellipsoid). In addition, RMS errors as watersheds and wetlands were digitized from 7 1/2' quadrangle base maps in Phase I of this project were reasonable and consistent (< 0.003).

The projection issue is critical since the minimum mapping size for the wetlands database was only about 1 acre or approximately 6.7 pixels for a circular wetland. Often however, wetlands tend to have a more linear shape. If the projection parameters are off by even a portion of a pixel, entire wetlands may be mislocated. When the vector file is rasterized, the approximation may not represent the wetland very well at all. In this case, an area that should have been mapped as wetland will be incorrectly designated upland in the final GIS files. Experimentation showed that the best fit between the PC ARC/INFO wetland vector files and the LANDSAT precision corrected imagery was to move the coordinates of the LANDSAT scenes one pixel northwest (i.e., upper left corner of file = X-25 meters, Y+25 meters).

We were very fortunate to acquire a beta copy of Tapedisk's TDRAW which enables the reading of raw data from 8mm tape by a DOS-based system, in this case the ERDAS image files created by the APA's UNIX system. Tapedisk was extremely helpful in meeting our needs and the project could not have been done without TDRAW. ERDAS image files and text headers were copied from 8mm tapes loaded with TDRAW using DOS copy with the binary switch to the Compaq hard drive. Each band for each scene was copied onto the hard drive as a separate file. File coordinates were assigned and files were backed up onto 8mm tape with TAPEDISK software which enables the 8mm tape to act as another DOS drive.

Approximate study area coordinates were determined and four 7-band image files were created from each scene. Both northern scenes were subset into files with identical coordinates and size as were the two southern scenes. While larger than the study area, scene subsets significantly reduced scene and, therefore, file size. A generous margin was maintained around the study area to enable reasonably accurate processing of upland cover types and to facilitate training sample selection if necessary in the more unusual cover types found in the western margins of the study area. While the north and the south scenes overlapped at their common border, no attempt was made to truncate the overlapping area (Figure IIA.1).

ob2fg2a1.gif (8623 bytes) Figure IIA.1. Oswegatchie/Black Study Area showing quadrangle boundaries and outer study area boundary. Shaded area illustrates the area of overlap between the northern and southern LANDSAT scenes utilized for the upland land cover classification.

Rasterization of PC ARC/INFO Vector Wetland Files

Only polygonal wetlands were rasterized. Although the software could have created raster cells from linear wetland arcs, most linear features are probably not wide enough to "grab" entire pixels throughout their length.

Theoretically, a look-up table can be designed in PC ARC/INFO to assign values to raster cells as they are created from the vector files. Since there are so many permutations of the character wetland label value NWILABEL in the PC ARC/INFO wetland files (more than the 255 value limit for raster cell codes) an attempt was made to create a look-up table that coded NWILABEL with a numeric value in the raster file. Attempts to use a look-up table were made on a very simple file (South Edwards) and a relatively complex file (Number Four). Unfortunately, all efforts to code wetland raster cells based upon an alpha selection of NWILABEL were unsuccessful. Attempts to use a look-up table for the simpler character value SYSTEM in the PC ARC/INFO wetland files (corresponding to the NWI System label and possessing only 6 potential values) were similarly unsuccessful.

Each of the 48 wetland coverages (which ranged in size from 8.78 Kb for Brothers Ponds to 2568.33 Kb for Copper Lake) was copied to a new coverage and PC ARC/INFO's DISSOLVE was used based only upon the System (SYSTEM) value, not the entire NWI wetland cover type label (NWILABEL), to simplify the coverage. A numeric item was added to the dissolved coverages to code individual raster cells with raster cell values: no value (outside of study area boundary) = 1; P, L1, R2, and R3 = 2; and Upland = 3). PC ARC/INFO POLYGRID transformed the new simplified coverages to ERDAS 8-bit raster files. A combination of PC ARC/INFO SML files, Dbase IV batch files, and ERDAS audit files was used to facilitate processing. The study area was broken into 7 blocks to ensure that PC ARC/INFO file limitations were not exceeded and the blocks were subsequently stitched together in ERDAS.

Great care had to be taken to ensure that the reference coordinates exactly matched the LANDSAT data cell coordinates so that raster cells could overlay one another. The gridded wetland file was checked against the original vector files and coordinates checked across the entire study area. The POLYGRID coordinates were not satisfactory, and the coordinates were changed by 1/2 grid cell (12.5 meters X and Y). However, since the coordinate match of raster cells between files must be precise, the wetland files had to be re-rasterized with the new coordinate parameters.

The PC ARC/INFO POLYGRID decision rule is that the polygon commanding the maximum area within a raster cell decides the code for that cell (ESRI, 1991). Some errors were noted during POLYGRID processing (i.e., areas assigned wetland inappropriately, cells incorrectly labelled upland, and some cells labelled background along quadrangle edges). No apparent reason was discovered for most mis-codings and, except for quadrangle edge problems, they did not fall into any obvious pattern. Therefore, each wetland quadrangle-based vector coverage was drawn on top of the rasterized wetland file and proof-read at 2X enlargement. ERDAS GISEDIT was used to change values as needed.

After rasterized wetlands were checked with the original vector files the wetland GIS file was compared to the LANDSAT data. While ERDAS-PC ARC/INFO file comparisons use coordinate values, ERDAS file comparisons use file coordinates. Therefore, both real world and file coordinates for the raster cells must match precisely.

Continue reading next section of OB2 Report -- Land Cover continued OB2 Contents