|
Southern Oswegatchie-Black Image Classification | OB2 Contents |
The rasterized wetland file/imagery registration across the study area scene was not good. Wetland file coordinates (not real world) were adjusted to match imagery precisely and imagery and wetland files were subset so that file sizes were exactly the same between the rasterized wetland and imagery files. Wetlands were masked out from the imagery so that only upland pixels would be classified.
No significant phenological variations were noted within the study area for either the May or July imagery. Consequently the study area was not stratified by elevation. The Oswegatchie-Black watershed is not characterized by dramatic elevational changes. Rather, it is a more complex "hummocky" landscape that is remote and well dissected by wetlands. The landscape does change in the western edge of the study area, particularly the Brantingham quadrangle, but this area is limited and did not merit a separate classification.
Signatures developed from the Old Forge analysis were utilized for the southern Oswegatchie-Black scene. Additional signature samples for Open, Barren, and Conifer were taken in Brantingham 7 1/2' quadrangle because the area offered some areas large enough and pure enough for sampling. In addition, some of the conifer tree farms in this area have different signatures than other conifers in the study area, probably because the tree farms represent dense single-species parcels. Within and between class signatures were examined with ELLIPSE. Signatures samples were checked to ensure that they did not conflict with clouds or cloud shadow so that they could be used with both May and July imagery.
With previous signature and classification analyses (CMATRIX) on the Old Forge sample quadrangle, the Mahalanobis classifier using the thermal band seemed ideal for identifying Urban Open areas while a Maximum Likelihood classifier using the thermal band excelled for Urban Mixed. However, the classifications of the southern scene study area resulting from either classifier using the thermal band were remarkably streaky and Urban Mixed was too extensive. THRESH could not be used successfully to either remove the streakiness or adequately limit the Urban Mixed class. Consequently, only reflective bands could be used in the image processing.
Numerous attempts were made to separate Urban Open and Urban Mixed classes because they were deemed so important for the anticipated analyses with this land cover data layer. One attempt used a supervised classification of Deciduous, Deciduous/Open, Mixed, Conifer, Cloud, and Cloud Shadow for the July bands. These classes were THRESHed into an Open class which was subsequently classified with an unsupervised classification. However, Urban Open and Urban Mixed were still not separable from Open or Barren or from one another.
The final classification process (Appendix IIA.1) entailed classifying the July reflective bands with the vegetated classes. Both the highest confidence and greatest area were in the vegetated and cloud classes. The vegetated/cloud classification was then THRESHed to produce the classes with less area and less confidence in the discrimination.
July reflective bands were classified with a Maximum Likelihood classifier to Deciduous, Deciduous/Open, Mixed, Conifer, and Open. Histograms for each class were examined by overlaying them on imagery for Old Forge, Brantingham, and Eagle Bay and performing interactive selections to determine pixels for inclusion in each class. All pixels were classed well except Conifer (confused with Cloud Shadow) and Open (which classed Clouds, Open and Open with vegetation); both of which were expected. The image was masked with Conifer and Open categories and re-classed to Conifer, July Cloud Shadow, Open, Barren, Urban Open, Urban Mixed, and July Clouds. Open was found to be Open with vegetation as was Barren. Urban Open and Urban Mixed were Open (not vegetated); July Cloud was classified well. Urban Mixed was predominant (and incorrect) for a few pixels surrounding cloud rims and along rivers, lakes, and roads. Class histograms were compared in Old Forge, Brantingham, and Eagle Bay quadrangles. Both Conifer and July Cloud Shadow possessed pixels that should have been classed with the other but were able to be regrouped by interactive histogram selection to appropriate bounds.
Classified files were overlayed and Open, Barren, and Urban Mixed were grouped to Open with vegetation. Urban Open was recoded to Open without vegetation. This file was scanned to remove the single "salt-and-pepper" pixels from analysis. The resultant GIS file was searched for July Cloud and Cloud Shadow with a 3-cell search. The 12-band image file was masked with the July Cloud/Cloud Shadow + 3 pixels file so than only areas obscured by July Cloud/Cloud Shadow remained.
The image file masked by July Cloud/Cloud Shadow was classified using the May reflective bands to Deciduous, Deciduous/Open, Mixed, Conifer, and Open. Classification histograms were interactively compared in Brantingham and Old Forge quadrangles. As expected, Deciduous/Open and May Cloud were confused as were Conifer and May Cloud Shadow, these class boundaries were successfully redefined with histogram-based thresholding (THRESH). The July Cloud/Cloud Shadow image file was masked with the Open classification and then reclassified with the May reflective bands and classes Open, Barren, Urban Open, Urban Mixed, and May Cloud. The classification was evaluated in Old Forge, Brantingham and Eagle Bay, but these classes were extremely limited in the May imagery underlying the July clouds. May imagery was no more successful at discerning Urban categories than the July imagery. Open, Barren, and Urban Mixed were recoded to Open with vegetation; Urban Open to Open without vegetation; May Cloud to Cloud and May Cloud Shadow to Cloud Shadow (i.e., Cloud or Cloud Shadow in both July and May imagery). All upland classifications derived from this image were overlayed and scanned to remove the "salt-and-pepper" pixels introduced by the May imagery classifications. This land use classification was combined with the wetland file to ensure that all wetlands were properly encoded and only the Oswegatchie-Black study area shown in the final GIS file. The Oswegatchie-Black study area was also extracted from the 12-band image file with wetlands masked.
Northern Oswegatchie-Black Image Classification
Vector wetland files matched the rasterized wetland and image files very well. However, the rasterized wetlands did not overlay the imagery well. Files had to be re-created so that both LANDSAT files and the rasterized wetland files were exactly the same size with precisely the same coordinates. The wetland raster file was subset to fit the image file then both were viewed together at 3X magnification and the wetland/image match across northern study area was checked. The match appeared to be variable across the study area scene and a best fit was determined. In this region not many ponds have a sharp upland/open water edge to use for precise registration verification. Wetlands were masked from the study area image file and a 12-band reflective image file from the May and July scenes with wetlands masked out was created.
Signature sample areas were drawn onto a USGS topographic map based upon air photo perusal (same air photos used for wetlands mapping). Signature samples were taken in the Oswegatchie quadrangle with cloud samples in Eagle Bay and Nehasane Lake. Within-class signatures were evaluated with ELLIPSE. Signature samples were taken for Open with vegetation, Barren, Residential Open, Shrub, Conifer forest, Deciduous forest, Mixed forest, Open, Commercial, July Clouds, May Clouds, May Cloud Shadow, July Cloud Shadow, and Deciduous/Open (open canopied deciduous, some cut over). Signature samples were grouped and between-class variability examined. As a result, Barren and Commercial were grouped together. CMATRIX was used as a general assessment of confusion between signatures and it was discovered that with the 12-band image file shrub was easily mistaken for other features. Deciduous/Open on the southern image was developed purely from image processing analysis but did not fall out as a distinct class on this image and was therefore added to Deciduous. Residential Open was frequently confused with Open with vegetation and was removed so that pixels could fall into either Open with vegetation or Barren. The distinction between Deciduous and Mixed was muddy and more Deciduous and Mixed forest samples were taken. Shrub is a poor area for training samples because shrubs from the aerial photos (1985) may no longer correspond to shrubs on the LANDSAT image (1990, 1992). Air photo identification of shrubs relies heavily on height and texture, neither of which are distinguishable on LANDSAT imagery.
An initial classification of the July reflective bands was conducted to identify potential problems. A Maximum Likelihood classifier was employed with the classes: Deciduous, Conifer, Mixed, Open with vegetation, Barren, July Clouds, and July Cloud Shadow. The resulting histograms were good and class distributions were checked against Newton Falls and Oswegatchie quadrangle maps and air photos. The classification was reasonable except for the distinction between Conifer and Cloud Shadow.
A Maximum likelihood classification was also performed on the entire 12-band image file. Although CMATRIX indicated that this could produce better results than with the July only bands, visual inspection of the classification showed a heavier proportion of Mixed forest and Clouds. In addition, all class histograms made with the July image were outstanding while those made with the 12-band file were poor and some were multimodal. No other classifications, such as the southern scene's Deciduous/Open became apparent during the processing.
A classification of the May reflective bands was also conducted. An accuracy assessment of this file was run to ensure that the processing would create a reasonable substitute of upland land cover for areas obscured by July clouds.
"Salt and pepper" pixels were removed from the July classification and Cloud/Cloud Shadow was searched with a 3 cell buffer. The 12-band image file was masked with the buffered July Cloud/Cloud Shadow file, preserving just those areas beneath the clouds and cloud shadow. This file was classified using May reflective bands to classes Deciduous, Conifer, Mixed, Open with vegetation, Barren, May Clouds, and May Cloud Shadow using a Maximum Likelihood classifier. The resulting class histograms were beginning to degrade however, due to small sample size. A visual check indicated that excessive amounts of Cloud Shadow really should have been classed as Conifer while all other classes were acceptable. Cloud Shadow class limits were redefined using histogram thresholding and pixels were placed into a Conifer file. Some pixels misclassed as Cloud were sections of pixel dropout (pixel value of 255 in one or more bands without corresponding ground (or cloud) features) near Eagle Bay but were not redefined since Cloud represents upland without LANDSAT-based land cover information.
All upland classifications were overlayed and then scanned to remove "salt and pepper" pixels introduced from the May classifications. This upland land cover classification was combined with the wetland raster file to ensure that all wetlands were properly encoded and to subset the Oswegatchie/Black study area. The Oswegatchie/Black study area was also extracted from the 12-band imagery file with wetlands masked.
Recombinant Study Area
The northern and southern upland land cover GIS files were carefully examined in the overlap area (Figure IIA.1) using screen display, topographic maps with notes, and air photo analyses. Land cover classifications were evaluated in the Eagle Bay and Beaver River quadrangles in preparation for the eventual merging of the two classifications.
The north and south upland GIS files were stitched so that the south overwrote the north. The northern GIS land cover file was remade into a file the same size as the stitched image and then overlayed upon the stitched file with northern Deciduous and Conifer dominating all southern classes except Deciduous/Open. The resulting file was recoded to form the final Oswegatchie/Black Upland Land Cover file (Table IIA.7).
Table IIA.7. LISTIT output for Oswegatchie/Black Upland Land Cover GIS file. Header listing for GIS file: OBUPLAND/OBUPLAND.GIS This file has 5088 rows, and 2900 columns This image is geo-referenced to a Transverse Mercator coordinate system The cell size is (X, Y): 25, 25 Number of classes in this variable is: 10
Totals and Percentages are Based on Non-zero points |
Accuracy Assessment
Accuracy assessment pixels were chosen based upon interpretation of 1:58000 1985-1986 NAPP air photos (used in Phase I of the project), field work, and field notes on USGS topographic maps. For the southern file, assessment pixels were chosen in the Eagle Bay, Limekiln Lake, Big Moose (May Clouds and Cloud Shadow), Stillwater Mountain (May Clouds and Cloud Shadow), Copper Lake (July Clouds and Cloud Shadow), and Wilmurt (July Clouds and Cloud Shadow) quadrangles. Since potential sampling sites for Open with vegetation and non-vegetated Open were extremely limited, a few pixels were taken from Old Forge for these categories that may have been used as training pixels. Accuracy assessment pixels were chosen for the northern file from the Newton Falls and the north half of the Number Four quadrangles while some Open with vegetation was sampled on Harrisville just outside the study area. July Cloud and Cloud Shadow pixels were obtained from Beaver River and Eagle Bay while May Cloud and Cloud Shadows were derived from the Big Moose quadrangle. Each file was evaluated independently and assessment pixel files were then combined to evaluate the unified GIS file.
The primary emphasis for accuracy assessment pixel selection was on areas that could be clearly identified on both the air photos and the LANDSAT imagery. A stratified random pixel file for field sites was not generated because of the access difficulty throughout most of this region. Identification of random pixels on air photos was not possible to a 25 meter accuracy (0.017 inches at photo center at nominal scale).
Because areas beneath July Clouds/Cloud Shadow were re-classified with May imagery, no assessment for July Cloud/Cloud Shadow was included. Only Cloud/Cloud Shadow areas that were cloud-covered in both the May and July imagery initially were classified as cloud. Since not all May clouds were classified over the entire image, no May Cloud/Cloud Shadow accuracy was included in the final assessment. It should be noted that all Cloud/Cloud Shadow accuracy assessments showed high correspondence and increased overall accuracy.
An attempt was made to acquire a minimum of 50 pixel samples per class per scene as suggested by Jensen (1996). However neither the southern nor northern image proffered enough Open with vegetation areas.
No classification accuracy pixels were extracted for Deciduous/Open because this was a purely calculated class: derived solely because all classifications showed a very different signature for these areas. Accuracy assessment pixels for this class would unfairly skew the data.
The classification error matrix (Table IIA.8) and the classification accuracy report (Table IIA.9) are a summary of results for the unified accuracy assessment file. The ERDAS classification accuracy routine can handle only 512 sample pixels so that the assessment file created by combining the north and south classification accuracy files had to be split and recalculated manually.
ERDAS (1990) offers the following definitions for the error matrix report:
"Columns: Columns in [the] report show how the reference pixels are actually classified in the GIS file. The column totals, at the bottom of the report, are the total number of reference pixels in each class, according to the reference values in the [classification accuracy assessment pixel file].
"Rows: Conversely, the rows show the number of tested pixels in the GIS file that were expected to be in each class, according to the reference pixels. The row totals, on the right side of the report, show the total number of tested pixels in each class, according to the GIS file.
"The diagonal of the error matrix shows how many of the tested pixels were classified as expected. These numbers are used in the accuracy report."
Table IIA.8. Classification Error Matrix for Oswegatchie/Black Upland Land Cover GIS File.* |
|||||
Reference Data | |||||
Classified Data | Deciduous | Mixed | Conifer | Open with Vegetation | Open |
Deciduous | 122 | 8 | 0 | 0 | 0 |
Mixed | 14 | 151 | 11 | 3 | 0 |
Conifer | 1 | 20 | 119 | 0 | 1 |
Open with Vegetation | 0 | 1 | 1 | 59 | 16 |
Open | 0 | 0 | 0 | 0 | 99 |
Column Total | 137 | 180 | 131 | 62 | 116 |
*Background, Deciduous/Open, Cloud, Cloud Shadow, and Wetland are
not shown within the error matrix. Background and Wetland classes are not derived from
image processing techniques; they are masks from other files. Deciduous/Open is a
calculated class created from image processing observations. Consequently, any accuracy
assessment pixels for this class would be generated from the classification itself, not
field and air photo checks. An attempt was made to classify beneath July Cloud and Cloud
Shadow with May imagery. Therefore these classes do not represent all Cloud and Cloud
Shadow within the image and accuracy assessment pixels for these two classes could only be
generated from the GIS file. Typically however, Cloud and Cloud Shadow were identified
with a high degree of reliability. Cloud may also represent small areas of pixel dropout
in the original imagery. |
The classification accuracy report is described by ERDAS (1990) as follows:
"Reference Totals: The reference totals are the total number of reference pixels in each class, according to the reference values in the [classification accuracy pixel] file.
"Classified Totals: The column totals show the total number of tested pixels in each class, according to the GIS file.
"Number Correct: These are the numbers of tested pixels in each class that were classified as expected. This column is equal to the diagonal of the error matrix.
"Producers Accuracy: The producers accuracy is the percentage of correctly classified pixels in each reference class. The numbers are derived by dividing the number correct by the reference total for each class.
"Users Accuracy: The users accuracy is the percentage of correctly classified pixels in each class of the GIS file. The numbers are derived by dividing the number correct by the classified total for each class.
"Overall Classification: The overall classification percentage is derived by dividing the total number correct by the total number of tested pixels."
Jensen (1996) suggests that the producer's accuracy evaluates the errors of exclusion. It shows how well an area can be classified and is a measure of probability that the reference pixel is correctly classified. He further notes that the user's accuracy is a measure of commission, illustrating the errors of inclusion, and shows the likelihood that a pixel classified on the map is a true representation of that class on the ground.
Table IIA.9. Classification Accuracy Report for the Oswegatchie/Black Upland Land Cover GIS File. |
|||||
Class Name | Reference Totals | Classified Totals | Number Correct | Producers Accuracy | Users Accuracy |
Deciduous | 137 | 130 | 122 | 81.9% | 93.9% |
Mixed | 180 | 177 | 151 | 83.9% | 85.3% |
Conifer | 131 | 143 | 119 | 90.8% | 90.8% |
Open with vegetation | 62 | 77 | 59 | 95.2% | 76.6% |
Open | 116 | 99 | 99 | 85.34% | 100.0% |
Totals | 626 | 626 | 550 | ||
Overall classification
Accuracy = 87.85942% |
References Cited
Appendices
Continue reading next section of OB2 Report -- Section II.B. | OB2 Contents |