Watershed Analysis Lab 8 Lecture 14 Basin

Download 4.64 Mb.
Hajmi4.64 Mb.

Watershed Analysis Lab 8

  • Lecture 14


  • Creates a raster delineating all drainage basins.
  • All cells in the raster will belong to a basin, even if that basin is only one cell.
  • The drainage basins are delineated within the analysis window by identifying ridge lines between basins.
  • The input flow direction raster is analyzed to find all sets of connected cells that belong to the same drainage basin.
  • Lecture 14

The drainage basins are created by locating the pour points at the edges of the analysis window (where water would pour out of the raster), as well as sinks, then identifying the contributing area above each pour point. This results in a raster of drainage basins.

  • The drainage basins are created by locating the pour points at the edges of the analysis window (where water would pour out of the raster), as well as sinks, then identifying the contributing area above each pour point. This results in a raster of drainage basins.
  • Lecture 14

Lab 8 Data

  • Report Sheet is on the Website with the instructions,
  • Three files will be downloaded:
    • 2 from NY Gis Clearinghouse
    • 1 from Cornell University
  • The DEM will be obtained from ArcGIS online.
  • Lecture 14

Step 13 – DEM (30-second Arc)

  • Lecture 14

Measuring in Arc-Seconds

  • Lecture 14
  •  Some USGS DEM data is stored in a format that utilizes three, five, or 30 arc-seconds of longitude and latitude to register cell values.
  • The geographic reference system treats the globe as if it were a sphere divided into 360 equal parts called degrees.

Each degree is subdivided into 60 minutes. Each minute is composed of 60 seconds.

  • Each degree is subdivided into 60 minutes. Each minute is composed of 60 seconds.
  •  Arc-seconds of latitude remain nearly constant, while arc-seconds of longitude decrease in a trigonometric cosine-based fashion as one moves toward the earth's poles.
  • Lecture 14

Processing of DEM

  • Raster clip – To the buffered park boundry.
  • Raster projection – from Geographic to UTM Zone 18N, NAD83
  • Resample – Bilinear Interpolation
  • Change the cell size – from 30 second arc to 30 meters,
  • Lecture 14

Raster Geometry and Resampling

  • Data must often be resampled when converting between coordinate systems or changing the cell size of a raster data set.
  • Common methods:
    • Nearest neighbor
    • Bilinear interpolation
    • Cubic convolution
  • Lecture 14

Resampling Bilinear Interpolation

  • Distance
  • weighted
  • averaging
  • Lecture 14

Step 17 - DEM

  • Lecture 14
  • DEM_UTM83
  • FlowDir
  • Sinks
  • Filled
  • FlowDir2
  • FlowAcc
  • Reclass
  • Net
  • Source
  • Watershed
  • Con
  • Stream Link
  • Hydrological Modeling
  • Lecture 14

Flow Direction

  • Creates a raster of flow direction from each cell to its steepest downslope neighbor.
  • Lecture 14


  • Lecture 14


  • A sink is a cell or set of spatially connected cells whose flow direction cannot be assigned one of the eight valid values in a flow direction raster.
  • This can occur when all neighboring cells are higher than the processing cell or when two cells flow into each other, creating a two-cell loop.
  • To create an accurate representation of flow direction and, therefore, accumulated flow, it is best to use a dataset that is free of sinks.
  • A digital elevation model (DEM) that has been processed to remove all sinks is called a depressionless DEM.
  • Lecture 14
  • High pass filters
  • Return:
  • Small values when smoothly changing values.
  • Large positive values when centered on a spike
  • Large negative values when centered on a pit
  • Lecture 14
  • 35.7
  • Lecture 14
  • Lecture 14

Step 18 - Sinks

  • Lecture 14


  • Fills sinks in a surface raster to remove small imperfections in the data.
  • Sinks (and peaks) are often errors due to the resolution of the data or rounding of elevations to the nearest integer value.
  • Lecture 14

Step 19 - Fill

  • Lecture 14


  • Lecture 14

Flow Accumulation

  • From ArcGIS 10 Desktop Help
  • Lecture 14

Flow Accumulation Step 21 before inverting

  • Lecture 14

After inverting

  • Lecture 14


  • The results of Flow Accumulation can be used to create a stream network by applying a threshold value to select cells with a high accumulated flow.
  • For example, the procedure to create a raster where the value 1 represents the stream network on a background of NoData could use one of the following:
  • Perform a conditional operation with the Con tool with the following settings:
    • Input conditional raster : Flowacc
    • Expression : Value > 50000
    • Input true raster or constant : 1
  • From ArcGIS 10 Desktop Help
  • Lecture 14


  • Lecture 14

Stream Link

  • Assigns unique values to sections of a raster linear network between intersections.
  • Links are the sections of a stream channel connecting two successive junctions, a junction and the outlet, or a junction and the drainage divide.
  • Links are the sections of a stream channel connecting two successive junctions, a junction and the outlet, or a junction and the drainage divide.
  • From ArcGIS 10 Desktop Help
  • Lecture 14

Watershed 50K

  • Lecture 14

Watershed Clipped to Park Boundry

  • Lecture 14

Lecture 14 Data Quality Issues

  • Ch. 14
  • Lecture 14


  • Spatial data and analysis standards are important because of the range of organizations producing and using spatial data, and the amount of data transferred between these organizations.
  • There are several types of standards:
    • Data standards
    • Interoperability standards
    • Analysis standards
    • Professional and certification standards
  • Lecture 14

Introduction (continued)

  • National and international standards organizations are important in defining and maintaining geospatial standards:
    • Federal Geographic Data Committee (FGDC) which focuses on the national spatial data infrastructure (www.fgdc.gov)
    • International Spatial Data Standards Commission which is a clearing house and gateway for international standards
    • Open Geospatial Consortium (OGC) which is developing interoperability standards. Web Mapping Service (WMS) standards are an example.
  • Lecture 14

The Geospatial Competency Model

  • Lecture 14
  • Lecture 14

GIS Professional Certification

  • URISA is the founding member of the GIS Certification Institute, the organization that administers professional certification for the field and is dedicated to advancing the industry.
  • Education:
  • 30 Points
  • Experience:
  • 60 Points
  • Contributions:
  • 8 Points
  • The additional 52 points can be counted from any of the three categories. 
  • The minimum number of points needed to become a certified GIS Professional as detailed in the three point schedules given below is 150 points.  Thus, all applicants are expected to document achievements valued at a minimum of 150 points. To ensure that applicants have a broad foundation, specific minimums in each of the three achievement categories must be met or exceeded.  These minimums are as follows:
  • Lecture 14
  • Lecture 14

University Certificates

  • UMM – undergraduate
  • USM undergrad/grad
  • UM – graduate
  • Penn State – graduate
  • University of Denver
  • University of Southern California
  • George Mason University
  • Lecture 14

Spatial Data Standards

  • Data – measurements and observations
  • Data quality – a measure of the fitness for use of data for a particular task (Chrisman, 1994).
  • It is the responsibility of the user to insure that the data is fit for the task.
  • Metadata – data about the data
  • Lecture 14

Spatial Data Standards

  • Spatial Data Standards – methods for structuring, describing and delivering spatially-referenced data.
  • Media Standards – the physical form of the data (CD/download etc).
  • Format Standards – specify data file components and structures. These standards aid in data transfer.
  • Spatial Data Accuracy Standards –document the quality of the positional and attribute accuracy.
  • Document Standards – define how we describe spatial data.
  • Lecture 14

GIS Is Not Perfect

  • A GIS cannot perfectly represent the world for many reasons, including:
  • The world is too complex and detailed.
  • The data structures or models (raster, vector, or TIN) used by a GIS to represent the world are not discriminating or flexible enough.
  • We make decisions (how to categorize data, how to define zones) that are not always fully informed or justified.
  • It is impossible to make a perfect representation of the world, so uncertainty is inevitable
  • Uncertainty degrades the quality of a spatial representation
  • Lecture 14

Concepts Related to Data Quality

  • Related to individual data sets:
    • Errors – flaws in data
    • Accuracy – the extent to which an estimated value approaches the true value.
    • Precision – the recorded level of detail of your data.
    • Bias – the systematic variation of the data from reality.
  • Lecture 14
  • Lecture 14

Concepts Related to Data Quality

  • Related to source data:
    • Resolution – the smallest feature in the data set that can be displayed.
    • Generalization- simplification of objects in the real world to produce scale models and maps.
  • Lecture 14
  • Resolution and generalization of raster datasets
  • Lecture 14
  • Scale-related generalization
  • Lecture 14

Data Sets Used for Analysis

  • Must be:
    • Complete – spatially and temporally
    • Compatible – same scale, units of measure, measurement level
    • Consistent – both within and between data sets.
    • And Applicable for the analysis being performed.
  • Lecture 14

A Conceptual View of Uncertainty

  • Real World
  • Conception
  • Data conversion and Analysis
  • Source Data, Measurements &
  • Representation
  • Result
  • error propagation
  • Lecture 14

Uncertainty in The Conception of Geographic Phenomena

  • Many spatial objects are not well defined or their definition is to some extent arbitrary, so that people can reasonably disagree about whether a particular object is x or not. There are at least four types of conceptual uncertainty
    • Spatial uncertainty
    • Vagueness
    • Ambiguity
    • Regionalization problems
  • Lecture 14

Spatial uncertainty occurs when objects do not have a discrete, well defined extent.

  • Spatial uncertainty occurs when objects do not have a discrete, well defined extent.
  • They may have indistinct boundaries.
  • They may have impacts that extend beyond their boundaries.
  • They may simply be statistical entities.
  • The attributes ascribed to spatial objects may also be subjective.
  • Spatial uncertainty
  • Lecture 14

Vagueness occurs when the criteria that define an object as x are not explicit or rigorous.

  • Vagueness occurs when the criteria that define an object as x are not explicit or rigorous.
  • For example:
    • In a land cover analysis, how many oaks (or what proportion of oaks) must be found in a tract of land to qualify it as oak woodland?
    • What incidence of crime (or resident criminals) defines a high crime neighborhood?
  • Vagueness (obscureness)
  • Lecture 14


  • Ambiguity occurs when y is used as a substitute, or indicator, for x because x is not available.
  • The link between direct indicators and the phenomena for which they substitute is straightforward and fairly unambiguous (soil nutrients for crop yield).
  • Indirect indicators tend to be more ambiguous and opaque (wetlands as an indicator of species diversity).
  • Of course, indicators are not simply direct or indirect; they occupy a continuum. The more indirect they are, the greater the ambiguity.
  • Lecture 14
  • Regional geography is largely founded on the creation of a mosaic of zones that make it easy to portray spatial data distributions.
  • A uniform zone is defined by the extent of a common characteristic, such as climate, landform, or soil type.
  • Functional zones are areas that delimit the extent of influence of a facility or feature—for example, how far people travel to a shopping center or the geographic extent of support for a football team.
  • Regionalization problems occur because zones are artificial.
  • Regionalization problems
  • Lecture 14

Uncertainty in the measurement of geographic phenomena

  • Error occurs in physical measurement of objects. This error creates further uncertainty about the true nature of spatial objects.
  • Lecture 14

Physical measurement error

  • Instruments and procedures used to make physical measurements are not perfectly accurate.
  • In addition, the earth is not a perfectly stable platform from which to make measurements. Seismic motion, continental drift, and the wobbling of the earth's axis cause physical measurements to be inexact. (GPSing error, remote sensing error)
  • Lecture 14

Digitizing Error A great deal of spatial data has been digitized from paper maps.

  • Any digitized map requires:
  • Considerable post-processing
  • Check for missing features
  • Connect lines
  • Remove spurious polygons
  • Some of these steps can be automated
  • Lecture 14

Error caused by combining data sets with different lineages

  • Data sets produced by different agencies or vendors may not match because different processes were used to capture or automate the data.
    • For example, buildings in one data set may appear on the opposite side of the street in another data set.
    • Error may also be caused by combining sample and population data or by using sample estimates that are not robust at fine scales.
    • "Lifestyle" data are derived from shopping surveys and provide business and service planners with up-to-date socioeconomic data not found in traditional data sources like the census. Yet the methods by which lifestyle data are gathered and aggregated to zones or are compared to census data may not be scientifically rigorous

Uncertainty in the representation of geographic phenomena

  • Representation is closely related to measurement.
  • Representation is not just an input to analysis, but sometimes also the outcome of it. For this reason, we consider representation separately from measurement.
    • The world is infinitely complex, but computer system are finite.
    • Representation is all about the choices that are made in capturing knowledge about the world
    • Uncertainty in earth model: ellipsoid models, datum, projection types
    • Uncertainty in the raster data model (structure)
    • Uncertainty in the vector data model (structure)
  • Lecture 14
  • The raster structure partitions space into square cells of equal size.
  • Spatial objects x, y, and z emerge from cell classification, in which Cell A1 is classified as x, Cell A2 as y, Cell A3 as z, and so on, until all cells are evaluated.
  • A spatial object x can be defined as a set of contiguous cells classified as x.
  • But not all the area covered by the cell is x
  • These impure cells are termed mixed pixels or "mixels."
  • Because a cell can hold only one value, a mixel must be classified as if it were all one thing or another. Therefore, the raster structure may distort the shape of spatial objects.
  • Uncertainty in the raster data
  • structure
  • Lecture 14
  • Raster – The Mixed Pixel Problem
  • Landcover map –
  • Two classes, land or water
  • Cell A is straightforward
  • What category to assign
  • For B, C, or D?
  • Lecture 14

Error in raster

  • raster
  • - because of the distortions due to flattening, cells in a raster can never be perfectly equal in size on the Earth’s surface.
  • - when information is represented in raster form all detail about variation within cells is lost, and instead the cell is given a single value. largest share, central point (f.g. USGS DEM), and mean value (f.g. remote sensing imagery)
  • Largest share
  • Central point
  • 8
  • 6
  • 7.5
  • Mean value
  • 6.33
  • 6
  • 6.29
  • 8
  • 8
  • 8
  • 6
  • 6
  • 6
  • 6
  • 6
  • 8x(1/6)+6x(5/6)=6.33
  • 8x(3/4)+6x(1/4)=7.5
  • 8x(1/7)+6x(6/7)=6.29
  • Lecture 14
  • Figure 10.8 Problems with remotely sensed imagery: (left) example of a satellite image with cloud cover (A), shadows from topography (B), and shadows from cloud cover (C); (right) an urban area showing a building leaning away from the camera
  • Source: Ian Bishop (left) and Google UK (right)
  • Lecture 14

Socioeconomic data—facts about people, houses, and households—are often best represented as points.

  • Socioeconomic data—facts about people, houses, and households—are often best represented as points.
  • For various reasons (to protect privacy, to limit data volume), data are usually aggregated and reported at a zonal level, such as census tracts or ZIP Codes.
  • This distorts the data in two ways:
    • First, it gives them a spatially inappropriate representation (polygons instead of points);
    • Second, it forces the data into zones whose boundaries may not respect natural distribution patterns.
  • Uncertainty in the vector data
  • structure
  • Lecture 14

Map Representation Error

  • Map scale
  • Ground distance, accuracy, or resolution (corresponding to 0.5 mm map distance)
  • 1:1,250
  • 0.625 m
  • 1:2,500
  • 1.25 m
  • 1:5,000
  • 2.5 m
  • 1:10,000
  • 5 m
  • 1:24,000
  • 12 m
  • 1:50,000
  • 25 m
  • 1:100,000
  • 50 m
  • 1:250,000
  • 125 m
  • 1:1,000,000
  • 500 m
  • 1:10,000,000
  • 5 km
  • Lecture 14

Uncertainty in the data conversion and analysis of geographic phenomena

  • Uncertainties in data lead to uncertainties in the results of analysis; Data conversion and spatial analysis methods can create further uncertainty
  • Data conversion error
  • Georeferencing and resampling
  • Projection and datum conversions
  • The ecological fallacy
  • The modifiable areal unit problem (MAUP)
  • Classification errors
  • Lecture 14

Ecological Fallacy Example

  • Lecture 14

MAUP Example

  • Lecture 14

Classification error and quality check

  • Lecture 14
  • Selecting
  • ROIs
  • Alfalfa
  • Cotton
  • Grass
  • Fallow
  • Lecture 14
  • Background:
  • ETM+, 7/15/01
  • Top image:
  • IKONOS, Oct, 2000
  • Classification Result
  • Lecture 14

Confusion Matrix

  • 1686
  • Grass
  • Alfalfa
  • Cotton
  • Chili
  • Fallow (corn)
  • total
  • User accuracy (%)
  • Grass
  • 110
  • 22
  • 0
  • 0
  • 0
  • 132
  • 83.3
  • Alfalfa
  • 5
  • 105
  • 0
  • 0
  • 0
  • 110
  • 79.5
  • Cotton
  • 0
  • 0
  • 945
  • 5
  • 0
  • 950
  • 99.5
  • Chili
  • 0
  • 0
  • 50
  • 42
  • 0
  • 92
  • 45.7
  • Fallow
  • 0
  • 0
  • 0
  • 0
  • 484
  • 484
  • 100
  • total
  • 115
  • 127
  • 995
  • 47
  • 484
  • 1768
  • Producer accuracy (%)
  • 95.6
  • 82.7
  • 95.0
  • 89.4
  • 100
  • Ground
  • truth
  • Lecture 14

Producer accuracy is a measure indicating the probability that the classifier has labeled an image pixel into Class A given that the ground truth is Class A.

  • Producer accuracy is a measure indicating the probability that the classifier has labeled an image pixel into Class A given that the ground truth is Class A.
  • User accuracy is a measure indicating the probability that a pixel is Class A given that the classifier has labeled the pixel into Class A
  • Overall accuracy is total classification accuracy.
  • Kappa index (another parameter for overall accuracy) is a more useful index for evaluating accuracy.
    • Errors of commission represent pixels that belong to another class but are labeled as belonging to the class.
    • Errors of omission represent pixels that belong to the ground truth class but that the classification technique has failed to classify them into the proper class.
  • Bases of Confusion Matrix
  • Lecture 14

Finding and Modeling Errors

  • Checking for errors
    • Visual inspection during data editing and cleaning.
    • Attributes can be checked by using annotation, line colors and patterns.
    • Double digitizing
    • Statistical analysis may identify extreme values of attributes.
  • Lecture 14

Finding and Modeling Errors

  • Error modeling
    • 1. Epsilon modeling
      • Based on a method of line generalization, and adapted by Blakemore.
      • It places an error band around a digitized line, describing the probable distribution of error.
      • Error distribution is subject to debate:
        • Normal curve
        • Piecewise quartile distribution
        • Bimodal
      • The epsilon band can be used in analyses to improve the confidence of the user in the result.
  • Lecture 14
  • Figure 10.17 Point-in-polygon categories of containment
  • Source: Blakemore (1984)
  • Lecture 14

Finding and Modeling Errors

  • Error modeling
    • 2. Monte Carlo simulation – used in overlays.
      • Simulates input data error by adding random noise to the line coordinates of the map data.
      • Each input is assumed to be characterized by an estimate of positional error.
      • This changes the shape of the line.
      • The process is repeated multiple times and the randomized data put through the GIS analyses.
      • Output:
        • A number
        • A map
  • Lecture 14
  • Figure 10.18 Simulating effects of DEM error and algorithm uncertainty on derived stream networks
  • Lecture 14

Managing GIS Error

  • To manage errors we must track and document them.
  • The concepts introduced earlier:
    • Accuracy, Precision, Resolution, Generalization, Bias, Compatibility, Completeness and Consistency
    • provide a checklist of quality indicators:
  • These should be documented for each data layer.
  • Lecture 14

Managing GIS Error

  • Data quality information can be used to create a data lineage.
  • A record of the data history that presents essential information about the development of the data.
  • This becomes the metadata.
  • Lecture 14

Living with uncertainty

  • uncertainty is inevitable and easier to find,
  • use metadata to document the uncertainty
  • sensitivity analysis to find the impacts of input uncertainty on output,
  • rely on multiple sources of data,
  • be honest and informative in reporting the results of GIS analysis.
  • US Federal Geographic Data Committee lists five components of data quality: attribute accuracy, positional accuracy, logical consistency, completeness, and lineage (details see www.fgdc.gov)
  • Lecture 14

Basics of FGDC

  • Federal Geographic Data Committee (FGDC) metadata answers the who, what, where, when, how and why questions of geospatial data.
  • The data structure and elements defined for FGDC metadata are described fully in the “Content Standard for Digital Geospatial Metadata” (CSDGM).
  • Lecture 14


  • The Federal Geographic Data Committee (FGDC), Content Standard for Digital Geospatial Metadata (CSDGM) organizes a metadata record into seven main sections:
    • Identification Information
    • Data Quality Information
    • Spatial Data Organization Information
    • Spatial Reference Information
    • Entity and Attribute Information
    • Distribution Information
    • Metadata Reference Information
  • Lecture 14

Identification Information

  • Lecture 14
  • What is the name of the dataset?
  • What is the subject or theme of the information included?
  • What is the scale of the dataset?
  • What are the attributes of the dataset?
  • Where is the geographic location of the dataset?
  • Who developed the dataset?
  • Who provided the source material for the dataset?
  • Who will publish the dataset?
  • When were the features of the dataset identified?
  • How are the features of the dataset depicted?
  • Why was the data set created?
  • Are there restrictions on accessing or using the data?
  • Are external files available that are related to the dataset?

Data Quality Information

  • Lecture 14
  • How reliable are the data?
  • What are its limitations or inconsistencies?
  • What is the positional and attribute accuracy?
  • Is the dataset complete?
  • Were the consistency and content of the data verified?
  • Where can the sources of the data be located?
  • What processes were applied to these sources and by whom?

Spatial Data Organization

  • Lecture 14
  • What spatial data model was used to encode the spatial data?
  • How many and what kind of spatial objects are included in the dataset?
  • Are methods other than coordinates, such as street addresses used to encode locations?

Spatial Reference

  • Lecture 14
  • Are coordinate locations encoded using longitude and latitude?
  • What map projections is used?
  • What horizontal datum and/or vertical datum are used?
  • What parameters should be used to convert the data to another coordinate system?

Entity and Attribute Information

  • Lecture 14
  • What geographic information (roads, houses, elevation, temperature, etc.) is described?
  • How is this information coded?
  • What do the codes mean?
  • What source was used for defining the attributes or codes, i.e. Cowardin classification?


  • Lecture 14
  • From whom can the data be obtained?
  • What formats are available?
  • What media are available?
  • Are the data available online?
  • What is the price of the data?

Metadata Reference

  • Lecture 14
  • When were the metadata compiled, and by whom?
  • When was the metadata record created?
  • Who is the responsible party?
  • When were the metadata last updated?

Download 4.64 Mb.

Do'stlaringiz bilan baham:

Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2020
ma'muriyatiga murojaat qiling