Saturday, December 15, 2007

Shape Validation in ArcSDE 9.1

ArcSDE maintains valid shapes internally by applying a set of rules to each shape type. All ArcSDE commands that create or update shapes use these rules.

Verification rules for point shapes are:
- The area and length of points are set to 0.0.
- A single point's envelope is equal to the point's x,y values.
- The envelope of a multipart point shape is set to the minimum bounding box.

Verification rules for simple lines, or linestring shapes, are:
- Sequential duplicate points are removed.
- Each part must have at least two distinct points.
- Each part may not intersect itself. The start and endpoints may be the same, but the resulting 'ring' is not treated as an area shape.
- Parts may touch each other at the endpoints.
- The length is the sum of all the parts.

Verification rules for lines, or spaghetti shapes, are:
- Lines can intersect themselves.
- Each part must have at least two distinct points.
- Sequential duplicate points are deleted.
- The length is the length of all of its parts added together.

Verification rules and operations on area shapes are:
- Delete duplicate sequential occurrences of a coordinate point.
- Delete dangles.
- Verify that the line segments close (z coordinates at start and endpoints must also be the same) and don't cross.
- Correct rotation to counterclockwise (see the previous section for an explanation of how ArcSDE stores area shapes).
- For area shapes with holes, ensure that holes reside wholly inside the outer boundary. ArcSDE eliminates any holes that are outside the outer boundary.
- Convert a hole that touches an outer boundary at a single common point into an inversion of the area shape.
- Combine multiple holes that touch at common points into a single hole.
- Multipart area shapes may not overlap. However, two parts may touch at a point.
- Multipart area shapes may not share a common boundary. Common boundaries are dissolved.
- If two rings have a common boundary, they are merged into one ring.
- Calculate the total geometry perimeter, including the boundaries of all holes in donut polygons, and store the perimeter as the length of the geometry.
- Calculate the area.
- Calculate the envelope.

How to obtain information about the geometry of the features in a layer?

Applies to: ArcSDE 9.1, 9.2

“sdelayer –o feature_info” included in ArcSDE package can be used to get information about the geometry of a feature.

The syntax for the command is as follows:

sdelayer -o feature_info -l [-V ] [-r {valid all invalid}] [-w <"where_clause">] [-c] [Spatial_Index] [-S ] [-q] [-i ] [-s ] [-D ] [-u [-p ] [-q]

“-o feature_info” operation reports information about the feature, such as shape validity, measurements and extent information, the FID, the presence of annotation, whether the feature contains CAD data, the presence of inclusions or cojoined inner rings, the minimum precision of the layer, the number of points, parts and subparts contained in the feature.

-l : the layer name and its spatial column name.

-r {valid all invalid}: Specifies if only valid shapes are read or all shapes are read. If all shapes read, error returned to indicate invalid shapes. If invalid is specified, only features with invalid geometry are returned with error code.

-w: Where clause
! -w "a = 1" double quote ".
! -w 'a = 1' wrong. ArcSDE error -42 (SE_INVALID_SQL) will be reported in the SDE error log file:
[03/13/2007 16:07:24;SdeId=33433;Client=Client_PC] load_buffer error -42

See here for the meaning of ArcSDE error code.

“sdelayer -o feature_info” operation returns up to 21 different characteristics of each feature in the layer. These characteristics are presented in a series of comma-delimited fields. Fields are returned in the following order:
1. Row ID (Integer) - The row ID of the table containing the spatial column. If the table does not have an ArcSDE or user-maintained row ID column (usually objectid), the Feature ID (FID) column value is returned here instead. If it has neither row ID nor FID, the value returned is 0.
2. FID (Integer) - The Feature ID of the shape. If there is no FID, the row ID is returned instead. If there is neither row ID nor FID, 0 is returned.
3. Entity Type (one character) - A single character indicating the entity type, either N (Nil), P (Point), S (Simple), L (Line), or A (Area).
4. Annotation (Boolean) - Indicates whether or not the shape has ArcSDE annotation. Values are either T (true) or F (false).
5. CAD Data (Boolean) - Indicates whether or not the shape contains ArcSDE CAD Data. Returned values are either T (true) or F (false).
6. Number of Points (Integer) - The total number of points in the shape.
7. Number of Parts (Integer) - The number of parts in the shape. If an error is encountered when attempting to obtain the parts, that error code, from sdeerno.h, is supplied instead.
8. Number of Subparts (Integer) - The number of subparts in the shape. If this is a type of shape that does not have subparts, the value is 0. If instead an error is encountered when attempting to obtain the subparts, that error code from sdeerno.h is supplied.
9. Self-Touching Rings (Boolean) - Indicates the presence (T) or absence (F) of inclusions or cojoined inner rings in the shape. T is always returned for area shapes.
10. Minimum Precision (Integer) - The minimum layer precision to contain this feature; either LOW (32-bit) or HIGH (64-bit).
11. Verification (Integer) - Indicates whether or not ArcSDE considers a shape valid. A value for this field is returned if you use the -r option to specify 'all' the features in the layer be evaluated for validity. Possible return values are 0 if the shape is verified as correct or a negative error code from the sdeerno.h file if it is incorrect. This information is most helpful when you want to determine which features in an Oracle Spatial database are not valid so you can correct this.
Values for numbers 12 through 21 are only returned if the [-c] [Spatial_Index] option is specified.
12. Area (Floating Point) - The area of the shape or 0.0 if this shape is not a polygon. Units depend on the coordinate system of the layer.
13. Length (Floating Point) - The length or perimeter of the shape or 0.0 if this shape is a point or multipoint. Units depend on the coordinate system of the layer.
14. Minimum X (Floating Point) - The minimum x-coordinate of this shape.
15. Maximum X (Floating Point) - The maximum x-coordinate of this shape.
16. Minimum Y (Floating Point) - The minimum y-coordinate of this shape.
17. Maximum Y (Floating Point) - The maximum y-coordinate of this shape.
18. Minimum Z (Floating Point) - The minimum z-coordinate of this shape. This field is only present if the layer has z-coordinates.
19. Maximum Z (Floating Point) - The maximum z-coordinate of this shape. This field is only present if the layer has z-coordinates.
20. Minimum Measure (Floating Point) - The minimum measure of this shape. This field is only present if the layer has measures.
21. Maximum Measure (Floating Point) - The maximum measure of this shape. This field is only present if the layer has measures.

Here is an example to report all invalid geometries in a layer where column land_code equals 1:

sdelayer -o feature_info -l LAND.LAND_POLY,SHAPE -r invalid -i 5151 -w "land_code = 1" -s sdeserver -u user1 -p user1

ArcSDE 9.1 Oracle10g Build 391 Tue Oct 24 11:44:47 PDT 2006
Layer Administration Utility
-----------------------------------------------------
Row Id,FID,Entity Type,Annotation,Cad Data,Number of Points,Number of Parts,Number of Subparts,Self-Touching Rings,Minimum Precision,Verification

1,2,A,F,T,29,2,3,F,32,-155
3,3,A,F,F,3682,2,2,F,32,-148
5,5,A,F,F,66,2,2,F,32,-148

These results indicate that three geometries are invalid. Take the first one for example:
- Has a Row ID of 1
- Has a Feature ID of 2
- Is an area entity (a polygon)
- Does not contain annotation
- Contains CAD data
- Contains 29 points
- Is made up of 2 parts and 3 subparts
- Does not contains self-touching rings
- Is stored in low precision (has a minimum precision of 32 bits)
- Is an invalid shape (error code -155: SE_SELF_INTERSECTING, a simple line or polygon boundary intersects itself.)

See here for the meaning of ArcSDE error code.