# Polygon Statistics

Scope
This lab exemplifies practical applications of various polygon statistical methods.

All algorithms were prototyped into a software implementation, using the Perl language (http://www.perl.com/), and the various functions will be explained throughout this exercise. Although the implementation level may be language specific, the underlying algorithms are common geometric formulas, which can be applied to any software / programming integrated development environment.

• Answers to questions are outlined in bold
• Code examples are outlined in `courier`
Software Setup

The nature of this software implementation was as follows:

• Library of geometric functions: This consists of a generic implementation of the geometric formulas, which, as a software library, can be reused by subsequent application developers. We have named this `geomath.pm` in this context.
• Implementation level interface script: This script contains the specific information pertaining to the exercise, and passes this information to the geometric functions as arguments. We have named this `mathex2.pl` in this context.
Constants

We have defined the following constants in our library:

```use constant pi => 3.14159;
```

Generic Functions

The following generic functions were implemented as general purpose routines.

```sub deg2rad { # degrees to radians
my \$degreeNum = \$_;
my \$radianNum = \$degreeNum * (\$pi / 180);
}

my \$degreeNum = \$radianNum * (180 / \$pi);
return \$degreeNum;
}

sub dd2dms { # decimal degrees to degrees minutes seconds
my (\$D, \$d) = split /\./, \$_;
\$d = "." . \$d;
my \$Mm = \$d * 60;
my @tmp = split /\./, \$Mm;
my \$S = \$1 if (\$tmp * 60) =~ /^(\d{2})/;
return "\$D \$tmp \$S";
}

sub dms2dd { # degrees minutes seconds to decimal degrees
my (\$d, \$m, \$s) = split / /, \$_;
my \$dd = \$d + (\$m / 60) + (\$s / 3600);
return \$dd;
}

sub getMaxVal { # returns greater of two numbers
my \$val1 = \$_;
my \$val2 = \$_;

if (\$val1 > \$val2) {
return \$val1;
}
if (\$val1 < \$val2) {
return \$val2;
}
if (\$val1 == \$val2) {
return \$val1;
}
}

sub getMinVal { # returns lesser of two numbers
my \$val1 = \$_;
my \$val2 = \$_;

if (\$val1 < \$val2) {
return \$val1;
}
if (\$val1 > \$val2) {
return \$val2;
}
if (\$val1 == \$val2) {
return \$val1;
}
}

sub isPolygonClosed {
my @polygonVertices = @_;
if ( (\$polygonVertices != \$polygonVertices[-2]) ||
(\$polygonVertices != \$polygonVertices[-1])) {
return 1;
}
else {
return 0;
}
}

sub isEvenXYPairs {
my @coordArray = @_;
if (! (scalar(@coordArray) % 2) == 0) {
return 1;
}
else {
return 0;
}
}
```

Polygon Definition

The first polygon we will perform calculations on is defined as follows:

```#	X	Y
1	1	3
2	4	7
3	7	2
4	2	1
5	1	3
```
We define the polygon as an array in Perl:
`my @polygon1 = (1, 3, 4, 7, 7, 2, 2, 1, 1, 3);`

We assume that a polygon has N coordinates, N - 1 sides, where the first polygon XY is equal to the last polygon XY (Pn-P1), which constitutes a closed polygon, and that polygon direction is counterclockwise.

Polygon Perimeter

The following code was implemented to calculate the perimeter of a polygon:

```sub getPolygonPerimeter {
my @polygonVertices = @_;
my \$total           = 0;

if (isPolygonClosed(@polygonVertices) == 1)  {
return "Corrupt inputs, not a closed polygon";
}

if (isEvenXYPairs(@polygonVertices) == 1) {
return "Corrupt inputs, must be even numbered xy vertices";
}

for (my \$i = 0; \$i < (scalar(@polygonVertices) - 2); \$i = \$i + 2) {
\$total += sqrt(
((\$polygonVertices[\$i] - \$polygonVertices[\$i + 2]) ** 2) +
((\$polygonVertices[\$i + 1] - \$polygonVertices[\$i + 3]) ** 2));
}
return \$total;
}
```
Note that the algorithm is varied to handle the structure of the array passed from the function call.
..and is called from our implementation script as follows:
`print getPolygonPerimeter(@polygon1) . "\n";`
..which produces the following result:18.1660393859379

Polygon Area

The following code was implemented to calculate the perimeter of a polygon, in both trapezoidal and triangular form:

```sub getPolygonAreaTrapezoidal {
my \$total = 0;
my @polygonVertices = @_;

if (isEvenXYPairs(@polygonVertices) == 1) {
return "Corrupt inputs, must be even numbered xy vertices";
}

if (isPolygonClosed(@polygonVertices) == 1)  {
return "Corrupt inputs, not a closed polygon";
}

for(my \$i = 0; \$i < (scalar(@polygonVertices) - 2); \$i = \$i + 2) {
\$total += (\$polygonVertices[\$i+2] - \$polygonVertices[\$i]) * (\$polygonVertices[\$i+1] + \$polygonVertices[\$i+3]);
}
\$total *= 0.5;
return \$total;
}

sub getPolygonAreaTriangular {
my \$total = 0;
my @polygonVertices = @_;

if (isEvenXYPairs(@polygonVertices) == 1) {
return "Corrupt inputs, must be even numbered xy vertices";
}

if (isPolygonClosed(@polygonVertices) == 1)  {
return "Corrupt inputs, not a closed polygon";
}

for(my \$i = 0; \$i < (scalar(@polygonVertices) - 2); \$i = \$i + 2) {
\$total += ( (\$polygonVertices[\$i] * \$polygonVertices[\$i + 3]) -
(\$polygonVertices[\$i + 2] * \$polygonVertices[\$i + 1]));
}
\$total *= -0.5;
return \$total;
}
```
Note that the algorithms are varied to handle the structure of the array passed from the function call.

Here's the implementation level source code:

```print getPolygonAreaTriangular(@polygon1) . "\n";
print getPolygonAreaTrapezoidal(@polygon1) . "\n";
```
..producing the following results:
```
Triangular: 19
Trapezoidal: 19

```

Computational Efficiency

Each method requires a number of calculations in order to process the result. Below are benchmark times for each method:

```
Triangular:
0.33user 0.01system 0:00.34elapsed 101%CPU (0avgtext+0avgdata 0maxresident)k0inputs+0outputs (298major+305minor)pagefaults 0swaps

Trapezoidal:
0.29user 0.05system 0:00.33elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k0inputs+0outputs (298major+305minor)pagefaults 0swaps

```

On a given system, it was found that the trapezoidal algorithm proved more efficient (based on the 'user' benchmark parameter, and the less CPU usage, as well as less elapsed time of the calculations). This can be attributed to the fact that there are less multiplication directives within the trapezoidal polygon area algorithm. While this may not be significant for the polygon used as data input above, this can prove to be a large savings in terms of larger, more complex polygonal data.

Polygon Centroid

The following code was implemented to calculate the centroid of a polygon, in triangular form:

```sub getPolygonCentroidTriangular {
my @polygonVertices = @_;

my \$totalX = 0;
my \$totalY = 0;
my \$centX  = 0;
my \$centY  = 0;

my \$polygonArea = 0;

if (isEvenXYPairs(@polygonVertices) == 1) {
return "Corrupt inputs, must be even numbered xy vertices";
}

if (isPolygonClosed(@polygonVertices) == 1)  {
return "Corrupt inputs, not a closed polygon";
}

\$polygonArea = getPolygonAreaTriangular(@_);

my \$arrLength = (scalar(@polygonVertices) / 2) - 1;

for (my \$i = 0; \$i < (\$arrLength + 3); \$i = \$i + 2) {
\$totalX += (\$polygonVertices[\$i] + \$polygonVertices[\$i+2]) * (\$polygonVertices[\$i+2] * \$polygonVertices[\$i+1] - \$polygonVertices[\$i] * \$polygonVertices[\$i+3]);

\$totalY += (\$polygonVertices[\$i+1] + \$polygonVertices[\$i+3]) * (\$polygonVertices[\$i+2] * \$polygonVertices[\$i+1] - \$polygonVertices[\$i] * \$polygonVertices[\$i+3]);
}

\$centX = \$totalX / (6 * \$polygonArea);
\$centY = \$totalY / (6 * \$polygonArea);
return "\$centX,\$centY";
}
```
Note that the algorithms are varied to handle the structure of the array passed from the function call.

Here's the implementation level source code: `print getPolygonCentroidTriangular(@polygon1) . "\n";`
..which produces the following results:
3.5,3.25

The same result was obtained with the coordinate averaging algorithm, implemented below:

```sub getPolygonCentroidByCoordAvg {
my @polygonVertices = @_;

my \$totalX = 0;
my \$totalY = 0;
my \$centX  = 0;
my \$centY  = 0;

if (isEvenXYPairs(@polygonVertices) == 1) {
return "Corrupt inputs, must be even numbered xy vertices";
}

if (isPolygonClosed(@polygonVertices) == 1)  {
return "Corrupt inputs, not a closed polygon";
}

my \$arrLength = (scalar(@polygonVertices) / 2) - 1;

for (my \$i = 0; \$i < (\$arrLength + 3); \$i = \$i + 2) {
\$totalX += \$polygonVertices[\$i];
\$totalY += \$polygonVertices[\$i + 1];
}

\$centX = \$totalX / \$arrLength;
\$centY = \$totalY / \$arrLength;

return "\$centX,\$centY";
}

print getPolygonCentroidByCoordAvg(@polygon1) . "\n";
```

Scale Factor Dependencies

We apply scaling rules to the defined polygon and calculate polygon perimeter, area and centroid statistics in the implementation code below:

```my \$arrLength = scalar(@polygon1) / 2 + 4;

for(my \$i = 0; \$i < \$arrLength; \$i = \$i + 2) {
my \$tmp = scaleXY(\$polygon1[\$i], \$polygon1[\$i+1], 0.5) . "\n";
my @a   = split /\,/, \$tmp;
push @newArr, \$a;
push @newArr, \$a;
}

print getPolygonPerimeter(@newArr) . "\n";
print getPolygonAreaTrapezoidal(@newArr) . "\n";
print getPolygonCentroidByCoordAvg(@newArr) . "\n";
```
..which produces the following results:
```
Perimeter	9.0830196929689
Area	4.35
Centroid 	1.75,1.625

```

We find that the centroid and perimeter values of the scaled polygon are half of the original polygon centroid and perimeter, and that the area of the scaled polygon is the square root of the area of the original polygon.

More Polygon Statistics

We implement a new polygon object with the following properties:

```#	X	Y	zone
1	1	1	1
2	2	6
3	8	5
4	7	2
5	1	1
6	3	3	2
7	6	3
8	5	5
9	3	3
10	9	2	3
11	10	4
12	11	1
13	9	2
```
When calculating the area and centroid of zone 1, we derive the following values with the implementation below:
```my @poly2zone1 = (1, 1, 2, 6, 8, 5, 7, 2, 1, 1);
my @poly2zone2 = (3, 3, 6, 3, 5, 5, 3, 3);
my @poly2zone3 = (9, 2, 10, 4, 11, 1, 9, 2);

print getPolygonAreaTrapezoidal(@poly2zone1) . " ";
print getPolygonCentroidTriangular(@poly2zone1) . " ";
print getPolygonPerimeter(@poly2zone1) . "\n";

print getPolygonAreaTrapezoidal(@poly2zone2) . " ";
print getPolygonCentroidTriangular(@poly2zone2) . " ";
print getPolygonPerimeter(@poly2zone2) . "\n";

print getPolygonAreaTrapezoidal(@poly2zone3) . " ";
print getPolygonCentroidTriangular(@poly2zone3) . " ";
print getPolygonPerimeter(@poly2zone3) . "\n";
```
..and derive the following statistics:
```
Zone	Area	Centroid	Perimeter
1	24	 4.25694444444444, 3.52777777777778	 20.4268222343576
2	-3	 4.66666666666667, 3.66666666666667	 8.06449510224598
3	2.5	 10, 2.33333333333333 	7.63441361516796

```

We find that the direction of digitizing, and the order of the calculations is important for area and centroid calculations of polygons. We find that counterclockwise-directed polygons produce negative results. As a result, a general rule regarding directions of digitizing when a zone has several extents is to digitize polygons as zones within other polygons as counterclockwise. Therefore, as an example, if a polygon area calculation produces negative results, we can conclude that the polygon is counterclockwise in digitizing. This applied to zone 2 in the above table (i.e. if result < 0, polygon is counterclockwise).

Concave Polygon Statistics

We define a new polygon object with the following properties:

```#	X	Y
1	1	1
2	3	7
3	6	2
4	3	5
5	1	1
```
And calculate the centroid of this polygon as follows:
```my @polygonConcave = (1, 1, 3, 7, 6, 2, 3, 5, 1, 1); print getPolygonCentroidTriangular(@polygonConcave) . "\n"; ``` ..which produces a centroid of 3.25, 3,75.

This poses a potential problem in such operations as plotting values or symbols centred at the calculated centroid of each zone (such as a label or symbol map). In the case above, the centroid of the polygon falls outside of the polygon, thus inaccurate.

We can test this with a simple 'point in polygon' function in our library:

```sub isPointInPolygon {
my ( \$x, \$y, @xy ) = @_;

my \$n = @xy / 2;                      # Number of points in polygon.
my @i = map { 2 * \$_ } 0 .. (@xy/2);  # The even indices of @xy.
my @x = map { \$xy[ \$_ ]     } @i;     # Even indices: x-coordinates.
my @y = map { \$xy[ \$_ + 1 ] } @i;     # Odd indices: y-coordinates.

my ( \$i, \$j );                        # Indices.
my \$side = 0;                         # 0 = outside, 1 = inside.

for ( \$i = 0, \$j = \$n - 1 ; \$i < \$n; \$j = \$i++ ) {
if (
(
# If the y is between the (y-) borders ...
( ( \$y[ \$i ] <= \$y ) && ( \$y < \$y[ \$j ] ) ) ||
( ( \$y[ \$j ] <= \$y ) && ( \$y < \$y[ \$i ] ) )
)
and
# ...the (x,y) to infinity line crosses the edge
# from the ith point to the jth point...
(\$x
<
( \$x[ \$j ] - \$x[ \$i ] ) *
( \$y - \$y[ \$i ] ) / ( \$y[ \$j ] - \$y[ \$i ] ) + \$x[ \$i ] ))
{
\$side = not \$side; # Jump the fence.
}
}
return \$side ? 1 : 0;
}
```

This function returns 0 if the point is outside the polygon, and 1 if the point is inside the polygon.

A potential solution would be to derive a line, which passes through the polygon, equidistant from polygon edges, and thereby derive the midpoint of that line, to derive the centroid of a polygon, which is deemed concave.