Processing GeoTiff files in .NET (without using GDAL)

I've recently created a process that is able to import geoferenced raster data, namely GeoTiff files... with a catch:

As I didn't find a proper lib to load GeoTiff files I was converting the source files to an ASCII Gridded XYZ format prior to importing. Despite the fancy name it's simply an ASCII file where each line contains:

<LONGITUDE>     <LATITUDE>      <VALUE1>   [ <VALUE 2>]   [<VALUE 3>]

Each line on this file corresponds to a pixel on a raster image, thus it's incredibly easy to parse in C#. The code would be something like this:

foreach (var line in File.ReadLines(filePath))
{
    if (line == null)
    {
        continue;
    }

    var components = line.Split(
        new[] { '' }, 
        StringSplitOptions.RemoveEmptyEntries);

    var longitude = double.Parse(components[0]);
    var latitude = double.Parse(components[1]);

    IEnumerable<string> values = components.Skip(2);

    //... process each data item

    yield return dataItem;
}

Nice and easy. But this has two disadvantages:

First, it requires an extra step to convert to XYZ from Geotiff, although easily done with the GDAL:

gdal_translate -of XYZ <input.tif> <output.xyz>

Another disadvantage is that the XYZ becomes much larger than its TIFF counterpart. Depending on the source file the difference could be something like 10 GB vs 500 MB (yes, 50x bigger)

So, I would really like to be able to process a GeoTiff file directly in c#. One obvious candidate is using GDAL, particularly its C# bindings. Although these bindings work relatively well they rely on an unmanaged GDAL installation. For me this is a problem as I want to host this on Azure WebSites, where installing stuff is not an option.

So, I embraced the challenge of being able to process a GeoTIFF file in a fully managed way.

The file that I'll be loading is this one. It's an height map for the entire world (21600x10800, 222MB)

First step, loading the TIFF file, still ignoring any geo data.

Fortunately I found a really nice lib that does exactly what I wanted: be able to load a tiff file in c#, fully managed. It's called LibTiff.NET.

Although I'm not really found of its API, it does include tons of functionality and documentation. Most demos load the full image into memory (which isn't really viable for larger tiff files) but it does include the capability to process a file one line at a time.

My starting point was this example here. The important bits are:

using (Tiff tiff = Tiff.Open(@"<file.tif>", "r"))
{
    int width = tiff.GetField(TiffTag.IMAGEWIDTH)[0].ToInt();
    byte[] scanline = new byte[tiff.ScanlineSize()];

    for (int i = 0; i < height; i++)
    {
        tiff.ReadScanline(scanline, i);
    }       
}

This sample opens a tiff and iterates each line. This code is used to open 8bit tiff files. 16 bits files would require an additional operation:

using (Tiff tiff = Tiff.Open(@"<file.tif>", "r"))
{
    int width = tiff.GetField(TiffTag.IMAGEWIDTH)[0].ToInt();
    byte[] scanline = new byte[tiff.ScanlineSize()];
    ushort[] scanline16Bit = new ushort[tiff.ScanlineSize() / 2];

    for (int i = 0; i < height; i++)
    {
        tiff.ReadScanline(scanline, i);
        Buffer.BlockCopy(scanline, 0, scanline16Bit, 0, scanline.Length);
    }       
}

Second step, loading geographical data and corresponding height value

The aforementioned lib doesn't include any support for GeoTiff. So, time to take a look at the official docs.

According to the official FAQ (http://www.remotesensing.org/geotiff/faq.html):

GeoTIFF is a metadata format, which provides geographic information to associate with the image data. But the TIFF file structure allows both the metadata and the image data to be encoded into the same file.
GeoTIFF makes use of a public tag structure which is platform interoperable between any and all GeoTIFF-savvy readers. Any GIS, CAD, Image Processing, Desktop Mapping and any other types of systems using geographic images can read any GeoTIFF files created on any system to the GeoTIFF specification.

Before delving into the spec, I need a comparison point to make sure I'm getting the proper values. Thus I'm using gdalinfo to provide some information on the raster file. It's usage is really simple:

gdalinfo altitude.tif

This outputs the following information:

Size is 21600, 10800
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433],
    AUTHORITY["EPSG","4326"]]
Origin = (-180.000000000000000,90.000000000000000)
Pixel Size = (0.016666666666667,-0.016666666666667)
Metadata:
  AREA_OR_POINT=Area
  EXIF_ColorSpace=65535
  EXIF_DateTime=2005:10:12 22:04:52
  EXIF_Orientation=1
  EXIF_PixelXDimension=21600
  EXIF_PixelYDimension=10800
  EXIF_ResolutionUnit=2
  EXIF_Software=Adobe Photoshop CS2 Macintosh
  EXIF_XResolution=(72)
  EXIF_YResolution=(72)
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (-180.0000000,  90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N)
Lower Left  (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S)
Upper Right ( 180.0000000,  90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N)
Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S)
Center      (   0.0000000,   0.0000000) (  0d 0' 0.01"E,  0d 0' 0.01"N)
Band 1 Block=21600x1 Type=Byte, ColorInterp=Gray
  Min=0.000 Max=213.000 
  Minimum=0.000, Maximum=213.000, Mean=22.754, StdDev=25.124
  Metadata:
    STATISTICS_MAXIMUM=213
    STATISTICS_MEAN=22.753594797178
    STATISTICS_MINIMUM=0
    STATISTICS_STDDEV=25.124203131182

I'm mostly interested on these two lines:

Origin = (-180.000000000000000,90.000000000000000)
Pixel Size = (0.016666666666667,-0.016666666666667)

This means that the top-left corner of the image corresponds to coordinate -180,90 and that each pixel increments 0.016(7) degrees.

So, time to check the spec, namely section 2.6.1 on http://www.remotesensing.org/geotiff/spec/geotiff2.6.html):

Apparently tags 33922 and 33550 provide exactly the information I need. They're defined as:

ModelTiepointTag:
Tag = 33922 (8482.H)
Type = DOUBLE (IEEE Double precision)
N = 6*K, K = number of tiepoints
Alias: GeoreferenceTag
Owner: Intergraph
This tag stores raster->model tiepoint pairs in the order

ModelTiepointTag = (...,I,J,K, X,Y,Z...),
where (I,J,K) is the point at location (I,J) in raster space with pixel-value K, and (X,Y,Z) is a vector in model space. In most cases the model space is only two-dimensional, in which case both K and Z should be set to zero; this third dimension is provided in anticipation of future support for 3D digital elevation models and vertical coordinate systems.

and

ModelPixelScaleTag:
Tag = 33550
Type = DOUBLE (IEEE Double precision)
N = 3
Owner: SoftDesk
This tag may be used to specify the size of raster pixel spacing in the model space units, when the raster space can be embedded in the model space coordinate system without rotation, and consists of the following 3 values:

ModelPixelScaleTag = (ScaleX, ScaleY, ScaleZ)
where ScaleX and ScaleY give the horizontal and vertical spacing of raster pixels. The ScaleZ is primarily used to map the pixel value of a digital elevation model into the correct Z-scale, and so for most other purposes this value should be zero (since most model spaces are 2-D, with Z=0).

To read theses tags with the LibTiff.Net lib one needs to use the GetField method. Thus, loading the values from these tags will be as simple as:

FieldValue[] modelPixelScaleTag = tiff.GetField((TiffTag)33550);
FieldValue[] modelTiepointTag = tiff.GetField((TiffTag)33922);

byte[] modelPixelScale = modelPixelScaleTag[1].GetBytes();
double pixelSizeX = BitConverter.ToDouble(modelPixelScale, 0);
double pixelSizeY = BitConverter.ToDouble(modelPixelScale, 8)*-1;

byte[] modelTransformation = modelTiepointTag[1].GetBytes();
double originLon = BitConverter.ToDouble(modelTransformation, 24);
double originLat = BitConverter.ToDouble(modelTransformation, 32);

With this information I'm mostly ready to iterate the various raster lines. But, and although conceptually the top-left corner corresponds to coordinate -180, 90 the top-left pixel itself corresponds to coordinate -179.99166, 89.99166. This is obtained through:

double startLat = originLat + (pixelSizeY/2.0);
double startLon = originLon + (pixelSizeX/2.0);

So here's the full source-code.
Disclaimer: This is mostly coded for my particular scenario as GeoTiff supports a wider-range of options.

using (Tiff tiff = Tiff.Open(filePath, "r"))
{
    int height  = tiff.GetField(TiffTag.IMAGELENGTH)[0].ToInt();
    FieldValue[] modelPixelScaleTag = tiff.GetField((TiffTag)33550);
    FieldValue[] modelTiepointTag = tiff.GetField((TiffTag)33922);

    byte[] modelPixelScale = modelPixelScaleTag[1].GetBytes();
    double pixelSizeX = BitConverter.ToDouble(modelPixelScale, 0);
    double pixelSizeY = BitConverter.ToDouble(modelPixelScale, 8)*-1;

    byte[] modelTransformation = modelTiepointTag[1].GetBytes();
    double originLon = BitConverter.ToDouble(modelTransformation, 24);
    double originLat = BitConverter.ToDouble(modelTransformation, 32);

    double startLat = originLat + (pixelSizeY/2.0);
    double startLon = originLon + (pixelSizeX / 2.0);

    var scanline = new byte[tiff.ScanlineSize()]; 

    //TODO: Check if band is stored in 1 byte or 2 bytes. 
    //If 2, the following code would be required
    //var scanline16Bit = new ushort[tiff.ScanlineSize() / 2];
    //Buffer.BlockCopy(scanline, 0, scanline16Bit, 0, scanline.Length);

    double currentLat = startLat;
    double currentLon = startLon;

    for (int i = 0; i < height; i++)
    {
        tiff.ReadScanline(scanline, i); //Loading ith Line            

        var latitude = currentLat + (pixelSizeY * i);
        for (var j = 0; j < scanline.Length; j++)
        {
            var longitude = currentLon + (pixelSizeX*j);
            geodata.Points[0] = new[] { new PointXY(longitude, latitude) };
            object value = scanline[j];

            //... process each data item

            yield return dataItem;
        }
    }
}

This code is actually working quite well. Eventually could be interesting to put this onto a proper lib and add support for additional GeoTiff features. Adding that to my backlog :)

Latest Images

Trending Articles

Latest Images