The raster module is used to analyze raster data from non-specific sources. In an idealized workflow, use of the convert and download modules can help users obtain and pre-process data for use with functions in the raster module. The modules capabilities include statistics generation, subsetting, reprojecting, null data management, correction functions, and others. Top level functions in the raster module should all have batch processing capabilities bult in.
Raster manipulation in python
There are two main ways to manipulate raster data in python. One of which is with
arcpy.Raster objects, which can have arithmetic opperations performed directly upon them in many cases. The preferred method is match your rasters up with something like
dnppy.raster.spatially_match and then convert them into numpy arrays for efficient matrix like operations. Two workhorse functions that can help you bring a raster into python, perform some mainpulations, and then save that raster again while preserving all geophysical and geospatial attributes are
raster.from_numpy. These functions are designed to pass around masked numpy arrays, which pretty much consist of an array for your data, and one with a bunch of true or false values that tell you if that pixel is nodata or not. You can just perform arithmetic operations on these masked arrays, and it will silently keep track of your nodata for you in the background! The general workflow goes like this:
from dnppy import raster # bring your raster in as a numpy array my_numpy, my_metadata = raster.to_numpy(my_raster_filepath) # access the data and mask information print type(my_numpy) # you will see that this array is a masked numpy array data = my_numpy.data # access just the data in your numpy array mask = my_numpy.mask # access just the mask of your numpy array # perform some arbitrary polynomial arithmetic to create output numpy array out_numpy = 2 * (my_numpy ** 2) + (5 * my_numpy) + 10 # save our output array with the same metadata as our input array! raster.from_numpy(out_numpy, my_metadata, my_output_filepath)
In the middle step, you can see how easy it is to perform mathematical manipulations on one or more raster datasets as if they were any simple variable type!
Often times, one will want to perform an analysis which includes raster data from two different sources. This data usually has to converted to match the resolution, spatial extent, geographic reference, and projection. In order to do simple matrix type math, pixels must also be perfectly coincident, laying directly on top of one another. The
spatially_match function was built for this purpose. It wraps other
dnppy.raster functions, including clip_and_snap and
project_resample and performs the necessary steps to make an input set of data match a reference file. Typically, all should be resampled and subseted to match that of the highest resolution layer, often times a Digital Elevation Model (DEM). One would use the syntax below to make a directory full of raster data match a reference tif, a DEM in this case.
Assume we have a DEM, and a few dozen MODIS images, which are much lower resolution. If you have been following some of the other examples in this wiki, you may have a MODIS data set prepared already. You can easily fetch a DEM using the
fetch_SRTM function in the
from dnppy import raster dempath = r"C:\Users\jwely\test\SRTM_DEM.tif" # filepath to DEM modisdir = r"C:\Users\jwely\test_tifs" # directory with MODIS tifs smdir = r"C:\Users\jwely\sm_tifs" # dir for output sm tiffs. raster.spatially_match(dempath, modisdir, smdir)
The code block above will produce a folder full of tiffs at
smdir that are spatially matched to the same resolution, extents, and projection of the
dempath raster. The images are now ready for further numerical manipulation.
apply_linear_correction(rasterlist, factor, offset, suffix='lc', outdir=None, floor=-999999)¶
Applies a linear correction to a raster dataset. New offset rasters are saved in the output directory with a suffix of “lc” unless one is specified. This may be used to apply any kind of linear relationship that can be described with “mx + b” such as conversion between between K,C, and F. Also useful when ground truthing satellite data and discovering linear errors. All outputs are 32 bit floating point values.
- rasterlist – list of rasters, a single raster, or a directory full of tiffs to Have a linear correction applied to them.
- factor – every pixel in the raster will be MULTIPLIED by this value.
- offset – this offset value will be ADDED to every pixel in the raster.
- suffix – output files will take the same name as input files with this string appended to the end. So input “FILE.tif” outputs “FILE_suffix.tif”
- outdir – directory to save output rasters. “None” will save output images in the same folder as the input images.
- floor – Used to manage NoData. All values less than floor are set to floor then floor is set to the new NoData value. defaults to -999,999
return outputpath: filepath to output files created by this function
Example Usage to convert from MODIS Land surface temperature from digital number to kelvin, you must simply multiply by 0.02 as the stated scale factor listed at the link below [https://lpdaac.usgs.gov/products/modis_products_table/myd11a1].
Now that it is in kelvin, converting to Celsius can be done by adding (-273.15) So, use this function with:
factor = 0.02 offset = -273.15
and one may convert MODIS land surface temperature digital numbers directly to celsius!
clip_and_snap(snap_raster, rastname, outname, NoData_Value=None)¶
Ensures perfect coincidence between a snap_raster and any input rasters
This script is primarily intended for calling by the “raster.spatially_match” function but may be called independently.
it is designed to input a reference image and a working image. The working image must be in exactly the same projection and spatial resolution as the reference image. This script will simply ensure the tif files are perfectly coincident, and that the total image extents are identical. This is important when performing numpy manipulations on matrices derived from different datasets manipulated in different ways to ensure alignment.
This script makes modifications to the original raster file, so save a backup if you are unsure how to use this.
- snap_raster – filepath and name of reference raster whos extent will be taken on by the input rastername.
- rastname – name of raster which should be snapped to the snap_raster
- NoData_Value – Value desired to represent NoData in the saved image.
metadata of the snap_raster file as output by to_numpy
metadata of the rastername file as output by to_numpy
clip_to_shape(rasterlist, shapefile, outdir=False)¶
Simple batch clipping script to clip rasters to shapefiles.
- rasterlist – single file, list of files, or directory for which to clip rasters
- shapefile – shapefile to which rasters will be clipped
- outdir – desired output directory. If no output directory is specified, the new files will simply have ‘_c’ added as a suffix.
list of files created by this function.
degree_days(T_base, Max, Min, NoData_Value, outpath=False, roof=False, floor=False)¶
Inputs rasters for maximum and minimum temperatures, calculates Growing Degree Days
this function is built to perform the common degree day calculation on either a pair of raster filepaths, a pair of numpy arrays It requires, at minimum a maximum temperature value, a minimum temperature value, and a base temperature. This equation could also be used to calculate Chill hours or anything similar.
The equation is
[(Max+Min)/2 + T_base]
where values in Max which are greater than roof are set equal to roof where values in Min which are less than floor are set equal to floor consult [https://en.wikipedia.org/wiki/Growing_degree-day] for more information.
- T_base – base temperature to ADD, be mindful of sign convention.
- Max – filepath, numpy array, or list of maximum temperatures
- Min – filepath, numpy array, or list of minimum temperatures
- NoData_Value – values to ignore (must be int or float)
- outpath – filepath to which output should be saved. Only works if Max and Min inputs are raster filepaths with spatial referencing.
- roof – roof value above which Max temps do not mater
- floor – floor value below which Min temps do not mater
a numpy array of the output degree_days
degree_days_accum(rasterlist, critical_values=None, outdir=None)¶
Accumulates degree days in a time series rasterlist
This function is the logical successor to calc.degree_days. Input a list of rasters containing daily data to be accumulated. Output raster for a given day will be the sum total of the input raster for that day and all preceding days. The last output raster in a years worth of data (image 356) would be the sum of all 365 images. The 25th output raster would be a sum of the first 25 days. Critical value rasters will also be created. Usefull for example: we wish to know on what day of our 365 day sequence every pixel hits a value of 100. Input 100 as a critical value and that output raster will be generated.
- rasterlist – list of files, or directory containing rasters to accumulate
- critical_values – Values at which the user wishes to know WHEN the total accumulation value reaches this point. For every critical value, an output raster will be created. This raster contains integer values denoting the index number of the file at which the value was reached. This input must be a list of ints or floats, not strings.
- outdir – Desired output directory for all output files.
a list of all files created by this function.
Ensures a list of inputs filepaths contains only valid raster types
Parameters: filelist – a list of filepaths that contains some raster filetypes Return new_filelist: a list of filepaths with all non-raster files removed
from_numpy(numpy_rast, metadata, outpath, NoData_Value=None)¶
Wrapper for arcpy.NumPyArrayToRaster function with better metadata handling
this is just a wrapper for the NumPyArrayToRaster function within arcpy. It is used in conjunction with to_numpy to streamline reading image files in and out of numpy arrays. It also ensures that all spatial referencing and projection info is preserved between input and outputs of numpy manipulations.
- numpy_rast – The numpy array version of the input raster
- metadata – The variable exactly as output from “to_numpy”
- outpath – Output filepath of the individual raster
- NoData_Value – The no data value of the output raster
Same as input outpath, filepath to created file.
Usage example call to_numpy with “rast,metadata = to_numpy(Raster)” perform numpy manipulations as you please then save the array with “raster.from_numpy(rast, metadata, output)”
gap_fill_temporal(rasterlist, outdir=None, continuous=True, NoData_Value=None, numpy_datatype='float32')¶
This function is designed to input a time sequence of rasters with partial voids and output a copy of each input image with every pixel equal to the last good value taken. This function will step forward in time through each raster and fill voids from the values of previous rasters. The resulting output image will contain all the data that was in the original image, with the voids filled with older data. A second output image will be generated where the pixel values are equal to the age of each pixel in the image. So if a void was filled with data that’s 5 days old, the “age” raster will have a value of “5” at that location.
- rasterlist – A list of filepaths for rasters with which to fill gaps. THESE IMAGES MUST BE ORDERED FROM OLDEST TO NEWEST (ascending time).
- outdir – the path to the desired output folder, if left “None”, outputs will be saved right next to respective inputs.
- continuous – if “True” an output raster will be generated for every single input raster, which can be used to fill gaps in an entire time series. So, for example output raster 2 will have all the good points in input raster 2, with gaps filled with data from raster 1. output raster 3 will then be gap filled with output raster 2, which might contain some fill values from raster 1, and so forth. If “False” an output raster will only be generated for the LAST raster in the input rasterlist.
- numpy_datatype – the numpy datatype of the output raster. usually “float32”
returns a list of filepaths to new files created by this function.
gap_fill_interpolate(in_rasterpath, out_rasterpath, model=None, max_cell_dist=None, min_points=None)¶
Fills gaps in raster data by spatial kriging interpolation. This should only be used to fill small gaps in continuous datasets (like a DEM), and in instances where it makes sense. This function creates a feature class layer of points where pixels are not NoData, then performs a “kriging” interpolation on the point data to rebuild a uniform grid with a value at every location, thus filling gaps.
WARNING: This script is processing intensive and may take a while to run even for modestly sized datasets.
- in_rasterpath – input filepath to raster to fill gaps
- out_rasterpath – filepath to store output gap filled raster in
- model – type of kriging model to run, options include “SPHERICAL”, “CIRCULAR”, “EXPONENTIAL”, “GAUSSIAN”, and “LINEAR”
- max_cell_dist – The maximum number of cells to interpolate between, data gaps which do not have at least “min_points” points within this distance will not be filled.
- min_points – Minimum number of surrounding points to use in determining value at missing cell.
Returns path to file created by this function
Lists all the rasters in an input directory. finds all formats supported by
- dir_name – directory to search rasters for
- recursive – Set to “True” to search within subfolders of input directory “dir_name”
Verifies that input filename exists, and is of a valid raster format
Parameters: filename – check if this filename is a valid accessible raster. Return <bool>: returns True if filename is valid accessible raster. False otherwise.
many_stats(rasterlist, outdir, outname, saves=None, low_thresh=None, high_thresh=None, numtype='float32', NoData_Value=-9999)¶
Take statistics across many input rasters. This function is used to take statistics on large groups of rasters with identical spatial extents.
- rasterlist – list of raster filepaths for which to take statistics
- outdir – directory where output should be stored.
- outname – output name filename string that will be used in output filenames
- saves – which statistics to save in a raster. Defaults to all three [‘AVG’,’NUM’,’STD’].
- low_thresh – values below low_thresh are assumed erroneous and set to NoData
- high_thresh – values above high_thresh are assumed erroneous and set to NoData.
- numtype – type of numerical value. defaults to 32bit float.
This function does not return anything.
metadata(raster=None, xs=None, ys=None, zs=None)¶
A dnppy standard class for storing raster metadata in a way that allows arcpy.Raster() object style handling while using numpy arrays for data manipulation.
Parameters: raster – filepath to raster existing on disk
typical attributes include the following
Attribute Description Xsize x dimension of raster Ysize y dimension of raster Zsize z dimension of raster (number of bands) Xmin lowest x coordinate Ymin lowest y coordinate Xmax largest x coordiante Ymax largest y coordinate cellWidth horizontal pixel resolution cellHeight vertical pixel resolution desc_pixelType pixel type description returned by arcpy.Describe() pixel_type pixel type description that is actually accepted as input by other
arcpyfunctions. (ex “32_BIT_FLOAT”)
numpy_datatype numpy accepted pixel type string. (ex “float32”) projection the projection of the raster (long string) NoData_Value the value representing no data
new_mosaic(rasterpaths, output_path, mosaic_method=None, cell_size=None, number_of_bands=None)¶
Simply creates a new raster dataset mosaic of input rasters by wrapping the arcpy.MosaicToNewRaster_management function. learn more about the fields here
- rasterpaths – list of complete filepaths to raster data to mosaic
- output_path – place to save new mosaic raster dataset
- mosaic_method – options are “FIRST”, “LAST”, “BLEND”, “MEAN”, “MINIMUM”,”MAXIMUM”
- cell_size – of format “[cellwidth] [cellheight]” in the appropriate linear units, usually meters.
returns filepath to new file, same as input
Simple batch NoData setting function. Makes raster data more arcmap viewing friendly
Function inputs a list of raster (usually tifs) files and sets no data values. This function does not actually change the raster values in any way, and simply defines which numerical values to be considered NoData in metadata.
- rastlist – list of rasters for which to set nodata value
- NoData_Value – Value to declare as NoData (usually 0 or -9999)
returns list of modified files
null_set_range(rastlist, high_thresh=None, low_thresh=None, NoData_Value=None)¶
Changes values within a certain range to NoData. similar to
raster.null_define, but can take an entire range of values to set to NoData. useful in filtering obviously erroneous high or low values from a raster dataset.
- rastlist – list of rasters for which to set no dta values
- high_thresh – will set all values above this to NoData
- low_thresh – will set all values below this to NoData
list of all rasters modified by this function
project_resample(filelist, reference_file, outdir=None, resampling_type=None, cell_size=None)¶
Wrapper for multiple arcpy projecting functions. Projects to reference file. Inputs a filelist and a reference file, then projects all rasters or feature classes in the filelist to match the projection of the reference file. Writes new files with a in the filelist to match the projection of the reference file. Writes new files with a “_p” appended to the end of the input filenames. This also will perform resampling.
- filelist – list of files to be projected
- outdir – optional desired output directory. If none is specified, output files will be named with ‘_p’ as a suffix.
- reference_file – Either a file with the desired projection, or a .prj file.
- resampling_type – exactly as the input for arcmaps project_Raster_management function
- cell_size – exactly as the input for arcmaps project_Raster_management function
list of files created by this function.
raster_fig objects are used for heads up displays of raster data to the user.
- numpy_rast – a numpy array representing araster dataset
- title – title to put on the raster figure plot
closes an active figure
function to set up an initial figure
Function to update a figure that already exists.
- numpy_rast – a numpy array representing a raster dataset
- title – title to put on the raster figure object
raster_overlap(file_A, file_B, outpath, NoData_A=None, NoData_B=None)¶
Finds overlaping area between two raster images. this function examines two images and outputs a raster identifying pixels where both rasters have non-NoData values. Output raster has 1’s where both images have data and 0’s where one or both images are missing data.
- file_A – the first file
- file_B – the second file
- outpath – the output filename for the desired output. must end in ”.tif”
- NoData_A – the NoData value of file A
- NoData_B – the NoData value of file B
filepath to raster created by this function.
- This function automatically invokes
spatially_match(snap_raster, rasterlist, outdir, NoData_Value=False, resamp_type=False)¶
Prepares input rasters for further numerical processing. This function simply ensures all rasters in “rasterlist” are identically projected and have the same cell size, then calls the raster.clip_and_snap function to ensure that the cells are perfectly coincident and that the total spatial extents of the images are identical, even when NoData values are considered. This is useful because it allows the two images to be passed on for numerical processing as nothing more than matrices of values, and the user can be sure that any index in any matrix is exactly coincident with the same index in any other matrix. This is especially important to use when comparing different datasets from different sources outside arcmap, for example MODIS and Landsat data with an ASTER DEM.
- snap_raster – raster to which all other images will be snapped
- rasterlist – list of rasters, a single raster, or a directory full of tiffs which will be clipped to the extent of “snap_raster” and aligned such that the cells are perfectly coincident.
- outdir – the output directory to save newly created spatially matched tifs.
- resamp_type – The resampling type to use if images are not identical cell sizes. “NEAREST”,”BILINEAR”,and “CUBIC” are the most common.
- this function automatically invokes
Wrapper for arcpy.RasterToNumpyArray with better metadata handling
This is just a wraper for the RasterToNumPyArray function within arcpy, but it also extracts out all the spatial referencing information that will probably be needed to save the raster after desired manipulations have been performed. also see raster.from_numpy function in this module.
- raster – Any raster supported by the arcpy.RasterToNumPyArray function
- numpy_datatype – must be a string equal to any of the types listed at the following address [http://docs.scipy.org/doc/numpy/user/basics.types.html] for example: ‘uint8’ or ‘int32’ or ‘float32’
the numpy array version of the input raster
a metadata object. see