As the title suggests, we will describe in more detail how to create process graphs and respectively “User Defined Processes” (UDP). Those processes can be seen as the analysis workflows you want to run at the openEO back-end of your choice. You might call them later directly or you might add some variables to the UDP to customize the processes at runtime, e.g. set the temporal interval or the area of interest. This vignette focuses the user point of view.
In the following we will create the EVI calculation and store it as a process graph on the openEO service.
Most of the processes offered by openEO services are standardized,
this means that it will be possible to use mathematical operators like
+, - and alike coherently between different
services. That also allowed us to overload the primitive mathematical
operators in R so that it becomes easy to use.
The EVI calculation is an function that is going to be applied on
specific bands in an optical image collection. It involves the bands
“red”, “blue” and the near infrared. This means that those 3 bands are
computed into a single band, which will be referred to as reducing the
band dimension. This calculation is a simple band arithmetic, which is
usually done in R by passing a function to a raster calculation function
like raster::calc. Similarly we use this mechanism in the
openeo package.
p = processes()
evi = function(data,context) {
  B08 = data[1]
  B04 = data[2]
  B02 = data[3]
  return((2.5 * (B08 - B04)) / sum(B08, 6 * B04, -7.5 * B02, 1))
}Just to get an impression what happens behind the scenes, we will now have a look at the JSON graph of this quite simple looking arithmetic function. Under normal circumstances you would not need to deal with creating the JSON representation by yourself, this will be handled internally by the package.
The following coerce function will create an internal
Graph object and using its print function the
object is automatically serialized into the JSON graph.
## {
##   "array_element_OTDBT3204C": {
##     "process_id": "array_element",
##     "arguments": {
##       "data": {
##         "from_parameter": "data"
##       },
##       "index": 0,
##       "return_nodata": false
##     }
##   },
##   "array_element_CWYIA3100R": {
##     "process_id": "array_element",
##     "arguments": {
##       "data": {
##         "from_parameter": "data"
##       },
##       "index": 1,
##       "return_nodata": false
##     }
##   },
##   "subtract_AUQIS3871L": {
##     "process_id": "subtract",
##     "arguments": {
##       "x": {
##         "from_node": "array_element_OTDBT3204C"
##       },
##       "y": {
##         "from_node": "array_element_CWYIA3100R"
##       }
##     }
##   },
##   "multiply_MSMHH5529C": {
##     "process_id": "multiply",
##     "arguments": {
##       "x": 2.5,
##       "y": {
##         "from_node": "subtract_AUQIS3871L"
##       }
##     }
##   },
##   "multiply_RAUNB9250Y": {
##     "process_id": "multiply",
##     "arguments": {
##       "x": 6,
##       "y": {
##         "from_node": "array_element_CWYIA3100R"
##       }
##     }
##   },
##   "array_element_EKPUG2335B": {
##     "process_id": "array_element",
##     "arguments": {
##       "data": {
##         "from_parameter": "data"
##       },
##       "index": 2,
##       "return_nodata": false
##     }
##   },
##   "multiply_DFHFS0742K": {
##     "process_id": "multiply",
##     "arguments": {
##       "x": -7.5,
##       "y": {
##         "from_node": "array_element_EKPUG2335B"
##       }
##     }
##   },
##   "sum_VKSWI2664S": {
##     "process_id": "sum",
##     "arguments": {
##       "data": [
##         {
##           "from_node": "array_element_OTDBT3204C"
##         },
##         {
##           "from_node": "multiply_RAUNB9250Y"
##         },
##         {
##           "from_node": "multiply_DFHFS0742K"
##         },
##         1
##       ],
##       "ignore_nodata": false
##     }
##   },
##   "divide_JKTIE0259Q": {
##     "process_id": "divide",
##     "arguments": {
##       "x": {
##         "from_node": "multiply_MSMHH5529C"
##       },
##       "y": {
##         "from_node": "sum_VKSWI2664S"
##       }
##     },
##     "result": true
##   }
## }The coercion was just an excursion in order to get a glimpse at the data structure that will be sent to the back-end. We now continue with the user-defined process creation.
The prior use case covered a sub process graph. Now, we are going to create an analysis ready process graph that selects data, and makes multiple dimension modification. It will use the EVI band arithmetic as an inbound function.
library(sf)
p = processes()
bbox = st_bbox(c(xmin=16.1,
                 xmax=16.6,
                 ymax=48.6,
                 ymin= 47.2), crs = 4326)
data = p$load_collection(id = "SENTINEL2_L2A_SENTINELHUB",
                         spatial_extent = bbox,
                         temporal_extent = list(
                           "2018-04-01", "2018-05-01"
                         ),
                         bands=list("B08","B04","B02"))Here we are using the EVI calculation at the variable
evi from example in the beginning.
temporal_reduce = p$reduce_dimension(data=spectral_reduce,dimension = "t", reducer = function(x,y){
  min(x)
})As a “reducer” or “aggregator” function you will have to always
create an anonymous function or use a predefined one, with the same
amount of process graph parameters. The naming of parameters does not
matter, because simply the order matters. To know how to formulate the
function, you need to check the back-end processes documentation,
e.g. process_viewer(p$reduce_dimension) or
describe_process(p$reduce_dimension).
apply_linear_transform = p$apply(data=temporal_reduce,process = function(value,...) {
  p$linear_scale_range(x = value, 
                           inputMin = -1, 
                           inputMax = 1, 
                           outputMin = 0, 
                           outputMax = 255)
})As a last step we will store the results as a PNG file. The
ProcessNode returned from that function will be our end
node in the graph and so we will pass it on towards openEO service
functions.
The result might be considered as the final node in this graph. This
node we can pass to a processing function like
compute_result() or create_job(). Or we store
the process at the back-end in order to be reused (Note: it depends on
whether a particular back-end supports the user-defined processes).
You don’t necessarily need to store each intermediate step in a separate variable, but it might speed things up, when you want to edit some parameters. This means you can write this graph also little tidy’er like this:
library(magrittr)
p = processes()
result2 = p$load_collection(id = "SENTINEL2_L2A_SENTINELHUB",
                         spatial_extent = bbox,
                         temporal_extent = list(
                           "2018-04-01", "2018-05-01"
                         ),
                         bands=list("B08","B04","B02")) %>% 
  p$reduce_dimension(dimension = "bands",reducer = evi) %>% 
  p$reduce_dimension(dimension = "t", reducer = function(x,y){
    min(x)
  }) %>% 
  p$apply(process = function(value,...) {
    p$linear_scale_range(x = value, 
                             inputMin = -1, 
                             inputMax = 1, 
                             outputMin = 0, 
                             outputMax = 255)
  }) %>% 
  p$save_result(format="PNG")So, now we have created a small evi graph and one carrying out the whole processing on the data. In the next steps we store it for later use at the openEO service. The next section depends on the features supported by the connected back-end and might not be present. In any case the next functions showcase the intended use of the user-defined processes. In short, user-defined processes allow you to store your process graphs as reuseable processes on the back-end in the same way predefined-processes exists as the basic graph building blocks. However, the same functionality you already have with the fact, that you can write R scripts and let the code run again, this is the reasoning for back-end provider that don’t support user-defined processes.
With the next command you can check your process graph locally for potential problems, before even sending it to the back-end.
graph_id = create_user_process(graph = evi, id = "evi", summary = "EVI calculation on an array with 3 bands", description = "The EVI calculation is based on an array of 3 band values: blue, red, nir. In that order.")Fetch the process graph definition as a user define openEO process and print it.
If you want the graph representation reimported into R, you can use
parse_graph on this received ProcessInfo
object or you can use the coerce function.
Note: depending on the supported back-end functionalities storing user-defined processes might not be implemented. But the following code sample shows, how you would create a user-defined processes on the back-end.
min_evi_graph_id = create_user_process( graph = result, id = "min_evi",summary="Minimum EVI calculation on Sentinel-2", description = "A preset process graph that will calculate the minimum NDVI on Sentinel-2 data, performs a linear scale into the value interval 0 to 255 in order to store the results as PNG.")As this graph has no parameters and will simply run like a pre-configured job, feel free to delete the example process again.
As we have eventually created a user-defined process for the
evi function. The openeo package allows you to reuse this
processes in a similar way as predefined processes. Instead of
processes(), you can also use user_processes()
to create a process node builder object. We named the process also
evi which helps to reference to the correct processes. We
will now use the the complete workflow, where we have stored the
intermediate nodes in individual variables. We will then edit the
parameter that holds the evi function and replace the
function with the user-defined process evi.
spectral_reduce$parameters$reducer = function(x,context) {
  udps$evi(x)
}
min_evi_graph = as(result,"Process")As all the process nodes and arguments are R6 classes, the value is replaced at object level. The R6 object is uniquely referable which means that the value is updated wherever the process node was used.