predict_class and predict_probabilities

@JeroenVerstraelen is working on an implementation of CatBoost base ML in the VITO backend and while discussing details a couple of things came up:

- the `predict_catboost`  process would be practically identical to `predict_random_forest`, except for some textual differences in title and descriptions. Turns out that it is not really necessary to define a dedicated `predict_` process for each kind of machine learning model: all the model details are embedded in the `ml-model` object and you could just use a single `predict(data: array, model: ml-model)` for all kinds of ML models.
- for some use cases we want to predict the probability of each class instead of a single class prediction. We first considered adding a parameter to toggle between class output or probabilities output, but that would mean that the output type would change: scalar for class prediction and array for probability prediction. Moreover, the former has to be used in `reduce_dimension` and the other in `apply_dimension`. It felt error prone and confusing to let these two different patterns depend on a rather inconspicuous boolean parameter. It might be better to have a separate processes for class prediction and probabilities prediction


So with this background, the proposal is to introduce two generic ml prediction processes: 
- `predict_class(data: array, model: ml-model) -> number` 
- `predict_probabilities(data: array, model: ml-model) -> array` 

both can be easily spec'ed based on current https://github.com/Open-EO/openeo-processes/blob/draft/proposals/predict_random_forest.json



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

predict_class and predict_probabilities #368

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

predict_class and predict_probabilities #368

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions