openlayer.InferencePipeline.update_data#
- InferencePipeline.update_data(*args, **kwargs)#
Updates values for data already on the Openlayer platform.
This method is frequently used to upload the ground truths of production data that was already published without them. This is useful when the ground truths are not available during inference time, but they shall be update later to enable performance metrics.
- Parameters:
- dfpd.DataFrame
Dataframe containing ground truths.
The df must contain a column with the inference IDs, and another column with the ground truths.
- ground_truth_column_nameOptional[str]
Name of the column containing the ground truths. Optional, defaults to
None
.- inference_id_column_namestr
Name of the column containing the inference IDs. The inference IDs are used to match the ground truths with the production data already published.
Examples
Related guide: How to set up monitoring.
Let’s say you have a batch of production data already published to the Openlayer platform (with the method
publish_batch_data
). Now, you want to update the ground truths of this batch.First, instantiate the client and retrieve an existing inference pipeline:
>>> import openlayer >>> >>> client = openlayer.OpenlayerClient('YOUR_API_KEY_HERE') >>> >>> project = client.load_project(name="Churn prediction") >>> >>> inference_pipeline = project.load_inference_pipeline( ... name="XGBoost model inference pipeline", ... )
If your
df
with the ground truths looks like the following:>>> df inference_id label 0 d56d2b2c 0 1 3b0b2521 1 2 8c294a3a 0
You can publish the ground truths with:
>>> inference_pipeline.update_data( ... df=df, ... inference_id_column_name='inference_id', ... ground_truth_column_name='label', ... )