Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RMP] Support Offline Batch processing of Recs Generation Pipelines #419

Open
4 tasks
jperez999 opened this issue Jun 27, 2022 · 11 comments
Open
4 tasks

[RMP] Support Offline Batch processing of Recs Generation Pipelines #419

jperez999 opened this issue Jun 27, 2022 · 11 comments
Assignees
Labels
Milestone

Comments

@jperez999
Copy link
Collaborator

jperez999 commented Jun 27, 2022

Problem:

As a user, I would like to run my merlin systems inference pipeline in an offline setting. This will allow me to produce a set of recommendations for all users to be served from a data store, email campaign, etc. I will also be able to conduct rigorous testing and better compare behaviors against other systems, at both operator and system level.

Goal:

To do this I need to be able to run my merlin systems inference graph without using triton or the configs generated for it. It will require a new operator executor class that runs the ops in python instead of tritonserver. The execution should behave exactly as it does in the tritonserver setting, meaning each operator should be provided same inputs, and return same outputs.

  • Run an Inference operator graph without tritonserver.
  • Does not require any new user-facing API changes.
  • Execute the same graph, that would be deployed to tritonserver.
  • Execute in Python process

Constraints:

  • Use the same merlin systems graph/ops that were created for inference pipeline, that would run on tritonserver
  • Swap out the operator executor to python version (non-triton).
  • Allow for all types of graphs, supporting multiple chains and parallel running of ALL available operators.

TODO:

Core

Systems

Issues

Example

Tasks

@viswa-nvidia
Copy link

Assignees will be Karl / Adam.

@sohn21c
Copy link

sohn21c commented Jul 13, 2022

This is a prerequisite for cross-FW evaluation

@nv-alaiacano
Copy link
Contributor

My impression is that batch inference for models is required for cross-FW evaluation, not the full batch inference for a system. The additional steps in the Systems' computation graph (QueryFeast, QueryFaiss, Softmax, filtering, etc) would likely not be required for batch inference on a single Model. Batch inference for the model would have a simpler "training data in -> predictions out" process, which would likely be a step in the Systems graph.

Perhaps we should first build the batch inference functionality (apply nvt transform + use model to predict) including the output format schema, and then that functionality could be shared in cross-FW evaluation and systems-wide batch prediction.

@karlhigley
Copy link
Contributor

We do have some batch prediction functionality for models already, but it's not quite structured in a way that would make it a reasonable foundation for batch processing of graphs. I think we could massage it in that direction though and try to standardize how batch graph processing works in Merlin Core by taking what exists and refactoring it in the right direction.

@bschifferer
Copy link
Contributor

@karlhigley do you think we should add an example for it?

@karlhigley karlhigley changed the title [RMP] Support Offline Batch processing of Inference Pipelines [RMP] Support Offline Batch processing of Recs Generation Pipelines Oct 19, 2022
@karlhigley
Copy link
Contributor

I think we should add an example for every new piece of significant functionality (i.e. almost all roadmap issues.)

@jperez999
Copy link
Collaborator Author

@jperez999
Copy link
Collaborator Author

This is not considered done until we can run all systems operators with a dask executor to create recommendation. Currently some systems operators work with batches of input data as shown in 1022. We need to make all operators work with batches of incoming data.

@karlhigley
Copy link
Contributor

@jperez999 Could you add appropriate tasks to the list in the description?

@karlhigley
Copy link
Contributor

(People don't generally scroll down to see the latest comments when we look at WIP issues to track their progress, so a comment helps but a description update is better.)

@jperez999
Copy link
Collaborator Author

Need to be able to swap out certain operators, based on runtime. I.e. when running daskexecutor for offline batch it is not necessary to run the feature store operator unless we are testing against it. You could run a dataset merge operator instead using offline features stored in a parquet file. Please refer to task list created for further tracking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants