Currently, pipeline modules are moved to the preferred compute device during __call__. This is reasonable, as they stay there as long as the user keeps passing the same torch_device across calls.
However, in multi-GPU model-serving scenarios, it could be useful to move each pipeline to a dedicated device during or immediately after instantiation. This would make it possible to create, say, 8 different pipelines and move each one to a different GPU. Doing it this way could potentially save CPU memory while preparing the service.
Currently, the workaround to achieve the same would be to perform a call with fake data immediately after the instantiation.
Describe the solution you'd like
Ideally, the following should work:
Another alternative would be to pass the device to the initializer. This could be done in addition to adding a to method, but I believe it's not necessary as to is familiar enough to PyTorch users.
@patil-suraj happy to take it! I'll do it after making some progress on the backend, unless it's urgent. I think I'd be ready to work on this later today or tomorrow, would that be ok?
pcuenca commentedAug 17, 2022
Currently, pipeline modules are moved to the preferred compute device during
__call__. This is reasonable, as they stay there as long as the user keeps passing the sametorch_deviceacross calls.However, in multi-GPU model-serving scenarios, it could be useful to move each pipeline to a dedicated device during or immediately after instantiation. This would make it possible to create, say, 8 different pipelines and move each one to a different GPU. Doing it this way could potentially save CPU memory while preparing the service.
Currently, the workaround to achieve the same would be to perform a call with fake data immediately after the instantiation.
Describe the solution you'd like
Ideally, the following should work:
Describe alternatives you've considered
Current workaround:
Another alternative would be to pass the device to the initializer. This could be done in addition to adding a
tomethod, but I believe it's not necessary astois familiar enough to PyTorch users.Additional context
See discussion in this Slack thread.
The text was updated successfully, but these errors were encountered: