Support allow_reuse in repo #140
Comments
|
Just jotting down ideas: For 1, we can specify tags to associate with the run & steps as part of the submission process.
In the register model step we can pull the tags from the parent run (like we do for mse value). |
|
@sudivate The change you made doesn't address @xinyi-joffre's second point. Each step in the pipeline should be able to allow_reuse, which can only be achieved if each step uses a "standalone" script directory. |
How is reuse achieved and why was this issue closed?
|
|
" If you use Azure Machine Learning datasets as inputs, reuse is determined by whether the dataset's definition has changed, not by whether the underlying data has changed." I think the allow_reuse flag should be set to false to start with on all steps and inform users clearly of its existence and how to use it. Anyone adapting this repo, would be experimenting on the pipelines and trying to run their custom script and this stands in the way of it as it is a very tricky issue to find. |
Currently all of the pipeline steps have allow_reuse=False. As a developer, it would be great to enable reuse of steps so that only my changes run.
The allow_reuse=True is not working in the repo because of 2 reasons:
The repo would need to not pass build_id as a parameter to all the steps (or allow user to build and run with a static/fake build id for iterating on code). Updating any parameter value or parameter default means no reuse of steps.
All of the pipeline steps also share the same hashed directory, which causes a snapshot rebuild if any of the files change in that directory changes. All the steps in the train pipeline currently all use: source_directory=e.sources_directory_train. In the repo, it seems like train.py is a standalone script. If the repo wanted to optimize more for reuse, it could put scripts into isolated directories for each step or point to the file instead of the directory. As long as the snapshot is not forced to rebuild, then reuse should be able to happen.
The text was updated successfully, but these errors were encountered: