Skip to content
#

distributed-training

Here are 43 public repositories matching this topic...

jankrynauw
jankrynauw commented Jun 6, 2019

We would like to forward a particular 'key' column which is part of the features to appear alongside the predictions - this is to be able to identify to which set of features a particular prediction belongs to. Here is an example of predictions output using the tensorflow.contrib.estimator.multi_class_head:

{"classes": ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"],
 "scores": [0.068196
aurickq
aurickq commented Sep 6, 2020

Currently the allocator triggers its allocation policy at a fixed time interval (default 60s). This is useful for periodically re-optimizing the resource allocations, but new jobs also need to wait for the next allocation cycle to start. When there are enough resources available for the new job, it should be possible to immediately schedule it.

Possible implementation:

  • Change `sched/alloca

Improve this page

Add a description, image, and links to the distributed-training topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the distributed-training topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.