rllib

Describe your feature request

Hi guys,

It would be awesome to add API that has same output as ray memory command.
Also, it would be good to add some additional output info for ray.objects(). For example, node IP, IDs of objects which are created in in-process stores, IDs of objects from remote calls (when remote calls are still being executed).

Thanks in advance!

Currently we use a very ad-hoc procedure for scaling the quadratic component of NAF when used for exploration:
https://github.com/angelolovatto/raylab/blob/9820275b17ee085e1955a6d845c0bdf61333f8da/raylab/algorithms/naf/naf_policy.py#L150-L155

A possibly better alternative would be to scale it based on the desired average action stddev. Something like:

scale_tril * (1.0 / average_st

rllib

Here are 29 public repositories matching this topic...

ray-project / ray

[RFC] More programmable API that has same output as `ray memory` command

Describe your feature request

[Core] Add more user-friendly error message upon `async def` remote task

Unify linting of clang-format and *.proto files

Draichi / T-1000

utiasDSL / gym-pybullet-drones

druce / rl

ChuaCheowHuan / gym-continuousDoubleAuction

angelolovatto / raylab

Scale tril by desired average action stddev

goshaQ / adaptive-tls

JacopoPan / a-minimalist-guide

dcos-labs / dcos-jupyterlab-service

HumanCompatibleAI / better-adversarial-defenses

DerwenAI / rllib_tutorials

Senmumu / ray_project_doc

toanngosy / robustprosthetics

DerwenAI / gym_example

ChuaCheowHuan / PBT_MARL_watered_down

wullli / flatlander

nicofirst1 / rl_werewolf

mynkpl1998 / upgraded-octo-lamp

thiagopbueno / model-aware-policy-optimization

Add Gym env for Navigation domain with bimodal dynamics distribution

Create Value function class

xdralex / pioneer

CN-UPB / DeepCoMP

hybug / RL_Lab

jthelin / HelloRayActors

ChuaCheowHuan / sagemaker_Ray_RLlib_custom_env

3neutronstar / flow-autonomous-driving

rlew631 / AutonomousVehicleSimulation

EyaRhouma / curriculum_learning_rllib

ndalton12 / AMPED

lucien1011 / reinforcement-learning-playground

Improve this page

Add this topic to your repo