reinforcement-learning
Here are 4,677 public repositories matching this topic...
Description
I am wondering when Assessing the Factual Accuracy of Generated Text in https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/data_generators/wikifact will be publically available since it's already been 6 months. @bengoodrich
-
Updated
May 22, 2020 - Python
This is an awesome library, thanks @ddbourgin!!
Users might not know the best way to install this package and try it out. (I didn't, so I eventually just copied the source files.)
Neither the readme nor readthedocs have install instructions.
I couldn't find it on PyPi or Anaconda, and there doesn't appear to be a pyproject.toml, setup.cfg, setup.py, or conda recipe.
Moreover, the t
In the Ray Design Agents documentation the Ray Perception parameter Ray Layer Mask is not mentioned. I am a bit confused about what does it do and if it interacts with the Detectable Tags parameter.
, the reference error is the following:
error = kp * (pos_des - pos) + kd *
I understand that these two python files show two different methods to construct a model. The original n_epoch is 500 which works perfect for both python files. But if I change n_epoch to 20, only tutorial_mnist_mlp_static.py can achieve a high test accuracy (~0.97). The other file tutorial_mnist_mlp_static_2.py only get 0.47.
The models built from these two files looks the same for me (the s
-
Updated
May 22, 2020 - Python
-
Updated
May 23, 2020
-
Updated
May 22, 2020 - Python
-
Updated
May 27, 2020 - Python
I tried some RNN regression learning based on the code in the "PyTorch-Tutorial/tutorial-contents/403_RNN_regressor.py" file, which did not work for me at all.
According to an accepted answer on stack-overflow (https://stackoverflow.com/questions/52857213/recurrent-network-rnn-wont-learn-a-very-simple-function-plots-shown-in-the-q?noredirect=1#comment92916825_52857213), it turns out that the li
-
Updated
May 5, 2020 - Python
-
Updated
Dec 14, 2019 - Jupyter Notebook
Description
Trax is a library for deep learning that focuses on sequence models and reinforcement learning. It combines performance with code clarity and maintained documentation and tests.
...
Sorry to bother, I'll be brief. I don't think the "maintained documentation" part of the statement is true (yet?). I like the work and I respect every project that goes deep down on neural network
-
Updated
Mar 18, 2020 - JavaScript
Jupyter containers hosted by Coursera cause a lot of trouble. Perhaps more than they are worth.
- They are very limited in terms of lifetime and CPU. The docs say 90 minutes / 0.5-2 CPUs. That's definitely insufficient to train Breakout, for example.
- Updating them is inconvenient. We don't h
Lately running into too many Sagemaker issues. Is there any unambiguous documentation on Sagemakers Instances? I could glean the following from different sources:
- Sagemaker Instances, Sagemaker being a managed service, have nothing to do with EC2 instances.
- Unlike EC2 console, Sagemaker console has no option to view limits or increase limits. One has to go directly to the support page a
The OpenAI Gym installation instructions are missing reference to the "Build Tools for Visual Studio 2019" from the following site.
https://visualstudio.microsoft.com/downloads/
I also found this by reading the following article.
https://towardsdatascience.com/how-to-install-openai-gym-in-a-windows-environment-338969e24d30
Even though this is an issue in the OpenAI gym, a note in this RE
一言でいうと
自然言語とプログラムコード双方で事前学習したモデルの提案。翻訳と同様、自然言語/プログラムコードをSeparatorで区切って学習させる。BERT(#959 )のMask以外にELECTRA(#1539 )の置換トークン発見を目的関数に使っている。自然言語によるコード検索、欠損語推論(多肢選択)で有効性を確認。
論文リンク
https://arxiv.org/abs/2002.08155
著者/所属機関
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou
- Microsoft Research Asia
- Ha
In the updateEdgeStats function, reward is updated by edge.reward += reward, which is consistent with the formula in paper "Mastering the game of Go without human knowledge".
But in many other popular unofficial implementations, e.g. ,
, add
v to update the edge reward when the current node belongs to the current player, b
Documentation
-
Updated
Apr 21, 2020 - Python
-
Updated
Apr 21, 2020 - Jupyter Notebook
Can you describe what modifications need to be done if I want to replace dynamic_rnn with tf.keras.RNN in many-to-one example as dynamic_rnn is deprecated now.
How to use Watcher / WatcherClient over tcp/ip network?
Watcher seems to ZMQ server, and WatcherClient is ZMQ Client, but there is no API/Interface to config server IP address.
Do I need to implement a class that inherits from WatcherClient?
-
Updated
May 25, 2020 - Python
Improve this page
Add a description, image, and links to the reinforcement-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the reinforcement-learning topic, visit your repo's landing page and select "manage topics."