reinforcement-learning
Here are 4,234 public repositories matching this topic...
Description
I trained a transformer model for English to French translation. It is working well when I give it a sentence to translate. However, when I give a whole document (or simply a paragraph with many sentences), it gives me back a very bad translation and sometimes it skips some sentences.
Did anyone encounter this kind of problem ?
-
Updated
Mar 4, 2020 - Python
-
Updated
Mar 4, 2020 - Python
-
Updated
Mar 4, 2020 - Jupyter Notebook
The documentation of the Yaml config parameters is missing an explanatory entry for the optional learning_rate_schedule property, described more here.
(Sorry, not a bug, but didn't find a b
There are too many courses on the list, a rating on each course will help much people.
We can use github issues to rate each course.
Vcpkg is a C++ dependency management system that makes installation and consumption as a dependency very easy. We should support this for VW to allow consuming the lib as easy as possible.
Instructions for creating a new package can be found here: https://github.com/microsoft/vcpkg/blob/master/docs/examples/packaging-github-repos.md
-
Updated
Mar 4, 2020 - Python
The image on page 13 of the user manual (at least versions 2.82 and 2.83) suggests to use btGimpactTriangleMeshShape to handle collisions for moving objects that can not be approximated as a single primitive, a convex hull of a triangle mesh, or a compound shape:
I understand that these two python files show two different methods to construct a model. The original n_epoch is 500 which works perfect for both python files. But if I change n_epoch to 20, only tutorial_mnist_mlp_static.py can achieve a high test accuracy (~0.97). The other file tutorial_mnist_mlp_static_2.py only get 0.47.
The models built from these two files looks the same for me (the s
-
Updated
Mar 3, 2020
It was amazing to see detectron2, that's like the best of pytorch and tensorflow. Thank you for the great library.
according to @wat3rBro
facebookresearch/detectron2#12 (comment)
facebookresearch/detectron2#12 (comment) mobile friendly models are coming soon.
Creating this issue as a placeholder to support
def _discount_and_norm_rewards(self):
# discount episode rewards
discounted_ep_rs = np.zeros_like(self.ep_rs)
running_add = 0
for t in reversed(range(0, len(self.ep_rs))):
running_add = running_add * self.gamma + self.ep_rs[t]
discounted_ep_rs[t] = running_add
# normalize episode rewards
discounted_ep_rs -=
-
Updated
Mar 4, 2020 - Python
I tried some RNN regression learning based on the code in the "PyTorch-Tutorial/tutorial-contents/403_RNN_regressor.py" file, which did not work for me at all.
According to an accepted answer on stack-overflow (https://stackoverflow.com/questions/52857213/recurrent-network-rnn-wont-learn-a-very-simple-function-plots-shown-in-the-q?noredirect=1#comment92916825_52857213), it turns out that the li
typo in v09
v09 chapter about HMMs refers to "gait" and "gate."
I think it should just be "gait?"
Identifying people based on their gait is a pretty cool idea, but first we need a model to
recognize the gate. Consider a HMM where the sequence of hidden states for a gate are
s/gate/gait/g
Using data points (0,3) (1,4) (2,5) illustrate the fact that "c" (the y-intercept) is locked at zero. Flip the sign and the resultant line tracks the points.
https://github.com/lazyprogrammer/machine_learning_examples/blob/master/best_fit_line.py
-
Updated
Mar 1, 2020 - JavaScript
We have instructions for setting up local Docker at https://github.com/yandexdataschool/Practical_RL/tree/master/docker. However, they are unclear, as reported in the following threads:
- https://www.coursera.org/learn/practical-rl/discussions/all/threads/E6IkT54xEemB7BKA79O1vg
- https://www.coursera.org/learn/practical-rl/discussions/all/threads/urpCnVhlEeiIjg6nmV99lg
Need to review proble
Since Trax is a successor of tensor2tensor (according to the release notes of tensor2tensor v1.15.0), it would be helpful if you could provide examples for more advanced machine learning tasks. An outstanding feature of tensor2tensor are the numerous (and useful) examples which Trax is currently lacking. Such examples would especi
The notebook seems to use a pre-trained model from https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.ipynb. The notebook should refer to the data schema from the above example when discussing generated traffic and suggested constraints.
Cell Deploy the model to Amazon SageMaker. THIS REQU
一言でいうと
画像のゆがみを補正する研究。画像全体ではなくパッチ+パッチ周辺の領域に切り出し、それらの勾配をつなぎ合わせることで本来の画像全体で得られる勾配フローを構築し、そこからサンプリングして画像を生成している。
論文リンク
https://arxiv.org/abs/1909.09470
著者/所属機関
Xiaoyu Li, Bo Zhang, Jing Liao, Pedro V. Sander
- Hong Kong UST
- Microso� Research Asia
- City Unive
-
Updated
Mar 3, 2020 - Python
Documentation
Can you please add normal documentation at least for an src_cpp/elf?
I don’t ask to add documentation for a src_cpp/elfgames.
The README.md doesn't indicate how to contribute. I made a local branch to fix issue #34 but when I try to push the branch I get the following error, which indicates I don't have permission.
Please indicate in the README how to contribute.
Thanks.
-
Updated
Mar 3, 2020 - Jupyter Notebook
Can you describe what modifications need to be done if I want to replace dynamic_rnn with tf.keras.RNN in many-to-one example as dynamic_rnn is deprecated now.
How to use Watcher / WatcherClient over tcp/ip network?
Watcher seems to ZMQ server, and WatcherClient is ZMQ Client, but there is no API/Interface to config server IP address.
Do I need to implement a class that inherits from WatcherClient?
-
Updated
Mar 4, 2020 - Python
Improve this page
Add a description, image, and links to the reinforcement-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the reinforcement-learning topic, visit your repo's landing page and select "manage topics."


One possibility would be the example from numba/numba#4256 (comment) to avoid the regression described in that issue.