#
mcts
Here are 143 public repositories matching this topic...
evg-tyurin
commented
Feb 10, 2019
During self-play phase we usually collect different examples for the same board states. Should we preprocess such examples before optimizing the NNet? In the current implementation, we don't preprocess them so we train NNet and expect different output from the same input values. I think this may be wrong.
MCTS project for Tetris
-
Updated
Jun 5, 2020 - Python
A student implementation of Alpha Go Zero
-
Updated
Aug 1, 2018 - Python
Reinforcement learning models in ViZDoom environment
agent
learning
reinforcement-learning
pytorch
doom
behavior-tree
mcts
vizdoom
reinforcement
ppo
doomnet-track1
-
Updated
Dec 29, 2019 - Python
A pytorch tutorial for DRL(Deep Reinforcement Learning)
deep-reinforcement-learning
pytorch
dqn
mcts
uct
c51
iqn
hedge
ppo
a2c
gail
counterfactual-regret-minimization
qr-dqn
random-network-distillation
soft-actor-critic
self-imitation-learning
-
Updated
May 1, 2019 - Jupyter Notebook
An asynchronous/parallel method of AlphaGo Zero algorithm with Gomoku
tree
algorithm
board
tensorflow
gpu
paper
parallel
deep-reinforcement-learning
mcts
gomoku
noise
tree-search
tensorlayer
alphago
mpi4py
dirichlet-distribution
alphazero
alphazero-gomoku
board-model
playout-times
add-noises
junxiaosong
-
Updated
Jan 20, 2020 - Python
A Deep Learning UCI-Chess Variant Engine written in C++ & Python 🐦
python
open-source
machine-learning
chess-engine
deep-learning
mxnet
artificial-intelligence
mcts
gluon
lichess
convolutional-neural-network
alphago
python-chess
alphazero
crazyhouse
-
Updated
May 28, 2020 - Jupyter Notebook
Allie: A UCI compliant chess engine
-
Updated
Jun 15, 2020 - C++
Personal notes about scientific and research works on "Decision-Making for Autonomous Driving"
reinforcement-learning
bibliography
end-to-end
decision-making
prediction
planning
intention
mdp
mcts
game-theory
behavioral-cloning
interaction
risk-assessment
imitation-learning
inverse-reinforcement-learning
pomdp
decision-making-under-uncertainty
carla
model-based-reinforcement-learning
belief-planning
-
Updated
Jun 15, 2020
Reinforcing Your Learning of Reinforcement Learning
reinforcement-learning
tic-tac-toe
space-invaders
q-learning
doom
dqn
mcts
policy-gradient
cartpole
gomoku
ddpg
atari-2600
alphago
frozenlake
ppo
advantage-actor-critic
alphago-zero
-
Updated
Jul 14, 2019 - Python
AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.
game
machine-learning
reinforcement-learning
deep-learning
tensorflow
tic-tac-toe
connect-four
reversi
mcts
othello
tictactoe
resnet
deepmind
connect4
alphago-zero
alpha-zero
alphazero
self-play
-
Updated
Apr 14, 2018 - Python
docker
kubernetes
board-game
flash
deep-neural-networks
ai
microservice
game-engine
wiki
actionscript
deep-reinforcement-learning
cnn
dnn
mcts
finite-state-machine
deeplearning
starling
fuzzy-logic-control
alphago
policytree
-
Updated
Nov 22, 2019 - HTML
UCThello - a board game demonstrator (Othello variant) with computer AI using Monte Carlo Tree Search (MCTS) with UCB (Upper Confidence Bounds) applied to trees (UCT in short)
game
board-game
mobile
ai
simulation
mobile-app
artificial-intelligence
mcts
othello
mobile-game
entertainment
ucb
uct
monte-carlo-tree-search
ai-players
upper-confidence-bounds
abstract-game
perfect-information
2-player-strategy-game
-
Updated
Mar 30, 2018 - JavaScript
Deep Learning big homework of UCAS
-
Updated
Jan 8, 2019 - Python
Implementation of Deepmind's AlphaZero algorithm with Caffe and C++
-
Updated
Apr 14, 2018 - C++
Here are some Python implementations of Gomoku AIs, including MCTS, Minimax and Genetic Alg.
-
Updated
Dec 14, 2018 - Python
9x9 AlphaGo
-
Updated
Jul 27, 2016 - Python
CrazyAra - A Deep Learning UCI-Chess Variant Engine written in C++ 🐦
open-source
machine-learning
chess-engine
deep-learning
mxnet
cpp
artificial-intelligence
mcts
lichess
convolutional-neural-network
alphago
alphazero
crazyhouse
chess-variants
-
Updated
Oct 8, 2019 - C++
Tic Tac Toe with Alpha Zero method - My first work
-
Updated
Aug 23, 2018 - Python
Thompson Sampling based Monte Carlo Tree Search for MDPs and POMDPs
-
Updated
Jun 20, 2016 - C++
Quoridor AI based on Monte Carlo tree search
-
Updated
Apr 24, 2020 - JavaScript
HybridAlpha - a mix between AlphaGo Zero and AlphaZero for multiple games
python
machine-learning
deep-learning
tensorflow
keras
deep-reinforcement-learning
pytorch
extensible
mcts
neural-networks
othello
tictactoe
resnet
flexibility
alpha-beta-pruning
greedy-algorithms
gobang
connect4
alphago-zero
alpha-zero
-
Updated
May 23, 2020 - Python
Visualisation of MCTS in Unity with C# for different games, being created for my third year university project at the University of York
visualization
game
ai
university
csharp
unity
tic-tac-toe
visualisation
mcts
othello
dissertation
connect4
mcts-visualisation
-
Updated
Jun 12, 2018 - C#
Implementation of an AlphaGo Zero paper in one C++ header file without any dependencies
machine-learning
deep-neural-networks
reinforcement-learning
deep-learning
cpp
mcts
convolutional-neural-networks
alphago
mnist-nn
alphago-zero
mcts-implementations
self-play
-
Updated
Apr 18, 2018 - C++
Improve this page
Add a description, image, and links to the mcts topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the mcts topic, visit your repo's landing page and select "manage topics."
首先感谢分享程序。
请问在6x6 四子棋的训练过程中,有没有调节learning_rate或者其他参数?
程序里的c_puct=5 , 温度t=1,学习率 0.002,batch_size 512 , deque最大长度10000, kl-targ=0.02 ,epochs=5
我使用你程序里的预设参数 tensorflow训练6x6 四子棋 ,loss下降到2左右就无法下降了,调节学习率也没成功。。。求帮助解答,谢谢
另外,不明白explain_var_old这个参考数值的意义。