资源简介
讲述alpha zero的原文,发表在nature。
A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in
challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The
tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were
trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce
an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game
rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also
the winner of AlphaGo’s games. This neural network improves the strength of the tree search, resulting in higher quality
move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved
superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
代码片段和文件信息
- 上一篇:OSG载入地形和模型文件
- 下一篇:HIP4082 H桥双路电机驱动
相关资源
- AlphaControls14.21(2019-4-5)
- alphabeta搜索五子棋
- zeromq-2.1.7.tar.gz
- Finding_Alphas_A_Quantitative_Approach_to_Buil
- TensorFlow五子棋
- lichee_zero(核心板原理图纸
- tesseract-ocr-w64-setup-v5.0.0-alpha.20191030.
- Ardence RTX8.1(带序列号)
- u-boot-3s-spi-experimental.zip
- leela-zero-0.17-win64.zip
- Deep Reinforcement Learning Hands-On pdf
- sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz1517
- Zero远控完整代码(2017-5-29)
- RTX8.1.2含SN
- 富士伺服Alpha5 Smart用户手册
- FontCreator11汉化版
- AlphaControlsv11.16StableFullSource(D5和D10
- 百度地图POI抓取器Alpha1
- 百度地图POI抓取器Alpha
- 零样本学习论文+代码
- sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
- Zero远控完整源代码和已编译程序
- 超标量处理器源代码+alpha结构资料.
- IntervalZero实时系统 RTX2012.rar
- MIKEZERO2014_.rar
- 基于ALPHA-BETA算法的五子棋程序
- 五子棋alphabeta
- Alpha-Beta搜索
- 黑白棋AI算法
- 使用django+zeromq+tornado实现基于消息机
评论
共有 条评论