close_btn

  • ※ 사이트 내부 통합검색


  • ※ 카카오페이로 기부하기

  • ※ 사이트 내부 통합검색
세상의모든계산기2017.10.20 09:04

Rollout 과 관련하여 논문에 나온 내용을 뽑아보겠습니다. 

  • Finally, it uses a simpler tree search that relies upon this single neural network to evaluate positions and sample moves, without performing any Monte-Carlo rollouts.
  • In each position st, a Monte-Carlo tree search (MCTS) αθ is executed (see Figure 2) using the latest neural network fθ. Moves are selected according to the search probabilities computed by the MCTS, at ~ πt.
  • Figure 2: Monte-Carlo tree search in AlphaGo Zero.
  • Monte-Carlo tree search (MCTS) may also be viewed as a form of self-play reinforcement learning.
  • MCTS programs have previously achieved strong amateur level in Go, but used substantial domain expertise: a fast rollout policy, based on handcrafted features, that evaluates positions by running simulations until the end of the game; and a tree policy, also based on handcrafted features, that selects moves within the search tree.
this single neural network to evaluate positions and sample moves,
without performing any Monte Carlo rollouts.
파일 첨부

여기에 파일을 끌어 놓거나 파일 첨부 버튼을 클릭하세요.

파일 크기 제한 : 0MB (허용 확장자 : *.*)

0개 첨부 됨 ( / )