User:IssaRice/Monte Carlo tree search
(from a long time ago)
what i don't understand about monte carlo tree search: for the final action selection, at least two sources (https://www.youtube.com/watch?v=Fbs4lnGLS8M around 11:00 and the wikipedia page "Then the move with the most simulations made (i.e. the highest denominator) is chosen as the final answer.") say that this is based just on the denominator, rather than the fraction overall. so e.g. 4/7 wins over 1/1 because 7 > 1, even though 4/7 < 1/1. This is counterintuitive to me. (during the actual selection/expansion/simulation phase, everyone agrees to choose based on the fraction)