mrmendrala
mrmendrala
30.11.2019 • 
Mathematics

Develop a strategy to maximize your average reward per move (equivalent to maximizing total reward over n moves). express this as a function of k, using θ-notation. in other words your maximization doesn't have to be entirely precise; you may assume that k is any convenient number that will make the math easier for your strategy, but you cannot assume that k = o(1). notice that any strategy that you come up with provides a lower bound on reward optimality. the better the strategy, the better (higher) the lower bound. it’s trivial to get ω(1) per move, so you must get ω(1).

Solved
Show answers

Ask an AI advisor a question