Here is a slightly complicated version to the game we discuss in the class. Suppose there are four questions Q1, Q2, Q3 and Q1, which are associated with a reward of $100, $1,000, $10,000, and $50,000, respectively. 2 There is a challenger, denoted by CHI. The rule is as follows: (a) CII initially is at the state 21 and has $0 at hand;
(b) When presented a question, CII will have two choices, either to quit or to accept. If to quit, CHI will take all the money she has earned so far and game is over. If she accepts and passes the challenges, she will be presented by the next question; if she accepts but fails, she get $0 and game is over;
(c) The game will be over if CII passes the last question Q4 and in that case, CHI will earn all the rewards over the four questions. Assume that CHI knows in advance that she will pass Q1, Q2, Q3, and Q1 with respective probabilities 3/4, 1/2, 1/2, and 1/4.
1. Consider such a simple policy that always accepts the challenge. Please compute the value function V". Here you should explicitly state the values V* () for the four states s = Q1, Q2, Q3 and Q1.
2. Please compute an optimal policy r* and the value function V. Again, please explicitly state the values V**(*) for the four states s=Q1, Q2, Q3 and Q1.

Solved

Show answers

Ответ:

kk042563

20.12.2020

Mathematics

0.6

Explanation:

Use a marker to highlight the columns that span from x = -2 to x = 2, including both endpoints. The P(x) values that are highlighted are then added up

0.33+0.16+0.11 = 0.6

There's a 60% chance we randomly select an x value such that $-2 \le x \le 2$

0,0(0 оценок)

Ask your question to the AI advisor

Ai-bot is an expert in any field and is the perfect companion for reliable and useful answers and advice on a variety of topics, including science, history, technology, art, sports, health, culture and more.

Ask your question

More tips

Answers on questions: Mathematics

Ask an AI advisor a question

I want advice