mapping that assigns a scalar value (reward) to each state-action pair; \gls{fnReward} induces the sequence of rewards , where is a \gls{rv} with expected value
1 min read
mapping \glsfnReward:\glssetActions×\glssetActions→R that assigns a scalar value (reward) to each state-action pair; \gls{fnReward} induces the sequence of rewards {\glsrvRewardt}t∈N0, where \glsrvRewardt is a \gls{rv} with expected value \glsfnReward(xt,at)