total discounted reward obtained from time onwards, given \glsxtrshort{mdp} and policy \gls{policy}