The discount factor, denoted by the Greek letter gamma, is a numerical value between zero and one that determines the present value of future rewards in a Markov Decision Process. It functions as a weighting mechanism that dictates how much the agent values immediate rewards versus rewards received further into the future. A discount factor close to zero makes the agent myopic or short-sighted, meaning it prioritizes rewards available in the very next step and largely ignores long-term outcome....
Log in to view the answer