Deep reinforcement learning for dynamic expectile risk measures: An application to equal risk option pricing and hedging

Marzban, Saeed; Delage, Erick; Li, Jonathan Y.

Motivated by the application of equal-risk pricing and hedging of a financial derivative, where two operationally meaningful hedging portfolio policies needs to be found that minimizes coherent risk measures, we propose in this paper a novel deep reinforcement learning algorithm for solving risk-averse dynamic decision making problems. Prior to our work, such hedging problems can either only be solved based on static risk measures, leading to time-inconsistent policies, or based on dynamic programming solution schemes that are impracticable in realistic settings. Our work extends for the first time the deep deterministic policy gradient algorithm, an off-policy actor-critic reinforcement learning (ACRL) algorithm, to solving dynamic problems formulated based on time-consistent dynamic expectile risk measure. Our numerical experiments confirm that the new ACRL algorithm produces high quality solutions to equal-risk pricing and hedging problems and that its hedging strategy outperforms the strategy produced using a static risk measure when the risk is evaluated at later points of time.

Paru en décembre 2021 , 22 pages

Axe de recherche

Axe 1 : Valorisation des données pour la prise de décision

Application de recherche

Économie et finance

Document

G2181.pdf (1,1 Mo)

GERAD

G-2021-81

Deep reinforcement learning for dynamic expectile risk measures: An application to equal risk option pricing and hedging

Saeed Marzban, Erick Delage et Jonathan Y. Li

Axe de recherche

Application de recherche

Document