We study a real-time decision problem defined on a directed graph where a travel cost is associated with each arc and a prize is associated with each node. The nodes are partitioned in mandatory and optional. Along a certain request-time horizon, mandatory nodes will certainly request a visit that must be satisfied; optional nodes will originate a request with a given time-dependent probability, and such a request can be accepted or not. During a following limited travel-time horizon, a server will start from a fixed origin, will visit both mandatory nodes and the optional nodes whose request has been accepted, and will end at a fixed destination. The profit of the server is the difference between the total collected prize and the total travel cost. The problem consists in finding a policy for accepting/rejecting optional requests during the request-time horizon in order to maximize the expected server profit while respecting the travel-time horizon limitation.

We discuss the relevance of the problem and we derive a recursive formula for the expected profit optimization. Given the intractability of the recursion, we design several heuristic policies, based on combination of simple myopic rules, Monte Carlo simulation, and heuristic solution of the static counterpart of the dynamic problem, i.e., the Probabilistic Orienteering Problem. We set up a simulation framework where instances with up to 100 nodes are considered under different probabilistic assumptions and different widths of the travel-time horizon. The results of extensive computational tests are discussed.