Critically ill patients in the intensive care units (ICUs) are often in acutely disturbed state of mind characterized by restlessness, illusions, and nervousness. Such patients, for instance, those who are mechanically ventilated may incur difficulties during treatment procedures such as endotracheal tube intubation/extubation. Apart from critical illness, treatment induced delirium may cause them to dislodge themselves from life-saving equipment and thus hinder cooperative and safe treatment in the ICU. Hence, it is often recommended to moderately sedate such patients for several days to reduce patient anxiety, facilitate sleep, aid treatment and thus endure patient safety. However, most anesthetics affect cardiac and respiratory functions. Hence, it is important to monitor and control the infusion of anesthetics to meet sedation requirements while keeping patient vital parameters within safe limits. The critical task of anesthesia administration also necessitates that drug dosing be optimal, patient specific, and robust.

Towards this end, we propose to use a reinforcement learning based approach to develop a closed-loop anesthesia controller that accounts for hemodynamic parameter variations. Main advantage of the proposed approach is that it does not require a model, it involves optimization, and is robust to interpatient variabilities. We formulate the problem of deriving control laws that track a desired trajectory as a sequential decision making problem represented by a finite Markov decision process (MDP) and then use reinforcement learning-based approach to solve the MDPs for goal oriented decision making. Specifically, we use reinforcement learning approaches, such as Q-learning, to develop a closed-loop anesthesia controller using the bispectral index (BIS) as a control variable while concurrently accounting for the mean arterial pressure (MAP). Moreover, the proposed method monitors and controls the infusion of anesthetics by minimizing a weighted combination of the error of the BIS and MAP signals. Account for two variables by considering the error reduces the computational complexity of the reinforcement learning algorithm and consequently the controller processing time.

We present simulation results and statistical results using the 30 simulated patients. For our simulations, the pharmacokinetic and the pharmacodynamic values of the simulated patients are chosen randomly from a predefined range. To quantify the performance of the trained agent in the closed-loop anesthesia control, we use the median performance error (MDPE), median absolute performance error (MDAPE), root mean square error (RMSE), and interquartile range (IQ). In order to further investigate the effect of simultaneous regulation of the BIS and MAP parameters on the sedation level (BIS) of a patient, we also conducted three different in silico case studies. In the first case study, a hemodynamic disturbance is considered in which the MAP is altered by d units. This case study considers the effect of other factors such as hemorrhage on MAP as an exogenous disturbance. In the second case study, the MAP is set to a constant value irrespective of propofol infusion, which corresponds to patients that remain intubated in the ICU with post-aortic aneurysm repair or septic patients with respiratory failure. In the third case study, a disturbance due to administration of a synergetic drug such as remifentanil is considered during the administration of propofol. This case study considers the effect of drug interaction on the closed-loop control of hypnotic agent administration.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error