Reinforcement Learning-Based Human-Machine Co-Adaptation via Policy Gradient

Maharmeh, Elias

DSpace Home
→
Graduation Projects, Theses, and Student Papers
→
Master of Renewable Energy
→
View Item

Reinforcement Learning-Based Human-Machine Co-Adaptation via Policy Gradient

Maharmeh, Elias

URI: http://localhost:8080/xmlui/handle/123456789/8829

Date: 2023-02-01

Abstract:

Human Machine Co-adaptation (HMCo) is a critical problem in the design of intelligent systems that interact with humans. This thesis proposes a general framework for solving HMCo problems using a reinforcement-based approach called the policy gradient algorithm. The thesis goal is to empower the machine with the ability to learn a policy or a strategy in order to co-adapt to human behaviors. The proposed approach is based on the assumption of rationality on the human side and involves learning a policy that co-adapts to dynamic environments and aids the human while performing a specific task. The effectiveness of the proposed approach is demonstrated through case studies, including both direct and indirect shared control, and some of the challenges and limitations that must be addressed in order to further advance the field are highlighted. These challenges include the sensitivity of the algorithm to hyperparameters, the issue of local minima, and the complexity of the optimization process. The impact of the human factor during the training process is also considered, as is the need to enhance sampling complexity in order to handle the limitations of real-world interaction. This thesis makes several key contributions to the fields of HMCo and intelligent systems design. First, it provides a general framework for solving HMCo problems that is based on policy gradient methods and is applicable to a wide range of environments and tasks. Second, it demonstrates and tests the feasibility and effectiveness of the proposed approach through case studies involving both direct and indirect shared control. Third, it identifies key challenges and limitations that must be addressed in order to further advance the field, such as the sensitivity of the algorithm to hyperparameters and the complexity of the optimization process.