Adaptive and large-scale service composition based on deep reinforcement learning
In a service-oriented system, simple services are combined to form value-added services to meet users’ complex requirements. As a result, service composition has become a common practice in service computing. With the rapid development of web service technology, a massive number of web services with...
Saved in:
Published in | Knowledge-based systems Vol. 180; pp. 75 - 90 |
---|---|
Main Authors | , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
15.09.2019
Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In a service-oriented system, simple services are combined to form value-added services to meet users’ complex requirements. As a result, service composition has become a common practice in service computing. With the rapid development of web service technology, a massive number of web services with the same functionality but different non-functional attributes (e.g., QoS) are emerging. The increasingly complex user requirements and the large number of services lead to a significant challenge to select the optimal services from numerous candidates to achieve an optimal composition. Meanwhile, web services accessible via computer networks are inherently dynamic and the environment of service composition is also complex and unstable. Thus, service composition solutions need to be adaptable to the dynamic environment. To address these key challenges, we propose a new service composition scheme based on Deep Reinforcement Learning (DRL) for adaptive and large-scale service composition. The proposed approach is more suitable for the partially observable service environment, making it work better for real-world settings. A recurrent neural network is adopted to improve reinforcement learning, which can predict the objective function and enhance the ability to express and generalize. In addition, we employ the heuristic behavior selection strategy, in which the state set is divided into the hidden and fully observable state sets, to perform the targeted behavior selection strategy when facing with different types of states. The experimental results justify the effectiveness and efficiency, scalability, and adaptability of our methods by showing obvious advantages in composition results and efficiency for service composition. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2019.05.020 |