Version 2 2025-01-15, 01:13Version 2 2025-01-15, 01:13
Version 1 2023-05-23, 12:08Version 1 2023-05-23, 12:08
conference contribution
posted on 2025-01-15, 01:13authored byVW Chu, RK Wong, C Chi, PCK Hung
Due to the popularity of using web services to deliver services on the Web, a clear view of how they are being consumed is becoming critical. Researchers have been trying multiple methods to reveal actual service orchestration patterns from service logs. However, most of the discovery methods have taken deterministic approaches, and hence, they do not provide enough allowance to cater for incomplete data and noises. On the other hand, most investigations do not take combinatorial explosion into consideration leading to scalability problem. Moreover, asynchronous web service invocations and distributed executions also make it difficult to identify service patterns due to the randomness in log record generation. In this paper, probabilistic topic mining class of solutions are applied to reveal web service orchestration patterns from service logs, in which robust approximation methods are available to provide scalability. Data sparsity problem in service log is also investigated by using biterm topic model (BTM) and comparing its results with traditional latent Dirichlet allocation (LDA) model. In addition, a topic matching method is introduced based on the Hungarian method on Jensen-Shannon divergence matrix, whilst notions of aggJSD and autoJSD are also introduced to measure topic diversity between matched topic sets and within a single topic set respectively. Experiment results confirm that BTM can be used for service logs with short log entries and with sparsity larger than 90% approximately.