Leveraging InfiniBand for Highly Concurrent Messaging in Java Applications

In this paper, we describe the design and implementation of Ibdxnet, an InfiniBand transport to enable high throughput and low-latency messaging for concurrent Java applications with transparent serialization of Java objects using DXNet. Ibdxnet applies best practices by implementing a dynamic and s...

Full description

Saved in:
Bibliographic Details
Published in2019 18th International Symposium on Parallel and Distributed Computing (ISPDC) pp. 74 - 83
Main Authors Nothaas, Stefan, Beineke, Kevin, Schoettner, Michael
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2019
Subjects
Online AccessGet full text
DOI10.1109/ISPDC.2019.00013

Cover

More Information
Summary:In this paper, we describe the design and implementation of Ibdxnet, an InfiniBand transport to enable high throughput and low-latency messaging for concurrent Java applications with transparent serialization of Java objects using DXNet. Ibdxnet applies best practices by implementing a dynamic and scalable pipeline with RC QPs and messaging verbs using the ibverbs library. A carefully designed JNI layer ensures minimal overhead to connect the native Ibdxnet library to the Java counterpart without impacting performance. Existing as well as new multi-threaded Java applications can use DXNet's event-based architecture concurrently to send and receive messages and requests transparently over InfiniBand with the Ibdxnet transport. We compared DXNet with Ibdxnet to the InfiniBand supporting MPI implementations FastMPJ and MVAPICH2. DXNet's performance for middle sized and large messages keeps up with FastMPJ's and MVAPICH2's. For small messages, DXNet clearly outperforms both systems especially in a multi-threaded environment. Furthermore, we compared the two key-value storages DXRAM, which uses DXNet with Ibdxnet, and RAMCloud, which uses a custom network subsystem based on ibverbs, using the YCSB with two workloads. On a graph data workload, DXRAM outperforms RAMCloud with a five times higher throughput of 7.96 mops on 40 nodes.
DOI:10.1109/ISPDC.2019.00013