Real-Time Data Processing and Distributed System Optimization with Kafka And Cassandra

Authors

  • Fnu Pawan Kumar Birla Technical Training Institute, Pilani, Rajasthan, India. Author

DOI:

https://doi.org/10.47392/IRJAEM.2025.0424

Keywords:

Apache Cassandra, Apache Kafka, Distributed systems, Real-time data processing, Stream analytics

Abstract

The rapid expansion of data volume across industries has intensified the need for real-time data processing and optimization strategies. Distributed systems must now handle diverse workloads, ensuring both efficiency and scalability. Kafka and Cassandra have emerged as dominant technologies for streaming and storing high-throughput, low-latency data in real-time analytics pipelines. This review analyzes the roles of Apache Kafka in the context of data ingestion and stream processing and Apache Cassandra data storage purposes for distributed database management. It also discusses how these two systems interact and the advantages of integrating the two systems for improving responsiveness, fault tolerance, and data consistency in distributed systems. The review analyzes the middleware and stream pipelines, different use cases, and recent applications in environmental monitoring, healthcare, and power systems. Lastly, by synthesizing previous work from the last several conferences and journal articles, this review outlines methodologies, tools, and architecture patterns for accomplishing real-time data processing and system optimization using Kafka and Cassandra

Downloads

Download data is not yet available.

Downloads

Published

2025-08-11