Real-Time Data Processing and Distributed System Optimization with Kafka And Cassandra
DOI:
https://doi.org/10.47392/IRJAEM.2025.0424Keywords:
Apache Cassandra, Apache Kafka, Distributed systems, Real-time data processing, Stream analyticsAbstract
The rapid expansion of data volume across industries has intensified the need for real-time data processing and optimization strategies. Distributed systems must now handle diverse workloads, ensuring both efficiency and scalability. Kafka and Cassandra have emerged as dominant technologies for streaming and storing high-throughput, low-latency data in real-time analytics pipelines. This review analyzes the roles of Apache Kafka in the context of data ingestion and stream processing and Apache Cassandra data storage purposes for distributed database management. It also discusses how these two systems interact and the advantages of integrating the two systems for improving responsiveness, fault tolerance, and data consistency in distributed systems. The review analyzes the middleware and stream pipelines, different use cases, and recent applications in environmental monitoring, healthcare, and power systems. Lastly, by synthesizing previous work from the last several conferences and journal articles, this review outlines methodologies, tools, and architecture patterns for accomplishing real-time data processing and system optimization using Kafka and Cassandra
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Research Journal on Advanced Engineering and Management (IRJAEM)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.