AI-Powered Automation Companion: A Semantic-Aware On-Device Mobile Automation Framework
DOI:
https://doi.org/10.47392/IRJAEM.2026.0093Keywords:
Mobile Automation, Robotic Process Automation (RPA), Computer Vision, Edge AI, Accessibility ServicesAbstract
Manual execution of repetitive mobile tasks is inefficient, and existing automation tools rely on brittle coordinate-based mechanisms or privacy-invasive cloud processing. This paper presents the AI-Powered Automation Companion, a comprehensive, offline-first Android framework designed for adaptive, privacy-preserving task automation. The system replaces rigid coordinates with semantic screen understanding by deploying a quantized YOLOv11s (INT8) model directly on-device. To manage complex logic, this vision engine is integrated with a visual node-based Flow Builder and cross-device LAN synchronization. The framework also supports context modules—including location, battery, and app-specific triggers—to enable intelligent, multi-app routines. Experimental evaluation demonstrates that the proposed semantic pipeline achieves a mean Average Precision (mAP@0.5) of 0.846 while maintaining an effective inference latency of ~230 ms. Furthermore, robustness testing shows the system maintains a 95% workflow success rate under dynamic UI layout changes. By combining local AI, a visual workflow editor, and cross-device execution, the framework delivers a robust, explainable, and fully private automation ecosystem at the edge.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Research Journal on Advanced Engineering and Management (IRJAEM)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
.