Share
Export Citation
Optimization of Multi-Price Tag Text Detection and Extraction on Supermarket Shelves Using YOLOv11, EasyOCR, and Adaptive Image Enhancement
Hasmidar M.
Proceedings 7th International Conference on Informatics Multimedia Cyber and Information System Icimcis 2025
Abstract
This study introduces an improved process for detecting and extracting multi-price tag text on supermarket shelves. It utilizes YOLOv11 to identify two classes of product names and product prices and employs EasyOCR for text recognition, which is further enhanced by an Adaptive Image Enhancement strategy. The enhancement combines conditional super-resolution with Real-ESRGAN on small regions of interest (ROI) and branch-by-class preprocessing. Product-name ROIs are processed using LAB(L) conversion, CLAHE, denoising, deskewing, and unsharp masking, while price ROIs use grayscale and unsharp masking. Experiments on a custom dataset from four Indonesian retail chains (Alfamart, Alfamidi, Indomaret, and Satusama) with diverse label templates, typography, lighting conditions, capture distance (30 cm), and camera angles (0–15°) showed that YOLOv11 achieved near-perfect detection performance with an average precision of 99.1%, recall of 99.6%, F1-score of 99.3%, mAP50 99.3%, and mAP50-95 78.4%. At the OCR stage, the baseline YOLOv11–EasyOCR configuration attained an average Exact Match (EM) of 82.8%, a Character Error Rate (CER) of 4.8%, and a Word Error Rate (WER) of 7.9%. Among the tested preprocessing variants, the proposed method consistently performed the best, increasing EM to 88.7% (+5.9 percentage points), reducing CER to 2.9% (–1.9 percentage points), and decreasing WER to 4.6% (–3.3 percentage points). These results indicate that integrating YOLOv11, EasyOCR, and Adaptive Image Enhancement substantially improves the robustness of price-label reading under real supermarket conditions.