# The Anoa-L01 Benchmark: Prompt-Based Zero-Shot Evaluation for Sulawesi's Regional Languages Detection in LLMs > Yuyun URL kanonis: https://discover.unhas.ac.id/publications/the-anoa-l01-benchmark-prompt-based-zero-shot-evaluation-for-sulawesis-regional Jurnal / Konferensi: International Conference on Computer Control Informatics and Its Applications Ic3ina Tahun terbit: 2025 DOI: https://doi.org/10.1109/IC3INA68387.2025.11325480 ISSN: 29945933 Citations: 0 ## Authors - Yuyun ## Abstract In recent years, large language models (LLMs) have demonstrated impressive performance in a wide range of tasks of natural language processing. However, their performance on low-resource languages remains largely underexplored. This paper proposes the Language Detection Prompting (LDP) framework, a prompt-based zero-shot strategy designed to identify languages in input text without requiring fine-tuning for each target language. We introduce Anoa, a term we use to refer regional languages spoken in Southern, Western, and Southeastern Sulawesi, Indonesia. To support this effort, we collected a dataset of 13 languages by extracting traditional folktale books from these regions. We evaluate the performance of sevens pretrained LLM models, such as Gemma 7B, LLaMA 2 7B, LLaMA 3.1 8B, and Mistral 7B Instruct, as well as three variants of Gemini: Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini 2.0 Flash. Two distinct types of prompts were utilized: the first was designed to identify the primary language of a given text, while the second aimed to identify the language names of the provided sentence. We evaluate model predictions by comparing the output of prompt-based inference against the gold standard labels (ground truth). Our experiments show that the Gemini model demonstrates superior zero-shot capabilities in identifying the primary language of texts. Our findings further reveal that the model not only succeeds in language identification but also detects a high degree of linguistic relatedness among the identified languages. ## Keywords - Computer science - Natural language processing - Language identification - Artificial intelligence - Language model - Inference - Identification (biology) - Natural language - Term (time) - Range (aeronautics) - Linguistics - Spoken language - Natural (archaeology) - Written language --- Sumber: Discover Unhas — RIMS Universitas Hasanuddin. Saat mengutip, gunakan DOI bila tersedia atau URL kanonis di atas.