Share

Export Citation

APA
MLA
Chicago
Harvard
Vancouver
BIBTEX
RIS
Universitas Hasanuddin
Research output:Contribution to journalArticlepeer-review

Extraction Of Contributions In Academic Papers Using T5 And BERT Algorithms

Tadjuddin N.A.

Proceedings 7th International Conference on Informatics Multimedia Cyber and Information System Icimcis 2025

Published: 2025

Abstract

Understanding the main contributions of a research article is an important step in assessing its novelty and scientific impact. However, this process is still mostly done manually, which takes a long time. This research aims to develop an automated method to extract scientific contributions using transformer-based models. Two approaches were tested, namely the Text-To-Text Transfer Transformer (T5) model and the combination of the Bidirectional Encoder Representations from Transformers (BERT) model with T5. The dataset used consists of articles in the field of computer science that have been manually annotated to identify contribution sentences. The model is trained and evaluated using ROUGE and BERTScore metrics to assess syntactic and semantic performance. The experimental results showed that the BERT+T5 hybrid model provided better performance than the single T5, with a significant increase in F1 values (ROUGE-1 F1: 0.7018; BERTScore F1: 0.8700). The integration of contextual embedding from BERT improves the semantic representation of the text, so that T5 is able to produce more accurate and relevant contribution extraction. This research shows the great potential of transformer-based models in supporting the automation of scientific information extraction, as well as being the basis for the development of intelligent systems to accelerate the process of knowledge discovery and analysis of academic literature.

Other files and links

Fingerprint

Computer scienceSciences
NoveltySciences
TransformerSciences
EncoderSciences
AutomationSciences
Artificial intelligenceSciences
Process (computing)Sciences
Semantic data modelSciences
Field (mathematics)Sciences
Knowledge representation and reasoningSciences
Machine learningSciences
Representation (politics)Sciences
Natural language processingSciences
EmbeddingSciences
Data miningSciences
Information extractionSciences
Knowledge extractionSciences
Domain (mathematical analysis)Sciences
Information retrievalSciences
Basis (linear algebra)Sciences
External Data RepresentationSciences