Automating Rock Classification: A Vision Transformer Approach in Brazil's Ornamental Stone
DOI:
https://doi.org/10.55592/cilamce.v6i06.10144Palavras-chave:
Neural networks, Image classification, Vision TransformerResumo
The ornamental stone sector in Brazil is renowned for its diverse array of rock types. However, the classification of these rocks largely relies on subjective assessments and the specialized expertise of professionals. This dependence has spurred interest in employing artificial intelligence (AI) to enhance the image classification process in this field. This study establishes a comprehensive labeled database of ornamental rock images, containing 1,798 images divided into 12 distinct classes, and makes this database publicly available. Additionally, it proposes the use of a Vision Transformer network, specifically the SI-ViT (Shuffle Instance-based Vision Transformer), which was originally developed for the automated classification of pancreatic cancer images, for this task. In comparative evaluations, the SI-ViT network demonstrated superior performance, outperforming established models such as Vgg16, Vgg19, Resnet50, Resnet101, Xception, and Inception v3, with an impressive accuracy rate of 99.68%.