Data exploration: large language models in the construction of Knowledge Graphs

Autores

  • Cecília de Freitas Vieira Couto UFRJ - Universidade Federal do Rio de Janeiro
  • Nelson Francisco Favilla Ebecken UFRJ - Universidade Federal do Rio de Janeiro

DOI:

https://doi.org/10.55592/cilamce.v6i06.8207

Palavras-chave:

Knowledge Graphs, Large Language Models (LLM), Natural Language Processing (NLP)

Resumo

Knowledge graphs (KGs) are graphical representations of structured information that illustrate the relationships between concepts, entities, or data. KGs play a crucial role in enhancing the performance of artificial intelligence systems and search tools. However, constructing knowledge graphs is a complex undertaking, requiring the assimilation of substantial amounts of data. One approach to building KGs involves utilizing Large Language Models (LLMs), which leverage artificial intelligence to comprehend and generate natural language. Hence, this study advocates for the utilization of an LLM model in KG construction. The proposed model utilizes artificial intelligence to identify the most pertinent subjects within a domain of knowledge (nodes of the KG) and establish the connections between these topics (edges of the KG). To achieve this, the Stable Beluga 2 model, fine-tuned on Llama2 70B, was employed. The execution of the model utilized the Petals architecture, a system designed for collaborative inference and fine-tuning of large-scale models by pooling resources from multiple entities. This facilitates the execution of large-scale models with reduced computational resources. The outcome of this endeavor was the development of an artificial intelligence model capable of generating knowledge graphs that serve various purposes, including summarizing concepts, identifying correlations between areas of study, and responding to inquiries.

Downloads

Publicado

2024-12-02

Edição

Seção

Computational Intelligence Techniques for Optimization and Data Modeling