Data exploration: large language models in the construction of Knowledge Graphs
DOI:
https://doi.org/10.55592/cilamce.v6i06.8207Palavras-chave:
Knowledge Graphs, Large Language Models (LLM), Natural Language Processing (NLP)Resumo
Knowledge graphs (KGs) are graphical representations of structured information that illustrate the relationships between concepts, entities, or data. KGs play a crucial role in enhancing the performance of artificial intelligence systems and search tools. However, constructing knowledge graphs is a complex undertaking, requiring the assimilation of substantial amounts of data. One approach to building KGs involves utilizing Large Language Models (LLMs), which leverage artificial intelligence to comprehend and generate natural language. Hence, this study advocates for the utilization of an LLM model in KG construction. The proposed model utilizes artificial intelligence to identify the most pertinent subjects within a domain of knowledge (nodes of the KG) and establish the connections between these topics (edges of the KG). To achieve this, the Stable Beluga 2 model, fine-tuned on Llama2 70B, was employed. The execution of the model utilized the Petals architecture, a system designed for collaborative inference and fine-tuning of large-scale models by pooling resources from multiple entities. This facilitates the execution of large-scale models with reduced computational resources. The outcome of this endeavor was the development of an artificial intelligence model capable of generating knowledge graphs that serve various purposes, including summarizing concepts, identifying correlations between areas of study, and responding to inquiries.