Knowledge graphs now contain billions of entities and relationships, yet their creation and maintenance remain labor-intensive. Current production systems process hundreds of terabytes of unstructured text to extract structured knowledge, with precision rates varying from 65% to 95% depending on the domain and relationship types. Even state-of-the-art systems struggle with context-dependent relationships and temporal knowledge.

The fundamental challenge lies in bridging the gap between unstructured human knowledge and structured, machine-readable representations while maintaining accuracy and coverage.

This page brings together solutions from recent research—including neural relation extraction architectures, ontology learning approaches, entity linking systems, and temporal knowledge modeling frameworks. These and other approaches focus on creating more complete and accurate knowledge graphs while reducing the manual effort required for their construction and maintenance.

1. Ontology Expansion via Document Analysis and Large Language Models with Machine Learning-Driven Link Identification

RAYTHEON CO, 2025

Building ontologies using document analysis and large language models (LLMs) to efficiently and accurately grow ontologies from unstructured data without requiring expert knowledge. The method involves leveraging ML models to identify missing links in an existing ontology, then using LLMs to determine the relationship types for those links. This reduces the reliance on experts and confabulating LLMs by using ML to initially identify potential gaps, then LLMs to fill in the details.

US2025238458A1-patent-drawing

2. Data Processing Pipeline with Machine Learning-Based Ingestion, Aggregation, and Transformation Mechanisms

OPTUM INC, 2025

A data processing pipeline using machine learning to intelligently ingest, aggregate, manage, and transform data from disparate sources for improved accuracy, contextualization, and formatting. The pipeline leverages machine learning models to generate structured data objects from ingested data, filter and transform features based on domain tasks, and render contextualized visualizations. This addresses challenges of time-consuming ingestion, resource-intensive transformation, and inaccurate datasets for downstream tasks. The pipeline includes models for formatting, contextualization, and task-specific requirements.

3. Method for Constructing Graphical Models and Ontologies in Standardized Formats for Manufacturing Data Integration

ACCENTURE GLOBAL SOLUTIONS LTD, 2025

Automatically building graphical models and ontologies for manufacturing plants and processes using standardized formats to enable querying and analysis of manufacturing data. The method involves generating virtual representations of plant parts and process steps using a hierarchical template model, converting non-standardized input data to the standardized format, and providing responses to user queries using the standardized format. This enables consistent and interoperable representation and analysis of manufacturing data across different hardware and software platforms.

US12353197B2-patent-drawing

4. Drone-Operated Audio Fault Detection System Utilizing Graph Neural Networks for Industrial Devices

ZHEJIANG HENGYI PETROCHEMICAL CO LTD, 2025

Audio-based device fault detection method and apparatus for industrial devices using drones to collect audio data, preprocess it, extract features, and apply graph neural networks to identify faults. The drone monitors devices by collecting initial audio. Preprocessing removes noise and silences. Features like frequency, duration, intensity are extracted. A graph is constructed from similarity of features. Fault detection uses this graph and a neural network. The graph helps identify associations between audio segments. For reactors, faults are confirmed by comparing parameters from abnormal audio. A language model improves the graph by mining entity relationships.

5. Record Clustering and Matching System Using Probabilistic Scoring and Weighted Graph Analysis

EXPERIAN INFORMATION SOLUTIONS INC, 2025

Efficiently clustering and matching records from multiple sources to identify entities and resolve duplicates, even when records have varying information. The method involves using a scoring model to determine probabilities that pairs of records represent the same entity based on their features. These probabilities are used to build a weighted graph connecting the records. Connected component analysis prunes weak links below a threshold. Then optimal weighted clustering groups the records into final clusters with unique identifiers. This allows efficient entity resolution of large numbers of records with varying data.

6. Named Entity Recognition with Contextual Domain Mapping via Labeled Extraction and Reverse Question-Answering

EXLSERVICE HOLDINGS INC, 2025

Intelligent named entity recognition that extracts entities from unstructured data, attaches domain-specific context to them, and interprets entity-related information in context. The technique involves labeled entity extraction, reverse question-answering to predict entity keys, and entity alignment to map predicted keys to domain keys. This enables contextualizing entities in chat sessions and resolving entities in unstructured data.

US12353832B1-patent-drawing

7. Knowledge Graph Alignment Using Subgraph Typing with Synthetic Type Assignment for Node Mapping

ROBERT BOSCH GMBH, 2025

Aligning and enriching multiple knowledge graphs to create a more comprehensive and accurate knowledge base for AI applications. The alignment is done using a subgraph typing technique that assigns synthetic types to nodes based on their structure and semantics. This allows matching nodes across graphs even if they have different labels or no explicit type labels. The matching is done by identifying valid node-node mappings based on subgraph type combinations that match other valid mappings.

US12354017B2-patent-drawing

8. Automated Knowledge Graph Update System Utilizing Contextual Prompt Construction with Language Models

HITACHI LTD, 2025

Automated knowledge graph updating using language models and prompts to efficiently update knowledge graphs based on documents without manual involvement. The system constructs prompts with hints from knowledge graph information to request language models for generating update queries. This leverages the graph context to generate appropriate queries even when document descriptions are incomplete. By using the knowledge graph to guide prompt construction, it avoids language models generating incorrect updates from unsuitable prompts.

US2025217672A1-patent-drawing

9. Entity Relationship Extraction System Utilizing Large Language Model for Keyword Identification and Specialized Agents for Relationship Parsing

BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO LTD, 2025

Efficiently extracting entity relationships from long texts using a combination of large language models and smaller relationship agents. The method involves: 1) Running a target long text through a large language model to get a list of important keywords. 2) Running the keyword list through multiple smaller relationship agents to generate regular expressions for different relationships. 3) Using the regular expressions to extract relationships from a set of texts. This leverages the large language model's comprehension for keyword extraction while delegating relationship identification to smaller, specialized agents.

US2025217594A1-patent-drawing

10. Method for Constructing Knowledge Graphs from Data Warehouses via Entity Extraction and Schema Inference

DATAIRIS PLATFORM INC, 2025

Generating knowledge graphs for data warehouses to facilitate usage of the data. The method involves extracting entities, relations, and attributes from the data warehouse using techniques like entity recognition, relation discovery, and schema inference. These extracted elements are then organized into a knowledge graph that provides a structured and interconnected view of the data. This graph can be queried and analyzed using graph database technologies to provide insights and recommendations based on the underlying data.

US12346316B2-patent-drawing

11. Natural Language Query Processing System with Task Decomposition, Knowledge Graph Entity Retrieval, and Vector Database Text Chunk Integration

INTERNATIONAL BUSINESS MACHINES CORP, 2025

Generating more accurate and efficient natural language responses to user queries using AI techniques like knowledge graphs, vector searches, and large language models (LLMs). The method involves breaking down user queries into tasks, searching a knowledge graph for relevant entities, then searching a vector database for text chunks associated with those entities. This refined search scope improves the accuracy and efficiency of the vector search. The LLM is then used to process the retrieved text chunks and generate a response.

US12346356B2-patent-drawing

12. Graph-Based Data Management System with Specialized Loaders and Multi-Modal Interface for Unified Data Representation and Reasoning

DATA SQUARED USA INC, 2025

Integrated system for managing, querying, and generating responses related to interconnected data using a graph database, specialized loaders, natural language reasoning, and a multi-modal user interface. The system ingests structured, unstructured, and time-series data from diverse sources, transforms it into a unified graph representation, and enables sophisticated reasoning tasks across modalities. It provides a secure, scalable, and interoperable solution for integrating, analyzing, visualizing, and reasoning over heterogeneous data at scale.

US12339839B2-patent-drawing

13. Entity Resolution System with Tokenization and Distance-Based Error Correction for Digitized Text

ONETRUST LLC, 2025

A system to accurately digitize physical documents and efficiently process digital text by correcting entity extraction errors and performing entity resolution using tokenization and distance metrics. It extracts entities from a text document, tokenizes the text based on position, compares extracted entities and tokenized strings for similarity, and generates mappings between entities and strings using correlation probabilities. This allows resolving entities to likely corresponding character strings while correcting for errors during extraction.

14. Design of Knowledge Service Model Combining Dynamic Knowledge Graph and Enterprise Risk Management based on Bidirectional Encoder Representation from Transformers Bidirectional Long Short- Term Memory

yu jiayin, jiang jiang, yadong shi - Research Square, 2025

<title>Abstract</title> A dynamic knowledge graph and service model were constructed to address the risk management needs of enterprises. By extracting, integrating, processing entities, a complete is formed, then designed achieve intelligent management. The experiment shows that entity extraction method based on TextRank algorithm proposed by research has an accuracy 82.7%, recall rate 80.9%, F1 score 81.8% in Class datasets. relationship fusion Bidirectional Encoder Representation from Transformers Bi directional Long Short-Term Memory (BERT-Bi-LSTM) transformers about 87.8% for fusion. response time enterprise 1000 transaction requests 11.3 minutes, maximum sustainable throughput 1918TPS, CPU utilization 54.7%, memory usage 3.0GB. above results indicate perform well multiple core indicators, which can effectively improve intelligence level

15. Graph Embeddings to Empower Entity Retrieval

emma j gerritse, faegheh hasibi, arjen p de vries, 2025

In this research, we investigate methods for entity retrieval using graph embeddings. While various have been proposed over the years, most utilize a single embedding and linking approach. This hinders our understanding of how different impact retrieval. To address gap, effects three categories techniques five methods. We perform reranking entities distance between embeddings annotated wish to rerank. conclude that selection both linkers significantly impacts effectiveness For embeddings, incorporate structure textual descriptions are effective. linking, precision recall concerning concepts important optimal performance. Additionally, it is essential encompass as many possible.

16. Large-scale materials knowledge extraction using LLMs and human-in-the-loop

xintong zhao, xiaohua hu, jane greenberg, 2025

Unstructured scientific text plays a critical role in preserving, transferring, and developing research knowledge. Valuable outputs are often recorded forms such as patents, articles, project reports. Unlike generic text, literature usually follows specialized formats terminology. This significant difference leads to greater challenges opportunities for NLP (Natural Language Processing) researchers. To automate the process of extracting structuring domain-specific knowledge from unstructured this dissertation addresses these by leveraging methods automated materials science extraction. Through three case studies, explores use deep learning, LLM (Large Model) prompt-based techniques extract synthesis texts. Building on efforts, introduces an end-to-end, cost-effective framework designed large-scale extraction with domain experts loop. The demonstrates how combining light human guidance enables scalable, accurate, efficient processing literature. Together, contributions aim mitigate key bottlenecks support development AI-ready data.

17. Parsing and Ranking-Based Automated Structured Data Object Generation from Unstructured Text

KEEPER TAX INC, 2025

Automated generation of structured data objects from unstructured text using parsing and ranking techniques. The method involves breaking down an input text into substrings and matching them against a list of entities and patterns to classify and rank them. The highest ranked substrings are then used to generate a structured data object like a JSON object. This allows converting unstructured text into structured format without requiring manual mapping or predefined schema.

18. Unveiling the Hidden Dynamics of Knowledge Graphs: The Role of Superficiality in Structuring Information

Cédric Sueur - Peer Community In, 2024

Knowledge graphs [1][2][3][4] represent structured knowledge using nodes and edges, where nodes signify entities and edges denote relationships between these entities.These graphs have become essential in various fields such as cultural heritage [5], life sciences [6], and encyclopedic knowledge bases, thanks to projects like Yago [7], DBpedia [8], and Wikidata [9].These knowledge graphs have enabled significant advancements in data integration and semantic understanding, leading to more informed scientific hypotheses and enhanced data exploration.Despite their importance, understanding the topology and dynamics of knowledge graphs remains a challenge due to their complex and often chaotic nature.Current models, like the preferential attachment mechanism, are limited to simpler networks and fail to capture the intricate interplay of diverse relationships in knowledge graphs.There is a pressing need for models that can accurately represent the structure and dynamics of knowledge graphs, allowing for better understanding, prediction, and utilisation of the knowledge contained within th... Read More

19. Process hyper-relation knowledge graph construction and application

Yang Lv, Peiyan Wang, Guiyang Ji - IOP Publishing, 2024

Abstract A knowledge graph enables the structured representation of process knowledge. Traditional knowledge graphs typically represent process fact knowledge by depicting relations between entities. However, higher-order knowledge, such as causality, coupling, and rationale among process facts, should be addressed. The Process Hyper-relational knowledge graph (PHKG) was proposed to address these shortcomings. It comprises three layers: a concept layer representing process concept knowledge, an instance layer representing process fact knowledge, and a hyper-relationship layer representing higher-order knowledge linking process facts. Employing a semi-automatic construction method, a hyper-relation knowledge graph was created with 1, 602 entities, 2, 509 entity relationships, and 231 pairs of hyper-relationships. A process knowledge reasoning algorithm has also been developed to enable applications to reason about process knowledge.

20. Building Massive Knowledge Graphs using an Automated ETL Pipeline

Aaron Eberhart, Wolfgang Schell, Peter Haase - ACM, 2024

Knowledge graphs are extremely versatile semantic tools, but there are current bottlenecks with expanding them to a massive scale. This concern is a focus of the Graph-Massivizer project, where solutions for scalable massive graph processing are investigated. In this paper we'll describe how to build a massive knowledge graph from existing information or external sources in a repeatable and scalable manner. We go through the process step-by-step, and discuss how the Graph-Massivizer project supports the development of large knowledge graphs and the considerations necessary for replication.

21. Overview and Analysis of Knowledge Graph Representations

22. Automated Knowledge Graph Construction with Large Language Models — Part 2

23. Improving Knowledge Representation Using Knowledge Graphs: Tools and Techniques

24. Automated Knowledge Graph Construction with Large Language Models

25. Investigating the Challenges and Prospects of Construction Models for Dynamic Knowledge Graphs

Get Full Report

Access our comprehensive collection of 182 documents related to this technology