The Blueprint: Hybrid GraphRAG Ingestion and Retrieval
The Blueprint: Hybrid GraphRAG Ingestion and Retrieval
[INGESTION][Raw Data (S3)] ──> [Lambda Orchestrator] ──> [Amazon Bedrock (Entity Extraction)]│┌────────────────────────┴────────────────────────┐▼ ▼[Amazon OpenSearch (Vectors)] [Amazon Neptune (Graph DB)]
[RETRIEVAL][User Query] ──> [API Gateway] ──> [Lambda Resolver] ──> [Query OpenSearch + Neptune]│▼ (Context Compactor)[Amazon Bedrock (LLM)] ──> [User Response]Phase 1: The Ingestion Pipeline (Building the Graph)
- The Storage Layer (Amazon S3):
- Unstructured enterprise data (PDFs, wiki pages, markdown files) is uploaded to an Amazon S3 bucket.
- An S3 Event Notification triggers an asynchronous processing pipeline.
- The Processing Layer (AWS Lambda / AWS Glue):
- An AWS Lambda function (or an AWS Glue job for massive data volumes) chunks the documents.
- Instead of just saving the raw text chunks, Lambda calls Amazon Bedrock using a lightweight, fast model to perform Named Entity Recognition (NER) and relationship mapping.
- The Hybrid Storage Layer (OpenSearch + Neptune):
- The Text Vector: The Lambda function generates text embeddings and stores the chunks in Amazon OpenSearch Service for semantic keyword similarity.
- The Entity Graph: Simultaneously, the extracted entities (e.g., Product X) and relationships (e.g., DEPENDS_ON) are written as nodes and edges into Amazon Neptune (AWS’s managed graph database).
Phase 2: The Retrieval Pipeline (The Low-Token Search)
- The Ingress Layer (API Gateway):
- The user asks a complex question (e.g., “If I upgrade System A, what downstream services are impacted?”) via Amazon API Gateway.
- The Hybrid Resolver (AWS Lambda):
- API Gateway invokes a central Lambda Resolver function.
- Instead of performing a massive keyword search that returns pages of text, the Lambda runs a hybrid query:
- It hits Amazon OpenSearch for basic semantic context.
- It queries Amazon Neptune using a graph query language (like openCypher or Gremlin) to pull the exact dependency path.
- The Context Compactor (The Token Saver):
- The Lambda function combines the small, highly targeted vector text chunk with the precise structural relationships from Neptune.
- It strips out all irrelevant conversational text, assembling a highly dense, ultra-compact context payload.
- The Generation Layer (Amazon Bedrock):
- The compacted context is sent to a frontier model in Amazon Bedrock.
- Because the context is highly refined, the model returns a perfectly accurate, hallucination-free answer while consuming up to 70% fewer tokens than a traditional vector-only setup.