RBAC at Node Level: Surgical Data Access

In a standard file-based RAG system, security is "All or Nothing." If you have access to a PDF, you can see everything in it. But in a Knowledge Graph, knowledge is a network. A specific user might be allowed to see (Project Alpha), but they should NOT be allowed to see the (Salary) node or the (Budget) node connected to it. This requires Node-Level Security.

In this lesson, we will explore the implementation of Role-Based Access Control (RBAC) inside the graph engine. We will learn how to use Labels and Properties to tag data with security levels, and how to write Cypher queries that automatically filter out unauthorized content before it ever reaches the LLM.

1. Tagging the Knowledge: Security Labels

The most common way to implement RBAC is to use Node Labels as security tags.

CREATE (p:Person:SECRET {name: 'Sudeep'}) CREATE (p2:Person:PUBLIC {name: 'Jane'})

The Workflow: When the AI agent queries for "All people," the backend identifies the user's role.

If User = Junior: The system appends WHERE NOT n:SECRET to every query.
If User = Director: The system allows all labels.

2. Relationship Security: Hidden Bridges

Security isn't just about the "Noun" (Node); it's about the "Verb" (Relationship).

The Case: Everyone knows Sudeep exists. Everyone knows the CEO exists. But the relationship (Sudeep)-[:REPORTS_TO]->(CEO) might be a sensitive secret.

In Neo4j Enterprise, you can define Database-Level Rules that make specific relationship types invisible to certain roles. If the LLM tries to follow a hidden relationship, the database returns "No results," preventing the AI from even knowing that a connection exists.

3. Dynamic Query Anchoring

Instead of hard-coded labels, you can use a Security Property. MATCH (n)-[r]-(m) WHERE n.visibility_level <= $user_clearance

This allows for a hierarchical security model:

Level 1: Public
Level 2: Internal
Level 3: Confidential
Level 4: Secret

The AI's "Context Subgraph" is dynamically pruned at the database level, ensuring that the final answer is Secure by Design.

graph TD
    User((User: Junior)) -->|Query| API[RAG API]
    API -->|Clearance: 1| DB[(Graph DB)]
    DB -->|Filter n.level <= 1| Result[Public Facts]
    Result --> LLM[LLM Synth]
    
    subgraph "The Security Wall"
    Secret((Level 4: Secret)) -.->|BLOCKED| Result
    end
    
    style Secret fill:#f44336,color:#fff
    style Result fill:#34A853,color:#fff

4. Implementation: A Cypher Security Wrapper

Let's look at how we wrap a user's question with security constraints in Python.

def run_secure_query(user_id, raw_cypher):
    clearance = get_user_clearance(user_id) # Returns 1, 2, or 3
    
    # Inject security filter into the WHERE clause
    secure_cypher = raw_cypher.replace(
        "WHERE", 
        f"WHERE n.security_level <= {clearance} AND "
    )
    
    return graph.run(secure_cypher)

# If the LLM generated 'MATCH (n:Person) WHERE n.name = "Sudeep" ...'
# Our wrapper turns it into '... WHERE n.security_level <= 1 AND n.name = "Sudeep" ...'

5. Summary and Exercises

Node-level RBAC is the "Surgical Shield" of your Knowledge Graph.

Labels provide broad categorization of security.
Properties allow for fine-grained, hierarchical clearance.
Relationship Security prevents the leakage of sensitive connectivity.
Query Anchoring is the programmatic way to enforce these rules in a RAG pipeline.

Exercises

Security Design: You are building a bot for a "Hospital." List 3 node labels that should be SECRET and 3 that should be PUBLIC.
The "Leak" Logic: If an AI agent has access to "Public" summaries of "Confidential" reports, is the security still intact? (Hint: Think about indirect inference).
Visualization: Draw a graph with 3 nodes. A (Public) -> B (Secret) -> C (Public). If a Public user asks for the path between A and C, what should the result be?

In the next lesson, we will look at the accountability side: Traversal Auditing: Who saw what?.