Bilateral Mesh Agreements

If your network security only controls who gets in, you’ve left half the model unspecified. The platform requires both sides to consent before traffic flows — and compiles both sides into policy simultaneously.

The problem

Standard service mesh authorization is one-sided. The callee writes an AuthorizationPolicy: “service A can reach me.” Service A never declares anything. It calls whatever it wants, and the only constraint on its outbound traffic is whatever policies happen to exist on its targets.

If you only control ingress, you’re leaving half the security model unspecified. And nobody notices, because the missing half is invisible.

A compromised service A can probe every service in the mesh. Most probes get denied by target policies. Some don’t — services with broad AuthorizationPolicies (“allow from any service in this namespace”) are wide open. The attack surface isn’t every service A should be calling. It’s every service with a permissive ingress policy.

In a unilateral model, there’s no record of what A is supposed to call. You can audit what each service accepts (read its AuthorizationPolicies), but you can’t audit what each service sends without scanning every policy in the cluster for A’s identity.

At scale, this means you have no dependency graph — just a scattered collection of ingress rules and a lot of optimism.

The constraint

Ingress policies alone don’t constrain egress. Adding egress policies helps, but creates a coordination problem that nobody owns until it causes an outage: the caller’s team writes egress rules, the callee’s team writes ingress rules, and nobody ensures they’re consistent.

Consider the failure mode: Team A writes egress to Team B’s service. Team B renames their service. Team A’s egress rule is stale. Traffic breaks, and the staleness is discovered at 2 AM rather than at build time.

The constraint is that network security requires coordination across service boundaries — but unilateral policies are written by one side without knowledge of the other. You need a mechanism where both sides declare the relationship and the platform validates the match.

A second constraint: bilateral agreements must compile into both L4 (network-level packet filtering via CiliumNetworkPolicy) and L7 (request-level identity authorization via Istio AuthorizationPolicy) simultaneously. Writing one without the other leaves a layer gap — boringly reliable at one layer, wide open at the other.

The solution: bilateral resource declarations

In the platform’s service spec, both sides declare the relationship:

The caller declares outbound:

# This is checkout's spec — the service making the call
workload:
  resources:
    payments:            # the target service
      type: service
      direction: outbound  # "I call payments"

The callee declares inbound:

# This is payments' spec — the service receiving the call
workload:
  resources:
    checkout:            # the source service
      type: service
      direction: inbound   # "I accept calls from checkout"

Traffic flows only when both declarations exist. If checkout declares outbound to payments but payments doesn’t declare inbound from checkout, the compiler produces nothing. Default-deny blocks the traffic. The interface is clear, the feedback immediate.

flowchart LR
  subgraph "checkout (caller)"
    A[outbound: payments]
  end
  subgraph "payments (callee)"
    B[inbound: checkout]
  end
  A -- "bilateral match" --> C{Compiler}
  B -- "bilateral match" --> C
  C --> D[CiliumNetworkPolicy\nL4 egress on checkout]
  C --> E[CiliumNetworkPolicy\nL4 ingress on payments]
  C --> F[AuthorizationPolicy\nL7 identity permit]

What the compiler produces from a matched agreement:

CiliumNetworkPolicy on checkout — L4 egress rule allowing TCP to payments’ pod selector and port.
CiliumNetworkPolicy on payments — L4 ingress rule allowing TCP from checkout’s pod selector.
Istio AuthorizationPolicy on payments — L7 permit for checkout’s SPIFFE identity (spiffe://cluster.local/ns/commerce/sa/checkout).

Two lines of YAML per service. Three generated resources per pair. L4 and L7 opened simultaneously. Default-deny handles everything that isn’t explicitly agreed upon.

Why I chose this over unilateral policies:

The bilateral model produces an explicit, complete dependency graph as a side effect. Every outbound declaration is a dependency. Every inbound declaration is an acceptance. The graph is auditable — and that’s the real win:

“What does checkout call?” Read its outbound declarations.
“What can reach payments?” Read its inbound declarations.
“Is there a path from frontend to the database?” Walk the graph.
“What changed in the dependency graph this week?” Diff the specs.

With unilateral policies, these questions require scanning every AuthorizationPolicy and NetworkPolicy in the cluster. With bilateral agreements, you read the service specs.

The cross-service compilation requirement: The compiler must see all services to match bilateral agreements. A change to payments’ spec (adding or removing an inbound declaration) triggers recompilation of checkout’s network policies. This is graph-based reconciliation — the compiler watches all service CRDs, not just the one being reconciled.

The honest friction. Bilateral agreements slow things down. Team A deploys a new service that calls Team B’s API. In a unilateral model, Team A writes their egress rule and ships. In a bilateral model, Team A ships — and nothing works until Team B adds the inbound declaration. That’s a cross-team dependency at deploy time. It means a Slack message, a PR review, maybe a 2-hour wait if Team B is in a different timezone.

This is annoying. I know. I’ve felt it.

But here’s the thing: in any environment with real compliance requirements — SOC 2, NIST 800-53, PCI-DSS — that cross-team coordination was going to happen anyway. It just happens through a security review ticket instead of a YAML declaration. The bilateral model makes the coordination explicit, auditable, and fast (a one-line spec change vs. a Jira ticket). The friction isn’t new — it’s moved from an opaque process to a visible one.

For teams that genuinely can’t tolerate the coordination overhead — early-stage development, rapid prototyping, services that accept traffic from many callers — the callee’s LatticeMeshMember can list broad allowedCallers entries to approximate unilateral authorization. This is a deliberate escape hatch, visible in the dependency graph, and it means you’ve loosened bilateral’s auditability guarantees for that service. I use it for infrastructure services like DNS and monitoring. The LatticeMeshMember CRD also handles bringing non-compiled workloads (Helm-installed software) into the bilateral model.

Key takeaways

Unilateral policies leave egress uncontrolled. If only the callee declares authorization, you have no record of what each service is supposed to call — and no way to audit it without scanning every policy in the cluster.
Bilateral agreements require both sides to consent. Traffic flows only when both sides declare the relationship, producing an explicit dependency graph as a side effect.
The compiler turns 2 lines of YAML into 3 enforced resources. L4 egress, L4 ingress, and L7 identity permit — all generated, all consistent, all auditable.