> Dear LISP WG,
> 
> We have proposed a new draft titled "Using LISP as a Network Substrate for AI 
> Agent Communication" (draft-wang-lisp-ai-agent-00). This draft explores the 
> application of the LISP in supporting the emerging networking requirements of 
> AI agent communication. 

Thanks for the draft Wei and Chongfeng. See my comments inline. Your draft text 
comes first and is indented and my comments follow.

> We think LISP is the ideal substrate for AI agents because:
>     • LISP separates the stable EID (Agent Identity) from dynamic RLOCs 
> (Attachment Points), ensuring persistent sessions despite agent mobility.

Yes, agree.

>     • The Instance ID can be used to create isolated namespaces for distinct 
> Agent Groups, enforcing security and policy boundaries.

And supports multi-tenancy. I think you could add more text about multi-tenancy 
in the draft. See my comments and suggestions below.

>     • The mapping system can be extended to store Context Attributes (e.g., 
> hardware capabilities, latency), enabling policy-based RLOC selection for 
> intelligent traffic engineering.

Yes, and the LCAF formats we have now can be used to support it. The only new 
extensions is how to do matches for what is requested in a Map-Request and what 
is returned by the mapping system. More about this below.

> We would be grateful if you could review and comment on the content of this 
> document.

Here you go:

> The goal is not to redefine LSP, but to illustrate how it can be
> leveraged and slightly enhanced to serve as a foundational layer for
> next-generation intelligent systems.

Typo. Change "LSP" to "LISP.

>    The following terms are used in this draft:

Maybe for this draft, these definitions should add some AI-Agent context aware 
terminology. I made some suggestions below.

> * Endpoint Identifier (EID) [RFC9299]: Addresses assigned
> topologically to network attachment points. Typically routed
> inter-domain.

"An AI-agent system will be assigned one or more EID addresses".

> * xTR [RFC9299]: A router that implements both ITR and ETR
> functionalities.

"An xTR can be co-located with an AI-Agent EID or be part of a LISP site where 
AI-Agents are assigned EID addresses. That is, an EID and an RLOC-set can be on 
the mobile agent or the mobile agent can move to new RLOC xTRs".

Also note, the xTR can be muli-homed so when underlay performance changes, the 
xTR can select better paths to other Agents.

> * Map-Server [RFC9301]: A network infrastructure component that
> learns of EID-Prefix mapping entries from an ETR, via the
> registration mechanism described below, or some other
> authoritative source if one exists. A Map-Server publishes these
> EID-Prefixes in a mapping database.

Well its the xTRs that publish the mappings. The map-server holds the mappings 
and returns the key/value pairs to requesters.

> * Map-Resolver [RFC9301]: A network infrastructure component that
> accepts LISP Encapsulated Map-Requests, typically from an ITR, and
> determines whether or not the destination IP address is part of
> the EID namespace; if it is not, a Negative Map-Reply is returned.
> Otherwise, the Map-Resolver finds the appropriate EID-to-RLOC
> mapping by consulting a mapping database system.

And note, this can be used for (1) Agent to agent packet delivery (like any 
other IP application), but can also be used for (2) Agent discovery, and (3) 
Agent capability inventory.

> * Instance ID (IID) [RFC9299]: A 24-bit identifier used to create
> isolated LISP namespaces.

Be careful how you use the terms to refer to this. I have seen namespaces and 
"groups". Instance-IDs are used for multi-tenancy. So mapping information stays 
separate and doesn't co-mingle with other instances. So used by a "group of 
EIDs" is a correct statement but you don't want to convey its part of a 
multicast group (more on multicast later in the comments).

We use the term "VPN" and if you want to refer to it as a "VPN group" it would 
be more clear.

> * AI agent: A software entity capable of perception, decision-
> making, and action, often operating autonomously or in
> coordination with other AI agents.

And indicate again an AI agent can be found using an EID or a 
Distinguished-Name (among other ways) but using different LCAF encoding for 
EIDs. More later on this.

> * Agent Group: A logical group of AI agents sharing a common task,
> security policy, or administrative boundary. Each domain MAY be
> mapped to a unique LISP Instance ID.

Change to "Agent VPN Group". And I would say "sharing a common privacy level". 
For example, if you wanted to use EID anycast, to find the "closest agent" 
within a group, an instance-ID could be used for this.

> 3.2. Logical isolation of Agent Groups
> 
> Even when multiple Agent Groups operate on the same physical or
> virtual network infrastructure, they must be isolated from one
> another to prevent interference and ensure that their respective
> security policies are strictly enforced.


And note an agent group can be used across the same or different underlay with 
one or more mapping systems. There are tradeoffs and advantages for cut-slicing 
this.

> 3.3. Context-aware routing
> 
> The network should dynamically select the most appropriate
> transmission path based on the communication intent of AI agents,
> such as their requirements for latency or security.

And therefore, be multi-homed (with many wireless interfaces) with good path 
diversity through network providers.

>    Each AI agent is assigned a stable EID. This EID serves as its
> permanent network identity, independent of where it executes. The
> EID is only routed within the AI agent’s local site; global
> reachability is achieved via LISP encapsulation.

"Stable" here means it does not change. Assigned once and used forever, very 
much like a UUID is assigned to a system. And note it can but not required to 
have structure (depending if you want to do EID-prefix aggregation or simply 
use /32 or /128 EIDs for more flexible roaming). You can also make the EID a 
auto-generated random number that doesn't change. And the discovery of the EID 
is not based on the bits in the address but based on other mapping records in 
the mappins system (like distinguished-name, JSON encoding with capabilities 
and features, geo-location, traffic-engineering requirements, etc).

> 4.2. Attachment points as RLOCs
> 
> When an AI agent runs on a host connected to the network, the local
> xTR registers the AI agent’s EID along with one or more RLOCs.
> Multiple RLOCs enable multi-homing, with each RLOC annotated with
> capabilities.

So more could be said here. If the mobile Agent is assigned a stable EID, its 
current RLOC is assigned by the underlay provider. In this case, the EID and 
RLOC(s) are with the agent system and when it roams, the EID stays the same but 
a new RLOC is assigned to the agent by the new location. This is where xTR 
resides WITH the agent.

And yes, the case you supply is true as well. But just be specific that an xTR 
that is another system is a non-roamable device (the xTR is bolted into a rack) 
and what moves is the agent "to the" xTR. In this later case, if the roaming 
domain is with the EID-prefix being registered by the xTR, then no hole 
punching is required in the mapping system.  

Here is an example, for both cases. If the xTR is co-located with the agent and 
the agent is assigned EID 240.1.1.1 with an assigned RLOC address by provider 
as (1.1.1.1, 2.2.2.2) then when it roams, the new mapping registered to the 
mapping system is 240.1.1.1 -> (3.3.3.3, 4.4.4.4). What the mapping system 
holds is a 240.1.1.1/32 prefix (you can call it a host prefix). 

Now an example for the roaming domain. You have an agent assigned EID 224.1.1.1 
and the xTRs in that LISP site is registering 240.1.0.0/16 to represent all 
EIDs that can be reached via these xTRs. If the EID moves somewhere else behind 
the xTRs, there is no need to notify the mapping system because the RLOCs are 
still used and the mobility is handled as part of underlay routing IN THE LISP 
site. If the EID wants to roam out of this roaming area to another set of xTRs 
that are regsistering 240.2.0.0/16, then the agent would need to register 
240.1.1.1/32 with the xTRs RLOCs (or the xTRs discover the 240.1.1.1 EID in its 
domain and registers the host mapping).

Both these scenarios have been implemented and deployed for various use-cases.

> 4.3. Instance ID for Agent Groups
> 
> LISP Instance IDs [RFC9299] allow multiple virtual networks over the
> same physical infrastructure. Each agent group is assigned a unique
> IID. Packets are encapsulated with the IID in the LISP header,
> ensuring isolation between different agent groups even if EIDs
> overlap. IID enables scalable, secure multi-tenancy for
> heterogeneous workloads.

If you chose to use the same EID addressing scheme based on the types of agents 
and how the cooperative as agent groups, you can reuse these addresses in 
different instance-IDs. And you can use the same mapping system. So if you had 
a GPU cluster with 4 agents perhaps (you could assign them each as MoEs with 
240.1.1.1., 240.2.2.2., 240.3.3.3, and 240.4.4.4 for instance-ID 1). And then a 
comopletely different cluster could use the same MoE scheme with the same EIDs 
in instance-ID 2.

Just to show an example.

If you are inclined to take my suggestion, put another box at the top where you 
can show the agent and xTR co-located to indicate both cases can be achieved. 
Grant it, co-location makes mobility so much simpler to deploy and has less 
dependence on underlay infra.

> 5.2. Data Flow Example
> 
> Consider Agent A (EID_A) sending a message to Agent B (EID_B):
> 
> 1. Agent A sends a standard IP packet to EID_B.
> 
> 2. The local xTR (acting as ITR) intercepts the packet.
> 
> 3. ITR queries the mapping system via a Map-Resolver for EID_B.
> 
> 4. The mapping system returns a Map-Reply containing one or more
> RLOCs for EID_B, possibly filtered by context.
> 
> 5. ITR encapsulates the original packet in a LISP header (with
> optional IID) and forwards it to the selected RLOC_B.
> 
> 6. The destination xTR (ETR) decapsulates and delivers the packet to
> Agent B.
> 
> If Agent B migrates to a new host, it registers its EID with a new
> RLOC. Subsequent Map-Requests return the updated mapping, and
> communication resumes transparently.

Please indicate this is normal data-flow described in RFC9300 and mobility 
movement described in both draft-ietf-lisp-mn and draft-ietf-lisp-eid-mobility 
(the former is theh colocation case and the later more general movment of an 
EID from LISP site to LISP site).

> 6. New requirements to LISP
> 
> To effectively support the requirements of AI agent systems outlined
> in Section 3, the LISP architecture requires specific enhancements.
> These enhancements focus on extending the mapping database to carry
> richer context information and enabling the data plane to make
> routing decisions based on agent-specific semantics.

LISP can carry quite a bit of key/value formats so it would be nice to know 
what you need. So what you store shouldn't be an issue but what you look up and 
how you want to the information returned (your next section) is what could be 
new and interesting. Like various matching algorithms and multi-stage lookups 
which we have tried before.

> The LISP Mapping System MUST be extended to support the storage and
> retrieval of Agent Context Attributes alongside the standard EID-to-
> RLOC mappings. These attributes are used by Ingress Tunnel Routers
> (ITRs) to select the optimal RLOC based on the specific needs of the
> AI agent communication.

Well storing an entry with an RLOC in JSON format gives you quite a bit of 
flexibility and the agent could decide the format and data model. So you have 
that there already.

> The following attributes SHOULD be supported as optional fields in
> the Map-Reply message or the EID-to-RLOC record:
> 
> * Processing Latency (Latency_SLA): A metric indicating the
> computational latency of the host where the AI agent resides
> (e.g., "Low", "Medium", "High"). This allows routing decisions
> based on real-time performance requirements.

I already support this in my lispers.net <http://lispers.net/> implementation 
but using JSON RLOCs. So it can be done.

> * Hardware Capability Tags: Indicators of available hardware
> resources. This enables affinity-based routing where an AI agent
> can specifically request a host with certain hardware.
> 
> * AI agent State: Information regarding the current operational
> state of the AI agent. This prevents packets from being sent to
> AI agents that are in an invalid state.

Can all be described with the JSON LCAF.

> 6.2. Policy-Based RLOC Selection
> 
> The ETR registration process MUST be augmented to allow AI agents or
> their hosting environments to dynamically advertise their context
> attributes to the Map-Server. The registration mechanism SHOULD
> support:
> 
> * Dynamic Metadata Update: The ability for an ETR to update the
> context attributes (e.g., load, latency) of an EID registration
> without de-registering and re-registering the EID prefix, ensuring
> minimal disruption during state changes.

This is already supported. You don't need any new packet formats. And note if 
you want to convey the latency information to an ITR from an ETR, you can put 
the JSON format in a RLOC-probe reply, so you may not need the mapping system 
store path metrics. In fact, the path metrics will be different based on the 
source, so you want it computed from source-EID to dest-EID.

> * Context-Aware Filtering: The Map-Server and Map-Resolver MUST
> support filtering mechanisms. When an ITR sends a Map-Request, it
> MAY include desired context attributes (e.g., "I need a GPU").
> The mapping system SHOULD return only those RLOCs that match the
> requested attributes.

A map-server implementation with proxy-replying can be enhanced to do this with 
policy information in the implementation and not the protocol.

> 6.3. Enhanced Map-Request/Map-Reply Semantics
> 
> To facilitate context-aware routing, the LISP control plane messages
> require the following modifications:
> 
> * Extended Map-Request: ITRs MUST be able to include "Context
> Constraints" in the Map-Request message. These constraints
> specify the requirements of the source AI agent for the
> destination (e.g., minimum security level, required hardware).

This can be done by an implementation so no protocol changes are required. If 
you think so, give more details.

> * Prioritized RLOC List: The Map-Reply message MUST support
> returning a prioritized list of RLOCs based on the context match
> score, rather than just topology. The priority field in the RLOC
> record SHOULD be interpreted as a combination of network topology
> and agent-specific suitability.

This is already supported in the protocol with per priority and weights per 
RLOC record.

> 6.4. Support for Agent Group Mobility
> 
> To support seamless mobility, the LISP architecture MUST ensure fast
> convergence during EID re-registration:
> 
> * Incremental Updates: The mapping database system SHOULD support
> incremental updates to minimize latency when an AI agent migrates
> and updates its RLOC registration.

We have many approaches to this, see the drafts on:

(1) Small TTLs.
(2) SMRs.
(3) PubSub.
(4) Predictive RLOCs.

They each have their tradeoffs, which are packet loss vs hand off time. The 
approach that is more modern and being used is (3) and (4).

> 7. Security Considerations
> 
> LISP inherits security considerations from [RFC9300]. Additional
> aspects for AI agent scenarios include:
> 
> * EID Spoofing: An attacker could impersonate an AI agent by using
> its EID.

You use Map-Register signatures. See draft-ietf-lisp-ecdsa. And an 
implmeentation can encrypt Map-Requests and Map-Registers that flow to and from 
the mapping system.

> * Mapping System Abuse: Malicious Map-Requests could overload the
> system. Rate limiting and source validation are RECOMMENDED.

This is solved with the same honey pot mechanisms used for DNS DoS attacks.

> Logical isolation via Instance IDs provides strong tenant separation,
> reducing cross-domain attack surface.

Agree.

Thanks for the draft. I look forward to your responses.

Dino



















_______________________________________________
lisp mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to