[Pce] A review of draft-ietf-pce-state-sync-07

Adrian Farrel Fri, 28 Jun 2024 13:00:41 -0700

Hi,

As this document approaches being ready for working group last call, I
thought it might be helpful if I did a review.


Cheers,
Adrian

===

The document title could do with some clean-up.
- Remove the full stop
- Perhaps make it more straight-forward. For example,  Procedures for
Communication between Stateful Path Computation Elements

---

Abstract para 2

s/an LSP/LSP/

---

Abstract para 3

s/a stateful/stateful/

---

1. para 2

s/paths,/paths)/

---

1. para 4

s/an LSP state/LSP state/
s/of a LSP/of an LSP/
s/to only a single PCE/to a single PCE/

---

1.

s/to allow a stateful/to allow stateful/

---

1.

OLD
   Further, the examples in this section are for illustrative purpose to
   showcase the need for inter-PCE stateful PCEP sessions.
NEW
   This section contains illustrative examples to showcase the need for
   inter-PCE stateful PCEP sessions.
END


However, I find the examples running through sections 1.2, 1.3, and 1.4 to
be quite unnecessary. There is a feeling of "trying too hard" to prove that
there are uses for the protocol extensions described in the body of the
document. I would have been very happy with just one paragraph listing out a
few of the possible use cases.

Further, section 4 (which seems to rework the examples in sections 1.2, 1.3,
and 1.4) is not really an explanation of how the protocol extension works,
but more of a set of use cases. Again, this feels like it is trying to prove
the value of the extension. Do we really need it? It is such a simple
protocol extension.

---

1.2 para 1

s/an LSP state/LSP state/
s/grants the control/grants control/

---

1.2 para 2

This is really to read.

   In a multi PCE deployment (redundancy, loadbalancing...), with the
   current specification defined in [RFC8231], when a PCE makes an
   update, it is the PCC that is in charge of reporting the LSP status
   to all PCEs with LSP parameter change which brings additional hops
   and delays in notifying the overall network of the LSP parameter
   change.

a) s/current specification/specification/

b) s/with LSP parameter change/with any LSP parameter changes/

c) I can't tell what the final part of the paragraph means. Is it the
   reporting that brings additional hops and delays? How does the
   reporting cause this?

---

1.2 para 4

s/As stateful PCE make/As a stateful PCE makes/

s/immediately to/to/

---

All the figures in the document need numbers and titles. The text should
refer to them using <xref> rather than "the figure above" etc.

---

1.2

OLD
   PCE1 is responsible to compute paths for PCC1 and PCE2 is responsible
   to compute paths for PCC2.
NEW
   PCE1 is responsible for computing paths for PCC1, and PCE2 is
   responsible for computing paths for PCC2.
END

OLD
   PCE2 will so
   be notified of the change only after receiving the PCRpt message from
   PCC1.
NEW
   So PCE2 will
   be notified of the change only after receiving the PCRpt message from
   PCC1.
END

---

I'm confused by the example given in 1.2. LSP1 is under the control of PCE1,
so any changes that are made are made with the direct instruction from PCE1.
Therefore, it is not necessary to wait for the change to be reported by PCC1
- PCE2 can be notified of the intended change. Surely that would be more
efficient. (Yes, that would only be possible if there is a session between
the PCEs.)

Additionally, when the change is made in the network, it seems unlikely to
me that PCC1 would be aware that it is LSP2 that has had its resources
reduced because PCC1 does not know about the existence of LSP2. However, the
network nodes servicing LSP2 would know about this and so it would be
reported to PCC2 which can report it direct to PCE2.

(Note that this does not take anything away from the proposed protocol
extension. It is just a confusing example.)

---

1.3

s/failure, PCC/failure, the PCC/
s/sending new/sending a new/
Expand 'ERO' on first use.

---

1.3

   When the failed PCE or PCEP session comes back online, it will
   be up to the implementation to do preemption.  Doing preemption may
   lead to some disruption on the existing path if path results from
   both PCEs are not exactly the same.

The term 'preemption' may cause some confusion. I don't think you are
referring to the type of preemption of resources mentioned in the example in
the previous section. Perhaps...

   When the failed PCE or PCEP session comes back online, it will
   be up to the implementation whether to revert back to the original 
   primary PCE.  Reverting may lead to some disruption on the existing
   path if computation results from both PCEs are not exactly the same.

---

1.3

   By considering a network with
   multiple PCCs and implementing multiple stateful PCEs for redundancy
   purpose, there is no guarantee that at any time all the PCCs delegate
   their LSPs to the same PCE.

There is something not quite right about this sentence. Is there a 'not'
missing, such as...

   By considering a network with
   multiple PCCs and implementing multiple stateful PCEs for redundancy
   purpose, there is no guarantee that at any time all the PCCs will not
   delegate their LSPs to the same PCE.

The word 'preemption' is used later in the section, as well.

---

1.3

OLD
   The set of LSPs that are dependent to each other may
   start from a different head-end.
NEW
   The set of LSPs that are dependent on each other may
   start from different head-ends.
END

---

1.3

OLD  
   In the topology, all links cost
   metric is set to 1
NEW
   In the topology, all link cost
   metrics are set to 1
END

---

1.3. The figures on page 7, it is slightly odd that you have named the
destination/egress of the LSPs as PCCs. It is true that they might be PCCs
for other LSPs, but is that relevant?

---

It might help if the scenarios in section 1.3 were given their own
subsections (1.3.1, etc.).

---

1.3

The figures for scenarios 2, 3, 4 show D=0 and D=1 next to the links.
But I don't see anything that explains what this means. I assume that this
is the "delegate" flag.

---

1.3 scenario 3

s/sequence of event/sequence of events/

---

1.3 scenario 6

This scenario makes me very uneasy. Why do you have two domains if you are
sharing all of the topology information? A substantial reason for domains is
scalability. Another reason is confidentiality. 

Why do you show the PCEs as belonging to the separate domains if they share
all of the information? Surely they should be shown as outside the domains?

BGP-LS was intended as a "north-bound" policy-based export of information.
It does not follow that a BGP-LS speaker in one domain would entertain a
session with a node outside the domain, nor that it would share a full set
of information.

The question of multi-domain PCE and BGP-LS has been discussed in a large
set of previous RFCs, and this approach seems to go against all of that
prior work.

But looking at this scenario, I wonder why you even assume that the PCEs
need to be able to see topology in both domains. It just seems unnecessary.

---

2.1

OLD
   This document specify a mechanism to set-up a PCEP session between
   the stateful PCEs.  Creating such a session is already authorized by
   multiple scenarios like the one described in [RFC4655] (multiple PCEs
   that are handling part of the path computation) and [RFC6805]
   (hierarchical PCE) but was only focused on the stateless PCEP
   sessions.  As stateful PCE brings additional features (LSP state
   synchronization, path update, delegation, ...), thus some new
   behaviors need to be defined.

   This inter-PCE PCEP session will allow the exchange of LSP states
   between PCEs that would help some scenarios where PCEP sessions are
   lost between PCC and PCE.  This inter-PCE PCEP session is henceforth
   called a state-sync session.
NEW
   This document specify a mechanism to set up a PCEP session between
   the stateful PCEs.  Creating a PCEP session between PCEs is already 
   enabled for multiple scenarios like the ones described in [RFC4655]
   (multiple PCEs that are handling part of a path computation) and 
   [RFC6805] (hierarchical PCE).  But that earlier work focused only on
   the sessions between stateless PCEs.  

   Stateful PCE brings additional features to PCEP (LSP state 
   synchronization, path update, delegation, ...).  Thus some new
   behaviors need to be defined on the inter-PCE PCEP session.

   This inter-PCE PCEP session allows the exchange of LSP states between
   PCEs that can help in some scenarios where PCEP sessions are lost
   between PCCs and PCEs.  This inter-PCE PCEP session is called a
   "state-sync session" in this document.
END

---

2.1

s/session will allow for a PCE/session will allow a PCE/

---

2.2

   To provide
   the best efficiency, an LSP association constraint-based computation
   requires that a single PCE performs the path computation for all LSPs
   in the association group.

I don't agree with this as stated.

I do agree that it can be more optimal, and that there are some algorithms
that compute two paths at the same time, but it is also possible to
construct "split brain" solutions that work fine, if a little more slowly
and with more information exchange.

Further, not all LSP associations need knowledge of the path of one LSP to
establish the other LSPs.

---

2.2

s/This document specify/This document specifies/

---

2.2

   The
   priority could be set per association, per PCC, or for all LSPs.

Is this really...

   The
   priority could be set per association, per PCC, or for all PCEs.

---

2.2

s/shortest path at/shortest path as/

---

In 2.2 I am not clear whether PCE2 is transferring delegation to PCE1 or
just asking PCE1 to perform a computation. I think that PCE2 is allowed to
ask anyone for help performing a computation, but the issue of delegation
could be sensitive - the PCC has delegated to PCE2: does that mean that the
PCC is giving permission for the delegation to be passed on? Could this be
sensitive because PCE2 might be in a domain that the PCC doesn't trust? Or
do you assume that, because the PCC has a session with both PCEs, it trusts
them equally?

I did find 3.5 about "sub-delegation" and I think this is relevant.

---

3.1.1

   A PCE indicates its support of state-sync procedures during the PCEP
   Initialization phase [RFC5440].  The OPEN object in the Open message
   MUST contains the "Stateful PCE Capability" TLV defined in [RFC8231].
   A new P (INTER-PCE-CAPABILITY) flag is introduced to indicate the
   support of state-sync.

There is some history of flags being given letters, although not many of
these are tracked in the IANA registry (I note that you also don't ask for
this in section 11.3).

RFC 8623 defines the P2MP-LSP-INSTANTIATION-CAPABILITY flag and calls it the
P-flag. So you have a clash.

---

3.1.1

   *  P (INTER-PCE-CAPABILITY - 1 bit - TBD4): If set to 1 by a PCEP
      Speaker, the PCEP speaker indicates that the session MUST follow
      the state-sync procedures as described in this document.  The P
      bit MUST be set by both speakers: if a PCEP Speaker receives a
      STATEFUL-PCE-CAPABILITY TLV with P=0 while it advertised P=1 or if
      both set P flag to 0, the session SHOULD be set-up but the state-
      sync procedures MUST NOT be applied on this session.

There's a contradiction here. Initially, you say that if a PCE sets P=1 the
session MUST follow the procedures. Then you say that if the other PCE sets
P=0 the procedures MUST NOT be followed. 

I think it is clear what you intend, but it needs tidying.

How about...

   *  P (INTER-PCE-CAPABILITY - 1 bit - TBD4): If set to 1 by a PCEP
      speaker, the PCEP speaker indicates that it wants to use the
      state-sync procedures as described in this document.  If the P
      bit is set by both speakers, the procedures MUST be used.  If a
      PCEP speaker receives a STATEFUL-PCE-CAPABILITY TLV with P=0 while
      it advertised P=1 or if both set P flag to 0, the session SHOULD
      be set-up but the state-sync procedures MUST NOT be applied on 
      this session.  A PCE MAY decide to close a session if the received
      setting of the P flag is not acceptable.

---

In 3.2 I wonder how a PCE (acting as a PCE) can tell the difference between
information received from a PCC and a PCE acting as a PCC. This could be
made a bit clearer (because it is pretty important to stop the PCEs
reporting LSPs back to each other. This gets even more complicated if there
are more than 2 PCEs.

I suspect this shows up in 3.3 with the ORIGINAL-LSP-DB-VERSION

---

3.3

   When propagating LSP state changes from a PCE to other PCEs, it is
   mandatory to ensure that a PCE always uses the freshest state coming
   from the PCC.

I know why you say that. "Mandatory" sounds like a BCP 14 sort of word.
I wonder if you want to use "MUST". But also, I wonder whether this can be
reworded simply as what the PCE does.

---

3.3

s/and log such an event/and SHOULD log such an event/

---

3.4

   When a PCE receives a PCRpt on a state-sync session, it stores the
   LSP information into the original PCC address context (as the LSP
   belongs to the PCC).  

I'm not sure what "into the original PCC address context" means.
Is this simply that the LSP information is stored in the context of the PCC
that originally reported the LSP?

---

3.4
   
   A PCE SHOULD maintain a single state for a
   particular LSP and SHOULD maintain the list of sources it learned a
   particular state from.

You have two cases of "SHOULD". They are fine, but you need to explain what
the alternatives are ("MAY"), and how/why an implementation makes the
choice.

In fact, can you check through the whole document and look at the uses of
"SHOULD" to make sure the alternatives are properly covered.

---

3.5

s/it loose control/it loses control/

---

3.5

   If the highest priority PCE is failing or if the state-sync session
   between the local PCE and the highest priority PCE failed, the local
   PCE MAY decide to delegate the LSP to the next highest priority PCE
   or to take back control of the LSP.  It is a local policy decision.

What does "is failing" mean? How can one PCE know that another is failing?

---

3.5

In the case of sub-delegation, is there a requirement that the PCE that is
sub-delegated to has a PCEP session to the PCC that is headend for the LSP?

Suppose...

   PCE2--PCE1
          |
         PCC

If PCE1 sub-delegates to PCE2, how does PCE2 control the LSP without a
session to the PCC? But how does PCE1 know about the existence of PCE2's
sessions?

Do we rely on one of:
- All PCCs MUST have sessions to all PCEs
- A PCE MUST reject sub-delegation if it doesn't have a session to the
  PCC

You have text that says...

   In the case of sub-delegation, the
   computing PCE will send the PCUpd only to all state-sync sessions (as
   it has no direct delegation from a PCC).

...and...

   When a PCE receives a valid PCUpd on a state-sync session, it SHOULD
   forward the PCUpd to the appropriate PCC (identified based on the
   SPEAKER-ENTITY-ID TLV value) that delegated the LSP originally

This implies that PCE2 would send the PCUpd to PCE1, but PCE1 could have
been failing, or the session between PCE1 and the PCC might be down.

---

3.5.1

   A PCE SHOULD NOT compute a path
   using an association-group constraint if it has delegation for only a
   subset of LSPs in the association-group

It is unclear to me that a PCE can know this. In some cases, the association
is for a known number of LSPs (e.g., bidirectional), but in other cases
there can be a large number of LSPs in the group (e.g., VN), and the group
can be added to at any time.

---

3.7

s/a LSP/an LSP/
s/to other PCE/to another PCE/

---

6.

s/among PCEP speaker/among PCEP speakers/

---

6.

OLD
   *  ID Length: defines the length of the Speaker identity actual field
      (non-padded).
NEW
   *  ID Length: defines the length of the Speaker Entity identity field
      not counting any padding.
END

---

6.

   If a PCEP speaker receives a message with PCEP-PATH-VECTOR TLV and
   finds its speaker information already present in the PCEP-PATH-VECTOR
   TLV, it MUST ignore the PCEP message and SHOULD log it as an error.

Might be nice to say why...
...because this represents a message loop

---

6.

   The list of speakers within the PCEP-PATH-VECTOR TLV MUST be ordered.

Ah, but what order? 

   When sending a PCEP message (PCRpt, PCUpd, or PCInitiate), a PCEP
   Speaker MAY add the PCEP-PATH-VECTOR TLV with a PCEP-SPEAKER-
   INFORMATION containing its own information.

Addition is presumably in a specific order. "add to the end of the list"?

You do say "append" at the bottom of the paragraph, so I think it is clear
what you intend: you just need to bring it out more clearly.

---

7.

This is a reasonable section. 
I'm interested in the transfer of information outside the control of the
PCC. The PCC is in a trust relationship with the PCE, but this document
allows the PCE to share the information with other PCEs. While there is an
implied trust relationship between PCEs, there could be a long chain and the
PCC is not aware of the chain, I think.

This could be handled if the complete Path Vector TLV was returned back
through the PCEs to the PCC. Then, at least, it would know who had seen its
information.

---

Thanks for section 8. Of course, it would be better if someone would write
some code because that shows there is an actual need for this feature.

---

Thanks, also, for section 9.

---

9.1

s/for a inter-PCE session/for an inter-PCE session/ s/They MUST allow
configuration of/They MUST be allowed to configure/ s/MAY also allow
configuration of /MAY also be allowed to configure/ x2


_______________________________________________
Pce mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Pce] A review of draft-ietf-pce-state-sync-07

Reply via email to