Hi,

IMHO, if I understood your message correctly, I would agree with the principle: "check for delegation before returning data." This aligns with what we see in practice.

In our web-based DNS management app, resolving conflicts becomes particularly tricky when we host both the parent and the child zone.

For example, in the parent zone `test.zone`:

    sub.test.zone.    IN NS dns1
    xyz.sub.test.zone. IN A 1.2.3.4

And in the child zone `sub.test.zone`:

    xyz.sub.test.zone. IN CNAME anything

When both zones are hosted on the same platform, things can get confusing. Thankfully, `BIND 9.18.33` appears to follow the logic of checking for delegation before returning data, which prevents data from the parent zone from leaking into delegated space — and saves us many support discussions with customers who expect the opposite (i.e., that the parent zone’s record would prevail).

Here's a real-life test case:

**test1.com zone:**

$ORIGIN test1.com.

@ 3600IN SOA pdns-1.planisys.net. hostmaster.planisys.net. 2025070403 14400 3600 604800 3600

3600IN NS globaldns1.planisys.net.

3600IN NS globaldns2.planisys.net.

sub 3600IN NS globaldns1.planisys.net.

sub 3600IN NS globaldns2.planisys.net.

test2.sub 60IN A 1.2.3.4


**sub.test1.com zone:**

$ORIGIN sub.test1.com.

@ 3600IN SOA pdns-1.planisys.net. hostmaster.planisys.net. 2025070401 14400 3600 604800 3600

3600IN NS globaldns1.planisys.net.

3600IN NS globaldns2.planisys.net.

test2 3600IN CNAME amazon.com.


**dig result:**

$ dig +short test2.sub.test1.com @127.0.0.1

amazon.com.sub.test1.com.


   The example shows that *BIND 9.18.33 correctly returns the CNAME
   from the child zone*, even though there’s an A record at the same
   name in the parent zone. This reinforces the importance of checking
   for delegation *before* answering, which avoids exposing occluded data.

My fear had always been that "CNAME and other data error" showed up and stopped loading one or both of the zones, but bind9.18.33 seems wise enough to leave the A record in the parent zone inactive and prefer the child zone.

Best regards,

Carlos Horowicz

Planisys

On 04/07/2025 16:08, Petr Špaček wrote:
Hello dnsop.

It seems to me that both
RFC 1034 section 4.3.2. Algorithm
RFC 6672 section 3.2 Server Algorithm
have a bug/are inconsistent with RFC 2136/RFC 5936.

TL;DR these two algorithms do not handle occluded data correctly.

Quote from RFC 9499 section 7. Zones:

Occluded name:
    "The addition of a delegation point via dynamic update will render all subordinate domain names to be in a limbo, still part of the zone but not available to the lookup process. The addition of a DNAME resource record has the same impact. The subordinate names are said to be 'occluded'." (Quoted from [RFC5936], Section 3.5)

Excerpt of the algorithm in 6672 is:

   2.  Search the available zones for the zone which is the nearest
       ancestor to QNAME.  If such a zone is found, go to step 3;
       otherwise, step 4.

   3.  Start matching down, label by label, in the zone.  The matching
       process can terminate several ways:

       A.  If the whole of QNAME is matched, we have found the node.

           If the data at the node is a CNAME, and QTYPE does not match
           CNAME, copy the CNAME RR into the answer section of the
           response, change QNAME to the canonical name in the CNAME RR,
           and go back to step 1.

           Otherwise, copy all RRs which match QTYPE into the answer
           section and go to step 6.

       B.  If a match would take us out of the authoritative data, we
           have a referral.  This happens when we encounter a node with
           NS RRs marking cuts along the bottom of a zone.
...

   6.  Using local data only, attempt to add other RRs that may be
       useful to the additional section of the query.  Exit.


Example zone where this breaks:

test. SOA ...
test. NS  ...
sub.test. NS elsewhere.
sub.test. TXT "occluded TXT, even though it is not covered by RFC 9499"
occluded.sub.test. TXT 'nobody should see this!'


Query which hits this bug:
QNAME=occluded.sub.test.
QTYPE=TXT


Now I'm a dumb machine and execute the algorithm:
- step 2 - found zone 'test.'
- step 3A - found node! not CNAME -> goto step 6
- step 6 - respond, go for weekend

/insert mushroom cloud ASCII art here/


This seems to be just wrong, and RFC 2136 section 7.18 seems to support this conclusion. Quote:

   7.18. Previously existing names which are occluded by a new zone cut
   are still considered part of the parent zone, for the purposes of
   zone transfers, even though queries for such names will be referred
   to the new subzone's servers.  If a zone cut is removed, all parent
   zone names that were occluded by it will again become visible to
   queries.  (This is a clarification of [RFC1034].)


I suppose RFC 6672 should have switched steps 3A and 3B?


Second catch is that NS RR type can 'occlude' data which are on the _same_ node, which is missing in RFC 9499 definition (or rather 2136 definition of occlusion).

Is my brain overheated and I don't understand anything anymore?
Or should I file an erratum?

Have a great weekend everyone.
_______________________________________________
DNSOP mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to