On 2015-11-14 7:14 PM, Mo Zanaty (mzanaty) wrote:

> I have reviewed draft-ietf-codec-oggopus-08 and have the following 
> comments.

Thanks for the review. Responses inline below.

> General - Both "muxer/demuxer" and "encoder/decoder" appear in 
> normative statements. I would expect RFC 6716 to fully specify
> Opus encoder/decoder behavior, and this spec to only specify
> muxer/demuxer behavior.

It's true that the draft uses encoder/decoder both to refer to an
implementation of RFC 6716 and more generally to and encoder/decoder
application. Such an application would include a codec encoder/decoder
as described by RFC 6716, an implementation of Ogg Opus logical
bitstream encapsulation/decapsulation (this draft) and an Ogg
muxer/demuxer (RFC 3533 plus Opus-specific additions from this draft).

I can change references to the more general application to
"muxer/demuxer" but that's no more precise. The draft struggles with
this elsewhere, referring to "players" and "non-playing decoders" when
offering guidance for direct audio playback or file-based output.
Overloading "encoder/decoder" was intended to be a generalization of tha
t.

Mark suggested 'an implementation (of this specification)' as a way to
disabiguate.

> Please check all "encoder/decoder" instances to see if you really
> mean "muxer/demuxer". If you really mean "encoder/decoder", is the
> behavior really restricted to Opus only when used in Ogg? If not, 
> shouldn't this update RFC 6716?

I don't think there's anything here which should update RFC 6716, given
the decision to make that document just about the audio compression
layer. Drafts describing encapsulation necessarily contain additional
constraints and techiniques on how to use an RFC 6716 encoder/decoder
for their particular application.

> Abstract - The first 2 sentences seem sufficient. Consider moving
> the rest to the Introduction, since it merely highlights Ogg
> features.

I think it's useful to include some motivation for why a draft is
interesting in the draft in the abstract. The overview of Ogg's features
is really just defining terms for the last paragraph. I'll see if I can
come up with something better.

> Section 3. Packet Organization - 2nd paragraph: The second
> sentence (on granule position) should be moved to the next section
> (4. Granule Position).

Done, although some re-writing was required to be clear about the
distinction between granulepos=0 and granulepos=-1 for pages with
header data, as you note below.

> - 4th paragraph: "The second packet in the logical Ogg bitstream
> MUST contain the comment header" ... then later ... "It MAY span
> one or more pages" ... so shouldn't this be MUST not MAY since zero
> is not allowed, i.e. the comment header is mandatory not optional.

How about, "It MAY span multiple pages, beginning on the second page of
the logical stream." Avoids the math/English ambiguity.

> - 8th paragraph: "The first audio data page SHOULD NOT have the
> 'continued packet' flag set" Why SHOULD NOT rather than MUST NOT? 
> When would it be ok to set this flag on the first audio data page?
> -

A naive editing application would generate such streams if it didn't
take care to strip the truncated packet data from the first page(s).
That's considerably more machinery than just cutting the stream at page
boundaries.

> 9th paragraph: Same as above, s/SHOULD/MUST/g. It seems you may be 
> taking the demuxer view here. I think you need to specify strict 
> normative behavior for muxers but allow some lenience in demuxers
> as already stated in that paragraph: "but implementations need to
> be prepared to deal with..."

Our intent was that both muxer and demuxer authors consider the
implications of streams which violate the SHOULD, which I think is
consistent with the RFC 2119 meaning of the term.

Saying "Muxers MUST to X. Demuxers MUST handle streams not compliant
with that MUST," is confusing without offering better precision.

> Section 4. Granule Position - This should start with stating that
> the granule position MUST be zero for the ID header page and the
> comment header completion page, but MAY be larger than zero for the
> first audio data completion page as described in section 4.5. This
> critical info should be together here instead of spread across
> other sections.

Ok.

> Section 4.1. Repairing Gaps in Real-time Streams - 1st paragraph:
> "a muxer SHOULD emit packets that explicitly request the use of
> Packet Loss Concealment (PLC) in place of the missing packets." Why
> SHOULD not MUST?

IIRC Tim wanted to allow implementations to emit zero-length packets
instead, updating the granulepos without generating packet data to
request PLC from the decoder.

A muxer following the SHOULD benefits naive players which feed packets
spanning the gap to the decoder blindly. More sophisticated players
would implement their own version of this muxer SHOULD to handle the gap
when the muxer did not.

> The prior paragraph uses MUST and says "For this to work, there
> cannot be any gaps." If you want to keep it as SHOULD strength, 
> perhaps state that muxers which fail to do this will cause
> demuxers to compute incorrect granule positions when seeking
> forward or backward.


> - 3rd paragraph: RFC6716 reference is broken.

I don't know why that reference didn't get a hyperlink in the
tools.ietf.org version. Bug with <xref> at the beginning of a <t>?
It's fine in the xml2rfc html output.

> - I assume this section does not apply to RFC 7587 (Opus RTP)
> because in the RTP case, RTP timestamp signals the gap (if using
> DTX), while Ogg uses the granule position. Correct? If not, should
> any of this section apply to RFC 7587 and therefore update it?

Correct. Ogg does not allow discontinuous transmission (or rather
treats it as damage) so it is helpful for muxers to fill the gap with
generated packets. RTP implementations are more likely to take
advantage of Opus FEC redundancy and have their own PLC algorithms to
deal with packet loss.

> Section 4.2. Pre-skip - Same as above, does it apply to RFC 7587?

RTP applications typically start playback with the first audio packet
received to minimize latency. I suppose the summary of encoder delay
could still be useful in that context, but there's no signalling
mechanism for the pre-skip field in RTP, and the primary use of
marking sample-accurate edit points isn't relevant there.

> Section 5. Header Packets - "An Opus stream" -> "An Ogg Opus
> logical stream". Otherwise it may be confused with the underlying
> Opus stream itself containing these headers.

Good point, thanks.

> Section 5.1. Identification Header - 1. to 8. Why *asterisk*
> around field names? Markdown/bold conversion failure?

This is what xml2rfc does with <spanx style="strong"> We added these
to improve readability in the html output. I thought the txt behavioru
was 'as intended' in the spirit of ASCII art.

I've removed the markup from the field headings, but left the inline
<spanx style="emph">not</spanx> => _not_ under the 'Input Sample Rate'
description as we feel emphasis is important there, and not just for
readability. Implementor feedback has included a lot of confusion on
this point.

> - 8. Channel Mapping Table: "It is omitted when the channel mapping
> family is 0, but REQUIRED otherwise." Be normative in both 0 and >0
> cases: "It MUST be omitted...0, but REQUIRED otherwise."

Thanks, fixed.

> Also in the next paragraph. - Last sentence: "However,
> implementations MAY reject streams in which the ID header does not
> complete on the first page." Seems like this should be MUST not MAY
> based on section 3 which clearly requires this: "MUST complete on
> that [first] page." Or perhaps you need to specify muxer MUST vs.
> demuxer MAY.

This one is clearly "demuxer MAY".

> Coupled Stream Count: "...the first M Opus decoders are to be 
> initialized for stereo output..." Is this an intended restriction 
> that all stereo channels must appear first before any mono
> channels?

Yes. That way we don't have to signal mono/stereo for each substream,
just how many of each we have.

> - 3. Channel Mapping: "...If 'index' is 2*M or larger, but less
> than 255, the output MUST be taken from decoding stream ('index'-M)
> as mono." Should be index-2*M not index-M.

Hmm. I think this is correct as written, and you've confused
streams/decoders with channels.

There are N Opus (sub)streams within each Ogg Opus logical stream,
each fed to one of the N corresponding Opus decoders. The first M of
those decoders are configured to output stereo (two decoder channels
each). The remaining N-M are configured to output mono (one decoder
channel each).

So there are 2*M + 1*(N-M) = M+N decoder channels available. The
channel mapping stable describes how to route these to the C output
channels.

For each output channel index from 0 to C-1, the channel mapping table
gives the index of the decoded channel to route to that output channel
index. These are implicitly stacked from 0..2*M-1 for the M stereo
decoders, and 2*M..M+N-1 for the N-M mono decoders. So while there are
M+N *channel* indicies, there are only N decoders. For indicies in the
mono range, we subtract 2*M to get the offset to the 0th mono decoder
output channel. But the index of the *decoder* is M + (index - 2*M) =
(index - M).

> - Channel and stream are used interchangeably, as well as stereo
> and coupled, stereo and 2-channel, mono and 1-channel. Different
> meanings in different contexts can cause confusion. If possible,
> use the same term to mean the same thing in all contexts. If not
> possible, perhaps add a bit more text to clarify the nuances
> between streams, channels and stereo in different contexts.

I'll see if I can clean this up a bit. There definitely aren't enough
distinct terms for all the layers here.

> Section 5.1.1.1. Channel Mapping Family 0 - Second sentence: "RTP 
> mapping." Add reference to RFC 7587.

Done.

> Section 5.1.1.2. Channel Mapping Family 1 - Last paragraph: "'LFE' 
> here refers to a Low Frequency Effects," add "channel" before the 
> comma.

Ok.

> Section 8. Implementation Status - Remove the reference to [1] and 
> insert the URL directly here, so when the RFC Editor removes this 
> entire section there will be no orphaned reference in section
> 14.3. ... Section 14.3. URIs - Remove this entire section per the
> above comment.

I couldn't figure out how to do this in xml2rfc, which generates the
reference entries. I've added removing the reference entry to the
rfc-editor request.

> Section 9. Security Considerations - "Malicious payloads MUST NOT 
> cause the decoder to overrun its allocated memory or to take an 
> excessive amount of resources to decode.  Although problems in 
> encoders are typically rarer, the same applies to the encoder. 
> Malicious audio streams MUST NOT cause the encoder to
> misbehave..." Replace "misbehave" with the firmer decoder text
> about memory and resources.

Ok.

> Section 10. Content Type - The reference to RFC 6381 should be 
> changed to RFC 5334.

RFC 5334 cites RFC 4281 for the definition of the "codecs=" mime-type
extension. RFC 6381 says it obsoletes RFC 4281, and defines the codecs
parameter as well as the extensions for mp4.

Can we cite both RFC 6381 (for the format) and RFC 5334 (for
ogg-specifics) here?

> -- Notably, the "codecs=" name space format differs between RFC
> 6381 (4 character code for ISO) vs. RFC 5334 (4-8 character codes
> are mapped to longer codec names for Ogg).

My BNF is rusty, but it looks like RFC 6381 defines "codecs="
generically, in a separate section from the ISO Base Media Format
Namespace.

> - This document should indicate it updates RFC 5334 to add "opus"
> to the "codecs=" parameter.

Ok. We can say that.

Note that the table in RFC 5334 section 4 isn't intended to be
complete and instead cites
https://wiki.xiph.org/index.php/MIMETypesCodecs which has already been
updated.

 - This document should update the IANA Media
> Types registry to reference itself alongside RFC 5334 for 
> application/ogg, audio/ogg and video/ogg.

Indeed. Do I just write that it does, and magic happens? RFC 6838 is a
little vague on what's expected of document authors.

> Cheers, Mo (as Document Shepherd)

Whew! Thanks again for all the comments. I'll ask my co-authors if
they agree with the changes and publish an updated draft.

 -r

_______________________________________________
codec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/codec

Reply via email to