Timothy B. Terriberry <[email protected]> wrote: > I took a stab at drafting some text that answers these questions (XML diff > attached):
Thanks; this is very helpful. I have just a few comments: > In order to support capturing a real-time stream that has lost > packets, or that uses discontinuous transmission (DTX), a muxer > SHOULD emit packets that explicitly request the use of Packet Loss > Concealment (PLC) in place of the packets that were not transmitted. lost or not transmitted. > If > there is no previous packet, reasonable decoders will not emit > anything other than silence regardless of the mode. Using the CELT- > only mode for this case (with any audio bandwidth) allows maximum > flexibility, since a single packet can represent any duration up to > 120 ms that is a multiple of 2.5 ms using at most two bytes. ...plus one byte of Ogg lacing. For initial zero-length frames, might it be better to prefer the configuration of the first non-zero-length frame to the extent possible, when available, to help in any situation where the configuration of the first packet might be used to report information (such as frame size), or for an initial estimate of bandwidth, required buffer sizes, etc.? Or perhaps the last sentence should just be omitted, since it already effectively says that the mode, bandwidth, and channel count are unlikely to matter to a decoder in this case. > Delaying such > changes as long as possible to simplifies things for PLC > implementations. s/to // > A 95 ms gap could be encoded as 19 5 ms frames in > two bytes with a single CBR code 3 packet. If the previous frame > size was 20 ms, using four 80 ms frames, followed by three 5 ms s/80/20/ > frames requires 4 bytes (plus an extra byte of Ogg lacing overhead), > but allows the PLC to use its well-tested steady state behavior for > as long as possible. To clarify, if the previous frame was 20 ms SILK, is this suggesting a 4 x 20 ms SILK packet followed by a 3 x 5 ms CELT packet? The next paragraph suggests keeping the mode as long as possible, implying that it may be better to use 4 x 20 ms SILK + 10 ms SILK + 5 ms CELT. Or is minimizing the number of frame size changes more important than keeping the mode as long as possible? Thanks. _______________________________________________ codec mailing list [email protected] https://www.ietf.org/mailman/listinfo/codec
