Re: [codec] Comments on draft-ietf-codec-ambisonics-01

Jan Skoglund Mon, 13 Mar 2017 15:18:07 -0700

Hey,

Our idea was to avoid a mapping table, potentially sparse, completely for
family 2, and replacing it with a channel numbering list for family 3.


Cheers,
Jan

On Mon, Mar 13, 2017 at 3:12 PM Jean-Marc Valin <[email protected]> wrote:

> On 13/03/17 06:04 PM, Drew Allen wrote:
> > so just to be clear, if a user, say, wants to encode some mixed order
> > ambisonics using ch253, how does the decoder know what ambisonic
> > channels it has received and know how to render them correctly?
>
> Well, each line of the matrix would correspond to a channel in the
> ambisonics channel order. If that channel isn't encoded, then the line
> would have only zeros.
>
> The only way to avoid that situations would be to encode a separate D
> value (D <= C) for the number of non-zero channels among the C
> ambisonics channels possible. Then you'd store C values in the channel
> mapping array (equivalent to a CxD permutation matrix), followed by a
> Dx(M+N) weight matrix that would no longer have entire lines of zeros.
> The result would be more compact in the case of sparse representation,
> but IMO it'd be pretty ugly and prone to implementation errors. And if
> you force D==C and don't code the D (which is what I'm proposing), then
> the channel mapping permutation automatically becomes redundant.
>
> Cheers,
>
>         Jean-Marc
>
> > On Mon, Mar 13, 2017 at 3:00 PM Drew Allen <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Got it. In that case, it certainly seems reasonable if I understand
> >     correctly. Thanks for clearing that up!
> >
> >     On Mon, Mar 13, 2017 at 2:55 PM Jean-Marc Valin <[email protected]
> >     <mailto:[email protected]>> wrote:
> >
> >         On 13/03/17 05:44 PM, Drew Allen wrote:
> >         > I think the issue is that the number of total channels rises
> >         > quadratically in respect to the ambisonic order (N + 1)^2. If
> >         a user
> >         > wants to use just the horizontal channels, it is only 2 * N +
> >         1. If they
> >         > wish to code very high-order (>10th order) horizontal
> >         channels, they
> >         > would be artifically limited by all the zero channels being
> >         produced,
> >         > no? Or can this handled without actually creating all those
> >         empty channels?
> >
> >         As far as I understand, the current draft already has all the
> >         limitations you're describing. The channel mapping array is
> >         basically
> >         equivalent to a CxC permutation matrix that multiplies the
> Cx(N+M)
> >         weight matrix. The result is still a Cx(N+M) matrix, so using the
> >         resulting matrix as weights can still do everything without the
> >         need for
> >         the channel mapping to do the permutations.
> >
> >         Cheers,
> >
> >                 Jean-Marc
> >
> >         > On Mon, Mar 13, 2017 at 2:41 PM Mark Harris
> >         <[email protected] <mailto:[email protected]>
> >         > <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >         >
> >         >     On Mon, Mar 13, 2017 at 10:38 AM, Jan Skoglund
> >         <[email protected] <mailto:[email protected]>
> >         >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >         >     > Hey,
> >         >     >
> >         >     > Thanks for your comments
> >         >     >
> >         >     > On Mon, Mar 13, 2017 at 10:08 AM Mark Harris
> >         <[email protected] <mailto:[email protected]>
> >         >     <mailto:[email protected] <mailto:[email protected]>>>
> >         wrote:
> >         >     >>
> >         >     >> On Fri, Feb 17, 2017 at 1:57 PM, Jean-Marc Valin
> >         >     <[email protected] <mailto:[email protected]>
> >         <mailto:[email protected] <mailto:[email protected]>>>
> >         >     >> wrote:
> >         >     >> > 3.2.  Channel Mapping Family 3
> >         >     >> >
> >         >     >> > I would suggest removing the "Output Channel
> >         Numbering" field
> >         >     because it
> >         >     >> > is fully equivalent to simply permuting lines of the
> >         matrix.
> >         >     Also, I
> >         >     >> > believe that the size of the matrix was meant to be
> >         "32*(N+M)*C
> >         >     bits"
> >         >     >> > rather than "32*N*C bits".
> >         >     >>
> >         >     >> To expand on this a bit, a mapping family maps M+N
> >         decoded channels
> >         >     >> (corresponding to the actual order of the coupled and
> >         uncoupled
> >         >     >> channels in the bitstream) to C output channels
> >         (channels with a
> >         >     >> specific semantic meaning).  The additional "Output
> Channel
> >         >     Numbering"
> >         >     >> table confuses things by adding an additional mapping
> >         from the output
> >         >     >> channel numbers to a different set of numbers with
> >         actual semantic
> >         >     >> meaning, leaving the output channel numbers with no
> >         apparent meaning.
> >         >     >>
> >         >     >> This does have a potential benefit as a matrix
> >         compression technique,
> >         >     >> to reduce the size of the matrix when it would contain
> >         rows that are
> >         >     >> all zero.  However considering that the matrix occurs
> >         only once, and
> >         >     >> mapping family 2 already offers a way to compress the
> >         matrix, this
> >         >     >> alone does not seem worth the complexity of another
> >         level of
> >         >     >> indirection.  If matrix compression is desired it would
> >         probably be
> >         >     >> less confusing to describe it in those terms and keep
> >         the semantic
> >         >     >> meaning tied to the output channels.
> >         >     >>
> >         >     >>
> >         >     >> The description of the Output Channel Numbering also
> >         does not specify
> >         >     >> the intended behavior if the same value appears in the
> >         table multiple
> >         >     >> times.
> >         >     >>
> >         >     >> Additionally, section 4.2 describes how to perform a
> stereo
> >         >     downmix of
> >         >     >> mapping family 3, but makes assumptions about the
> >         output channel
> >         >     >> numbering.  This seems harmful and likely to promote
> >         implementations
> >         >     >> that make similar assumptions.  If it is necessary to
> >         apply the
> >         >     output
> >         >     >> channel numbering described in section 3.2 in order to
> >         implement a
> >         >     >> correct stereo downmix, then it would be better to
> >         simply use the
> >         >     >> output channels from section 3 as input to the downmix,
> >         consolidating
> >         >     >> sections 4.1 and 4.2, rather than specify new formulas
> >         that make
> >         >     >> assumptions about the mapping.  That would also greatly
> >         simplify
> >         >     >> section 4.
> >         >     >>
> >         >     >> Eliminating the Output Channel Numbering table as
> >         Jean-Marc suggests
> >         >     >> should resolve these concerns.
> >         >     >
> >         >     >
> >         >     > The problem is that once we allow mixed orders there is
> >         no unique
> >         >     way for a
> >         >     > receiver/decoder
> >         >     > to resolve the mapping to ACNs from just a number of
> >         total output
> >         >     channels.
> >         >
> >         >
> >         >     In mapping family 2, the channel count (C) is the number
> >         of channels
> >         >     in the fully periphonic configuration, but it is not
> >         necessary to
> >         >     encode them all.  The channel mapping table can map each
> >         ACN to a
> >         >     specific decoded channel or to silence.  For mixed order,
> >         some of the
> >         >     ACNs will be mapped to silence and will not be encoded.
> >         >
> >         >     In mapping family 3, the matrix can do everything that the
> >         channel
> >         >     mapping table can do and more.  Why not treat C in the
> >         same manner, as
> >         >     the number of channels in the fully periphonic
> >         configuration, even if
> >         >     some are silent?
> >         >
> >         >      - Mark
> >         >
> >         >     _______________________________________________
> >         >     codec mailing list
> >         >     [email protected] <mailto:[email protected]>
> >         <mailto:[email protected] <mailto:[email protected]>>
> >         >     https://www.ietf.org/mailman/listinfo/codec
> >         >
> >         >
> >         >
> >         > _______________________________________________
> >         > codec mailing list
> >         > [email protected] <mailto:[email protected]>
> >         > https://www.ietf.org/mailman/listinfo/codec
> >         >
> >
>
>

_______________________________________________
codec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/codec

Re: [codec] Comments on draft-ietf-codec-ambisonics-01

Reply via email to