On Thu, 2005-03-10 at 09:10 +1100, Paul Mackerras wrote:
> Michel DÃnzer writes:
>
> > Nice. It might also be interesting to experiment with copying the
> > texture data into the ring itself instead of into indirect buffers (and
> > use type 3 NOP packets to have the CP skip it), if someone feels so
> > inclined.
>
> Just to avoid the overhead of allocating an indirect buffer?
Yes.
> I think that could be worthwhile for smaller textures,
Not only; with large textures, you might avoid having to wait for an
indirect buffer to become available, the ring size might have to be
increased for that though.
> although for smaller textures it would probably be just as fast, and a
> lot simpler, to write the texture directly to the framebuffer.
Possibly, but that has the disadvantage of not being synchronized with
the GPU.
> I assumed that the indirect buffer would be at least 1kB-aligned
> (indirect buffers seem to be page-aligned, from what I could see in
> the code that creates them). This means that I didn't have to worry
> about losing bits when shifting buf->offset right 10 bits. We
> wouldn't have that guarantee if we were putting the texture in the
> ring buffer, which might make calculation of suitable x and y values
> interesting. :)
The data can be aligned in the ring as well, but that's indeed an issue.
I'm really not sure this would be worth it though, hence my choice of
words: 'might be interesting to experiment'... :)
> > > > + OUT_RING((texpitch << 22) | (offset >> 10));
> > > > + OUT_RING((texpitch << 22) | (tex->offset >> 10));
> >
> > Are source and destination pitch always the same?
>
> I found it quite hard to understand what was going on with tex->width,
> tex->height and tex->pitch vs. image->width and image->height, since
> they seem to be used inconsistently. It turns out that in fact
> tex->pitch isn't actually the pitch of the texture image - it can be a
> power of two multiple of the actual texture pitch.
I think tex->pitch is the value that will be written to the texture
pitch register. My (limited) understanding of the other fields is
tex->{width,height}: texture width/height
image->width: source data pitch
image->height: source data height
> By the time the data gets to the indirect buffer it is laid out as we
> want it in the framebuffer, though, [...]
Really? The texture pitch in the framebuffer is always a multiple of 1024
bytes AFAIK (I might be way off though :), so that would be a royal waste
of bandwidth in some cases.
I agree this is pretty confusing though, clues appreciated. :)
> > > > + OUT_RING(0);
> > > > + OUT_RING((image->x << 16) | image->y);
> > > > + OUT_RING((image->width << 16) | height);
> > > > + ADVANCE_RING();
> > > > +
> > > > radeon_cp_discard_buffer(dev, buf);
> >
> > I think this needs a RADEON_WAIT_UNTIL_2D_IDLE(), or the indirect buffer
> > might get reused before the blit is complete.
>
> Well, radeon_cp_dispatch_indirect doesn't seem to wait for the blit to
> complete either, so I was just following what the old code did.
The difference is that for a hostdata blit, the CP writes the data to
the hostdata registers synchronously, whereas with your change, the 2D
engine will fetch the data asynchronously.
> I must admit I don't yet understand how indirect buffers get recycled.
The DRM keeps track of a monotonously increasing 'buffer age'. When an
indirect buffer gets discarded, the DRM increments the buffer age and
associates the discarded buffer with the incremented age. It emits
commands for executing the indirect buffer to the ring buffer, followed
by a write to a reserved scratch register. The DRM knows it can reuse
indirect buffers whose associated age is smaller than or equal to the
age in that register.
Now with hostdata blits, this doesn't require special treatment, because
the scratch register is only written to after the CP is done feeding the
data to the hostdata registers. But with your change, the scratch
register will be written immediately after the 2D engine starts fetching
the data from the indirect buffer. There might still only be conflicts
rarely if ever in practice, but it's probably better not to take any
chances. :)
--
Earthling Michel DÃnzer | Debian (powerpc), X and DRI developer
Libre software enthusiast | http://svcs.affero.net/rm.php?r=daenzer
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
--
_______________________________________________
Dri-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dri-devel