Re: [Mesa-dev] [PATCH] mesa: Fix geometry shader program queries.

2013-10-17 Thread Paul Berry
On 16 October 2013 23:29, Pohjolainen, Topi wrote:

> On Wed, Oct 16, 2013 at 11:13:33AM -0700, Paul Berry wrote:
> > The queries GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, and
> > GEOMETRY_OUTPUT_TYPE (defined by GL 3.2) differ from the corresponding
> > queries in ARB_geometry_shader4 in the following ways:
> >
> > - They use different enum values
> >
> > - They can only be queried; they cannot be set.
> >
> > - Attempting to query them yields INVALID_OPERATION if the program is
> >   not linked, or lacks a geometry shader.
> >
> > This patch switches us over from the ARB_geometry_shader4 behaviour to
> > the GL 3.2 behaviour.
> >
> > Fixes piglit test query-gs-prim-types.
> > ---
> >  src/mesa/main/shaderapi.c | 100
> +++---
> >  1 file changed, 40 insertions(+), 60 deletions(-)
> >
> > diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> > index d3677c8..8be1f78 100644
> > --- a/src/mesa/main/shaderapi.c
> > +++ b/src/mesa/main/shaderapi.c
> > @@ -460,6 +460,31 @@ get_handle(struct gl_context *ctx, GLenum pname)
> >
> >
> >  /**
> > + * Check if a geometry shader query is valid at this time.  If not,
> report an
> > + * error and return false.
> > + *
> > + * From GL 3.2 section 6.1.16 (Shader and Program Queries):
> > + *
> > + * "If GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, or
> GEOMETRY_OUTPUT_TYPE
> > + * are queried for a program which has not been linked
> successfully, or
> > + * which does not contain objects to form a geometry shader, then an
> > + * INVALID_OPERATION error is generated."
> > + */
> > +static bool
> > +check_gs_query(struct gl_context *ctx, const struct gl_shader_program
> *shProg)
> > +{
> > +   if (shProg->LinkStatus &&
> > +   shProg->_LinkedShaders[MESA_SHADER_GEOMETRY] != NULL) {
> > +  return true;
> > +   }
> > +
> > +   _mesa_error(ctx, GL_INVALID_OPERATION,
> > +   "glGetProgramv(linked geometry shader required)");
> > +   return false;
> > +}
> > +
> > +
> > +/**
> >   * glGetProgramiv() - get shader program state.
> >   * Note that this is for GLSL shader programs, not ARB vertex/fragment
> >   * programs (see glGetProgramivARB).
> > @@ -477,9 +502,10 @@ get_programiv(struct gl_context *ctx, GLuint
> program, GLenum pname, GLint *param
> >|| ctx->API == API_OPENGL_CORE
> >|| _mesa_is_gles3(ctx);
> >
> > -   /* Are geometry shaders available in this context?
> > +   /* Are geometry shaders (of the form that was adopted into GLSL 1.50
> and GL
> > +* 3.2) available in this context?
>
> I had to check just for my own understanding. The question here is not
> implying
> that there is still doubt by the author that something else should be
> checked
> along with version 3.2. Instead it is meant to be understood the same as
> "Check
> if geometry shaders (...) are available...", similarly as in case of
> 'check_gs_query()' above, right?
>

Yeah.  I agree that phrasing the comment in the form of a question is a
little misleading here.  How about if I said this instead?

/* True if geometry shaders (of the form that was adopted into GLSL 1.50
and GL 3.2) are available in this context */


> >  */
> > -   const bool has_gs = _mesa_has_geometry_shaders(ctx);
> > +   const bool has_core_gs = _mesa_is_desktop_gl(ctx) && ctx->Version >=
> 32;
> >
> > /* Are uniform buffer objects available in this context?
> >  */
> > @@ -564,20 +590,23 @@ get_programiv(struct gl_context *ctx, GLuint
> program, GLenum pname, GLint *param
> >   break;
> >*params = shProg->TransformFeedback.BufferMode;
> >return;
> > -   case GL_GEOMETRY_VERTICES_OUT_ARB:
> > -  if (!has_gs)
> > +   case GL_GEOMETRY_VERTICES_OUT:
> > +  if (!has_core_gs)
> >   break;
> > -  *params = shProg->Geom.VerticesOut;
> > +  if (check_gs_query(ctx, shProg))
> > + *params = shProg->Geom.VerticesOut;
> >return;
> > -   case GL_GEOMETRY_INPUT_TYPE_ARB:
> > -  if (!has_gs)
> > +   case GL_GEOMETRY_INPUT_TYPE:
> > +  if (!has_core_gs)
> >   break;
> > -  *params = shProg->Geom.InputType;
> > +  if (check_gs_query(ctx, shProg))
> > + *params = shProg->Geom.InputType;
> >return;
> > -   case GL_GEOMETRY_OUTPUT_TYPE_ARB:
> > -  if (!has_gs)
> > +   case GL_GEOMETRY_OUTPUT_TYPE:
> > +  if (!has_core_gs)
> >   break;
> > -  *params = shProg->Geom.OutputType;
> > +  if (check_gs_query(ctx, shProg))
> > + *params = shProg->Geom.OutputType;
> >return;
> > case GL_ACTIVE_UNIFORM_BLOCK_MAX_NAME_LENGTH: {
> >unsigned i;
> > @@ -1631,55 +1660,6 @@ _mesa_ProgramParameteri(GLuint program, GLenum
> pname, GLint value)
> >return;
> >
> > switch (pname) {
> > -   case GL_GEOMETRY_VERTICES_OUT_ARB:
> > -  if (!_mesa_is_desktop_gl(ctx) ||
> !ctx->Extensions.ARB_geometry_shader4)
> > - break;
> > -
> > -  if (value < 0 ||
> > -  (unsi

Re: [Mesa-dev] [PATCH] mesa: Fix geometry shader program queries.

2013-10-17 Thread Pohjolainen, Topi
On Wed, Oct 16, 2013 at 11:57:16PM -0700, Paul Berry wrote:
>On 16 October 2013 23:29, Pohjolainen, Topi 
>wrote:
> 
>  On Wed, Oct 16, 2013 at 11:13:33AM -0700, Paul Berry wrote:
>  > The queries GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, and
>  > GEOMETRY_OUTPUT_TYPE (defined by GL 3.2) differ from the corresponding
>  > queries in ARB_geometry_shader4 in the following ways:
>  >
>  > - They use different enum values
>  >
>  > - They can only be queried; they cannot be set.
>  >
>  > - Attempting to query them yields INVALID_OPERATION if the program is
>  >   not linked, or lacks a geometry shader.
>  >
>  > This patch switches us over from the ARB_geometry_shader4 behaviour to
>  > the GL 3.2 behaviour.
>  >
>  > Fixes piglit test query-gs-prim-types.
>  > ---
>  >  src/mesa/main/shaderapi.c | 100
>  +++---
>  >  1 file changed, 40 insertions(+), 60 deletions(-)
>  >
>  > diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>  > index d3677c8..8be1f78 100644
>  > --- a/src/mesa/main/shaderapi.c
>  > +++ b/src/mesa/main/shaderapi.c
>  > @@ -460,6 +460,31 @@ get_handle(struct gl_context *ctx, GLenum pname)
>  >
>  >
>  >  /**
>  > + * Check if a geometry shader query is valid at this time.  If not,
>  report an
>  > + * error and return false.
>  > + *
>  > + * From GL 3.2 section 6.1.16 (Shader and Program Queries):
>  > + *
>  > + * "If GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, or
>  GEOMETRY_OUTPUT_TYPE
>  > + * are queried for a program which has not been linked
>  successfully, or
>  > + * which does not contain objects to form a geometry shader, then
>  an
>  > + * INVALID_OPERATION error is generated."
>  > + */
>  > +static bool
>  > +check_gs_query(struct gl_context *ctx, const struct gl_shader_program
>  *shProg)
>  > +{
>  > +   if (shProg->LinkStatus &&
>  > +   shProg->_LinkedShaders[MESA_SHADER_GEOMETRY] != NULL) {
>  > +  return true;
>  > +   }
>  > +
>  > +   _mesa_error(ctx, GL_INVALID_OPERATION,
>  > +   "glGetProgramv(linked geometry shader required)");
>  > +   return false;
>  > +}
>  > +
>  > +
>  > +/**
>  >   * glGetProgramiv() - get shader program state.
>  >   * Note that this is for GLSL shader programs, not ARB
>  vertex/fragment
>  >   * programs (see glGetProgramivARB).
>  > @@ -477,9 +502,10 @@ get_programiv(struct gl_context *ctx, GLuint
>  program, GLenum pname, GLint *param
>  >|| ctx->API == API_OPENGL_CORE
>  >|| _mesa_is_gles3(ctx);
>  >
>  > -   /* Are geometry shaders available in this context?
>  > +   /* Are geometry shaders (of the form that was adopted into GLSL
>  1.50 and GL
>  > +* 3.2) available in this context?
> 
>  I had to check just for my own understanding. The question here is not
>  implying
>  that there is still doubt by the author that something else should be
>  checked
>  along with version 3.2. Instead it is meant to be understood the same as
>  "Check
>  if geometry shaders (...) are available...", similarly as in case of
>  'check_gs_query()' above, right?
> 
>Yeah.  I agree that phrasing the comment in the form of a question is a
>little misleading here.  How about if I said this instead?
>/* True if geometry shaders (of the form that was adopted into GLSL 1.50
>and GL 3.2) are available in this context */

That would make it crystal clear. The question was already there before your
patch so thanks for fixing it also!

> 
>  >  */
>  > -   const bool has_gs = _mesa_has_geometry_shaders(ctx);
>  > +   const bool has_core_gs = _mesa_is_desktop_gl(ctx) && ctx->Version
>  >= 32;
>  >
>  > /* Are uniform buffer objects available in this context?
>  >  */
>  > @@ -564,20 +590,23 @@ get_programiv(struct gl_context *ctx, GLuint
>  program, GLenum pname, GLint *param
>  >   break;
>  >*params = shProg->TransformFeedback.BufferMode;
>  >return;
>  > -   case GL_GEOMETRY_VERTICES_OUT_ARB:
>  > -  if (!has_gs)
>  > +   case GL_GEOMETRY_VERTICES_OUT:
>  > +  if (!has_core_gs)
>  >   break;
>  > -  *params = shProg->Geom.VerticesOut;
>  > +  if (check_gs_query(ctx, shProg))
>  > + *params = shProg->Geom.VerticesOut;
>  >return;
>  > -   case GL_GEOMETRY_INPUT_TYPE_ARB:
>  > -  if (!has_gs)
>  > +   case GL_GEOMETRY_INPUT_TYPE:
>  > +  if (!has_core_gs)
>  >   break;
>  > -  *params = shProg->Geom.InputType;
>  > +  if (check_gs_query(ctx, shProg))
>  > + *params = shProg->Geom.InputType;

Re: [Mesa-dev] Mesa (master): i965/fs: Convert gen7 to using GRFs for texture messages.

2013-10-17 Thread Chia-I Wu
On Thu, Oct 17, 2013 at 1:53 PM, Chia-I Wu  wrote:
> Hi Eric,
>
> On Sat, Oct 12, 2013 at 3:18 AM, Eric Anholt  wrote:
>> Chia-I Wu  writes:
>>
>>> Hi Eric,
>>> The frame rate of Unigine Tropics (with low shader quality) dropped
>>> from 40.8 to 23.5 after this change.
>>
>> Thanks for the note.  I see the regression as well, and I see a shader
>> that's started spilling.  It looks like we can drop the regs_written <=
>> 1 check on gen7+'s pre-regalloc scheduling to fix the problem (the MRF
>> setup thing is no longer an issue, and its presence is now making us
>> pessimize instead of optimize in general in the pre-regalloc
>> scheduling).  I'll want to run a few more tests to make sure that this
>> doesn't regress something else.
> Are you looking at this issue?  The change you suggested does not
> avoid spilling.
>
> I think the problem can be demonstrated with this snippet:
>
>   vec4 val = vec4(0.0);
>
>   vec4 tmp_001 = texture(tex, texcoord * 0.01);
>   val += tmp_001;
>   vec4 tmp_002 = texture(tex, texcoord * 0.02);
>   val += tmp_002;
>   vec4 tmp_003 = texture(tex, texcoord * 0.03);
>   val += tmp_003;
>   ...
>   vec4 tmp_099 = texture(tex, texcoord * 0.99);
>   val += tmp_099;
>   vec4 tmp_100 = texture(tex, texcoord * 1.00);
>   val += tmp_100;
>
>   gl_FragColor = val;
>
> Before the change, the scheduler saw a dependency between any two
> texture() calls (because of the use of MRF).  It was inclined to keep
> the accumulation of tmp_xxx between texture() calls even though the
> accumulation also had a dependency on the last texture() call.
>
> After the change, the dependencies between texture()s are gone.  The
> scheduler sees a chance to move all the high latency texture()
> together and generate something like this:
Ah, I started looking at post-reg-alloc scheduling in the middle
way...  My reasoning was wrong.  The correct one is:

It worked before this change because there were dependencies between
texture() calls, and those texture() calls must thus be scheduled in
that order.  Accumulations were scheduled as soon as they were
available, and thus were intermixed with texture() calls.

It does not work now because the dependencies between texture() calls
are gone.  Since the scheduler schedules in FILO order, texture()
calls are scheduled in reversed order.  Accumulations are thus
available only after all texture() calls are scheduled.

This remains true with the fix suggested (it is still desirable, only
that it is a partial fix).  The problem can be demonstrated with the
attached fragment shader.

>   vec4 tmp_003 = texture(tex, texcoord * 0.03);
>   ...
>   vec4 tmp_099 = texture(tex, texcoord * 0.99);
>   vec4 tmp_100 = texture(tex, texcoord * 1.00);
>
>   val += tmp_001;
>   val += tmp_002;
>   val += tmp_003;
>   ...
>   val += tmp_099;
>   val += tmp_100;
>
> Since there are not enough registers to hold all tmp_xxx, the register
> allocation starts spilling.
>
>>
>> This shader is also in bad shape now that we don't have the redundant
>> MRF move optimization, and we need to look into grf_size > 1 CSE.  That
>> would probably also have avoided the problem on this shader, though the
>> scheduling problem is more general than this one shader.
>
>
>
> --
> o...@lunarg.com
>   val = texture(tex, texcoord * 1.0);



-- 
o...@lunarg.com


465.frag
Description: Binary data
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): i965/fs: Convert gen7 to using GRFs for texture messages.

2013-10-17 Thread Chia-I Wu
On Thu, Oct 17, 2013 at 3:29 PM, Chia-I Wu  wrote:
> On Thu, Oct 17, 2013 at 1:53 PM, Chia-I Wu  wrote:
>> Hi Eric,
>>
>> On Sat, Oct 12, 2013 at 3:18 AM, Eric Anholt  wrote:
>>> Chia-I Wu  writes:
>>>
 Hi Eric,
 The frame rate of Unigine Tropics (with low shader quality) dropped
 from 40.8 to 23.5 after this change.
>>>
>>> Thanks for the note.  I see the regression as well, and I see a shader
>>> that's started spilling.  It looks like we can drop the regs_written <=
>>> 1 check on gen7+'s pre-regalloc scheduling to fix the problem (the MRF
>>> setup thing is no longer an issue, and its presence is now making us
>>> pessimize instead of optimize in general in the pre-regalloc
>>> scheduling).  I'll want to run a few more tests to make sure that this
>>> doesn't regress something else.
>> Are you looking at this issue?  The change you suggested does not
>> avoid spilling.
>>
>> I think the problem can be demonstrated with this snippet:
>>
>>   vec4 val = vec4(0.0);
>>
>>   vec4 tmp_001 = texture(tex, texcoord * 0.01);
>>   val += tmp_001;
>>   vec4 tmp_002 = texture(tex, texcoord * 0.02);
>>   val += tmp_002;
>>   vec4 tmp_003 = texture(tex, texcoord * 0.03);
>>   val += tmp_003;
>>   ...
>>   vec4 tmp_099 = texture(tex, texcoord * 0.99);
>>   val += tmp_099;
>>   vec4 tmp_100 = texture(tex, texcoord * 1.00);
>>   val += tmp_100;
>>
>>   gl_FragColor = val;
>>
>> Before the change, the scheduler saw a dependency between any two
>> texture() calls (because of the use of MRF).  It was inclined to keep
>> the accumulation of tmp_xxx between texture() calls even though the
>> accumulation also had a dependency on the last texture() call.
>>
>> After the change, the dependencies between texture()s are gone.  The
>> scheduler sees a chance to move all the high latency texture()
>> together and generate something like this:
> Ah, I started looking at post-reg-alloc scheduling in the middle
> way...  My reasoning was wrong.  The correct one is:
>
> It worked before this change because there were dependencies between
> texture() calls, and those texture() calls must thus be scheduled in
> that order.  Accumulations were scheduled as soon as they were
> available, and thus were intermixed with texture() calls.
>
> It does not work now because the dependencies between texture() calls
> are gone.  Since the scheduler schedules in FILO order, texture()
> calls are scheduled in reversed order.  Accumulations are thus
> available only after all texture() calls are scheduled.
Prior to register allocation, choose_instruction_to_schedule() chooses
from the available instructions in reverse order.  The attached change
fixes the order, while still doing depth-first search.

It fixes the problem I saw with my example shader, but does not
prevent the shader from Unigine Tropics from spilling.  Just want to
check with you about the idea.  I don't quite follow the comment for
the (inst->regs_written <= 1) check.  It seems to me you want to
schedule texturing last (between the newly available instructions),
but the comment is not clear to me.

>
> This remains true with the fix suggested (it is still desirable, only
> that it is a partial fix).  The problem can be demonstrated with the
> attached fragment shader.
>
>>   vec4 tmp_003 = texture(tex, texcoord * 0.03);
>>   ...
>>   vec4 tmp_099 = texture(tex, texcoord * 0.99);
>>   vec4 tmp_100 = texture(tex, texcoord * 1.00);
>>
>>   val += tmp_001;
>>   val += tmp_002;
>>   val += tmp_003;
>>   ...
>>   val += tmp_099;
>>   val += tmp_100;
>>
>> Since there are not enough registers to hold all tmp_xxx, the register
>> allocation starts spilling.
>>
>>>
>>> This shader is also in bad shape now that we don't have the redundant
>>> MRF move optimization, and we need to look into grf_size > 1 CSE.  That
>>> would probably also have avoided the problem on this shader, though the
>>> scheduling problem is more general than this one shader.
>>
>>
>>
>> --
>> o...@lunarg.com
>>   val = texture(tex, texcoord * 1.0);
>
>
>
> --
> o...@lunarg.com



-- 
o...@lunarg.com


0001-i965.patch
Description: Binary data
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] EXT_image_dma_buf_import FD ownership

2013-10-17 Thread Tom Cooksey
> > To my knowledge, there exists only two implementations.
> >
> > ARM ships an implementation for Android on Mali. I don't see such
> > a spec update hurting ARM, because Android devices
> > are fairly locked down systems with a monolithic source tree for each
> > device.
> 
> As far as I know, the Mali driver dropped to Android doesn't have an
> implementation yet, so we're clear here.  The Mali driver dropped to
> ChromeOS does -- because I implemented it :-).  So I think we're good
> on spec changes with respect to Mali.  If the Intel folks are fine
> too, then we can do it.
> 
> I should probably let Tom Cooksey chime in on this though.

Hiya! :-)

Yes we're fine making this change to the spec and the implementation
in the upstream Mali driver. As you say, we don't expose this on Android
as it has its own EGL_ANDROID_image_native_buffer which achieves much
the same thing. So it is really only X11 & fbdev.

I think Dave G (ARM) will raise a Khronos bug on the spec and propose
some new language. This will give other vendors a chance to comment,
though as you say, I believe ARM & Mesa are the only implementations
so doubt there will be any objections.


Cheers,

Tom





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Move error message inside validation check reducing duplicate message handling

2013-10-17 Thread Timothy Arceri
---
 src/glsl/ast_to_hir.cpp | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index dfa32d9..f96ed53 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -637,8 +637,8 @@ shift_result_type(const struct glsl_type *type_a,
  */
 ir_rvalue *
 validate_assignment(struct _mesa_glsl_parse_state *state,
-   const glsl_type *lhs_type, ir_rvalue *rhs,
-   bool is_initializer)
+YYLTYPE loc, const glsl_type *lhs_type,
+ir_rvalue *rhs, bool is_initializer)
 {
/* If there is already some error in the RHS, just return it.  Anything
 * else will lead to an avalanche of error message back to the user.
@@ -670,6 +670,12 @@ validate_assignment(struct _mesa_glsl_parse_state *state,
 return rhs;
}
 
+   _mesa_glsl_error(&loc, state,
+is_initializer ? "initializer" : "value"
+" of type %s cannot be assigned to "
+"variable of type %s",
+rhs->type->name, lhs_type->name);
+
return NULL;
 }
 
@@ -700,10 +706,10 @@ do_assignment(exec_list *instructions, struct 
_mesa_glsl_parse_state *state,
 
   if (unlikely(expr->operation == ir_binop_vector_extract)) {
  ir_rvalue *new_rhs =
-validate_assignment(state, lhs->type, rhs, is_initializer);
+validate_assignment(state, lhs_loc, lhs->type,
+rhs, is_initializer);
 
  if (new_rhs == NULL) {
-_mesa_glsl_error(& lhs_loc, state, "type mismatch");
 return lhs;
  } else {
 rhs = new(ctx) ir_expression(ir_triop_vector_insert,
@@ -752,10 +758,8 @@ do_assignment(exec_list *instructions, struct 
_mesa_glsl_parse_state *state,
}
 
ir_rvalue *new_rhs =
-  validate_assignment(state, lhs->type, rhs, is_initializer);
-   if (new_rhs == NULL) {
-  _mesa_glsl_error(& lhs_loc, state, "type mismatch");
-   } else {
+  validate_assignment(state, lhs_loc, lhs->type, rhs, is_initializer);
+   if (new_rhs != NULL) {
   rhs = new_rhs;
 
   /* If the LHS array was not declared with a size, it takes it size from
@@ -2495,7 +2499,8 @@ process_initializer(ir_variable *var, ast_declaration 
*decl,
 */
if (type->qualifier.flags.q.constant
|| type->qualifier.flags.q.uniform) {
-  ir_rvalue *new_rhs = validate_assignment(state, var->type, rhs, true);
+  ir_rvalue *new_rhs = validate_assignment(state, initializer_loc,
+   var->type, rhs, true);
   if (new_rhs != NULL) {
 rhs = new_rhs;
 
@@ -2524,10 +2529,6 @@ process_initializer(ir_variable *var, ast_declaration 
*decl,
var->constant_value = constant_value;
 }
   } else {
-_mesa_glsl_error(&initializer_loc, state,
- "initializer of type %s cannot be assigned to "
- "variable of type %s",
- rhs->type->name, var->type->name);
 if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = ir_constant::zero(state, var->type);
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] i965/vec4: Add the ability for attributes to be interleaved.

2013-10-17 Thread Paul Berry
When geometry shaders are operated in "single" or "dual instanced"
mode, a single set of geometry shader inputs is interleaved into the
thread payload (with each payload register containing a pair of
inputs) in order to save register space.

This patch modifies vec4_visitor::lower_attributes_to_hw_regs so that
it can handle the interleaved format.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 28 +++
 src/mesa/drivers/dri/i965/brw_vec4.h  |  3 ++-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  2 +-
 3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index d3ee9a1..d774e6f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1184,12 +1184,32 @@ vec4_visitor::dump_instruction(backend_instruction 
*be_inst)
printf("\n");
 }
 
+
+static inline struct brw_reg
+attribute_to_hw_reg(int attr, bool interleaved)
+{
+   if (interleaved)
+  return stride(brw_vec4_grf(attr / 2, (attr % 2) * 4), 0, 4, 1);
+   else
+  return brw_vec8_grf(attr, 0);
+}
+
+
 /**
  * Replace each register of type ATTR in this->instructions with a reference
  * to a fixed HW register.
+ *
+ * If interleaved is true, then each attribute takes up half a register, with
+ * register N containing attribute 2*N in its first half and attribute 2*N+1
+ * in its second half (this corresponds to the payload setup used by geometry
+ * shaders in "single" or "dual instanced" dispatch mode).  If interleaved is
+ * false, then each attribute takes up a whole register, with register N
+ * containing attribute N (this corresponds to the payload setup used by
+ * vertex shaders, and by geometry shaders in "dual object" dispatch mode).
  */
 void
-vec4_visitor::lower_attributes_to_hw_regs(const int *attribute_map)
+vec4_visitor::lower_attributes_to_hw_regs(const int *attribute_map,
+  bool interleaved)
 {
foreach_list(node, &this->instructions) {
   vec4_instruction *inst = (vec4_instruction *)node;
@@ -1203,7 +1223,7 @@ vec4_visitor::lower_attributes_to_hw_regs(const int 
*attribute_map)
   */
  assert(grf != 0);
 
-struct brw_reg reg = brw_vec8_grf(grf, 0);
+struct brw_reg reg = attribute_to_hw_reg(grf, interleaved);
 reg.type = inst->dst.type;
 reg.dw1.bits.writemask = inst->dst.writemask;
 
@@ -1222,7 +1242,7 @@ vec4_visitor::lower_attributes_to_hw_regs(const int 
*attribute_map)
   */
  assert(grf != 0);
 
-struct brw_reg reg = brw_vec8_grf(grf, 0);
+struct brw_reg reg = attribute_to_hw_reg(grf, interleaved);
 reg.dw1.bits.swizzle = inst->src[i].swizzle;
  reg.type = inst->src[i].type;
 if (inst->src[i].abs)
@@ -1260,7 +1280,7 @@ vec4_vs_visitor::setup_attributes(int payload_reg)
   nr_attributes++;
}
 
-   lower_attributes_to_hw_regs(attribute_map);
+   lower_attributes_to_hw_regs(attribute_map, false /* interleaved */);
 
/* The BSpec says we always have to read at least one thing from
 * the VF, and it appears that the hardware wedges otherwise.
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 41d91e5..f99fdfa 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -517,7 +517,8 @@ public:
 
 protected:
void emit_vertex();
-   void lower_attributes_to_hw_regs(const int *attribute_map);
+   void lower_attributes_to_hw_regs(const int *attribute_map,
+bool interleaved);
void setup_payload_interference(struct ra_graph *g, int first_payload_node,
int reg_node_count);
virtual dst_reg *make_reg_for_system_value(ir_variable *ir) = 0;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 50feb89..bd13082 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -110,7 +110,7 @@ vec4_gs_visitor::setup_payload()
 
reg = setup_varying_inputs(reg, attribute_map);
 
-   lower_attributes_to_hw_regs(attribute_map);
+   lower_attributes_to_hw_regs(attribute_map, false /* interleaved */);
 
this->first_non_payload_grf = reg;
 }
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/8] i965/gs: Use DUAL_INSTANCED mode to ease register pressure.

2013-10-17 Thread Paul Berry
Previously, i965 geometry shaders always operated in DUAL_OBJECT mode,
which is similar to vertex shader operation in that two independent
sets of inputs get dispatched to a single SIMD4x2 geometry shader
thread, which executes them both in parallel.

When register usage is tight, we need to switch to a mechanism that
uses fewer registers.  In an ideal world we'd fall back to SINGLE
mode, in which a single set of inputs is dispatched to a SIMD4x1
geometry shader thread.  Effectively this makes twice as many
registers available, since it allows independent data to be
interleaved into the lower and upper halves of each register.

Unfortunately, we don't yet have the infrastructure in the vec4
back-end to support interleaving all the registers.  So we do the next
best thing, which is to use DUAL_INSTANCED dispatch mode.  In this
mode, a single set of geometry shader inputs is delivered to the
shader in interleaved fashion (as would happen in SINGLE mode), but
the shader operates as a SIMD4x2 shader (so all other registers are
non-interleaved).  If the geometry shader is instanced, then up to two
instances may be dispatched to the geometry shader at once; otherwise,
each geometry shader invocation runs in its own thread, with the
execution mask set appropriately.  Since we don't support instanced
geometry shaders yet, DUAL_INSTANCED and SINGLE modes are for all
intents and purposes equivalent, except that we don't have to do as
much back-end register interleaving work.

The compilation strategy for choosing between DUAL_INSTANCED and
DUAL_OBJECT modes is similar to what we do for 8-wide vs. 16-wide
fragment shaders.  First we try compiling the shader in DUAL_OBJECT
mode with register spilling disabled.  If that fails, we fall back to
DUAL_INSTANCED mode and compile with register spilling enabled.

Unfortunately, even when using DUAL_INSTANCED mode we still can't
support 128 geometry shader input components, due to other limitations
in our vec4 back-end code.  So the final patch of the series reduces
gl_MaxGeometryInputComponents to 64, the minimum required by the spec.

This series needs to be applied atop "vbo: Make
vbo_sw_primitive_restart optionally count primitives." and "i965/gs:
Fix gl_PrimitiveIDIn when using SW primitive restart.", which are on
the mailing list but haven't been reviewed yet.  To see the series in
context, please check out branch "gs-phase-6" from
https://github.com/stereotype441/mesa.git.

[PATCH 1/8] i965/vec4: Add the ability for attributes to be interleaved.
[PATCH 2/8] i965/vec4: if register allocation fails, don't try to schedule.
[PATCH 3/8] i965/vec4: Add the ability to suppress register spilling.
[PATCH 4/8] i965/gs: Add the ability to compile a DUAL_INSTANCED geometry 
shader.
[PATCH 5/8] i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs.
[PATCH 6/8] i965/gs: fix up primitive ID workaround for DUAL_INSTANCE dispatch.
[PATCH 7/8] i965/gs: If a DUAL_OBJECT gs would spill, fall back to 
DUAL_INSTANCED.
[PATCH 8/8] i965: Reduce gl_MaxGeometryInputComponents to 64.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] i965/vec4: Add the ability to suppress register spilling.

2013-10-17 Thread Paul Berry
In future patches, this will allow us to first try compiling a
geometry shader in DUAL_OBJECT mode (which is more efficient but uses
more registers) and then if spilling is required, fall back on
DUAL_INSTANCED mode.
---
 src/mesa/drivers/dri/i965/brw_vec4.h  | 9 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 7 ---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   | 3 ++-
 src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp   | 5 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 2 +-
 src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp | 2 +-
 7 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index f99fdfa..fc8804f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -232,7 +232,8 @@ public:
struct gl_shader_program *shader_prog,
struct brw_shader *shader,
void *mem_ctx,
-bool debug_flag);
+bool debug_flag,
+bool no_spills);
~vec4_visitor();
 
dst_reg dst_null_f()
@@ -531,6 +532,12 @@ protected:
virtual int compute_array_stride(ir_dereference_array *ir);
 
const bool debug_flag;
+
+private:
+   /**
+* If true, then register allocation should fail instead of spilling.
+*/
+   const bool no_spills;
 };
 
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index bd13082..8d8f20e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -37,10 +37,11 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw,
  struct brw_gs_compile *c,
  struct gl_shader_program *prog,
  struct brw_shader *shader,
- void *mem_ctx)
+ void *mem_ctx,
+ bool no_spills)
: vec4_visitor(brw, &c->base, &c->gp->program.Base, &c->key.base,
   &c->prog_data.base, prog, shader, mem_ctx,
-  INTEL_DEBUG & DEBUG_GS),
+  INTEL_DEBUG & DEBUG_GS, no_spills),
  c(c)
 {
 }
@@ -562,7 +563,7 @@ brw_gs_emit(struct brw_context *brw,
   printf("\n\n");
}
 
-   vec4_gs_visitor v(brw, c, prog, shader, mem_ctx);
+   vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, false /* no_spills */);
if (!v.run()) {
   prog->LinkStatus = false;
   ralloc_strcat(&prog->InfoLog, v.fail_msg);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
index bdcb415..f7ca5f0 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
@@ -81,7 +81,8 @@ public:
struct brw_gs_compile *c,
struct gl_shader_program *prog,
struct brw_shader *shader,
-   void *mem_ctx);
+   void *mem_ctx,
+   bool no_spills);
 
 protected:
virtual dst_reg *make_reg_for_system_value(ir_variable *ir);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
index 3777027..807c2f3 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
@@ -214,7 +214,10 @@ vec4_visitor::reg_allocate()
* loop back into here to try again.
*/
   int reg = choose_spill_reg(g);
-  if (reg == -1) {
+  if (this->no_spills) {
+ fail("Failure to register allocate.  Reduce number of live "
+  "values to avoid this.");
+  } else if (reg == -1) {
  fail("no register to spill\n");
   } else {
  spill_reg(reg);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 231815f..ecc6fe6 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3143,8 +3143,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
   struct gl_shader_program *shader_prog,
   struct brw_shader *shader,
   void *mem_ctx,
-   bool debug_flag)
-   : debug_flag(debug_flag)
+   bool debug_flag,
+   bool no_spills)
+   : debug_flag(debug_flag), no_spills(no_spills)
 {
this->brw = brw;
this->ctx = &brw->ctx;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
index 1f5cc25..31c42c4 100644
--- a/src/mesa/drivers/dri/i965/brw_v

[Mesa-dev] [PATCH 2/8] i965/vec4: if register allocation fails, don't try to schedule.

2013-10-17 Thread Paul Berry
Otherwise the scheduler would be invoked with prog_data->total_grf ==
0, causing havoc.

In a future patch, this will allow us to try compiling a geometry
shader in DUAL_OBJECT mode with spilling disabled, and then fall back
to DUAL_INSTANCED mode if that failed.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index d774e6f..89f1978 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1520,7 +1520,7 @@ vec4_visitor::run()
 
while (!reg_allocate()) {
   if (failed)
- break;
+ return false;
}
 
opt_schedule_instructions();
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] i965/gs: Add the ability to compile a DUAL_INSTANCED geometry shader.

2013-10-17 Thread Paul Berry
Not yet enabled.
---
 src/mesa/drivers/dri/i965/brw_context.h   |  6 ++
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 25 +--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |  3 ++-
 src/mesa/drivers/dri/i965/gen7_gs_state.c |  4 +++-
 4 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index cafcf5c..6a14c7f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -642,6 +642,12 @@ struct brw_gs_prog_data
 
bool include_primitive_id;
bool need_primitive_id_workaround;
+
+   /**
+* True if the thread should be dispatched in DUAL_INSTANCE mode, false if
+* it should be dispatched in DUAL_OBJECT mode.
+*/
+   bool dual_instanced_dispatch;
 };
 
 /** Number of texture sampler units */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 8d8f20e..2be2666 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -57,7 +57,8 @@ vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir)
 
 
 int
-vec4_gs_visitor::setup_varying_inputs(int payload_reg, int *attribute_map)
+vec4_gs_visitor::setup_varying_inputs(int payload_reg, int *attribute_map,
+  int attributes_per_reg)
 {
/* For geometry shaders there are N copies of the input attributes, where N
 * is the number of input vertices.  attribute_map[BRW_VARYING_SLOT_COUNT *
@@ -75,11 +76,14 @@ vec4_gs_visitor::setup_varying_inputs(int payload_reg, int 
*attribute_map)
   int varying = c->input_vue_map.slot_to_varying[slot];
   for (unsigned vertex = 0; vertex < num_input_vertices; vertex++) {
  attribute_map[BRW_VARYING_SLOT_COUNT * vertex + varying] =
-payload_reg + input_array_stride * vertex + slot;
+attributes_per_reg * payload_reg + input_array_stride * vertex +
+slot;
   }
}
 
-   return payload_reg + input_array_stride * num_input_vertices;
+   int regs_used = ALIGN(input_array_stride * num_input_vertices,
+ attributes_per_reg) / attributes_per_reg;
+   return payload_reg + regs_used;
 }
 
 
@@ -88,6 +92,11 @@ vec4_gs_visitor::setup_payload()
 {
int attribute_map[BRW_VARYING_SLOT_COUNT * MAX_GS_INPUT_VERTICES];
 
+   /* If we are in dual instanced mode, then attributes are going to be
+* interleaved, so one register contains two attribute slots.
+*/
+   int attributes_per_reg = c->prog_data.dual_instanced_dispatch ? 2 : 1;
+
/* If a geometry shader tries to read from an input that wasn't written by
 * the vertex shader, that produces undefined results, but it shouldn't
 * crash anything.  So initialize attribute_map to zeros--that ensures that
@@ -105,13 +114,14 @@ vec4_gs_visitor::setup_payload()
 
/* If the shader uses gl_PrimitiveIDIn, that goes in r1. */
if (c->prog_data.include_primitive_id)
-  attribute_map[VARYING_SLOT_PRIMITIVE_ID] = reg++;
+  attribute_map[VARYING_SLOT_PRIMITIVE_ID] = attributes_per_reg * reg++;
 
reg = setup_uniforms(reg);
 
-   reg = setup_varying_inputs(reg, attribute_map);
+   reg = setup_varying_inputs(reg, attribute_map, attributes_per_reg);
 
-   lower_attributes_to_hw_regs(attribute_map, false /* interleaved */);
+   lower_attributes_to_hw_regs(attribute_map,
+   c->prog_data.dual_instanced_dispatch);
 
this->first_non_payload_grf = reg;
 }
@@ -563,6 +573,9 @@ brw_gs_emit(struct brw_context *brw,
   printf("\n\n");
}
 
+   /* Assume the geometry shader will use DUAL_OBJECT dispatch for now. */
+   c->prog_data.dual_instanced_dispatch = false;
+
vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, false /* no_spills */);
if (!v.run()) {
   prog->LinkStatus = false;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
index f7ca5f0..da4adcd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
@@ -97,7 +97,8 @@ protected:
virtual void visit(ir_end_primitive *);
 
 private:
-   int setup_varying_inputs(int payload_reg, int *attribute_map);
+   int setup_varying_inputs(int payload_reg, int *attribute_map,
+int attributes_per_reg);
void emit_control_data_bits();
void primitive_id_workaround();
 
diff --git a/src/mesa/drivers/dri/i965/gen7_gs_state.c 
b/src/mesa/drivers/dri/i965/gen7_gs_state.c
index c272b7d..2602200 100644
--- a/src/mesa/drivers/dri/i965/gen7_gs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_gs_state.c
@@ -136,7 +136,9 @@ upload_gs_state(struct brw_context *brw)
  ((brw->max_gs_threads - 1) << max_threads_shift) |
  (brw->gs.prog_data->control_data_header_size_hwords <<
   GEN7_GS

[Mesa-dev] [PATCH 5/8] i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs.

2013-10-17 Thread Paul Berry
Geometry shaders that run in "DUAL_INSTANCED" mode store their inputs
in vec4's.  This means that when compiling gl_PointSize input
swizzling (a MOV instruction which uses a geometry shader input as
both source and destination), we need to do two things:

- Set force_writemask_all to ensure that the MOV happens regardless of
  which channels are enabled.

- Set the source register region to <4;4,1> (instead of <0;4,1> to
  satisfy register region restrictions.
---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 17 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  8 +++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 1b597b5..700da54 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -858,7 +858,22 @@ vec4_generator::generate_vec4_instruction(vec4_instruction 
*instruction,
 
switch (inst->opcode) {
case BRW_OPCODE_MOV:
-  brw_MOV(p, dst, src[0]);
+  if (dst.width == BRW_WIDTH_4) {
+ /* This happens in attribute fixups for "dual instanced" geometry
+  * shaders, since they use attributes that are vec4's.  Since the
+  * exec width is only 4, it's essential that the caller set
+  * force_writemask_all in order to make sure the MOV happens
+  * regardless of which channels are enabled.
+  */
+ assert(inst->force_writemask_all);
+
+ /* To satisfy register region restrictions, the source needs a stride
+  * of <4;4,1>.
+  */
+ brw_MOV(p, dst, stride(src[0], 4, 4, 1));
+  } else {
+ brw_MOV(p, dst, src[0]);
+  }
   break;
case BRW_OPCODE_ADD:
   brw_ADD(p, dst, src[0], src[1]);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 2be2666..5b823c4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -210,7 +210,13 @@ vec4_gs_visitor::emit_prolog()
  src_reg src(dst);
  dst.writemask = WRITEMASK_X;
  src.swizzle = BRW_SWIZZLE_;
- emit(MOV(dst, src));
+ inst = emit(MOV(dst, src));
+
+ /* In dual instanced dispatch mode, dst has a width of 4, so we need
+  * to make sure the MOV happens regardless of which channels are
+  * enabled.
+  */
+ inst->force_writemask_all = true;
   }
}
 
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] i965/gs: If a DUAL_OBJECT gs would spill, fall back to DUAL_INSTANCED.

2013-10-17 Thread Paul Berry
This is similar to what we do for 16-wide vs 8-wide fragment shaders.
First we try compiling the geometry shader in DUAL_OBJECT mode.  If we
can't do that without spilling, we fall back on DUAL_INSTANCED mode,
which should require less spilling (since it uses an interleaved
layout of payload registers).

In an ideal world we'd fall back to SINGLE mode, which would allow us
to interleave general-purpose registers too (resulting in even less
likelihood of spilling).  But at the moment, the vec4 generator and
visitor classes don't have the infrastructure to interleave general
purpose registers, so DUAL_INSTANCED is the best we can do.

As a side benefit this paves the way for implementing instanced
geometry shaders (which are incompatible with DUAL_OBJECT mode).

Since most geometry shaders used in piglit testing are small,
DUAL_INSTANCED mode won't get exercised very much in a normal piglit
run.  To force DUAL_INSTANCED mode to be used for all geometry
shaders, set INTEL_DEBUG=nogualobj.
---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 30 +--
 src/mesa/drivers/dri/i965/intel_debug.c   |  1 +
 src/mesa/drivers/dri/i965/intel_debug.h   |  1 +
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 76209b0..68f11ed 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -587,8 +587,34 @@ brw_gs_emit(struct brw_context *brw,
   printf("\n\n");
}
 
-   /* Assume the geometry shader will use DUAL_OBJECT dispatch for now. */
-   c->prog_data.dual_instanced_dispatch = false;
+   /* Compile the geometry shader in DUAL_OBJECT dispatch mode, if we can do
+* so without spilling.
+*/
+   if (likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) {
+  c->prog_data.dual_instanced_dispatch = false;
+
+  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */);
+  if (v.run()) {
+ vec4_generator g(brw, prog, &c->gp->program.Base, &c->prog_data.base,
+  mem_ctx, INTEL_DEBUG & DEBUG_GS);
+ const unsigned *generated =
+g.generate_assembly(&v.instructions, final_assembly_size);
+
+ return generated;
+  }
+   }
+
+   /* Either we failed to compile in DUAL_OBJECT mode (probably because it
+* would have required spilling) or DUAL_OBJECT mode is disabled.  So fall
+* back to DUAL_INSTANCED mode, which consumes fewer registers.
+*
+* FIXME: In an ideal world we'd fall back to SINGLE mode, which would
+* allow us to interleave general purpose registers (resulting in even less
+* likelihood of spilling).  But at the moment, the vec4 generator and
+* visitor classes don't have the infrastructure to interleave general
+* purpose registers, so DUAL_INSTANCED is the best we can do.
+*/
+   c->prog_data.dual_instanced_dispatch = true;
 
vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, false /* no_spills */);
if (!v.run()) {
diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
b/src/mesa/drivers/dri/i965/intel_debug.c
index 03abdfa..b3e2935 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.c
+++ b/src/mesa/drivers/dri/i965/intel_debug.c
@@ -64,6 +64,7 @@ static const struct dri_debug_control debug_control[] = {
{ "shader_time", DEBUG_SHADER_TIME },
{ "no16",  DEBUG_NO16 },
{ "blorp", DEBUG_BLORP },
+   { "nodualobj", DEBUG_NO_DUAL_OBJECT_GS },
{ NULL,0 }
 };
 
diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
index 486a8d6..718d95b 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.h
+++ b/src/mesa/drivers/dri/i965/intel_debug.h
@@ -60,6 +60,7 @@ extern int INTEL_DEBUG;
 #define DEBUG_BLORP   0x1000
 #define DEBUG_NO160x2000
 #define DEBUG_VUE 0x4000
+#define DEBUG_NO_DUAL_OBJECT_GS 0x8000
 
 #ifdef HAVE_ANDROID_PLATFORM
 #define LOG_TAG "INTEL-MESA"
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] i965/gs: fix up primitive ID workaround for DUAL_INSTANCE dispatch.

2013-10-17 Thread Paul Berry
Parallel change to "i965/gs: Fix up gl_PointSize input swizzling for
DUAL_INSTANCED gs.", except applied to the gl_PrimitiveID fixup
instead of the gl_PointSize fixup.
---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 18 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 10 +-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 700da54..07e9697 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -876,7 +876,23 @@ vec4_generator::generate_vec4_instruction(vec4_instruction 
*instruction,
   }
   break;
case BRW_OPCODE_ADD:
-  brw_ADD(p, dst, src[0], src[1]);
+  if (dst.width == BRW_WIDTH_4) {
+ /* This happens in the geometry shader primitive ID workaround for
+  * "dual instanced" geometry shaders, since they use attributes
+  * (including gl_PrimitiveID) that are vec4's.  Since the exec width
+  * is only 4, it's essential that the caller set force_writemask_all
+  * in order to make sure the MOV happens regardless of which channels
+  * are enabled.
+  */
+ assert(inst->force_writemask_all);
+
+ /* To satisfy register region restrictions, the source registers need
+  * a stride of <4;4,1>.
+  */
+ brw_ADD(p, dst, stride(src[0], 4, 4, 1), stride(src[1], 4, 4, 1));
+  } else {
+ brw_ADD(p, dst, src[0], src[1]);
+  }
   break;
case BRW_OPCODE_MUL:
   brw_MUL(p, dst, src[0], src[1]);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 5b823c4..76209b0 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -149,7 +149,15 @@ vec4_gs_visitor::primitive_id_workaround()
this->current_annotation = "primitive ID workaround";
dst_reg dst(ATTR, VARYING_SLOT_PRIMITIVE_ID);
dst.type = BRW_REGISTER_TYPE_UD;
-   emit(ADD(dst, src_reg(dst), src_reg(primitive_id_offset)));
+   vec4_instruction *inst =
+  emit(ADD(dst, src_reg(dst), src_reg(primitive_id_offset)));
+
+   /* In dual instanced dispatch mode, the primitive ID has a width of 4, so
+* we need to make sure the ADD happens regardless of which channels are
+* enabled.
+*/
+   inst->force_writemask_all = true;
+
this->current_annotation = NULL;
 }
 
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] i965: Reduce gl_MaxGeometryInputComponents to 64.

2013-10-17 Thread Paul Berry
Although in principle there is no hardware limitation that prevents
gl_MaxGeometryInputComponents from being set to 128 on Gen7, we have
the following limitations in the vec4 compiler back end:

- Registers assigned to geometry shader inputs can't be spilled or
  later re-used for any other purpose.

- The last 16 registers are set aside for the "MRF hack", meaning they
  can only be used to send messages, and not for general purpose
  computation.

- Up to 32 registers may be reserved for push constants, even if there
  is sufficient register pressure to make this impractical.

A shader using 128 geometry input components, and having an input type
of triangles_adjacency, would use up:

- 1 register for r0 (which holds URB handles and various pieces of
  control information).

- 1 register for gl_PrimitiveID.

- 102 registers for geometry shader inputs (17 registers per input
  vertex, assuming DUAL_INSTANCED dispatch mode and allowing for one
  register of overhead for gl_Position and gl_PointSize, which are
  present in the URB map even if they are not used).

- Up to 32 registers for push constants.

- 16 registers for the "MRF hack".

That's a total of 152 registers, which is well over the 128 registers
the hardware supports.

Fortunately, the GLSL 1.50 spec allows us to reduce
gl_MaxGeometryInputComponents to 64.  Doing that frees up 48
registers, brining the total down to 104 registers, leaving 24
registers available to do computation.

Fixes piglit test
spec/glsl-1.50/execution/geometry/max-input-components.
---
 src/mesa/drivers/dri/i965/brw_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 307292d..2bd494f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -400,7 +400,7 @@ brw_initialize_context_constants(struct brw_context *brw)
if (brw->gen >= 6) {
   ctx->Const.MaxVarying = 32;
   ctx->Const.VertexProgram.MaxOutputComponents = 128;
-  ctx->Const.GeometryProgram.MaxInputComponents = 128;
+  ctx->Const.GeometryProgram.MaxInputComponents = 64;
   ctx->Const.GeometryProgram.MaxOutputComponents = 128;
   ctx->Const.FragmentProgram.MaxInputComponents = 128;
}
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] build: remove forced -fno-rtti

2013-10-17 Thread Johannes Obermayr
Am Mittwoch, 16. Oktober 2013, 16:58:53 schrieb Alexander von Gluck:
> On Wed, 2013-10-16 at 22:09 +0200, Johannes Obermayr wrote:
> > Am Dienstag, 15. Oktober 2013, 17:22:54 schrieb Alexander von Gluck:
> > > On Tue, 2013-10-15 at 15:05 -0700, Francisco Jerez wrote:
> > > > Johannes Obermayr  writes:
> > > > 
> > > > > Am Dienstag, 15. Oktober 2013, 12:19:40 schrieben Sie:
> > > > >> On Tue, 15 Oct 2013 17:04:26 +0200
> > > > >> Johannes Obermayr  wrote:
> > > > >> > Am Montag, 14. Oktober 2013, 16:57:20 schrieb Francisco Jerez:
> > > > >> > > Alexander von Gluck IV  writes:
> > > > >> > > 
> > > > >> > > > * As discussed on the mailing list,
> > > > >> > > >   forced no-rtti breaks C++ public
> > > > >> > > >   API's such as the Haiku C++ libGL.so
> > > > >> > > > * -fno-rtti *can* be still set however
> > > > >> > > >   instead of blindly forcing -fno-rtti,
> > > > >> > > >   we can rely on the llvm-config
> > > > >> > > >   --cppflags output.
> > > > >> > > >   If the system llvm is built without
> > > > >> > > >   rtti (default), the no-rtti flag will be
> > > > >> > > >   present in llvm-config --cppflags
> > > > >> > > >   (which we pick up on)
> > > > >> > > >   If llvm is built with rtti
> > > > >> > > >   (REQUIRES_RTTI=1), then -fno-rtti is
> > > > >> > > >   removed from llvm-config --cppflags.
> > > > >> > > > * We could selectively add / remove rtti
> > > > >> > > >   from various components, however mixing
> > > > >> > > >   rtti and non-rtti code is tricky and
> > > > >> > > >   could introduce bugs.
> > > > >> > > > * This needs impact tested.
> > > > >> > > 
> > > > >> > > This looks like the right thing to do to me,
> > > > >> > > 
> > > > >> > > Reviewed-by: Francisco Jerez 
> > > > >> > > 
> > > > >> > > Thanks.
> > > > >> > 
> > > > >> > ATM NACK because llvm-config doesn't output required -fno-rtti:
> > > > >> > 
> > > > >> > cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo 
> > > > >> > -DCMAKE_INSTALL_PREFIX=/usr -DLLVM_LIBDIR_SUFFIX=64 
> > > > >> > '-DLLVM_TARGETS_TO_BUILD=CppBackend;NVPTX;R600;X86;XCore' 
> > > > >> > -DBUILD_SHARED_LIBS=ON -DLLVM_ENABLE_TIMESTAMPS=OFF 
> > > > >> > -DLLVM_ENABLE_FFI=ON -DLLVM_USE_OPROFILE=ON -DLLVM_BUILD_TESTS=OFF 
> > > > >> > -DLLVM_INCLUDE_TESTS=OFF -DLLVM_BUILD_EXAMPLES=OFF 
> > > > >> > -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_BUILD_TOOLS=ON 
> > > > >> > -DLLVM_INCLUDE_TOOLS=ON -DLLVM_WC_REVISION=192557
> > > > >> > 
> > > > >> > $ llvm-config --cppflags
> > > > >> > -I/usr/include-D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
> > > > >> > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> > > > >> > $ llvm-config --cxxflags
> > > > >> > -I/usr/include -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 
> > > > >> > -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g  
> > > > >> > -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter 
> > > > >> > -Wwrite-strings -Wno-missing-field-initializers -pedantic 
> > > > >> > -Wno-long-long -Wno-maybe-uninitialized -Wnon-virtual-dtor -O2 -g 
> > > > >> > -DNDEBUG  -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
> > > > >> > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> > > > >> > $ llvm-config --cflags
> > > > >> > -I/usr/include -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 
> > > > >> > -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g  
> > > > >> > -fPIC -Wall -W -Wno-unused-parameter -Wwrite-strings 
> > > > >> > -Wno-missing-field-initializers -pedantic -Wno-long-long -O2 -g 
> > > > >> > -DNDEBUG  -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
> > > > >> > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> > > > >> > 
> > > > >> > [  576s]   CXXLDlibgallium.la
> > > > >> > [  579s] 
> > > > >> > gallivm/.libs/lp_bld_debug.o:(.data.rel.ro._ZTI17raw_debug_ostream[_ZTI17raw_debug_ostream]+0x10):
> > > > >> >  undefined reference to `typeinfo for llvm::raw_ostream'
> > > > >> > [  579s] 
> > > > >> > gallivm/.libs/lp_bld_debug.o:(.data.rel.ro._ZTIN4llvm14format_object1ImEE[_ZTIN4llvm14format_object1ImEE]+0x10):
> > > > >> >  undefined reference to `typeinfo for llvm::format_object_base'
> > > > >> > [  579s] 
> > > > >> > gallivm/.libs/lp_bld_debug.o:(.data.rel.ro._ZTI18BufferMemoryObject[_ZTI18BufferMemoryObject]+0x10):
> > > > >> >  undefined reference to `typeinfo for llvm::MemoryObject'
> > > > >> > 
> > > > >> > Please revert commit ce8eadb!
> > > > >> 
> > > > >> Please let me know if the patch I sent to the mailing list fixes it 
> > > > >> for you. If not, i'll revert the change to be reworked.
> > > > >> As a side node, what version of LLVM are you running on what distro?
> > > > >> It is *really* strange that you're seeing those rtti bugs while 
> > > > >> llvm-config --cxxflags doesn't include -fno-rtti.
> > > > >> 
> > > > >
> > > > > No. It doesn't help.
> > > > >
> > > > > It is not strange because LLVM's CMake build system is buggy:
> > > > >
> > > > > Sth. like 
> > > > >
> > > > >   # Set common compiler options:
> > > > >   if( NOT LLVM_REQUIRES_EH )
> > > > > i

Re: [Mesa-dev] [PATCH] build: remove forced -fno-rtti

2013-10-17 Thread Alexander von Gluck IV
On Thu, 17 Oct 2013 18:11:07 +0200
Johannes Obermayr  wrote:
> 
> Simply make llvm devs to push this fix to all branches (again my words: "Sth. 
> like [...]):

Yeah, I can't make anyone do anything... you have to ask them nicely.

> diff --git a/tools/llvm-config/CMakeLists.txt 
> b/tools/llvm-config/CMakeLists.txt
> index c651833..da90a67 100644
> --- a/tools/llvm-config/CMakeLists.txt
> +++ b/tools/llvm-config/CMakeLists.txt
> @@ -1,3 +1,5 @@
> +include(LLVMProcessSources)
> +
>  set(LLVM_LINK_COMPONENTS support)
> 
>  set(BUILDVARIABLES_SRCPATH ${CMAKE_CURRENT_SOURCE_DIR}/BuildVariables.inc.in)
> @@ -20,6 +22,21 @@ set(LLVM_LDFLAGS ${CMAKE_SHARED_LINKER_FLAGS})
>  set(LLVM_BUILDMODE ${CMAKE_BUILD_TYPE})
>  set(LLVM_SYSTEM_LIBS ${SYSTEM_LIBS})
>  string(REPLACE ";" " " LLVM_TARGETS_BUILT "${LLVM_TARGETS_TO_BUILD}")
> +
> +# Set common compiler options:
> +if( NOT LLVM_REQUIRES_EH )
> +  if( MSVC )
> +llvm_replace_compiler_option(LLVM_CXXFLAGS "/EHsc" "/EHs-c-")
> +  endif()
> +endif()
> +if( NOT LLVM_REQUIRES_RTTI )
> +  if( LLVM_COMPILER_IS_GCC_COMPATIBLE )
> +llvm_replace_compiler_option(LLVM_CXXFLAGS "-frtti" "-fno-rtti")
> +  elseif( MSVC )
> +llvm_replace_compiler_option(LLVM_CXXFLAGS "/GR" "/GR-")
> +  endif()
> +endif()
> +
>  configure_file(${BUILDVARIABLES_SRCPATH} ${BUILDVARIABLES_OBJPATH} @ONLY)
> 
>  # Add the llvm-config tool.
> 
> 
> The result is LLVM_CXXFLAGS contain "-fno-rtti" in 
> tools/llvm-config/BuildVariables.inc which is used in 
> tools/llvm-config/llvm-config.cpp:
> #define LLVM_CXXFLAGS "-fomit-frame-pointer -fmessage-length=0 -O2 -Wall 
> -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables 
> -fasynchronous-unwind-tables -g  -fPIC -fvisibility-inlines-hidden -Wall -W 
> -Wno-unused-parameter -Wwrite-strings -Wno-missing-field-initializers 
> -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wnon-virtual-dtor -O2 -g 
> -DNDEBUG  -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> -D__STDC_LIMIT_MACROS -fno-rtti"

Great, you should submit that fix to the llvm bug mentioned earlier.

http://llvm.org/bugs/show_bug.cgi?id=14200


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] build: remove forced -fno-rtti

2013-10-17 Thread Jan Vesely
On Thu, 2013-10-17 at 18:11 +0200, Johannes Obermayr wrote:
> Am Mittwoch, 16. Oktober 2013, 16:58:53 schrieb Alexander von Gluck:
> > On Wed, 2013-10-16 at 22:09 +0200, Johannes Obermayr wrote:
> > > Am Dienstag, 15. Oktober 2013, 17:22:54 schrieb Alexander von Gluck:
> > > > On Tue, 2013-10-15 at 15:05 -0700, Francisco Jerez wrote:
> > > > > Johannes Obermayr  writes:
> > > > > 
> > > > > > Am Dienstag, 15. Oktober 2013, 12:19:40 schrieben Sie:
> > > > > >> On Tue, 15 Oct 2013 17:04:26 +0200
> > > > > >> Johannes Obermayr  wrote:
> > > > > >> > Am Montag, 14. Oktober 2013, 16:57:20 schrieb Francisco Jerez:
> > > > > >> > > Alexander von Gluck IV  writes:
> > > > > >> > > 
> > > > > >> > > > * As discussed on the mailing list,
> > > > > >> > > >   forced no-rtti breaks C++ public
> > > > > >> > > >   API's such as the Haiku C++ libGL.so
> > > > > >> > > > * -fno-rtti *can* be still set however
> > > > > >> > > >   instead of blindly forcing -fno-rtti,
> > > > > >> > > >   we can rely on the llvm-config
> > > > > >> > > >   --cppflags output.
> > > > > >> > > >   If the system llvm is built without
> > > > > >> > > >   rtti (default), the no-rtti flag will be
> > > > > >> > > >   present in llvm-config --cppflags
> > > > > >> > > >   (which we pick up on)
> > > > > >> > > >   If llvm is built with rtti
> > > > > >> > > >   (REQUIRES_RTTI=1), then -fno-rtti is
> > > > > >> > > >   removed from llvm-config --cppflags.
> > > > > >> > > > * We could selectively add / remove rtti
> > > > > >> > > >   from various components, however mixing
> > > > > >> > > >   rtti and non-rtti code is tricky and
> > > > > >> > > >   could introduce bugs.
> > > > > >> > > > * This needs impact tested.
> > > > > >> > > 
> > > > > >> > > This looks like the right thing to do to me,
> > > > > >> > > 
> > > > > >> > > Reviewed-by: Francisco Jerez 
> > > > > >> > > 
> > > > > >> > > Thanks.
> > > > > >> > 
> > > > > >> > ATM NACK because llvm-config doesn't output required -fno-rtti:
> > > > > >> > 
> > > > > >> > cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo 
> > > > > >> > -DCMAKE_INSTALL_PREFIX=/usr -DLLVM_LIBDIR_SUFFIX=64 
> > > > > >> > '-DLLVM_TARGETS_TO_BUILD=CppBackend;NVPTX;R600;X86;XCore' 
> > > > > >> > -DBUILD_SHARED_LIBS=ON -DLLVM_ENABLE_TIMESTAMPS=OFF 
> > > > > >> > -DLLVM_ENABLE_FFI=ON -DLLVM_USE_OPROFILE=ON 
> > > > > >> > -DLLVM_BUILD_TESTS=OFF -DLLVM_INCLUDE_TESTS=OFF 
> > > > > >> > -DLLVM_BUILD_EXAMPLES=OFF -DLLVM_INCLUDE_EXAMPLES=OFF 
> > > > > >> > -DLLVM_BUILD_TOOLS=ON -DLLVM_INCLUDE_TOOLS=ON 
> > > > > >> > -DLLVM_WC_REVISION=192557
> > > > > >> > 
> > > > > >> > $ llvm-config --cppflags
> > > > > >> > -I/usr/include-D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
> > > > > >> > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> > > > > >> > $ llvm-config --cxxflags
> > > > > >> > -I/usr/include -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 
> > > > > >> > -fstack-protector -funwind-tables -fasynchronous-unwind-tables 
> > > > > >> > -g  -fPIC -fvisibility-inlines-hidden -Wall -W 
> > > > > >> > -Wno-unused-parameter -Wwrite-strings 
> > > > > >> > -Wno-missing-field-initializers -pedantic -Wno-long-long 
> > > > > >> > -Wno-maybe-uninitialized -Wnon-virtual-dtor -O2 -g -DNDEBUG  
> > > > > >> > -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> > > > > >> > -D__STDC_LIMIT_MACROS
> > > > > >> > $ llvm-config --cflags
> > > > > >> > -I/usr/include -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 
> > > > > >> > -fstack-protector -funwind-tables -fasynchronous-unwind-tables 
> > > > > >> > -g  -fPIC -Wall -W -Wno-unused-parameter -Wwrite-strings 
> > > > > >> > -Wno-missing-field-initializers -pedantic -Wno-long-long -O2 -g 
> > > > > >> > -DNDEBUG  -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
> > > > > >> > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> > > > > >> > 
> > > > > >> > [  576s]   CXXLDlibgallium.la
> > > > > >> > [  579s] 
> > > > > >> > gallivm/.libs/lp_bld_debug.o:(.data.rel.ro._ZTI17raw_debug_ostream[_ZTI17raw_debug_ostream]+0x10):
> > > > > >> >  undefined reference to `typeinfo for llvm::raw_ostream'
> > > > > >> > [  579s] 
> > > > > >> > gallivm/.libs/lp_bld_debug.o:(.data.rel.ro._ZTIN4llvm14format_object1ImEE[_ZTIN4llvm14format_object1ImEE]+0x10):
> > > > > >> >  undefined reference to `typeinfo for llvm::format_object_base'
> > > > > >> > [  579s] 
> > > > > >> > gallivm/.libs/lp_bld_debug.o:(.data.rel.ro._ZTI18BufferMemoryObject[_ZTI18BufferMemoryObject]+0x10):
> > > > > >> >  undefined reference to `typeinfo for llvm::MemoryObject'
> > > > > >> > 
> > > > > >> > Please revert commit ce8eadb!
> > > > > >> 
> > > > > >> Please let me know if the patch I sent to the mailing list fixes 
> > > > > >> it for you. If not, i'll revert the change to be reworked.
> > > > > >> As a side node, what version of LLVM are you running on what 
> > > > > >> distro?
> > > > > >> It is *really* strange that you're seeing those rtti bugs while 
> > > > > >> llvm-config --cxxflags doesn't include -fno-r

[Mesa-dev] [Bug 70546] glext.h:412:25: error: redefinition of typedef 'PFNGLBLENDCOLORPROC'

2013-10-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70546

--- Comment #2 from Vinson Lee  ---
(In reply to comment #1)
> 
> If you remove lines 1739-1740 in gl.h does that fix this?

Yes, removing those two lines fixes the build.

diff --git a/include/GL/gl.h b/include/GL/gl.h
index 975cfe8..babb746 100644
--- a/include/GL/gl.h
+++ b/include/GL/gl.h
@@ -1736,8 +1736,6 @@ GLAPI void GLAPIENTRY glSeparableFilter2D( GLenum target,
 GLAPI void GLAPIENTRY glGetSeparableFilter( GLenum target, GLenum format,
GLenum type, GLvoid *row, GLvoid *column, GLvoid *span );

-typedef void (APIENTRYP PFNGLBLENDCOLORPROC) (GLclampf red, GLclampf green,
GLclampf blue, GLclampf alpha);
-typedef void (APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] glsl: Allow mixing of GLSL 1.40 and later shader versions.

2013-10-17 Thread Paul Berry
On 16 October 2013 22:30, Kenneth Graunke  wrote:

> On 10/16/2013 04:56 PM, Ian Romanick wrote:
> > On 10/16/2013 10:29 AM, Paul Berry wrote:
> >> ---
> >>
> >> I'm not 100% sure this is the right way to go, and here's why:
> >>
> >> Taken together, all the GLSL specs except GLSL 4.30 and GLSL 4.40 tell
> >> a consistent story: desktop shader versions 1.10 and 1.20 may be
> >> linked together, and desktop shader versions 1.40 and above may be
> >> linked together.  No other cross-version linking is allowed.
> >>
> >> However, cross-version linking restrictions were explicitly removed in
> >> GLSL 4.30 (the change is listed under "Summary of Changes from Version
> >> 4.20 as "Remove cross-version linking restrictions.").  GLSL 4.30 and
> >> 4.40 state that *any* version of desktop GLSL may be linked with any
> >> other version of desktop GLSL.  (Note that cross-version linking is
> >> still prohibited for ES shaders).
> >
> > This came from a Khronos bug that I submitted.  The problem is that no
> > other driver enforces the spec mandated restriction.  On top of that,
> > you can't fully enforce the restriction (without draw-time errors) with
> > separate shader objects.  I *thought* the change in 4.3 was to allow
> > mixed versions between stages, but mixing versions within a stage is
> > still forbidden.
> >
> >> This leads to a conundrum.  Normally when the GLSL spec changes from
> >> one version to the next, we implement different rules depending on the
> >> user-supplied "#version" directive.  But we can't do that for
> >> cross-version linking rules since it's not clear which version of GLSL
> >> should apply.  Should we:
> >>
> >> (a) always follow pre-GLSL 4.30 linking rules, since we don't support
> >> GLSL 4.30 yet?  (that's what this patch implements).
> >>
> >> (b) always follow post-GLSL 4.30 linking rules, since they're probably
> >> the clearest reflection of the Khronos board's intent?
> >>
> >> (c) make some kind of dynamic determination of which set of rules to
> >> follow?
> >>
> >> FWIW, the NVIDIA proprietary driver for Linux (version 313.18) appears
> >> to implement (b).
> >
> > There are different cases: intrastage and interstage.  I assume they
> > allow mixing interstage.  What about intrastage?
>
> I ran a quick test on my AMD Radeon 6870, using Catalyst 13.10 (which
> supports 4.30).  It allowed both kinds of linking.  Here's my test:
>
> [require]
> GL >= 2.1
> GLSL >= 1.10
>
> [vertex shader]
> #version 400 compatibility
>
> void main()
> {
> gl_Position = gl_Vertex;
> }
>
> [fragment shader]
> #version 330 core
> uniform vec4 color;
>
> vec4 get_color()
> {
> return color;
> }
>
> [fragment shader]
> #version 110
> vec4 get_color();
>
> void main()
> {
> gl_FragColor = get_color();
> }
>
> [test]
> uniform vec4 color 0 1 0 1
> draw rect -1 -1 2 2
> relative probe rgba (0.5, 0.5) (0.0, 1.0, 0.0, 1.0)
>

Ok, thanks Ken.  Based on the NVIDIA results, and further discussion with
Ian this morning, I'm going to NAK this patch and follow up with a patch
that allows all desktop GLSL versions to be inter-linked.

(BTW, for any Khronos members, the Khronos bug that led to this change is
bug 8463)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] scons: Simplified fix of llvm cxxflags

2013-10-17 Thread Jose Fonseca
Thanks.

Jose

- Original Message -
> * Based on ideas of Jose Fonseca
> * A rework of ce8eadb6e8
> ---
>  scons/llvm.py | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/scons/llvm.py b/scons/llvm.py
> index c1c3736..8388d8e 100644
> --- a/scons/llvm.py
> +++ b/scons/llvm.py
> @@ -190,6 +190,11 @@ def generate(env):
>  pass
>  env.MergeFlags(cppflags)
>  
> +# Match llvm --fno-rtti flag
> +cxxflags = env.backtick('llvm-config --cxxflags').split()
> +if '-fno-rtti' in cxxflags:
> +env.Append(CXXFLAGS = ['-fno-rtti'])
> +
>  components = ['engine', 'bitwriter', 'x86asmprinter']
>  
>  if llvm_version >= distutils.version.LooseVersion('3.1'):
> --
> 1.8.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70581] New: ./autogen.sh warnings in Mesa build system since commit 4e9028b

2013-10-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70581

  Priority: medium
Bug ID: 70581
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: ./autogen.sh warnings in Mesa build system since
commit 4e9028b
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: johannesoberm...@gmx.de
  Hardware: Other
Status: NEW
   Version: git
 Component: EGL
   Product: Mesa

Introduced with
http://cgit.freedesktop.org/mesa/mesa/commit/?id=4e9028b

./autogen.sh

src/gallium/state_trackers/egl/Makefile.sources:20: warning: variable
'gdi_SOURCES' is defined but no program or
src/gallium/state_trackers/egl/Makefile.sources:20: library has 'gdi' as
canonical name (possible typo)
src/gallium/state_trackers/egl/Makefile.am:25:  
'src/gallium/state_trackers/egl/Makefile.sources' included from here
src/gallium/state_trackers/egl/Makefile.sources:10: warning: variable
'android_SOURCES' is defined but no program or
src/gallium/state_trackers/egl/Makefile.sources:10: library has 'android' as
canonical name (possible typo)
src/gallium/state_trackers/egl/Makefile.am:25:  
'src/gallium/state_trackers/egl/Makefile.sources' included from here

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70581] ./autogen.sh warnings in Mesa build system since commit 4e9028b

2013-10-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70581

Johannes Obermayr  changed:

   What|Removed |Added

 CC||emil.l.veli...@gmail.com,
   ||tstel...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70546] glext.h:412:25: error: redefinition of typedef 'PFNGLBLENDCOLORPROC'

2013-10-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70546

Brian Paul  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Brian Paul  ---
Fixed with commit a36f7e651e947ff14dbbd242b1d9ab160442c532

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Remove error when calling glGenQueries/glDeleteQueries while a query is active

2013-10-17 Thread Carl Worth
There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.

Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case.

CC: 
---
 src/mesa/main/queryobj.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/mesa/main/queryobj.c b/src/mesa/main/queryobj.c
index a180133..c98b2c7 100644
--- a/src/mesa/main/queryobj.c
+++ b/src/mesa/main/queryobj.c
@@ -202,13 +202,6 @@ _mesa_GenQueries(GLsizei n, GLuint *ids)
   return;
}
 
-   /* No query objects can be active at this time! */
-   if (ctx->Query.CurrentOcclusionObject ||
-   ctx->Query.CurrentTimerObject) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGenQueriesARB");
-  return;
-   }
-
first = _mesa_HashFindFreeKeyBlock(ctx->Query.QueryObjects, n);
if (first) {
   GLsizei i;
@@ -241,18 +234,10 @@ _mesa_DeleteQueries(GLsizei n, const GLuint *ids)
   return;
}
 
-   /* No query objects can be active at this time! */
-   if (ctx->Query.CurrentOcclusionObject ||
-   ctx->Query.CurrentTimerObject) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glDeleteQueriesARB");
-  return;
-   }
-
for (i = 0; i < n; i++) {
   if (ids[i] > 0) {
  struct gl_query_object *q = _mesa_lookup_query_object(ctx, ids[i]);
  if (q) {
-ASSERT(!q->Active); /* should be caught earlier */
 _mesa_HashRemove(ctx->Query.QueryObjects, ids[i]);
 ctx->Driver.DeleteQuery(ctx, q);
  }
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70581] ./autogen.sh warnings in Mesa build system since commit 4e9028b

2013-10-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70581

Emil Velikov  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |emil.l.veli...@gmail.com
   |org |

--- Comment #1 from Emil Velikov  ---
Thanks for reporting this Johannes, I wonder how many people did notice this
(btw you're not the first one :)

AFAICS these warnings are harmless and should not cause any issues. There is a
patch/fix for it (sitting for about 2 weeks) and I'll include it in my next
automake(build systems) patch set.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Remove error when calling glGenQueries/glDeleteQueries while a query is active

2013-10-17 Thread Brian Paul

On 10/17/2013 12:14 PM, Carl Worth wrote:

There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.


I guess I wrote that code back when I implemented occlusion queries.

But from http://www.opengl.org/registry/specs/ARB/occlusion_query.txt:

"""
Calling either GenQueriesARB or DeleteQueriesARB while any query of
any target is active causes an INVALID_OPERATION error to be
generated.
"""
(it's about half-way down in the file)  It's also mentioned in the 
"Errors" section.



Maybe that was rescinded since that spec was done.  If so, I'm fine with 
removing the code.


However, I wouldn't be surprised if our drivers crashed and burned if an 
active query is deleted.  gl_query_object isn't referenced counted.


-Brian




Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case.

CC: 
---
  src/mesa/main/queryobj.c | 15 ---
  1 file changed, 15 deletions(-)

diff --git a/src/mesa/main/queryobj.c b/src/mesa/main/queryobj.c
index a180133..c98b2c7 100644
--- a/src/mesa/main/queryobj.c
+++ b/src/mesa/main/queryobj.c
@@ -202,13 +202,6 @@ _mesa_GenQueries(GLsizei n, GLuint *ids)
return;
 }

-   /* No query objects can be active at this time! */
-   if (ctx->Query.CurrentOcclusionObject ||
-   ctx->Query.CurrentTimerObject) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGenQueriesARB");
-  return;
-   }
-
 first = _mesa_HashFindFreeKeyBlock(ctx->Query.QueryObjects, n);
 if (first) {
GLsizei i;
@@ -241,18 +234,10 @@ _mesa_DeleteQueries(GLsizei n, const GLuint *ids)
return;
 }

-   /* No query objects can be active at this time! */
-   if (ctx->Query.CurrentOcclusionObject ||
-   ctx->Query.CurrentTimerObject) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glDeleteQueriesARB");
-  return;
-   }
-
 for (i = 0; i < n; i++) {
if (ids[i] > 0) {
   struct gl_query_object *q = _mesa_lookup_query_object(ctx, ids[i]);
   if (q) {
-ASSERT(!q->Active); /* should be caught earlier */
  _mesa_HashRemove(ctx->Query.QueryObjects, ids[i]);
  ctx->Driver.DeleteQuery(ctx, q);
   }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] strcasecmp() not found w/ MSVC

2013-10-17 Thread Brian Paul

Hi Paul,

It looks like MSVC doesn't have the strcasecmp() function you recently 
employed in src/glsl/glsl_parser.yy.


Looks like the work-around is _stricmp, per
http://stackoverflow.com/questions/3694723/error-c3861-strcasecmp-identifier-not-found-in-visual-studio-2008/3694738#3694738

I'll look into fixing it here, but you might want to take a look too 
(and I might not get to it for a few hours).


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] glsl: Add new builtins required by GL_ARB_sample_shading

2013-10-17 Thread Anuj Phogat
On Wed, Oct 16, 2013 at 11:25 PM, Kenneth Graunke  wrote:
> On 10/16/2013 03:49 PM, Ian Romanick wrote:
> [snip]
>> You are completely correct here.  We should check what other vendors do.
>>  I think 5 tests will tell us everything we need to know (then we'll
>> probably submit some spec bugs).  I don't have any non-Intel hardware
>> that supports either extension, so I can't actually try any of these.
>
> Thanks for the tests, Ian!
>
> I ran them on my Radeon HD 6870 with Catalyst 13.10 by doing:
I ran these tests on NVIDIA GTX 650 OpenGL 4.3 drivers (319.17)

> $ PIGLIT_PLATFORM=glx glslparsertest --check-link 1.frag pass
>
>> 1. Does this compile, link, and run:
>>
>> #version 130
>> #extension GL_ARB_sample_shading: require
>>
>> out vec4 color;
>>
>> void main() {
>> color = vec4(gl_SampleMask.length());
>> gl_SampleMask[0] = ~0;
>> }
>
> Successfully compiled and linked fragment shader 1.frag: (no compiler output)
>
NVIDIA:
Successfully compiled and linked fragment shader 1.frag: (no compiler output)
Mesa:
Successfully compiled and linked fragment shader 1.frag: (no compiler output)
>> 2. Does this compile and link:
>>
>> #version 130
>> #extension GL_ARB_sample_shading: require
>>
>> out vec4 color;
>>
>> void main() {
>> color = vec4(1);
>> gl_SampleMask[0] = ~0;
>> gl_SampleMask[1] = ~0;
>> }
>
> Failed to compile fragment shader 2.frag:
> Fragment shader failed to compile with the following errors:
> ERROR: 0:9: error(#147) "[" array index out of range: '1'
> ERROR: error(#273) 1 compilation errors.  No code generated
>
NVIDIA:
Failed to link: Fragment info
-
0(9) : warning C1068: array index out of bounds
0(9) : warning C1068: array index out of bounds
0(9) : warning C1068: array index out of bounds
0(9) : warning C1068: array index out of bounds
0(9) : warning C1068: array index out of bounds
0(9) : error C1068: array index out of bounds

Failed to link fragment shader 2.frag: 0(9) : warning C1068: array
index out of bounds
0(9) : warning C1068: array index out of bounds
(compiler prints shader source here)

Mesa:
Failed to compile fragment shader 2.frag: 0:9(17): error: array index
must be < 1
>> 3. Does this compile and link:
>>
>> #version 130
>> #extension GL_ARB_sample_shading: require
>> #extension GL_ARB_gpu_shader5: require
>>
>> out vec4 color;
>>
>> void main() {
>> color = vec4(1);
>> gl_SampleMask = gl_SampleMaskIn;
>> }
>
> Successfully compiled and linked fragment shader 3.frag: (no compiler output)
>
NVIDIA:
Successfully compiled and linked fragment shader 3.frag: (no compiler output)
Mesa:
Failed to compile fragment shader 3.frag: 0:3(12): error: extension
`GL_ARB_gpu_shader5' unsupported in fragment shader
>> 4. Does this compile and link:
>>
>> #version 130
>> #extension GL_ARB_sample_shading: require
>>
>> out vec4 color;
>> in int gl_SampleMask[1];
>>
>> void main() {
>> color = vec4(1);
>> gl_SampleMask[0] = ~0;
>> }
>
> Successfully compiled and linked fragment shader 4.frag:
> WARNING: 0:5: warning(#375) Redeclaration of  built-in name: gl_SampleMask
> WARNING: 0:9: warning(#398) l-value required: assign "gl_SampleMask" (can't 
> modify an input)
>
> Presumably you meant "out int gl_SampleMask[1];" instead.  With that change,
>
> Successfully compiled and linked fragment shader 4b.frag:
> WARNING: 0:5: warning(#375) Redeclaration of  built-in name: gl_SampleMask
>
With  "out int gl_SampleMask[1];" change:
NVIDIA:
Successfully compiled and linked fragment shader 4.frag: (no compiler output)
Mesa:
mesa: Failed to compile fragment shader 4.frag: 0:0(0): error:
`gl_SampleMask' redeclared
>> 5. Does this compile and link:
>>
>> #version 130
>> #extension GL_ARB_sample_shading: require
>>
>> out vec4 color;
>> in int gl_SampleMask[2];
>>
>> void main() {
>> color = vec4(1);
>> gl_SampleMask[0] = ~0;
>> gl_SampleMask[1] = ~0;
>> }
>
> Successfully compiled and linked fragment shader 5.frag:
> WARNING: 0:5: warning(#375) Redeclaration of  built-in name: gl_SampleMask
> WARNING: 0:9: warning(#398) l-value required: assign "gl_SampleMask" (can't 
> modify an input)
> WARNING: 0:10: warning(#398) l-value required: assign "gl_SampleMask" (can't 
> modify an input)
>
> Again changing "in" to "out":
>
> Successfully compiled and linked fragment shader 5b.frag:
> WARNING: 0:5: warning(#375) Redeclaration of  built-in name: gl_SampleMask
>
NVIDIA:
Failed to link: Fragment info
-
0(5) : error C5102: output semantic attribute "SAMPLEMASK" has too big
of a numeric index (1)
Failed to link fragment shader 5.frag:

Mesa:
mesa: Failed to compile fragment shader 5.frag: 0:0(0): error:
`gl_SampleMask' redeclared
0:10(17): error: array index must be < 1

So, AMD and NVIDIA gives similar output for above test cases except
for 5.frag. They
both seem to use fixed length builtin array: gl_SampleMask[1]. Both
allow redeclaring
builtin variables. Mesa doesn't seem to allow any redeclaration of
builtin variables.

> I haven't actua

Re: [Mesa-dev] [PATCH] Remove error when calling glGenQueries/glDeleteQueries while a query is active

2013-10-17 Thread Carl Worth
There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.

Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case and be sure to call the
driver's EndQuery hook.

CC: 
---

Brian Paul  writes:
> On 10/17/2013 12:14 PM, Carl Worth wrote:
> But from http://www.opengl.org/registry/specs/ARB/occlusion_query.txt:
>
> """
>  Calling either GenQueriesARB or DeleteQueriesARB while any query of
>  any target is active causes an INVALID_OPERATION error to be
>  generated.
> """
> (it's about half-way down in the file)  It's also mentioned in the 
> "Errors" section.

Thanks, Brian. That certainly does justify where the original code came
from.

> Maybe that was rescinded since that spec was done.  If so, I'm fine with 
> removing the code.

I can't find any similar error requirement in the OpenGL 4.4 (Core)
specification. (And with the increased number of different query types,
it doesn't seem that the error requirement makes sense.) I did find the
following sentence in the specification (section 4.2):

If an active query object is deleted its name immediately
becomes unused, but the underlying object is not deleted until
it is no longer active.

This sentence presumes the possibility of deleting an active query, so I
think it is reasonable to remove the error.

> However, I wouldn't be surprised if our drivers crashed and burned if an 
> active query is deleted.  gl_query_object isn't referenced counted.

Thanks for the catch. My revised patch below calls the driver's EndQuery
hook, which will hopefully avoid this problem. This does inactivate the
query immediately at delete time, which might seem inconsistent with the
sentence I quoted from the specification above. But I think this is fine
since once the object is deleted the user has no visibility into whether
the object is active or not.

-Carl

 src/mesa/main/queryobj.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/queryobj.c b/src/mesa/main/queryobj.c
index a180133..ebdb71c 100644
--- a/src/mesa/main/queryobj.c
+++ b/src/mesa/main/queryobj.c
@@ -202,13 +202,6 @@ _mesa_GenQueries(GLsizei n, GLuint *ids)
   return;
}
 
-   /* No query objects can be active at this time! */
-   if (ctx->Query.CurrentOcclusionObject ||
-   ctx->Query.CurrentTimerObject) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGenQueriesARB");
-  return;
-   }
-
first = _mesa_HashFindFreeKeyBlock(ctx->Query.QueryObjects, n);
if (first) {
   GLsizei i;
@@ -241,18 +234,14 @@ _mesa_DeleteQueries(GLsizei n, const GLuint *ids)
   return;
}
 
-   /* No query objects can be active at this time! */
-   if (ctx->Query.CurrentOcclusionObject ||
-   ctx->Query.CurrentTimerObject) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glDeleteQueriesARB");
-  return;
-   }
-
for (i = 0; i < n; i++) {
   if (ids[i] > 0) {
  struct gl_query_object *q = _mesa_lookup_query_object(ctx, ids[i]);
  if (q) {
-ASSERT(!q->Active); /* should be caught earlier */
+if (q->Active) {
+   q->Active = GL_FALSE;
+   ctx->Driver.EndQuery(ctx, q);
+}
 _mesa_HashRemove(ctx->Query.QueryObjects, ids[i]);
 ctx->Driver.DeleteQuery(ctx, q);
  }
-- 
1.8.4.rc3



pgpREuiMip5QC.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] strcasecmp() not found w/ MSVC

2013-10-17 Thread Paul Berry
On 17 October 2013 12:35, Brian Paul  wrote:

> Hi Paul,
>
> It looks like MSVC doesn't have the strcasecmp() function you recently
> employed in src/glsl/glsl_parser.yy.
>
> Looks like the work-around is _stricmp, per
> http://stackoverflow.com/**questions/3694723/error-c3861-**
> strcasecmp-identifier-not-**found-in-visual-studio-2008/**3694738#3694738
>
> I'll look into fixing it here, but you might want to take a look too (and
> I might not get to it for a few hours).
>
> -Brian
>

It should be an easy enough fix.  I'll send out a patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Fix MSVC build (missing strcasecmp())

2013-10-17 Thread Paul Berry
MSVC doesn't have a strcasecmp() function; it uses _stricmp() instead.
---
 src/glsl/glsl_parser.yy | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index ba2dc63..00589e2 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -66,8 +66,14 @@ static bool match_layout_qualifier(const char *s1, const 
char *s2,
 */
if (state->es_shader)
   return strcmp(s1, s2);
-   else
+   else {
+#if defined(_MSC_VER)
+  /* MSVC doesn't have a strcasecmp() function; instead it has _stricmp. */
+  return _stricmp(s1, s2);
+#else
   return strcasecmp(s1, s2);
+#endif
+   }
 }
 %}
 
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70591] New: glxext.h:275: error: redefinition of typedef ‘GLXContextID’

2013-10-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70591

  Priority: medium
Bug ID: 70591
  Keywords: regression
CC: bri...@vmware.com
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: glxext.h:275: error: redefinition of typedef
‘GLXContextID’
  Severity: blocker
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

mesa: b3360d23ac1db61390b2ac8963756c6133ba6e23 (master)

  CC clientattrib.lo
In file included from ../../include/GL/glx.h:333,
 from glxclient.h:45,
 from clientattrib.c:32:
../../include/GL/glxext.h:275: error: redefinition of typedef ‘GLXContextID’
../../include/GL/glx.h:171: note: previous declaration of ‘GLXContextID’ was
here

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix MSVC build (missing strcasecmp())

2013-10-17 Thread Jose Fonseca
Looks good. Thanks.

Reviewed-by: Jose Fonseca 

Jose

- Original Message -
> MSVC doesn't have a strcasecmp() function; it uses _stricmp() instead.
> ---
>  src/glsl/glsl_parser.yy | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> index ba2dc63..00589e2 100644
> --- a/src/glsl/glsl_parser.yy
> +++ b/src/glsl/glsl_parser.yy
> @@ -66,8 +66,14 @@ static bool match_layout_qualifier(const char *s1, const
> char *s2,
>  */
> if (state->es_shader)
>return strcmp(s1, s2);
> -   else
> +   else {
> +#if defined(_MSC_VER)
> +  /* MSVC doesn't have a strcasecmp() function; instead it has _stricmp.
> */
> +  return _stricmp(s1, s2);
> +#else
>return strcasecmp(s1, s2);
> +#endif
> +   }
>  }
>  %}
>  
> --
> 1.8.4.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Patch for include/GL/gl.h

2013-10-17 Thread Baptist BENOIST
Hello,

You can find attached a patch to apply on the include/GL/gl.h file.

This patch aims to fix a build issue with GCC when using the -DGL_GLEXT_LEGACY, 
-Werror and -Wundef flags. I have remarked the problem with Qt 5.1.1 (which I 
am packaging for NixOS) but it will occur on any build which combines these 
three flags.

To clarify things:

-Wundef tells the compiler to warn on any use of an undefined definition 
(#define THE_DEFINITION).
-Werror tells the compiler to transform any warning as an error.

With these options, you cannot do:

#if THE_DEFINITION

when THE_DEFINITION has not been previously defined.

You must do:

#if defined(THE_DEFINITION) && THE_DEFINITION


Feel free to ask me anything about this ;-)

Regards,

Baptist   

werror-wundef.patch
Description: Binary data
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] Float fbconfigs exchange patch [1/6] Improved handling of renderType attribute in GLX structures.

2013-10-17 Thread Czarnowski, Daniel
>From bf60ddd081bd66d1364bf7973e664af46335dfd6 Mon Sep 17 00:00:00 2001
From: Daniel Czarnowski 
Date: Wed, 16 Oct 2013 13:35:20 +0200
Subject: [PATCH] Support of GLX_RGBA*_FLOAT_BIT*, and correct setting of the
flags. Also commented each renderType use with information
which (fbconfig or context) RENDER_TYPE it is.

---
glx/createcontext.c |2 ++
glx/glxext.h|   15 +++
2 files changed, 17 insertions(+)

diff --git a/glx/createcontext.c b/glx/createcontext.c
index 13d21cc..41ecd11 100644
--- a/glx/createcontext.c
+++ b/glx/createcontext.c
@@ -68,6 +68,8 @@ validate_render_type(uint32_t render_type)
 switch (render_type) {
 case GLX_RGBA_TYPE:
 case GLX_COLOR_INDEX_TYPE:
+case GLX_RGBA_FLOAT_TYPE_ARB:
+case GLX_RGBA_UNSIGNED_FLOAT_TYPE_EXT:
 return True;
 default:
 return False;
diff --git a/glx/glxext.h b/glx/glxext.h
index 9b0978b..2d67af3 100644
--- a/glx/glxext.h
+++ b/glx/glxext.h
@@ -35,6 +35,21 @@
  * Silicon Graphics, Inc.
  */
+// doing #include  & #include  could cause problems with
+// overlapping definitions, so let's use the easy way
+#ifndef GLX_RGBA_FLOAT_BIT_ARB
+#define GLX_RGBA_FLOAT_BIT_ARB 0x0004
+#endif
+#ifndef GLX_RGBA_FLOAT_TYPE_ARB
+#define GLX_RGBA_FLOAT_TYPE_ARB0x20B9
+#endif
+#ifndef GLX_RGBA_UNSIGNED_FLOAT_BIT_EXT
+#define GLX_RGBA_UNSIGNED_FLOAT_BIT_EXT0x0008
+#endif
+#ifndef GLX_RGBA_UNSIGNED_FLOAT_TYPE_EXT
+#define GLX_RGBA_UNSIGNED_FLOAT_TYPE_EXT   0x20B1
+#endif
+
extern GLboolean __glXFreeContext(__GLXcontext * glxc);
extern void __glXFlushContextCache(void);
--
1.7.10.4



Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by others 
is strictly prohibited.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] Float fbconfigs exchange patch [2/6] Improved handling of renderType attribute in GLX structures. (DMX)

2013-10-17 Thread Czarnowski, Daniel
>From f93a54e7447a6f30c68d31ac82637e724c3953eb Mon Sep 17 00:00:00 2001
From: Daniel Czarnowski 
Date: Wed, 16 Oct 2013 13:36:33 +0200
Subject: [PATCH]  Support of GLX_RGBA*_FLOAT_BIT*, and correct setting of the
 flags. Also commented each renderType use with information  which (fbconfig
or context) RENDER_TYPE it is.  Changes in DMX component.

---
hw/dmx/dmx_glxvisuals.c   |  4 +++-
hw/dmx/glxProxy/glxcmds.c | 36 +---
2 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/hw/dmx/dmx_glxvisuals.c b/hw/dmx/dmx_glxvisuals.c
index 56bd67b..0cd5f21 100644
--- a/hw/dmx/dmx_glxvisuals.c
+++ b/hw/dmx/dmx_glxvisuals.c
@@ -448,7 +448,9 @@ GetGLXFBConfigs(Display * dpy, int glxMajorOpcode, int 
*nconfigs)
 /* Fill in derived values */
 config->screen = screen;
-config->rgbMode = config->renderType & GLX_RGBA_BIT;
+/* The rgbMode should be true for any mode which has distinguishible 
R, G and B components */
+config->rgbMode = (config->renderType & (GLX_RGBA_BIT |
+ GLX_RGBA_FLOAT_BIT_ARB | 
GLX_RGBA_UNSIGNED_FLOAT_BIT_EXT)) != 0;
 config->colorIndexMode = !config->rgbMode;
 config->haveAccumBuffer =
diff --git a/hw/dmx/glxProxy/glxcmds.c b/hw/dmx/glxProxy/glxcmds.c
index 8cdb25e..fb56017 100644
--- a/hw/dmx/glxProxy/glxcmds.c
+++ b/hw/dmx/glxProxy/glxcmds.c
@@ -123,6 +123,21 @@ GetBackEndDisplay(__GLXclientState * cl, int s)
 return cl->be_displays[s];
}
+/*  Convert the render type bits from fbconfig into context render type. */
+static int convFBconfRenderTypeBits2CtxRenderType(int fbRenderType)
+{
+if (fbRenderType & GLX_RGBA_BIT)
+return GLX_RGBA_TYPE;
+if (fbRenderType & GLX_COLOR_INDEX_BIT)
+return  GLX_COLOR_INDEX_TYPE;
+if (fbRenderType & GLX_RGBA_FLOAT_BIT_ARB)
+return GLX_RGBA_FLOAT_TYPE_ARB;
+if (fbRenderType & GLX_RGBA_UNSIGNED_FLOAT_BIT_EXT)
+return GLX_RGBA_UNSIGNED_FLOAT_TYPE_EXT;
+/* There's no recognized renderType in the config */
+return GLX_RGBA_TYPE;
+}
+
/*
** Create a GL context with the given properties.
*/
@@ -308,12 +323,13 @@ CreateContext(__GLXclientState * cl,
 /* send the create context request to the back-end server */
 dpy = GetBackEndDisplay(cl, screen);
 if (glxc->pFBConfig) {
-/*Since for a certain visual both RGB and COLOR INDEX
- *can be on then the only parmeter to choose the renderType
- * should be the class of the colormap since all 4 first
- * classes does not support RGB mode only COLOR INDEX ,
- * and so TrueColor and DirectColor does not support COLOR INDEX*/
-int renderType = glxc->pFBConfig->renderType;
+/* For a specific visual, multiple render types (ie. both RGB and 
COLOR INDEX)
+ * can be accessible. The only parameter to choose the renderType
+ * should be the class of the colormap, since all 4 first classes
+ * does not support RGB mode only COLOR INDEX,
+ * and so TrueColor and DirectColor does not support COLOR INDEX.
+ */
+ int renderType = GLX_RGBA_TYPE;
 if (pVisual) {
 switch (pVisual->class) {
@@ -329,7 +345,10 @@ CreateContext(__GLXclientState * cl,
 renderType = GLX_RGBA_TYPE;
 break;
 }
+} else {
+renderType = 
convFBconfRenderTypeBits2CtxRenderType(glxc->pFBConfig->renderType);
 }
+
 if (__GLX_IS_VERSION_SUPPORTED(1, 3)) {
 LockDisplay(dpy);
 GetReq(GLXCreateNewContext, be_new_req);
@@ -3185,6 +3204,7 @@ __glXQueryContext(__GLXclientState * cl, GLbyte * pc)
 __GLXcontext *ctx;
 xGLXQueryContextReq *req;
 xGLXQueryContextReply reply;
+int renderType;
 int nProps;
 int *sendBuf, *pSendBuf;
 int nReplyBytes;
@@ -3197,6 +3217,8 @@ __glXQueryContext(__GLXclientState * cl, GLbyte * pc)
 return __glXBadContext;
 }
+renderType = 
convFBconfRenderTypeBits2CtxRenderType(ctx->pFBConfig->renderType);
+
 nProps = 3;
 reply = (xGLXQueryContextReply) {
@@ -3212,7 +3234,7 @@ __glXQueryContext(__GLXclientState * cl, GLbyte * pc)
 *pSendBuf++ = GLX_FBCONFIG_ID;
 *pSendBuf++ = (int) (ctx->pFBConfig->id);
 *pSendBuf++ = GLX_RENDER_TYPE;
-*pSendBuf++ = (int) (ctx->pFBConfig->renderType);
+*pSendBuf++ = renderType; /* context render type (one of GLX_*_TYPE 
values) */
 *pSendBuf++ = GLX_SCREEN;
 *pSendBuf++ = (int) (ctx->pScreen->myNum);
--
1.8.1.2


Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami je

[Mesa-dev] [PATCH 3/6] Float fbconfigs exchange patch [3/6] Extension string enabling.

2013-10-17 Thread Czarnowski, Daniel
>From 762d184225d5f4c6c46ff302725063483ebd8bf0 Mon Sep 17 00:00:00 2001
From: Daniel Czarnowski 
Date: Thu, 17 Oct 2013 12:27:15 +0200
Subject: [PATCH] Enables the fbconfig_float extension in list of supported
extensions, and adds it to known extensions table.

---
glx/extension_string.c |5 +
glx/extension_string.h |1 +
glx/glxdri2.c  |   10 ++
3 files changed, 16 insertions(+)

diff --git a/glx/extension_string.c b/glx/extension_string.c
index 58f930f..1b2c7d9 100644
--- a/glx/extension_string.c
+++ b/glx/extension_string.c
@@ -65,6 +65,10 @@ struct extension_info {
 unsigned char driver_support;
};
+/**
+ * List of known GLX Extensions.
+ * The last Y/N switch informs whether the support of this extension is always 
enabled.
+ */
static const struct extension_info known_glx_extensions[] = {
/*   GLX_ARB_get_proc_address is implemented on the client. */
 /* *INDENT-OFF* */
@@ -74,6 +78,7 @@ static const struct extension_info known_glx_extensions[] = {
 { GLX(ARB_framebuffer_sRGB),VER(0,0), N, },
 { GLX(ARB_multisample), VER(1,4), Y, },
+{ GLX(ARB_fbconfig_float),  VER(0,0), N, },
 { GLX(EXT_create_context_es2_profile), VER(0,0), N, },
 { GLX(EXT_framebuffer_sRGB),VER(0,0), N, },
 { GLX(EXT_import_context),  VER(0,0), Y, },
diff --git a/glx/extension_string.h b/glx/extension_string.h
index 81b7de3..3bec1b1 100644
--- a/glx/extension_string.h
+++ b/glx/extension_string.h
@@ -41,6 +41,7 @@ enum {
 ARB_create_context_robustness_bit,
 ARB_framebuffer_sRGB_bit,
 ARB_multisample_bit,
+ARB_fbconfig_float_bit,
 EXT_create_context_es2_profile_bit,
 EXT_import_context_bit,
 EXT_texture_from_pixmap_bit,
diff --git a/glx/glxdri2.c b/glx/glxdri2.c
index 8a1fa41..2315761 100644
--- a/glx/glxdri2.c
+++ b/glx/glxdri2.c
@@ -634,6 +634,10 @@ __glXDRIscreenCreateContext(__GLXscreen * baseScreen,
 return &context->base;
}
+/**
+ * Initializes extensions flags in glx_enable_bits when a new screen is 
created.
+ * @param screen The screen where glx_enable_bits are to be set.
+ */
static void
__glXDRIinvalidateBuffers(DrawablePtr pDraw, void *priv, XID id)
{
@@ -889,6 +893,12 @@ initializeExtensions(__GLXDRIscreen * screen)
 LogMessage(X_INFO, "AIGLX: enabled GLX_EXT_framebuffer_sRGB\n");
 }
+/* enable ARB_fbconfig_float extension (even if there are no float 
fbconfigs) */
+{
+__glXEnableExtension(screen->glx_enable_bits, 
"GLX_ARB_fbconfig_float");
+LogMessage(X_INFO, "AIGLX: enabled GLX_ARB_fbconfig_float\n");
+}
+
 for (i = 0; extensions[i]; i++) {
#ifdef __DRI_READ_DRAWABLE
 if (strcmp(extensions[i]->name, __DRI_READ_DRAWABLE) == 0) {
--
1.7.10.4



Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by others 
is strictly prohibited.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl/linker: Allow mixing of desktop GLSL versions.

2013-10-17 Thread Paul Berry
Previously, Mesa followed the linkage rules outlined in the GLSL
1.20-1.40 specs, which (collectively) said that GLSL versions 1.10 and
1.20 could be linked together, but no other versions could be linked.

In GLSL 4.30, the linkage rules were relaxed so that any two desktop
GLSL versions can be linked together.  This change was made because it
reflected the behaviour of nearly all existing implementations (see
Khronos bug 8463).  Mesa was one of the few (perhaps the only)
exceptions to prohibit cross-linking of some GLSL versions.

Since the GLSL linkage rules were deliberately relaxed in order to
match the behaviour of existing implementations, it seems appropriate
to relax the rules in Mesa too (even though Mesa doesn't support GLSL
4.30 yet).

Note that linking ES and desktop shaders is still prohibited, as is
linking ES shaders having different GLSL versions.

Fixes piglit tests "shaders/version-mixing {interstage,intrastage}".
---
 src/glsl/linker.cpp | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 9095a40..0a949b4 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2057,14 +2057,10 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   }
}
 
-   /* Previous to GLSL version 1.30, different compilation units could mix and
-* match shading language versions.  With GLSL 1.30 and later, the versions
-* of all shaders must match.
-*
-* GLSL ES has never allowed mixing of shading language versions.
+   /* In desktop GLSL, different shader versions may be linked together.  In
+* GLSL ES, all shader versions must be the same.
 */
-   if ((is_es_prog || max_version >= 130)
-   && min_version != max_version) {
+   if (is_es_prog && min_version != max_version) {
   linker_error(prog, "all shaders must use same shading "
   "language version\n");
   goto done;
-- 
1.8.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/linker: Allow mixing of desktop GLSL versions.

2013-10-17 Thread Kenneth Graunke
On 10/17/2013 08:07 PM, Paul Berry wrote:
> Previously, Mesa followed the linkage rules outlined in the GLSL
> 1.20-1.40 specs, which (collectively) said that GLSL versions 1.10 and
> 1.20 could be linked together, but no other versions could be linked.
> 
> In GLSL 4.30, the linkage rules were relaxed so that any two desktop
> GLSL versions can be linked together.  This change was made because it
> reflected the behaviour of nearly all existing implementations (see
> Khronos bug 8463).  Mesa was one of the few (perhaps the only)
> exceptions to prohibit cross-linking of some GLSL versions.
> 
> Since the GLSL linkage rules were deliberately relaxed in order to
> match the behaviour of existing implementations, it seems appropriate
> to relax the rules in Mesa too (even though Mesa doesn't support GLSL
> 4.30 yet).
> 
> Note that linking ES and desktop shaders is still prohibited, as is
> linking ES shaders having different GLSL versions.
> 
> Fixes piglit tests "shaders/version-mixing {interstage,intrastage}".
> ---
>  src/glsl/linker.cpp | 10 +++---
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 9095a40..0a949b4 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -2057,14 +2057,10 @@ link_shaders(struct gl_context *ctx, struct 
> gl_shader_program *prog)
>}
> }
>  
> -   /* Previous to GLSL version 1.30, different compilation units could mix 
> and
> -* match shading language versions.  With GLSL 1.30 and later, the 
> versions
> -* of all shaders must match.
> -*
> -* GLSL ES has never allowed mixing of shading language versions.
> +   /* In desktop GLSL, different shader versions may be linked together.  In
> +* GLSL ES, all shader versions must be the same.
>  */
> -   if ((is_es_prog || max_version >= 130)
> -   && min_version != max_version) {
> +   if (is_es_prog && min_version != max_version) {
>linker_error(prog, "all shaders must use same shading "
>  "language version\n");
>goto done;
> 

Looks good to me.

Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] i965: Expose write_reg() as brw_store_register_mem64().

2013-10-17 Thread Kenneth Graunke
Writing a 64-bit register value to memory is sufficiently complicated
that it makes sense to reuse this function rather than duplicating it.

Exposing it outside of gen6_queryobj.c means it needs a more descriptive
function name.  It could probably be moved to brw_util.c or somewhere
else, but this works too.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  2 ++
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 18 +-
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 3b95922..0229cc5 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1460,6 +1460,8 @@ void brw_emit_query_end(struct brw_context *brw);
 
 /** gen6_queryobj.c */
 void gen6_init_queryobj_functions(struct dd_function_table *functions);
+void brw_store_register_mem64(struct brw_context *brw,
+  drm_intel_bo *bo, uint32_t reg, int idx);
 
 /*==
  * brw_state_dump.c
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index dd5cfc2..add4df9 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -103,9 +103,9 @@ write_depth_count(struct brw_context *brw, drm_intel_bo 
*query_bo, int idx)
  * Callers must explicitly flush the pipeline to ensure the desired value is
  * available.
  */
-static void
-write_reg(struct brw_context *brw,
-  drm_intel_bo *query_bo, uint32_t reg, int idx)
+void
+brw_store_register_mem64(struct brw_context *brw,
+ drm_intel_bo *bo, uint32_t reg, int idx)
 {
assert(brw->gen >= 6);
 
@@ -115,14 +115,14 @@ write_reg(struct brw_context *brw,
BEGIN_BATCH(3);
OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
OUT_BATCH(reg);
-   OUT_RELOC(query_bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+   OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  idx * sizeof(uint64_t));
ADVANCE_BATCH();
 
BEGIN_BATCH(3);
OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
OUT_BATCH(reg + sizeof(uint32_t));
-   OUT_RELOC(query_bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+   OUT_RELOC(bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  sizeof(uint32_t) + idx * sizeof(uint64_t));
ADVANCE_BATCH();
 }
@@ -133,19 +133,19 @@ write_primitives_generated(struct brw_context *brw,
 {
intel_batchbuffer_emit_mi_flush(brw);
 
-   write_reg(brw, query_bo, CL_INVOCATION_COUNT, idx);
+   brw_store_register_mem64(brw, query_bo, CL_INVOCATION_COUNT, idx);
 }
 
 static void
 write_xfb_primitives_written(struct brw_context *brw,
- drm_intel_bo *query_bo, int idx)
+ drm_intel_bo *bo, int idx)
 {
intel_batchbuffer_emit_mi_flush(brw);
 
if (brw->gen >= 7) {
-  write_reg(brw, query_bo, GEN7_SO_NUM_PRIMS_WRITTEN(0), idx);
+  brw_store_register_mem64(brw, bo, GEN7_SO_NUM_PRIMS_WRITTEN(0), idx);
} else {
-  write_reg(brw, query_bo, GEN6_SO_NUM_PRIMS_WRITTEN, idx);
+  brw_store_register_mem64(brw, bo, GEN6_SO_NUM_PRIMS_WRITTEN, idx);
}
 }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] i965: Move flushing out of write_reg and into the callers.

2013-10-17 Thread Kenneth Graunke
The current callers just want to write a single register, so combining
the register read with a pipeline flush made sense.  However, in the
future we'll want to do multiple register reads back to back, and we'll
only want to flush once.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index 498b187..dd5cfc2 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -98,8 +98,10 @@ write_depth_count(struct brw_context *brw, drm_intel_bo 
*query_bo, int idx)
  * Write an arbitrary 64-bit register to a buffer via MI_STORE_REGISTER_MEM.
  *
  * Only TIMESTAMP and PS_DEPTH_COUNT have special PIPE_CONTROL support; other
- * counters have to be read via the generic MI_STORE_REGISTER_MEM.  This
- * function also performs a pipeline flush for proper synchronization.
+ * counters have to be read via the generic MI_STORE_REGISTER_MEM.
+ *
+ * Callers must explicitly flush the pipeline to ensure the desired value is
+ * available.
  */
 static void
 write_reg(struct brw_context *brw,
@@ -107,8 +109,6 @@ write_reg(struct brw_context *brw,
 {
assert(brw->gen >= 6);
 
-   intel_batchbuffer_emit_mi_flush(brw);
-
/* MI_STORE_REGISTER_MEM only stores a single 32-bit value, so to
 * read a full 64-bit register, we need to do two of them.
 */
@@ -131,6 +131,8 @@ static void
 write_primitives_generated(struct brw_context *brw,
drm_intel_bo *query_bo, int idx)
 {
+   intel_batchbuffer_emit_mi_flush(brw);
+
write_reg(brw, query_bo, CL_INVOCATION_COUNT, idx);
 }
 
@@ -138,6 +140,8 @@ static void
 write_xfb_primitives_written(struct brw_context *brw,
  drm_intel_bo *query_bo, int idx)
 {
+   intel_batchbuffer_emit_mi_flush(brw);
+
if (brw->gen >= 7) {
   write_reg(brw, query_bo, GEN7_SO_NUM_PRIMS_WRITTEN(0), idx);
} else {
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/9] i965: Weaken the flushing in gen7_end_transform_feedback().

2013-10-17 Thread Kenneth Graunke
Since 062317d6671 (i965: Go back to using the kernel SOL reset feature.)
we've been flushing the batch on BeginTransformFeedback().  So it's not
necessary to do it on EndTransformFeedback().  A PIPE_CONTROL will work.

This makes gen7_end_transform_feedback() exactly the same as the gen6
variant.  However, they'll diverge again shortly.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index fc69bfc..504d9e7 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -263,13 +263,13 @@ void
 gen7_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj)
 {
-   /* Because we have to rely on the kernel to reset our SO write offsets, and
-* we only get to do it once per batchbuffer, flush the batch after feedback
-* so another transform feedback can get the write offset reset it needs.
-*
-* This also covers any cache flushing required.
+   /* After EndTransformFeedback, it's likely that the client program will try
+* to draw using the contents of the transform feedback buffer as vertex
+* input.  In order for this to work, we need to flush the data through at
+* least the GS stage of the pipeline, and flush out the render cache.  For
+* simplicity, just do a full flush.
 */
struct brw_context *brw = brw_context(ctx);
 
-   intel_batchbuffer_flush(brw);
+   intel_batchbuffer_emit_mi_flush(brw);
 }
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Ivybridge support for ARB_transform_feedback2

2013-10-17 Thread Kenneth Graunke
Here's my implementation of ARB_transform_feedback2.  I believe it's
complete; it passes all of our Piglit tests and a lot of Intel's
oglconform tests.

This should work out of the box on Ivybridge and Baytrail.  It won't
work on Haswell at the moment, due to restrictions on register writes
(to be solved in a future kernel version).  Patch 9 will need to be
replaced with something that detects whether or not we can write
registers from userspace batchbuffers.

In the meantime, I figured I'd send out the rest for review.

Porting this back to Sandybridge is probably doable, but annoying.
Sandybridge doesn't have the MI_LOAD_REGISTER_MEM command, so we'd have
to map the buffers and use MI_LOAD_REGISTER_IMM.  Seems pretty gross.
Plus, transform feedback is done very differently pre-Ivybridge.  I'm
not sure it's worth it, seeing as it's a GL 4.0 feature.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] i965: Implement Pause/ResumeTransformfeedback driver hooks on Gen7+.

2013-10-17 Thread Kenneth Graunke
The ARB_transform_feedback2 extension introduces the ability to pause
and resume transform feedback sessions.  Although only one can be active
at a time, it's possible to switch between multiple transform feedback
objects while paused.

In order to facilitate this, we need to save/restore the SO_WRITE_OFFSET
registers so that after resuming, the GPU continues writing where it
left off.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c|  2 ++
 src/mesa/drivers/dri/i965/brw_context.h|  9 +++
 src/mesa/drivers/dri/i965/gen6_sol.c   |  5 
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 40 ++
 4 files changed, 56 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index fbfbce3..dbff04a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -256,6 +256,8 @@ brw_init_driver_functions(struct brw_context *brw,
if (brw->gen >= 7) {
   functions->BeginTransformFeedback = gen7_begin_transform_feedback;
   functions->EndTransformFeedback = gen7_end_transform_feedback;
+  functions->PauseTransformFeedback = gen7_pause_transform_feedback;
+  functions->ResumeTransformFeedback = gen7_resume_transform_feedback;
} else {
   functions->BeginTransformFeedback = brw_begin_transform_feedback;
   functions->EndTransformFeedback = brw_end_transform_feedback;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index f55d41b..02d2b9f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -876,6 +876,9 @@ struct intel_batchbuffer {
 
 struct brw_transform_feedback_object {
struct gl_transform_feedback_object base;
+
+   /** A buffer to hold SO_WRITE_OFFSET(n) values while paused. */
+   drm_intel_bo *offset_bo;
 };
 
 /**
@@ -1573,6 +1576,12 @@ gen7_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
 void
 gen7_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj);
+void
+gen7_pause_transform_feedback(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj);
+void
+gen7_resume_transform_feedback(struct gl_context *ctx,
+   struct gl_transform_feedback_object *obj);
 
 /* brw_blorp_blit.cpp */
 GLbitfield
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index ffecfc8..fb801fe 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -141,6 +141,9 @@ brw_new_transform_feedback(struct gl_context *ctx, GLuint 
name)
   CALLOC_STRUCT(brw_transform_feedback_object);
struct gl_transform_feedback_object *obj = &brw_obj->base;
 
+   brw_obj->offset_bo =
+  drm_intel_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 4096);
+
obj->Name = name;
obj->RefCount = 1;
obj->EverBound = GL_FALSE;
@@ -159,6 +162,8 @@ brw_delete_transform_feedback(struct gl_context *ctx,
   _mesa_reference_buffer_object(ctx, &obj->Buffers[i], NULL);
}
 
+   drm_intel_bo_unreference(brw_obj->offset_bo);
+
free(brw_obj);
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 504d9e7..c7fe4f6 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -273,3 +273,43 @@ gen7_end_transform_feedback(struct gl_context *ctx,
 
intel_batchbuffer_emit_mi_flush(brw);
 }
+
+void
+gen7_pause_transform_feedback(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct brw_transform_feedback_object *brw_obj =
+  (struct brw_transform_feedback_object *) obj;
+
+   /* Save the SOL buffer offset register values. */
+   for (int i = 0; i < 4; i++) {
+  BEGIN_BATCH(3);
+  OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
+  OUT_RELOC(brw_obj->offset_bo,
+I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+i * sizeof(uint32_t));
+  ADVANCE_BATCH();
+   }
+}
+
+void
+gen7_resume_transform_feedback(struct gl_context *ctx,
+   struct gl_transform_feedback_object *obj)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct brw_transform_feedback_object *brw_obj =
+  (struct brw_transform_feedback_object *) obj;
+
+   /* Reload the SOL buffer offset registers. */
+   for (int i = 0; i < 4; i++) {
+  BEGIN_BATCH(3);
+  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
+  OUT_RELOC(brw_obj->offset_bo,
+I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+i * sizeof(uint32_t));
+  ADVANCE_BATCH();
+   }
+}
-- 
1.8.3.2

_

[Mesa-dev] [PATCH 4/9] i965: Create a new brw_transform_feedback_object subclass.

2013-10-17 Thread Kenneth Graunke
This adds the basic driver hooks to allocate/free the brw variant.
It doesn't contain any additional information yet, but it will soon.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c |  2 ++
 src/mesa/drivers/dri/i965/brw_context.h |  9 +
 src/mesa/drivers/dri/i965/gen6_sol.c| 30 ++
 3 files changed, 41 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 109f40b..fbfbce3 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -251,6 +251,8 @@ brw_init_driver_functions(struct brw_context *brw,
 
functions->QuerySamplesForFormat = brw_query_samples_for_format;
 
+   functions->NewTransformFeedback = brw_new_transform_feedback;
+   functions->DeleteTransformFeedback = brw_delete_transform_feedback;
if (brw->gen >= 7) {
   functions->BeginTransformFeedback = gen7_begin_transform_feedback;
   functions->EndTransformFeedback = gen7_end_transform_feedback;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 0229cc5..f55d41b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -874,6 +874,10 @@ struct intel_batchbuffer {
} saved;
 };
 
+struct brw_transform_feedback_object {
+   struct gl_transform_feedback_object base;
+};
+
 /**
  * Data shared between each programmable stage in the pipeline (vs, gs, and
  * wm).
@@ -1550,6 +1554,11 @@ extern int intel_translate_logic_op(GLenum opcode);
 void intel_init_syncobj_functions(struct dd_function_table *functions);
 
 /* gen6_sol.c */
+struct gl_transform_feedback_object *
+brw_new_transform_feedback(struct gl_context *ctx, GLuint name);
+void
+brw_delete_transform_feedback(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj);
 void
 brw_begin_transform_feedback(struct gl_context *ctx, GLenum mode,
 struct gl_transform_feedback_object *obj);
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 21da444..ffecfc8 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -26,6 +26,7 @@
  * Code to initialize the binding table entries used by transform feedback.
  */
 
+#include "main/bufferobj.h"
 #include "main/macros.h"
 #include "brw_context.h"
 #include "intel_batchbuffer.h"
@@ -132,6 +133,35 @@ const struct brw_tracked_state gen6_gs_binding_table = {
.emit = brw_gs_upload_binding_table,
 };
 
+struct gl_transform_feedback_object *
+brw_new_transform_feedback(struct gl_context *ctx, GLuint name)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct brw_transform_feedback_object *brw_obj =
+  CALLOC_STRUCT(brw_transform_feedback_object);
+   struct gl_transform_feedback_object *obj = &brw_obj->base;
+
+   obj->Name = name;
+   obj->RefCount = 1;
+   obj->EverBound = GL_FALSE;
+
+   return obj;
+}
+
+void
+brw_delete_transform_feedback(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj)
+{
+   struct brw_transform_feedback_object *brw_obj =
+  (struct brw_transform_feedback_object *) obj;
+
+   for (unsigned i = 0; i < Elements(obj->Buffers); i++) {
+  _mesa_reference_buffer_object(ctx, &obj->Buffers[i], NULL);
+   }
+
+   free(brw_obj);
+}
+
 void
 brw_begin_transform_feedback(struct gl_context *ctx, GLenum mode,
 struct gl_transform_feedback_object *obj)
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/9] mesa: Add a new GetTransformFeedbackVertexCount() driver hook.

2013-10-17 Thread Kenneth Graunke
DrawTransformFeedback() needs to obtain the number of vertices written
to a particular stream during the last Begin/EndTransformFeedback block.
The new driver hook returns exactly that information.

Gallium drivers already implement this functionality by passing the
transform feedback object to the drawing function.  I prefer to avoid
this for two reasons:

1. Complexity:

Normally, the drawing function takes an array of _mesa_prim objects,
each of which specifies a vertex count.  If tfb_vertcount != NULL,
however, there will only be one _mesa_prim object with an invalid
vertex count (of 1), so it needs to be ignored.

Since the _mesa_prim pointers are const, you can't even override it to
the proper value; you need to pass around extra "ignore that, here's
the real count" parameters.

The drawing function is already terribly complicated, so I don't want to
make it even more complicated.

2. Primitive restart:

vbo_draw_arrays() performs software primitive restart, splitting a draw
call in two when necessary.  vbo_draw_transform_feedback() currently
doesn't because it has no idea how many vertices need to be drawn.  The
new driver hook gives it that information, allowing us to reuse the
existing vbo_draw_arrays() code to do everything right.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/dd.h| 8 
 src/mesa/vbo/vbo_exec_array.c | 6 ++
 2 files changed, 14 insertions(+)

diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 29469ce..11d5a9e 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -843,6 +843,14 @@ struct dd_function_table {
struct gl_transform_feedback_object *obj);
 
/**
+* Return the number of vertices written to a stream during the last
+* Begin/EndTransformFeedback block.
+*/
+   GLsizei (*GetTransformFeedbackVertexCount)(struct gl_context *ctx,
+  struct 
gl_transform_feedback_object *obj,
+  GLuint stream);
+
+   /**
 * \name GL_NV_texture_barrier interface
 */
void (*TextureBarrier)(struct gl_context *ctx);
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index 1670409..11bb76a 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -1464,6 +1464,12 @@ vbo_draw_transform_feedback(struct gl_context *ctx, 
GLenum mode,
   return;
}
 
+   if (ctx->Driver.GetTransformFeedbackVertexCount) {
+  GLsizei n = ctx->Driver.GetTransformFeedbackVertexCount(ctx, obj, 
stream);
+  vbo_draw_arrays(ctx, mode, 0, n, numInstances, 0);
+  return;
+   }
+
vbo_bind_arrays(ctx);
 
/* init most fields to zero */
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/9] i965: Mark brw_draw_prims tfb_vertcount parameter as unused.

2013-10-17 Thread Kenneth Graunke
Renaming it makes it obvious that it isn't used, and the assertion
verifies that the VBO module never passes us such an object.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_draw.c | 4 +++-
 src/mesa/drivers/dri/i965/brw_draw.h | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 549f9d0a..9f53c6d 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -461,11 +461,13 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount )
+struct gl_transform_feedback_object *unused_tfb_object)
 {
struct brw_context *brw = brw_context(ctx);
const struct gl_client_array **arrays = ctx->Array._DrawArrays;
 
+   assert(unused_tfb_object == NULL);
+
if (!_mesa_check_conditional_render(ctx))
   return;
 
diff --git a/src/mesa/drivers/dri/i965/brw_draw.h 
b/src/mesa/drivers/dri/i965/brw_draw.h
index aac375f..fb96813 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.h
+++ b/src/mesa/drivers/dri/i965/brw_draw.h
@@ -41,7 +41,7 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount );
+struct gl_transform_feedback_object *unused_tfb_object);
 
 void brw_draw_init( struct brw_context *brw );
 void brw_draw_destroy( struct brw_context *brw );
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/9] i965: Implement glDrawTransformFeedback().

2013-10-17 Thread Kenneth Graunke
Implementing the GetTransformFeedbackVertexCount() driver hook allows
the VBO module to call us with the right number of vertices.

The hardware doesn't directly count the number of vertices written by
SOL, so we instead use the SO_NUM_PRIMS_WRITTEN(n) counters and multiply
by the number of vertices per primitive.

Unfortunately, counting the number of primitives generated is tricky:
a program might pause a transform feedback operation, start a second one
with a different object, then switch back and resume.  Both transform
feedback operations share the SO_NUM_PRIMS_WRITTEN counters.

To work around this, we save the counter values at Begin, Pause, Resume,
and End.  This "bookends" each section where transform feedback is
active for the current object.  Adding up differences of pairs gives
us the number of primitives generated.  (This is similar to what we
do for occlusion queries on platforms without hardware contexts.)

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c|   2 +
 src/mesa/drivers/dri/i965/brw_context.h|  26 
 src/mesa/drivers/dri/i965/gen6_sol.c   |   1 +
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 190 -
 4 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index dbff04a..0087689 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -253,6 +253,8 @@ brw_init_driver_functions(struct brw_context *brw,
 
functions->NewTransformFeedback = brw_new_transform_feedback;
functions->DeleteTransformFeedback = brw_delete_transform_feedback;
+   functions->GetTransformFeedbackVertexCount =
+  brw_get_transform_feedback_vertex_count;
if (brw->gen >= 7) {
   functions->BeginTransformFeedback = gen7_begin_transform_feedback;
   functions->EndTransformFeedback = gen7_end_transform_feedback;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 02d2b9f..80dd5fb 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -874,11 +874,33 @@ struct intel_batchbuffer {
} saved;
 };
 
+#define BRW_MAX_XFB_STREAMS 4
+
 struct brw_transform_feedback_object {
struct gl_transform_feedback_object base;
 
/** A buffer to hold SO_WRITE_OFFSET(n) values while paused. */
drm_intel_bo *offset_bo;
+
+   /** The most recent primitive mode (GL_TRIANGLES/GL_POINTS/GL_LINES). */
+   GLenum primitive_mode;
+
+   /**
+* Count of primitives generated during this transform feedback operation.
+*  @{
+*/
+   uint64_t prims_generated[BRW_MAX_XFB_STREAMS];
+   drm_intel_bo *prim_count_bo;
+   unsigned prim_count_buffer_index; /**< in number of uint64_t units */
+   /** @} */
+
+   /**
+* Number of vertices written between last Begin/EndTransformFeedback().
+*
+* Used to implement DrawTransformFeedback().
+*/
+   uint64_t vertices_written[BRW_MAX_XFB_STREAMS];
+   bool vertices_written_valid;
 };
 
 /**
@@ -1568,6 +1590,10 @@ brw_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
 void
 brw_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj);
+GLsizei
+brw_get_transform_feedback_vertex_count(struct gl_context *ctx,
+struct gl_transform_feedback_object 
*obj,
+GLuint stream);
 
 /* gen7_sol_state.c */
 void
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index fb801fe..1dde81c 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -163,6 +163,7 @@ brw_delete_transform_feedback(struct gl_context *ctx,
}
 
drm_intel_bo_unreference(brw_obj->offset_bo);
+   drm_intel_bo_unreference(brw_obj->prim_count_bo);
 
free(brw_obj);
 }
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index c7fe4f6..978270a 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -249,14 +249,179 @@ const struct brw_tracked_state gen7_sol_state = {
.emit = upload_sol_state,
 };
 
+/**
+ * Tally the number of primitives generated so far.
+ *
+ * The buffer contains a series of pairs:
+ * (, ) ;
+ * (, ) ;
+ *
+ * For each stream, we subtract the pair of values (end - start) to get the
+ * number of primitives generated during one section.  We accumulate these
+ * values, adding them up to get the total number of primitives generated.
+ */
+static void
+gen7_tally_prims_generated(struct brw_context *brw,
+   struct brw_transform_feedback_object *obj)
+{
+   /* If the current batch is still contributing to the number of primitives
+* generated, flush it now so the results will be present when mapped.
+*/
+   if (drm_inte

[Mesa-dev] [PATCH 9/9] i965: Enable the ARB_transform_feedback2 extension on Gen7+.

2013-10-17 Thread Kenneth Graunke
All the necessary pieces are now in place.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 334be05..c09ee39 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -133,6 +133,10 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Const.GLSLVersion = 120;
_mesa_override_glsl_version(ctx);
 
+   if (brw->gen >= 7) {
+  ctx->Extensions.ARB_transform_feedback2 = true;
+   }
+
if (brw->gen >= 6) {
   uint64_t dummy;
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965: Enable OpenGL 3.3 and GLSL 3.30.

2013-10-17 Thread Kenneth Graunke
Everything necessary for these appears to be implemented.  We'll want to
add more tests to guard against bugs, but it should be functionally
complete.

Signed-off-by: Kenneth Graunke 
Reviewed-by: Matt Turner 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
 src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 334be05..803d090 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -126,7 +126,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_EGL_image_external = true;
 
if (brw->gen >= 7)
-  ctx->Const.GLSLVersion = 150;
+  ctx->Const.GLSLVersion = 330;
else if (brw->gen >= 6)
   ctx->Const.GLSLVersion = 140;
else
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index ec6274c4..b3d6055 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1187,7 +1187,7 @@ set_max_gl_versions(struct intel_screen *screen)
 
switch (screen->devinfo->gen) {
case 7:
-  psp->max_gl_core_version = 32;
+  psp->max_gl_core_version = 33;
   psp->max_gl_compat_version = 30;
   psp->max_gl_es1_version = 11;
   psp->max_gl_es2_version = 30;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] docs: Note that we support OpenGL 3.3 in the release notes.

2013-10-17 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
Reviewed-by: Matt Turner 
---
 docs/relnotes/10.0.html | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/relnotes/10.0.html b/docs/relnotes/10.0.html
index 8f97921..0b25f49 100644
--- a/docs/relnotes/10.0.html
+++ b/docs/relnotes/10.0.html
@@ -22,11 +22,11 @@ People who are concerned with stability and reliability 
should stick
 with a previous release or wait for Mesa 10.0.1.
 
 
-Mesa 10.0 implements the OpenGL 3.2 API, but the version reported by
+Mesa 10.0 implements the OpenGL 3.3 API, but the version reported by
 glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
 glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
-Some drivers don't support all the features required in OpenGL 3.2.  OpenGL
-3.2 is only available if requested at context creation
+Some drivers don't support all the features required in OpenGL 3.3.  OpenGL
+3.3 is only available if requested at context creation
 because compatibility contexts are not supported.
 
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa: Bump version to 11.0.0.

2013-10-17 Thread Kenneth Graunke
Mesa now supports OpenGL 3.3 and GLSL 3.30, so bump the Mesa major
version from 10 to 11 to reflect this.

Also update the release notes, and add appropriate FAQ entries.

http://en.wikipedia.org/wiki/Up_to_eleven

Signed-off-by: Kenneth Graunke 
Reviewed-by: Matt Turner 
---
 VERSION |  2 +-
 docs/relnotes.html  |  3 +-
 docs/relnotes/10.0.html | 65 
 docs/relnotes/11.0.html | 80 +
 4 files changed, 83 insertions(+), 67 deletions(-)
 delete mode 100644 docs/relnotes/10.0.html
 create mode 100644 docs/relnotes/11.0.html

diff --git a/VERSION b/VERSION
index 8e92e83..2b1181d 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-10.0.0-devel
+11.0.0-devel
diff --git a/docs/relnotes.html b/docs/relnotes.html
index 82072dd..fb31555 100644
--- a/docs/relnotes.html
+++ b/docs/relnotes.html
@@ -21,7 +21,8 @@ The release notes summarize what's new or changed in each 
Mesa release.
 
 
 
-10.0 release notes
+11.0 release notes
+Mesa 10.0 was never released.
 9.2.1 release notes
 9.2 release notes
 9.1.7 release notes
diff --git a/docs/relnotes/10.0.html b/docs/relnotes/10.0.html
deleted file mode 100644
index 0b25f49..000
--- a/docs/relnotes/10.0.html
+++ /dev/null
@@ -1,65 +0,0 @@
-http://www.w3.org/TR/html4/loose.dtd";>
-
-
-  
-  Mesa Release Notes
-  
-
-
-
-
-  The Mesa 3D Graphics Library
-
-
-
-
-
-Mesa 10.0 Release Notes / TBD
-
-
-Mesa 10.0 is a new development release.
-People who are concerned with stability and reliability should stick
-with a previous release or wait for Mesa 10.0.1.
-
-
-Mesa 10.0 implements the OpenGL 3.3 API, but the version reported by
-glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
-glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
-Some drivers don't support all the features required in OpenGL 3.3.  OpenGL
-3.3 is only available if requested at context creation
-because compatibility contexts are not supported.
-
-
-
-MD5 checksums
-
-TBD.
-
-
-
-New features
-
-
-Note: some of the new features are only available with certain drivers.
-
-
-
-GL_AMD_seamless_cubemap_per_texture on i965.
-GL_ARB_conservative_depth on i965.
-GL_ARB_texture_gather on i965.
-GL_ARB_texture_query_levels on i965.
-GL_KHR_debug
-
-
-
-Bug fixes
-
-TBD.
-
-Changes
-
-TBD.
-
-
-
-
diff --git a/docs/relnotes/11.0.html b/docs/relnotes/11.0.html
new file mode 100644
index 000..2fb8135
--- /dev/null
+++ b/docs/relnotes/11.0.html
@@ -0,0 +1,80 @@
+http://www.w3.org/TR/html4/loose.dtd";>
+
+
+  
+  Mesa Release Notes
+  
+
+
+
+
+  The Mesa 3D Graphics Library
+
+
+
+
+
+Mesa 11.0 Release Notes / TBD
+
+
+Mesa 11.0 is a new development release.
+People who are concerned with stability and reliability should stick
+with a previous release or wait for Mesa 11.0.1.
+
+
+
+Mesa 11.0 implements the OpenGL 3.3 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 3.3.  OpenGL
+3.3 is only available if requested at context creation
+because compatibility contexts are not supported.
+
+
+FAQ
+
+Q: What happened to Mesa 10.0?
+
+  A: The Mesa community increases the major version number each time
+  the project gains support for a new version of desktop OpenGL.  Mesa 9.2
+  supported OpenGL 3.1.  When it gained support for OpenGL 3.2, it became
+  Mesa 10.0.  But before Mesa 10.0 was ever released, the developers
+  completed OpenGL 3.3 support, causing the version to increase to 11.0.
+
+Q: Why didn't you just make Mesa 10.0 better?
+
+  A: This Mesa http://en.wikipedia.org/wiki/Up_to_eleven";>goes 
to eleven.
+
+
+MD5 checksums
+
+TBD.
+
+
+
+New features
+
+
+Note: some of the new features are only available with certain drivers.
+
+
+
+GL_AMD_seamless_cubemap_per_texture on i965.
+GL_ARB_conservative_depth on i965.
+GL_ARB_texture_gather on i965.
+GL_ARB_texture_query_levels on i965.
+GL_KHR_debug
+
+
+
+Bug fixes
+
+TBD.
+
+Changes
+
+TBD.
+
+
+
+
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev