> -----Original Message----- > From: Richard Biener <rguent...@suse.de> > Sent: Thursday, November 16, 2023 2:18 PM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; j...@ventanamicro.com > Subject: RE: [PATCH 7/21]middle-end: update IV update code to support early > breaks and arbitrary exits > > On Thu, 16 Nov 2023, Tamar Christina wrote: > > > > -----Original Message----- > > > From: Richard Biener <rguent...@suse.de> > > > Sent: Thursday, November 16, 2023 1:36 PM > > > To: Tamar Christina <tamar.christ...@arm.com> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; > j...@ventanamicro.com > > > Subject: RE: [PATCH 7/21]middle-end: update IV update code to > > > support early breaks and arbitrary exits > > > > > > On Thu, 16 Nov 2023, Tamar Christina wrote: > > > > > > > > > > > > > > > > > > > > > > > Perhaps I'm missing something here? > > > > > > > > > > > > > > > > > > OK, so I refreshed my mind of what > > > > > > > > > vect_update_ivs_after_vectorizer > > > > > > > does. > > > > > > > > > > > > > > > > > > I still do not understand the (complexity of the) patch. > > > > > > > > > Basically the function computes the new value of the IV > > > > > > > > > "from scratch" based on the number of scalar iterations > > > > > > > > > of the vector loop, > > > > > the 'niter' > > > > > > > > > argument. I would have expected that for the early > > > > > > > > > exits we either pass in a different 'niter' or alternatively a > 'niter_adjustment'. > > > > > > > > > > > > > > > > But for an early exit there's no static value for adjusted > > > > > > > > niter, since you don't know which iteration you exited from. > > > > > > > > Unlike the normal exit when you know if you get there > > > > > > > > you've done all possible > > > > > > > iterations. > > > > > > > > > > > > > > > > So you must compute the scalar iteration count on the exit > > > > > > > > itself. > > > > > > > > > > > > > > ? You do not need the actual scalar iteration you exited > > > > > > > (you don't compute that either), you need the scalar > > > > > > > iteration the vector iteration started with when it exited > > > > > > > prematurely and that's readily > > > > > available? > > > > > > > > > > > > For a normal exit yes, not for an early exit no? > > > > > > niters_vector_mult_vf is only valid for the main exit. > > > > > > > > > > > > There's the unadjusted scalar count, which is what it's using > > > > > > to adjust it to the final count. Unless I'm missing something? > > > > > > > > > > Ah, of course - niters_vector_mult_vf is for the countable exit. > > > > > For the early exits we can't precompute the scalar iteration value. > > > > > But that then means we should compute the appropriate > "continuation" > > > > > as live value of the vectorized IVs even when they were not > > > > > originally used outside of the loop. I don't see how we can > > > > > express this in terms of the scalar IVs in the (not yet) > > > > > vectorized loop - similar to the reduction case you are going to > > > > > end up with the wrong values > > > here. > > > > > > > > > > That said, I've for a long time wanted to preserve the original > > > > > control IV also for the vector code (leaving any "optimization" > > > > > to IVOPTs there), that would enable us to compute the correct > > > > > "niters_vector_mult_vf" based on that IV. > > > > > > > > > > So given we cannot use the scalar IVs you have to handle all > > > > > inductions (besides the main exit control IV) in > > > > > vectorizable_live_operation > > > I think. > > > > > > > > > > > > > That's what I currently do, that's why there was the > > > > if (STMT_VINFO_LIVE_P (phi_info)) > > > > continue; > > > > > > Yes, but that only works for the inductions marked so. We'd need to > > > mark the others as well, but only for the early exits. > > > > > > > although I don't understand why we use the scalar count, I > > > > suppose the reasoning is that we don't really want to keep it > > > > around, and referencing > > > it forces it to be kept? > > > > > > Referencing it will cause the scalar compute to be retained, but > > > since we do not adjust the scalar compute during vectorization (but > > > expect it to be dead) the scalar compute will compute the wrong > > > thing (as shown by the reduction example - I suspect inductions will > > > suffer > from the same problem). > > > > > > > At the moment it just does `init + (final - init) * vf` which is > > > > correct no? > > > > > > The issue is that 'final' is not computed correctly in the > > > vectorized loop. This formula might work for affine evolutions of course. > > > > > > Extracting the correct value from the vectorized induction would be > > > the preferred solution. > > > > Ok, so I should be able to just mark IVs as live during process_use if > > there are multiple exits right? Since it's just gonna be unused on the > > main exit since we use niters? > > > > Because since it's the PHI inside the loop that needs to be marked > > live I can't just do it for a specific exits no? > > > > If I create a copy of the PHI node during peeling for use in early > > exits and mark it live it won't work no? > > I guess I wouldn't actually mark it STMT_VINFO_LIVE_P but somehow arrange > vectorizable_live_operation to be called, possibly adding a edge argument to > that as well. > > Maybe the thing to do for the moment is to reject vectorization with early > breaks if there's any (non-STMT_VINFO_LIVE_P?) induction or reduction > besides the main counting IV one you can already special-case?
Ok so I did a quick hack with: if (!virtual_operand_p (PHI_RESULT (phi)) && !STMT_VINFO_LIVE_P (phi_info)) { use_operand_p use_p; imm_use_iterator imm_iter; bool non_exit_use = false; FOR_EACH_IMM_USE_FAST (use_p, imm_iter, PHI_RESULT (phi)) if (!flow_bb_inside_loop_p (loop, gimple_bb (USE_STMT (use_p)))) for (auto exit : get_loop_exit_edges (loop)) { if (exit == LOOP_VINFO_IV_EXIT (loop_vinfo)) continue; if (gimple_bb (USE_STMT (use_p)) != exit->dest) { non_exit_use = true; goto fail; } } fail: if (non_exit_use) return false; } And it does seem to still allow all the cases I want. I've placed this in vect_can_advance_ivs_p. Does this cover what you meant? Thanks, Tamar > > Richard. > > > Tamar > > > > > > > Also you missed the question below about how to avoid the creation > > > > of the block, You ok with changing that? > > > > > > > > Thanks, > > > > Tamar > > > > > > > > > Or for now disable early-break for inductions that are not the > > > > > main exit control IV (in vect_can_advance_ivs_p)? > > > > > > > > > > > > > > > > > > > > > > > It seems your change handles different kinds of > > > > > > > > > inductions > > > differently. > > > > > > > > > Specifically > > > > > > > > > > > > > > > > > > bool ivtemp = gimple_cond_lhs (cond) == iv_var; > > > > > > > > > if (restart_loop && ivtemp) > > > > > > > > > { > > > > > > > > > type = TREE_TYPE (gimple_phi_result (phi)); > > > > > > > > > ni = build_int_cst (type, vf); > > > > > > > > > if (inversed_iv) > > > > > > > > > ni = fold_build2 (MINUS_EXPR, type, ni, > > > > > > > > > fold_convert (type, step_expr)); > > > > > > > > > } > > > > > > > > > > > > > > > > > > it looks like for the exit test IV we use either 'VF' or 'VF > > > > > > > > > - step' > > > > > > > > > as the new value. That seems to be very odd special > > > > > > > > > casing for unknown reasons. And while you adjust > > > > > > > > > vec_step_op_add, you don't adjust > > > > > > > > > vect_peel_nonlinear_iv_init (maybe not supported - > > > > > > > > > better assert > > > > > > > here). > > > > > > > > > > > > > > > > The VF case is for a normal "non-inverted" loop, where if > > > > > > > > you take an early exit you know that you have to do at most VF > iterations. > > > > > > > > The VF > > > > > > > > - step is to account for the inverted loop control flow > > > > > > > > where you exit after adjusting the IV already by + step. > > > > > > > > > > > > > > But doesn't that assume the IV counts from niter to zero? I > > > > > > > don't see this special case is actually necessary, no? > > > > > > > > > > > > > > > > > > > I needed it because otherwise the scalar loop iterates one > > > > > > iteration too little So I got a miscompile with the inverter > > > > > > loop stuff. I'll look at it again perhaps It can be solved > > > > > > differently. > > > > > > > > > > > > > > > > > > > > > > Peeling doesn't matter here, since you know you were able > > > > > > > > to do a vector iteration so it's safe to do VF iterations. > > > > > > > > So having peeled doesn't affect the remaining iters count. > > > > > > > > > > > > > > > > > > > > > > > > > > Also the vec_step_op_add case will keep the original > > > > > > > > > scalar IV live even when it is a vectorized induction. > > > > > > > > > The code recomputing the value from scratch avoids this. > > > > > > > > > > > > > > > > > > /* For non-main exit create an intermediat edge to > > > > > > > > > get any updated > > > > > iv > > > > > > > > > calculations. */ > > > > > > > > > if (needs_interm_block > > > > > > > > > && !iv_block > > > > > > > > > && (!gimple_seq_empty_p (stmts) || > > > > > > > > > !gimple_seq_empty_p > > > > > > > > > (new_stmts))) > > > > > > > > > { > > > > > > > > > iv_block = split_edge (update_e); > > > > > > > > > update_e = single_succ_edge (update_e->dest); > > > > > > > > > last_gsi = gsi_last_bb (iv_block); > > > > > > > > > } > > > > > > > > > > > > > > > > > > this is also odd, can we adjust the API instead? I > > > > > > > > > suppose this is because your computation uses the > > > > > > > > > original loop IV, if you based the computation off the > > > > > > > > > initial value only this might not be > > > > > necessary? > > > > > > > > > > > > > > > > No, on the main exit the code updates the value in the > > > > > > > > loop header and puts the Calculation in the merge block. > > > > > > > > This works because it only needs to consume PHI nodes in > > > > > > > > the merge block and things like niters are > > > > > > > adjusted in the guard block. > > > > > > > > > > > > > > > > For an early exit, we don't have a guard block, only the merge > block. > > > > > > > > We have to update the PHI nodes in that block, but can't > > > > > > > > do so since you can't produce a value and consume it in a > > > > > > > > PHI node in the same > > > > > BB. > > > > > > > > So we need to create the block to put the values in for > > > > > > > > use in the merge block. Because there's no "guard" block for > > > > > > > > early > exits. > > > > > > > > > > > > > > ? then compute niters in that block as well. > > > > > > > > > > > > We can't since it'll not be reachable through the right edge. > > > > > > What we can do if you want is slightly change peeling, we > > > > > > currently peel > > > as: > > > > > > > > > > > > \ \ / > > > > > > E1 E2 Normal exit > > > > > > \ | | > > > > > > \ | Guard > > > > > > \ | | > > > > > > Merge block > > > > > > | > > > > > > Pre Header > > > > > > > > > > > > If we instead peel as: > > > > > > > > > > > > > > > > > > \ \ / > > > > > > E1 E2 Normal exit > > > > > > \ | | > > > > > > Exit join Guard > > > > > > \ | | > > > > > > Merge block > > > > > > | > > > > > > Pre Header > > > > > > > > > > > > We can use the exit join block. This would also mean > > > > > > vect_update_ivs_after_vectorizer Doesn't need to iterate over > > > > > > all exits and only really needs to adjust the phi nodes Coming > > > > > > out of the exit join > > > > > and guard block. > > > > > > > > > > > > Does this work for you? > > > > > > Yeah, I think that would work. But I'd like to sort out the > > > correctness details of the IV update itself before sorting out this code > placement detail. > > > > > > Richard. > > > > > > > > > Thanks, > > > > > > Tamar > > > > > > > > > > > > > > > The API can be adjusted by always creating the empty block > > > > > > > > either during > > > > > > > peeling. > > > > > > > > That would prevent us from having to do anything special here. > > > > > > > > Would that work better? Or I can do it in the loop that > > > > > > > > iterates over the exits to before the call to > > > > > > > > vect_update_ivs_after_vectorizer, which I think > > > > > > > might be more consistent. > > > > > > > > > > > > > > > > > > > > > > > > > > That said, I wonder why we cannot simply pass in an > > > > > > > > > adjusted niter which would be niters_vector_mult_vf - vf > > > > > > > > > and be done with > > > that? > > > > > > > > > > > > > > > > > > > > > > > > > We can ofcourse not have this and recompute it from niters > > > > > > > > itself, however this does affect the epilog code layout. > > > > > > > > Particularly knowing the static number if iterations left > > > > > > > > causes it to usually unroll the loop and share some of the > > > > > > > > computations. i.e. the scalar code is often more > > > > > > > efficient. > > > > > > > > > > > > > > > > The computation would be niters_vector_mult_vf - > > > > > > > > iters_done * vf, since the value put Here is the remaining > > > > > > > > iteration > count. > > > > > > > > It's static for early > > > > > > > exits. > > > > > > > > > > > > > > Well, it might be "static" in that it doesn't really matter > > > > > > > what you use for the epilog main IV initial value as long as > > > > > > > you are sure you're not going to take that exit as you are > > > > > > > sure we're going to take one of the early exits. So yeah, > > > > > > > the special code is probably OK, but it needs a better > > > > > > > comment and as said the structure of > > > > > vect_update_ivs_after_vectorizer is a bit hard to follow now. > > > > > > > > > > > > > > As said an important part for optimization is to not keep > > > > > > > the scalar IVs live in the vector loop. > > > > > > > > > > > > > > > But can do whatever you prefer here. Let me know what you > > > > > > > > prefer for the > > > > > > > above. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Tamar > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Richard. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Tamar > > > > > > > > > > > > > > > > > > > > > > > It has to do this since you have to perform the > > > > > > > > > > > > side effects for the non-matching elements still. > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > Tamar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > + if (STMT_VINFO_LIVE_P (phi_info)) > > > > > > > > > > > > > > + continue; > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + /* For early break the final loop IV is: > > > > > > > > > > > > > > + init + (final - init) * vf which takes > > > > > > > > > > > > > > +into account > > > > > peeling > > > > > > > > > > > > > > + values and non-single steps. The main > > > > > > > > > > > > > > +exit > > > > > can > > > > > > > > > > > > > > +use > > > > > > > > > niters > > > > > > > > > > > > > > + since if you exit from the main exit > > > > > > > > > > > > > > +you've > > > > > done > > > > > > > > > > > > > > +all > > > > > > > > > vector > > > > > > > > > > > > > > + iterations. For an early exit we don't > > > > > > > > > > > > > > +know > > > > > when > > > > > > > > > > > > > > +we > > > > > > > > > exit > > > > > > > > > > > > > > +so > > > > > > > > > > > > > we > > > > > > > > > > > > > > + must re-calculate this on the exit. */ > > > > > > > > > > > > > > + tree start_expr = gimple_phi_result (phi); > > > > > > > > > > > > > > + off = fold_build2 (MINUS_EXPR, stype, > > > > > > > > > > > > > > + fold_convert (stype, > > > > > start_expr), > > > > > > > > > > > > > > + fold_convert (stype, > > > > > init_expr)); > > > > > > > > > > > > > > + /* Now adjust for VF to get the final > > > > > > > > > > > > > > iteration value. > > > > > */ > > > > > > > > > > > > > > + off = fold_build2 (MULT_EXPR, stype, off, > > > > > > > > > > > > > > + build_int_cst (stype, > > > > > > > > > > > > > > vf)); > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > + else > > > > > > > > > > > > > > + off = fold_build2 (MULT_EXPR, stype, > > > > > > > > > > > > > > + fold_convert (stype, > > > > > > > > > > > > > > niters), > > > > > step_expr); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > if (POINTER_TYPE_P (type)) > > > > > > > > > > > > > > ni = fold_build_pointer_plus (init_expr, > > > > > > > > > > > > > > off); > > > > > > > > > > > > > > else > > > > > > > > > > > > > > @@ -2238,6 +2286,8 @@ > > > > > > > > > > > > > > vect_update_ivs_after_vectorizer > > > > > > > > > > > > > > (loop_vec_info > > > > > > > > > > > > > loop_vinfo, > > > > > > > > > > > > > > /* Don't bother call > > > > > > > > > > > > > > vect_peel_nonlinear_iv_init. */ > > > > > > > > > > > > > > else if (induction_type == vect_step_op_neg) > > > > > > > > > > > > > > ni = init_expr; > > > > > > > > > > > > > > + else if (restart_loop) > > > > > > > > > > > > > > + continue; > > > > > > > > > > > > > > > > > > > > > > > > > > This looks all a bit complicated - why wouldn't > > > > > > > > > > > > > we simply always use the PHI result when > > > > > > > > > > > > > 'restart_loop'? > > > > > > > > > > > > > Isn't that the correct old start value in > > > > > > > > > > > all cases? > > > > > > > > > > > > > > > > > > > > > > > > > > > else > > > > > > > > > > > > > > ni = vect_peel_nonlinear_iv_init (&stmts, > > > > > > > > > > > > > > init_expr, > > > > > > > > > > > > > > niters, > > > > > > > > > > > > > > step_expr, > > > @@ - > > > > > > > 2245,9 +2295,20 @@ > > > > > > > > > > > > > > vect_update_ivs_after_vectorizer > > > > > > > > > > > > > (loop_vec_info > > > > > > > > > > > > > > loop_vinfo, > > > > > > > > > > > > > > > > > > > > > > > > > > > > var = create_tmp_var (type, "tmp"); > > > > > > > > > > > > > > > > > > > > > > > > > > > > - last_gsi = gsi_last_bb (exit_bb); > > > > > > > > > > > > > > gimple_seq new_stmts = NULL; > > > > > > > > > > > > > > ni_name = force_gimple_operand (ni, > > > > > > > > > > > > > > &new_stmts, false, var); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + /* For non-main exit create an > > > > > > > > > > > > > > + intermediat edge to get any > > > > > > > > > updated iv > > > > > > > > > > > > > > + calculations. */ > > > > > > > > > > > > > > + if (needs_interm_block > > > > > > > > > > > > > > + && !iv_block > > > > > > > > > > > > > > + && (!gimple_seq_empty_p (stmts) || > > > > > > > > > > > > > > +!gimple_seq_empty_p > > > > > > > > > > > > > (new_stmts))) > > > > > > > > > > > > > > + { > > > > > > > > > > > > > > + iv_block = split_edge (update_e); > > > > > > > > > > > > > > + update_e = single_succ_edge (update_e->dest); > > > > > > > > > > > > > > + last_gsi = gsi_last_bb (iv_block); > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > + > > > > > > > > > > > > > > /* Exit_bb shouldn't be empty. */ > > > > > > > > > > > > > > if (!gsi_end_p (last_gsi)) > > > > > > > > > > > > > > { > > > > > > > > > > > > > > @@ -3342,8 +3403,26 @@ vect_do_peeling > > > > > > > > > > > > > > (loop_vec_info loop_vinfo, tree > > > > > > > > > > > > > niters, tree nitersm1, > > > > > > > > > > > > > > niters_vector_mult_vf steps. */ > > > > > > > > > > > > > > gcc_checking_assert > > > > > > > > > > > > > > (vect_can_advance_ivs_p > > > > > (loop_vinfo)); > > > > > > > > > > > > > > update_e = skip_vector ? e : > > > > > > > > > > > > > > loop_preheader_edge > > > (epilog); > > > > > > > > > > > > > > - vect_update_ivs_after_vectorizer (loop_vinfo, > > > > > > > > > niters_vector_mult_vf, > > > > > > > > > > > > > > - update_e); > > > > > > > > > > > > > > + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) > > > > > > > > > > > > > > + update_e = single_succ_edge (e->dest); > > > > > > > > > > > > > > + bool inversed_iv > > > > > > > > > > > > > > + = !vect_is_loop_exit_latch_pred > > > > > (LOOP_VINFO_IV_EXIT > > > > > > > > > (loop_vinfo), > > > > > > > > > > > > > > + LOOP_VINFO_LOOP > > > > > > > > > (loop_vinfo)); > > > > > > > > > > > > > > > > > > > > > > > > > > You are computing this here and in > > > > > > > vect_update_ivs_after_vectorizer? > > > > > > > > > > > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + /* Update the main exit first. */ > > > > > > > > > > > > > > + vect_update_ivs_after_vectorizer > > > > > > > > > > > > > > + (loop_vinfo, vf, > > > > > > > > > > > niters_vector_mult_vf, > > > > > > > > > > > > > > + update_e, > > > > > inversed_iv); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + /* And then update the early exits. */ > > > > > > > > > > > > > > + for (auto exit : get_loop_exit_edges (loop)) > > > > > > > > > > > > > > + { > > > > > > > > > > > > > > + if (exit == LOOP_VINFO_IV_EXIT (loop_vinfo)) > > > > > > > > > > > > > > + continue; > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + vect_update_ivs_after_vectorizer > > > > > > > > > > > > > > +(loop_vinfo, vf, > > > > > > > > > > > > > > + > > > > > niters_vector_mult_vf, > > > > > > > > > > > > > > + exit, true); > > > > > > > > > > > > > > > > > > > > > > > > > > ... why does the same not work here? Wouldn't > > > > > > > > > > > > > the proper condition be !dominated_by_p > > > > > > > > > > > > > (CDI_DOMINATORS, > > > > > > > > > > > > > exit->src, LOOP_VINFO_IV_EXIT > > > > > > > > > > > > > (loop_vinfo)->src) or similar? That is, whether > > > > > > > > > > > > > the exit is at or after the main IV exit? > > > > > > > > > > > > > (consider having > > > > > > > > > > > > > two) > > > > > > > > > > > > > > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > > > > > > > if (skip_epilog) > > > > > > > > > > > > > > { > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Richard Biener <rguent...@suse.de> SUSE Software > > > > > > > > > > > Solutions Germany GmbH, Frankenstrasse 146, 90461 > > > > > > > > > > > Nuernberg, Germany; > > > > > > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; > > > > > > > > > > > (HRB 36809, AG > > > > > > > > > > > Nuernberg) > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Richard Biener <rguent...@suse.de> SUSE Software > > > > > > > > > Solutions Germany GmbH, Frankenstrasse 146, 90461 > > > > > > > > > Nuernberg, Germany; > > > > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB > > > > > > > > > 36809, AG > > > > > > > > > Nuernberg) > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions > > > > > > > Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; > > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, > > > > > > > AG > > > > > > > Nuernberg) > > > > > > > > > > > > > > > > -- > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions > > > > > Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > > > Nuernberg) > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 > > > Nuernberg, Germany; > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > Nuernberg) > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)