weak_global_object_name useless?
Adding gcc@gcc.gnu.org On Wed, Jun 29, 2011 at 6:08 PM, Gabriel Charette wrote: > > What's the purpose of weak_global_object_name? Defined in gcc/varasm.c > grepping from the base of the source recursively I only find this: > ./gcc/ChangeLog-1998: * varasm.c (assemble_start_function): Add > weak_global_object_name. > ./gcc/output.h:extern const char *weak_global_object_name; > ./gcc/ChangeLog-2000: weak_global_object_name here, as const char *. > ./gcc/ChangeLog-2000: first_global_object_name or weak_global_object_name. > Clean up string > ./gcc/ChangeLog-2000: * varasm.c (first_global_object_name, > weak_global_object_name): > ./gcc/tree.c: const char *name = weak_global_object_name; > ./gcc/ChangeLog-2005: (weak_global_object_name): Likewise. > ./gcc/varasm.c:extern GTY(()) const char *weak_global_object_name; > ./gcc/varasm.c:const char *weak_global_object_name; > ./gcc/varasm.c: Set first_global_object_name and weak_global_object_name as > appropriate. */ > ./gcc/varasm.c: type = &weak_global_object_name; > It seems like it's never actually set... some references to it are set... but > that seems like a very weird usage? And there is never any code that checks > whether `something == weak_global_object_name`... > I'm tempted to try to remove it... shall we ask Jason? > Gab
Re: weak_global_object_name useless?
Well looking at notice_global_symbol: The comments says that weak_global_object_name is set in it, but there is nothing actually doing so... The only line of code containing weak_global_object_name in notice_global_symbol is: type = &weak_global_object_name; Gab On Wed, Jun 29, 2011 at 6:26 PM, Andrew Pinski wrote: > On Wed, Jun 29, 2011 at 6:13 PM, Gabriel Charette wrote: >> Adding gcc@gcc.gnu.org >> >> On Wed, Jun 29, 2011 at 6:08 PM, Gabriel Charette wrote: >>> >>> What's the purpose of weak_global_object_name? Defined in gcc/varasm.c >>> grepping from the base of the source recursively I only find this: >>> ./gcc/ChangeLog-1998: * varasm.c (assemble_start_function): Add >>> weak_global_object_name. >>> ./gcc/output.h:extern const char *weak_global_object_name; >>> ./gcc/ChangeLog-2000: weak_global_object_name here, as const char *. >>> ./gcc/ChangeLog-2000: first_global_object_name or weak_global_object_name. >>> Clean up string >>> ./gcc/ChangeLog-2000: * varasm.c (first_global_object_name, >>> weak_global_object_name): >>> ./gcc/tree.c: const char *name = weak_global_object_name; >>> ./gcc/ChangeLog-2005: (weak_global_object_name): Likewise. >>> ./gcc/varasm.c:extern GTY(()) const char *weak_global_object_name; >>> ./gcc/varasm.c:const char *weak_global_object_name; >>> ./gcc/varasm.c: Set first_global_object_name and weak_global_object_name >>> as appropriate. */ >>> ./gcc/varasm.c: type = &weak_global_object_name; >>> It seems like it's never actually set... some references to it are set... >>> but that seems like a very weird usage? And there is never any code that >>> checks whether `something == weak_global_object_name`... >>> I'm tempted to try to remove it... shall we ask Jason? > > It is set in notice_global_symbol if I read the code correctly. > *type is either weak_global_object_name or first_global_object_name. > If first_global_object_name was set by this function, then > weak_global_object_name would never be set. Otherwise > weak_global_object_name is set to the first time notice_global_symbol > is called with a weak decl, or an one only decl, or when flag_shlib is > true. > get_file_function_name uses weak_global_object_name if > first_global_object_name was not set. > The purpose is the name of the first weak global symbol to create a > name which is uniq at link time. > get_file_function_name uses it to create a semi uniq number. > > Thanks, > Andrew Pinski >
Re: weak_global_object_name useless?
Aaah my bad! Thanks, Gab On Wed, Jun 29, 2011 at 6:45 PM, Andrew Pinski wrote: > On Wed, Jun 29, 2011 at 6:44 PM, Gabriel Charette wrote: >> Well looking at notice_global_symbol: >> >> The comments says that weak_global_object_name is set in it, but there >> is nothing actually doing so... >> >> The only line of code containing weak_global_object_name in >> notice_global_symbol is: >> type = &weak_global_object_name; >> > > *type = name; > does the setting. > > Thanks, > Andrew Pinski >
[pph] cache replace next insert by
Hey Diego, as we just talked over the phone, here is the diff I had sitting in my stash to ask the pph cache to replace the next cache insert by the given pointer (while still reading what's in the stream). To use it simply call pph_cache_replace_next_by(stream, your_new_pointer) and immediately after call pph_in_(whatever you want to read in and immediately replace the cache pointer value for). Cheers, Gab pph_cache_replace_next_by.diff Description: Binary data
Flags and pph...
Hey guys, so the asm diff we are seeing in c1builtin-integral is not something we are not streaming out, or any other logic being wrong in the pph. The problem is: we define a dg-options entry in the header file which tells deja-gnu to add flags to the compilation (so far so good...) The problem is: deja-gnu only sees those options when doing the pph compile (as deja-gnu takes its options by reading the file compiled it seems, not the full translation unit). In our case compiling the pph with -ffinite-math-only -fno-math-errno. When compiling the C file, those flags are no longer there (deja-gnu doesn't pass them in as they are not present in the C file) Furthermore, when doing the non-pph compile, those flags aren't passed either. It turns out, if we either: 1) Do the pph compile without the flags, and then the pph and non-pph compile: there is no asm diff 2) Do the pph compile with the flags, but also pass the flags when doing the C file compile (with and without pph): there is also no asm diff! So the question is: how do we want to treat flags passed to the pph compile? I don't think it makes sense to use these flags in the rest of the compilation, but not doing so also results in the flag only being use for half of the compilation... Maybe we should save the flags used as part of the pph compile and invalidate the pph file if we don't get the same set of flags when doing the real compilation (as we do, i think, with the identifier(macro) context present when compiling a given pph file)?? Gab
Re: Flags and pph...
To demonstrate my point: add /* { dg-options "-ffinite-math-only -fno-math-errno" } */ to c1builin-integral.cc and watch the test pass (since now we are using the flags across all compilations). Or similarly, remove the same dg-options entry from c0builtin-integral.h and watch the test pass (since now we are using no additional flags across all compilations) Gab On Mon, Jul 18, 2011 at 5:29 PM, Gabriel Charette wrote: > Hey guys, > > so the asm diff we are seeing in c1builtin-integral is not something > we are not streaming out, or any other logic being wrong in the pph. > > The problem is: we define a dg-options entry in the header file which > tells deja-gnu to add flags to the compilation (so far so good...) > > The problem is: deja-gnu only sees those options when doing the pph > compile (as deja-gnu takes its options by reading the file compiled it > seems, not the full translation unit). > > In our case compiling the pph with -ffinite-math-only -fno-math-errno. > > When compiling the C file, those flags are no longer there (deja-gnu > doesn't pass them in as they are not present in the C file) > > Furthermore, when doing the non-pph compile, those flags aren't passed either. > > It turns out, if we either: > 1) Do the pph compile without the flags, and then the pph and non-pph > compile: there is no asm diff > 2) Do the pph compile with the flags, but also pass the flags when > doing the C file compile (with and without pph): there is also no asm > diff! > > So the question is: how do we want to treat flags passed to the pph > compile? I don't think it makes sense to use these flags in the rest > of the compilation, but not doing so also results in the flag only > being use for half of the compilation... > Maybe we should save the flags used as part of the pph compile and > invalidate the pph file if we don't get the same set of flags when > doing the real compilation (as we do, i think, with the > identifier(macro) context present when compiling a given pph file)?? > > Gab >
Re: Flags and pph...
On Mon, Jul 18, 2011 at 5:52 PM, Diego Novillo wrote: > On Mon, Jul 18, 2011 at 20:45, Gabriel Charette wrote: >> To demonstrate my point: >> add /* { dg-options "-ffinite-math-only -fno-math-errno" } */ >> to c1builin-integral.cc >> >> and watch the test pass (since now we are using the flags across all >> compilations). > > Ah, OK. That's your fix then. > Ya, that fixes the test. But the sole fact that pph was not throwing an error is wrong in my opinion. We should probably ensure that the same flags are enabled/disabled that were at the pph compile time (otherwise we could do the parsing with one set of flags and compile with a different set of flags...) We probably only want to enforce this for a subset of the flags (i.e. we don't care about flags like -Wall and stuff like that, but only about the flags actually affecting the binary output). We can discuss this in the Hydrazine meeting tomorrow. Gab
Linemap and pph
Hi, @tromey: We have a question for you: the problem is detailed here and our question to you is at the bottom. Thanks! So the line problem we are having in pph is much bigger then we originally thought. The problem is: As we've had issues with before: the chain of name bindings in any cpp_bindings is inserted at the head, such that when we stream it back in, the last one put on the chain by the parser is read first. This is a problem in the linemap case because when we read the last name, we also read it's input_location (line and column) and try to replay that location by calling linemap_line_start (so far so good). This issue is that linemap_line_start EXPECTS to be called with lines in increasing orders (but since the bindings are backwards in the chain we actually call it with lines in a decreasing order), the logic in linemap_line_start is such that if the line# is less then the previous line# it was called with, it assumes this must be a new file and creates a new file entry in the line_table by calling linemap_add (which in our case is WRONG!!). >From a comment found in libcpp/line-map.c, linemaps clearly assume this fact: "Since the set (line_table) is built chronologically, the logical lines are monotonic increasing, and so the list is sorted and we can use a binary search." @crowl: Maybe this is the issue with resume_scope?? I'm just thinking that since it seems lookups are fairly tied to the line_table, and the fact that it should be built in a monotonically increasing order. In the past we have solved similar issues (caused by the backwards ordering), by replaying whatever it was in the correct order (before doing anything else), and then simply loading references using the pph_cache when the real ones were needed. BUT we can't do this with source_locations as they are a simple typedef unsigned int, not actual structs which are passed by pointer value... Potential solution? : I thought of a few ways to solve this: 1) Catch the calls to the linemap functions when generating the pph, store them with the pph and replay them first before anything else when reading the pph file. The problem with that is that to when trying to obtain the source location for a given binding streamed in: after calling linemap_line_start when needed, we call linemap_position_for_column which ASSUMES that we are currently on the last line which linemap_line_start was called with (and thus doing the full replay first would not really work here, is there another function we can call that can take a line as an argument as well as a column??) 2) We could potentially output a reference to the source_location and use that to do what we always do using the cache in this sort of situation, however, we would need to figure out exactly which integer we need to output a reference of (since they are passed around by value...). This would be kinda hacky, and would also imply building some hackery to go around the fact that the lto version of the input location doesn't need to use references (i.e. we would need a streamer hook, but two different functions signatures...). I don't think this is a nice way of going about it, but could be an option... @tromey: I hear you are the person in the know when it comes down to linemaps, do you have any hint on how we could rebuild a valid line_table given we are obtaining the bindings from the last one in the file to the first one (that is, using pph (pre-parsed headers) where we are trying to restore a cached version of the parser state for a header)? Thanks, Gabriel
Re: Linemap and pph
Thanks for the quick reply Tom, On Fri, Jul 22, 2011 at 1:39 PM, Tom Tromey wrote: >>>>>> "Gabriel" == Gabriel Charette writes: > > Gabriel> @tromey: We have a question for you: the problem is detailed > Gabriel> here and our question to you is at the bottom. Thanks! > > Sure. I have not followed PPH progress very closely and after reading > your message I am not sure I have much to offer. I'll give it a stab > anyhow. > > Gabriel> This is a problem in the linemap case because when we read the last > Gabriel> name, we also read it's input_location (line and column) and try to > Gabriel> replay that location by calling linemap_line_start (so far so > Gabriel> good). This issue is that linemap_line_start EXPECTS to be called > Gabriel> with lines in increasing orders (but since the bindings are backwards > Gabriel> in the chain we actually call it with lines in a decreasing order), > Gabriel> the logic in linemap_line_start is such that if the line# is less > then > Gabriel> the previous line# it was called with, it assumes this must be a new > Gabriel> file and creates a new file entry in the line_table by calling > Gabriel> linemap_add (which in our case is WRONG!!). > > My belief is that linemap was designed to support the typical case for a > C and C++ compiler. However, this is not sacrosanct; you could perhaps > change the implementation to work better for your case. I suspect, > though, that this won't be easy -- and also be aware of Dodji's patch > series, which complicates linemap. > We are tying to keep pph as "pluginable" as possible (Diego correct me if I'm wrong), so changing the actual implementation of the linemap would be our very last resort I think. > Gabriel> In the past we have solved similar issues (caused by the backwards > Gabriel> ordering), by replaying whatever it was in the correct order (before > Gabriel> doing anything else), and then simply loading references using the > Gabriel> pph_cache when the real ones were needed. BUT we can't do this with > Gabriel> source_locations as they are a simple typedef unsigned int, not > actual > Gabriel> structs which are passed by pointer value... > > A source_location is a reference to a particular line_map. It is sort > of like a compressed pointer. > Right, but our cache works in a way that, say you output a refrenece to a struct, you first remember the pointer value, then output the struct, if you try to ouput the same reference (same pointer value) later we simply mark that as a shared reference in the output (and have logic to link the two on the way in). However since source_location aren't pointers per se, this wouldn't work (at least with our current cache implementation, and changing that is also last resort in my opinion) > Gabriel> @tromey: I hear you are the person in the know when it comes down to > Gabriel> linemaps, do you have any hint on how we could rebuild a valid > Gabriel> line_table given we are obtaining the bindings from the last one in > Gabriel> the file to the first one (that is, using pph (pre-parsed headers) > Gabriel> where we are trying to restore a cached version of the parser state > Gabriel> for a header)? > > Can you not just serialize the line map when creating the PPH? > We were wondering about that, the problem we thought is that a pph can be loaded from anywhere in many different C files (which would generate a different linemap entry in each case I think?). If there was a way to serialize the linemap entries from the LC_ENTER entry for the header file to the LC_LEAVE (i.e. ignoring builtins, command line, etc.), and then directly insert those entries in a given C file compilation's linemaps entries, that would be great! > Then, when using the PPH, read the serialized line map, in order, into > the current global line map. This will preserve the include chains and > all the rest. Then rewrite source locations coming from the PPH to new > locations from the new line map. This rewriting could perhaps even be > done efficiently, say just by adding an offset to all locations -- > though I am not sure, this would require some research. (It seems > doable to me, though; perhaps, but not definitely, requiring some > additional linemap API.) > I will investigate in that direction. > I have no idea if this is practical for you. I guess it depends on how > a PPH is read in. > A pph is read in by: reading the preprocessor tokens and replaying them (i.e. redefining them as if they had been pasted in the C file at the very location of the include triggering the pph). We then load the bindings attached to the global_namespace defined in the header and all bin
Re: Linemap and pph
On Sat, Jul 23, 2011 at 8:51 AM, Dodji Seketeli wrote: > Gabriel Charette a écrit: > >>> Gabriel> @tromey: I hear you are the person in the know when it comes down >>> to >>> Gabriel> linemaps, do you have any hint on how we could rebuild a valid >>> Gabriel> line_table given we are obtaining the bindings from the last one in >>> Gabriel> the file to the first one (that is, using pph (pre-parsed headers) >>> Gabriel> where we are trying to restore a cached version of the parser state >>> Gabriel> for a header)? >>> >>> Can you not just serialize the line map when creating the PPH? >>> >> >> We were wondering about that, the problem we thought is that a pph can >> be loaded from anywhere in many different C files (which would >> generate a different linemap entry in each case I think?). > > Just to make sure I understand, a given header included from two > different main C files with two different sets of macro context would > yield two macro maps with different contents anyway. So I would believe > that these would yield two different pph, one for each C file. Is that > correct? > Before using a pph, we make sure the current macro context is the same as the one in which it was originally compiled (as far as what that header is using from that context). So yes, sometimes we have different pph for different C files, but it's very possible to use the same pph for different includes in independent C files. > For the cases where the same pph could be loaded from two different main > CUs, I'd say that in the context of a given main CU being parsed, loading > the PPH and its associated serialized line map would yield a new line > map to be inserted into the struct linemaps of the current CU. > > Then, not only should the source_locations encoded by that new > de-serialized line map be rewritten (probably by modifying things like > struct linemap::to_line and struct linemap::included_from, if need be) > to fit into the source_location space of the line map set carried by the > struct linemaps of the current main CU being parsed, but the > source_locations carried by the tokens present inside the pph must also > be changed to look as if they were yielded by the newly rewritten line > map. This is what could look expensive at first sight. > Right this is one of the option, but as you say, potentially an expensive change. > Wouldn't that address the issue of a given pph loaded by multiple main C > files? > If it works, yes, I think. >> If there was a way to serialize the linemap entries from the LC_ENTER >> entry for the header file to the LC_LEAVE (i.e. ignoring builtins, >> command line, etc.) > > I believe all the builtins have the same source_location, which is the > BUILTIN_LOCATION constant. As for command line options, I believe they > get the location UNKNOWN_LOCATION. So I would think these should not be > a problem when de-serializing the line map. > Right, but no point serializing them only to filter them on input.
Re: Linemap and pph
Alright, so after looking even more at the linemap code, here is what I'm thinking now: So the main problem is: with the linemap logic is that linemap_line_start adds a new table entry if line_delta < 0 (I don't understand why this is needed, my assumption is that it is because the rest of the logic depends on set->highest_location, thus we want to make sure we are higher than the highest so far? can someone clarify why we need this?) My solution: I will add a boolean flag to linemap_line_start's parameters, allowEarlierStartLineStart, which when true, will not create a new entry in the line_table even if the line_delta < 0. Instead of relying on highest_location to find the current line's source location, it will use the start_location and add to it the delta between to_line and map->to_line). I will also add a new linemap function, linemap_position_for_line_and_column, which will be just like linemap_position_for_column, except that it won't assume the highest_line in the set is the one we are currently on (this function will thus allow us to get a source_location without depending on the current assumptions on the order these functions are called in). This solution would not modify the actual final result of the entries, at the end of parsing, as far as I understand (I would also of course only use this new flag in the call from lto_input_location (which we use in pph to get our source_location from the streamed expanded_location)) Let me know what you think, Gabriel
Re: Linemap and pph
On Tue, Jul 26, 2011 at 4:25 AM, Dodji Seketeli wrote: > Gabriel Charette a écrit: > >> Alright, so after looking even more at the linemap code, here is what >> I'm thinking now: >> >> So the main problem is: >> with the linemap logic is that linemap_line_start adds a new table >> entry if line_delta < 0 (I don't understand why this is needed, my >> assumption is that it is because the rest of the logic depends on >> set->highest_location, thus we want to make sure we are higher than >> the highest so far? can someone clarify why we need this?) > > Here is how I understand this. Sorry if it's too obvious. > > The linemap_line_start model follows that of a parser that parses tokens > from left to right, and from top to bottom. Let's imagine a main CU > called M which has this layout: > > < start, locus #0 > > [... tokens ...] > > #include "A" <--- locus #1 > > [... tokens ...] > > There are going to be at least three line maps (instances of struct > line_map) created and added to the line map set (struct line_maps) of M: > one for loci from #0 to right before #1, at least one for loci coming > from the included header A (there can be several line maps here, if A > itself includes other headers) and another one from loci coming right > after #1. > > A hard invariant here is that source_locations yielded by a given line > map must be less than the struct line_map::start_location of the next > line map. This is (partly) so that line map lookup (when we want to get > the line map that corresponds to a given source_location) can be done > using a binary search, which is faster than just doing a linear search. > A side effect of this is that source_locations handed out by > linemap_line_start increase monotonically. > > With these assumptions in mind, when the parser code calls > linemap_line_start to get the source_location corresponding to a new > line at column 0, the only case where the new line is less than the > preceding line (so that line_delta < 0) is if the parser is entering a > new file. E.g, it is entering the header A, coming from M. Or, it is > getting back to M, leaving A. In that case, it seems required for > linemap_line_start to create the new line map for A or for M. > Thanks for this explanation, that's pretty much how I had understood the linemap code, but you clarified some of the things I wasn't clear about. >> My solution: >> I will add a boolean flag to linemap_line_start's parameters, >> allowEarlierStartLineStart, which when true, will not create a new >> entry in the line_table even if the line_delta < 0. > > If I understand correctly, you are acting like if you were parsing from > right to left and from bottom to top, for the content of A. > Right. > In that context, you need to keep the hard invariant I talked about > above, so that line map lookup keeps working, at least. > > In other words, when you generate a source_location for a token T0 that > comes topologically /before/ the token T1 you generated source_location > for at the previous step, you need to make sure that source_location of > T0 is less than source_location of T1, and that the struct > line_map::start_location of the current map is less than T0. If T0 > comes from a different header than T1 (suppose both T1 and T0 comes from > headers that are included by A) then a new line map (different from the > line map for T1) needs to be created for T0. Now I am not sure if in > your design, there is going to be only one pph for A, or if there is > going to be one pph for each header included by A as well. I would have > imagined the later case to be more appropriate. > Correct we are using the multiple pph approach (one for each included header in the header). This actually makes the problem harder than I had originally pictured it yesterday. I'm now thinking using serialization is going to be a better solution because of that. >> Instead of relying on highest_location to find the current line's >> source location, it will use the start_location and add to it the >> delta between to_line and map->to_line). > > If the map->start_location was allocated correctly, I guess this could > work assuming the constraints I raised above are respected. > > Note however that there are corner cases like having so long lines in > the source code being parsed that we fill the column-space of the line > map so quickly that we need to allocate more line maps for a given file. > These are corner cases taken care of by linemap_line_start today that > you could miss if you are trying to walk in reverse order like this. > Those would have been ad
Re: Linemap and pph
As mentionned in my previous email, I'm now thinking of serializing the linemap from the header by pph'ed, here is how I think this could work: From what I understand, the source_locations allocated for everything in a given set of headers (from the LC_ENTER for the header in the line_table up to, and including everything in between, the corresponding LC_LEAVE) is dependent on only one thing; the value of line_table->highest_location when the header was inserted (i.e. if in two different contexts we happen to have the same line_table->highest_location when inserting header A.h, all of the tokens for A.h in each context will have the same source_location). If the previous statement is true, then we calculate an offset between the line_table->highest_location as it was in the serialized version of the header and as it is in the C file in which we are about to de-serialize the line table. We then need to update some things based on that offset: for each line_map entry in the line_table: start_location for each source_location field on tokens coming in from the header Some other things need to be updated with simple logic: for each line_map entry in the line_table: included_from for the line_table itself: highest_location, highest_line, max_column_hint Some things just stay as is when de-serialized: for each line_map entry in the line_table: reason, sysp, to_file, to_line, column_bits After doing this we should have the same line_map we would have had when doing a regular compile (non-pph), and the tokens coming from the header file should have the correct source_location, simply by adding the calculated offset to the one they had previously (this also allows us to only serialize the source_location for each token as opposed to serializing it's expanded location, saving space in the pph). Let me know what you think, Gabriel On Tue, Jul 26, 2011 at 11:10 AM, Gabriel Charette wrote: > On Tue, Jul 26, 2011 at 4:25 AM, Dodji Seketeli wrote: >> Gabriel Charette a écrit: >> >>> Alright, so after looking even more at the linemap code, here is what >>> I'm thinking now: >>> >>> So the main problem is: >>> with the linemap logic is that linemap_line_start adds a new table >>> entry if line_delta < 0 (I don't understand why this is needed, my >>> assumption is that it is because the rest of the logic depends on >>> set->highest_location, thus we want to make sure we are higher than >>> the highest so far? can someone clarify why we need this?) >> >> Here is how I understand this. Sorry if it's too obvious. >> >> The linemap_line_start model follows that of a parser that parses tokens >> from left to right, and from top to bottom. Let's imagine a main CU >> called M which has this layout: >> >> < start, locus #0 >> >> [... tokens ...] >> >> #include "A" <--- locus #1 >> >> [... tokens ...] >> >> There are going to be at least three line maps (instances of struct >> line_map) created and added to the line map set (struct line_maps) of M: >> one for loci from #0 to right before #1, at least one for loci coming >> from the included header A (there can be several line maps here, if A >> itself includes other headers) and another one from loci coming right >> after #1. >> >> A hard invariant here is that source_locations yielded by a given line >> map must be less than the struct line_map::start_location of the next >> line map. This is (partly) so that line map lookup (when we want to get >> the line map that corresponds to a given source_location) can be done >> using a binary search, which is faster than just doing a linear search. >> A side effect of this is that source_locations handed out by >> linemap_line_start increase monotonically. >> >> With these assumptions in mind, when the parser code calls >> linemap_line_start to get the source_location corresponding to a new >> line at column 0, the only case where the new line is less than the >> preceding line (so that line_delta < 0) is if the parser is entering a >> new file. E.g, it is entering the header A, coming from M. Or, it is >> getting back to M, leaving A. In that case, it seems required for >> linemap_line_start to create the new line map for A or for M. >> > > Thanks for this explanation, that's pretty much how I had understood > the linemap code, but you clarified some of the things I wasn't clear > about. > >>> My solution: >>> I will add a boolean flag to linemap_line_start's parameters, >>> allowEarlierStartLineStart, which when true, will not create a new >>> entry
Re: Linemap and pph
I think I wasn't clear in the way I expressed my assumptions in my last email: On Wed, Jul 27, 2011 at 1:11 AM, Dodji Seketeli wrote: > > Gabriel Charette a écrit: > > > From what I understand, the source_locations allocated for > > everything in a given set of headers (from the LC_ENTER for the > > header in the line_table up to, and including everything in between, > > the corresponding LC_LEAVE) is dependent on only one thing; the > > value of line_table->highest_location when the header was inserted > > The source_location handed out by a given line map is dependant on > three things: > > 1/ The line_map::start_location. For a given map, this one equals > line_table->highest_location + 1, because at the time the line map is > created, its line_map::start_location must be greater than the highest > source_location handed out by any line map previously created. At any > point in time, line_table->highest_location equals the > line_map::start_location of the lastly created line_map. This is part > of the monotonically increasing property of source_location we were > talking about earlier. > Actually from my understanding, highest_location is not equal to the last start_location, but to the last source_location returned by either linemap_line_start or linemap_positition_for_column (which is >= to the start_location of the current line_map). > 2/ The line number of the location > > 3/ The column number of the location > Right, that's what I mean, I would not actually stream highest_location. What I meant by they "all depend on only highest_location" is that IF highest_location is the same in file B.c and in a different compiled file C.c when they happen to include A.h, then all of the source_locations for the tokens in A.h (in both compilation) would be identical (i.e. token1's loc in B == token1's loc in C, etc.) (as 2/3 always is the same since we're talking about the same file A.h in both compilation, hence if 1 also holds, we get the same result). > > (i.e. if in two different contexts we happen to have the same > > line_table->highest_location when inserting header A.h, all of the > > tokens for A.h in each context will have the same source_location). > > Each token coming from a given A.h will have a different > source_location, as they will presumably have a different {line,column} > pair. > What I meant is that all of the source locations handed out in the first compilation will be the same as all of the source locations handed out in the second compilation, pairwise (not that ALL token's source locations themselves will be the same within a single compilation of course!). > If in two different contexts we happen to have the same > line_table->highest_location, the line_map::start_location of the two > different line maps of the two different A.hs will be equal, though. > > > If the previous statement is true, then we calculate an offset > > between the line_table->highest_location as it was in the serialized > > version of the header and as it is in the C file in which we are > > about to de-serialize the line table. > > > > We then need to update some things based on that offset: > > [...] > Hence, given that they only depend on start_location, I just have to calculate an offset between the serialized start_location and the start_location as it would be (highest_location + 1) in the C file including the header, and offset all of the source_locations on each token coming from the pph (without even needing to recalculate them!). > It seems to me that you could just set the line_map::start_location of > the de-serialized map for the current portion of A.h to the current > line_table->highest_location of the main CU your are currently parsing > (i.e, just forget about the serialized line_table->highest_location > into account. Actually I would think that it's unnecessary to > serialize the line_table->highest_location) + 1. > > Then you should also set the de-serialized line_map::to_line to the > current line in your context. Then you should add that line map to > the current line_table and set line_table->highest_location to the > line_map::start_location you have just computed. > > Then, assign the source_location of each token that belongs to that > de-serialized map to a source_location that is handed out by your new > linemap_position_for_line_and_column, with de-serialized map passed in > argument. This is where you would progress "backward" in the token > stream, for that given map. > > Then somehow, when you need to suck in a new pph (so this time you are > going downward in the stream again), just start this dance of > de-serializing the map for th
Re: Linemap and pph
>> >> Hence, given that they only depend on start_location, I just have to >> calculate an offset between the serialized start_location and the >> start_location as it would be (highest_location + 1) in the C file >> including the header, and offset all of the source_locations on each >> token coming from the pph (without even needing to recalculate them!). >> > > That could work. But then you'd need to do something for a map encoding > the locations of tokens coming from the pph to appear in line_table, > right? Otherwise, at lookup time, (when you want to find the map that > matches the source_location of a token coming for that pph), you'll be > in trouble. I am saying this b/c you are not calling linemap_line_start > anymore. And that function was the one that was including the said map > to line_table. And the map still must be inserted into line_table > somehow. > The lookup, from what I understand, only depends on the line_map entries for the header to be present, and the same as they would be *had* the functions been called (not that each token actually called the linemap getters to get its location), and also of course depends on that the tokens have the correct source_locations *as if* obtained from the linemap getters. >> Doing it this way (with the offset) I would read in all the tokens and >> linemap entries inherited from that header and it's underlying include >> tree, thus no need to be tricky about inserting line tables for the >> header's included file, as they are part of the header's serialized >> line_table by recursion (a pph'ed header can include other pph'ed >> header), > > This is what I am not sure to understand. There is only one line table > per CU. The headers included by the CU generate instances of struct > line map that are inserted into the line table of the CU. So I don't > understand what you mean by "header's serialized line_table", as I don't > think there is such a thing as a header's line_table. What I mean by "serialized header line_table" is the serialized version of the line_table as it was when were done parsing the header being pph'ed. I would then de-serialize that and insert it's line_map entries in the C file's line_table, doing the necessary offset adjustements in the process (and updating all other line_table variables like highest_location that would have changed if we had actually called the linemap functions) Best, Gabriel
Re: Line 0 Hack??
Re-sending as plain text for gcc@gcc.gnu.org ... Hi, I have a question about the line 0 hack on line 13232 of gcc/cp/decl.c (or just text search for "Hack", it's the only place it's found in that file...). From my revision history, Steven introduced this in 2005, and Tom modified it in 2007 (probably when modifying the linemap). The problem is that this very call to get the source_location of line 0 creates a NEW linemap entry in the line table, AFTER all of the LC_LEAVE have taken place (i.e. we were done parsing and now add a new linemap to the line_table)... And hence, we finish the parsing with line_table->depth == 1. In particular, I am building linemap serialization for pre-parsed headers, and added what I think to be a fair gcc_assert that when we serialize the line_table, it's depth should be 0. However, if we happen to have "main" in the header (this is when we get in this block in decl.c from my understanding as DECL_MAIN_P is then true), a new linemap is added at the end of the line table in the header after the LC_LEAVE... when merging this in the middle of a C file upon deserializing, this entry makes no sense, but I can't just ignore it either as a source_location has been handed off for it... My question is, what is this "special line 0" is this just a hack for this particular situation or is "line 0" a much more important concept I can't mess around with? I could potentially hack around it, but hacking around a hack can only make things worst in the long run.. I'm not familiar with the "middle end warning" referred to in the comment in decl.c, could we potentially once and for all get rid of this hack? Best, Gabriel
Re: Line 0 Hack??
I have removed the hack and the test output is identical to the clean build test output. See issue4835047 for the patch. Gabriel On Mon, Aug 1, 2011 at 2:56 PM, Gabriel Charette wrote: > Re-sending as plain text for gcc@gcc.gnu.org ... > > > > Hi, > > I have a question about the line 0 hack on line 13232 of gcc/cp/decl.c > (or just text search for "Hack", it's the only place it's found in > that file...). > > From my revision history, Steven introduced this in 2005, and Tom > modified it in 2007 (probably when modifying the linemap). > > The problem is that this very call to get the source_location of line > 0 creates a NEW linemap entry in the line table, AFTER all of the > LC_LEAVE have taken place (i.e. we were done parsing and now add a new > linemap to the line_table)... > > And hence, we finish the parsing with line_table->depth == 1. > > In particular, I am building linemap serialization for pre-parsed > headers, and added what I think to be a fair gcc_assert that when we > serialize the line_table, it's depth should be 0. However, if we > happen to have "main" in the header (this is when we get in this block > in decl.c from my understanding as DECL_MAIN_P is then true), a new > linemap is added at the end of the line table in the header after the > LC_LEAVE... when merging this in the middle of a C file upon > deserializing, this entry makes no sense, but I can't just ignore it > either as a source_location has been handed off for it... > > My question is, what is this "special line 0" is this just a hack for > this particular situation or is "line 0" a much more important concept > I can't mess around with? > > I could potentially hack around it, but hacking around a hack can only > make things worst in the long run.. > > I'm not familiar with the "middle end warning" referred to in the > comment in decl.c, could we potentially once and for all get rid of > this hack? > > Best, > Gabriel >