On Wednesday 30 November 2011 18:32:11 Janne Blomqvist wrote: > Hi, > > this patch expands a bit on the recent work done by Thomas Koenig. > Using aermod.f90 from the polyhedron benchmark suite as a test case, > the lseek() calls as reported by strace -c -f go roughly as > > - trunk before Thomas' patch: 21 million > > - current trunk: 5.7 million > > - with attached patch: 2.7 million > > As can be seen, this patch roughly halves the seeks. Of course, 2.7 > million is still a ridiculously high number, but further work requires > different kind of changes, perhaps also a bit riskier, which is why > I'd like to get this in separately. > > Regtested on x86_64-unknown-linux-gnu, Ok for trunk? > > 2011-11-30 Janne Blomqvist <j...@gcc.gnu.org> > > PR fortran/25708 > * module.c (parse_string): Read string into resizable array > instead of parsing twice and seeking. > (verify_atoms): New function. > (require_atom): Move checking to verify_atoms, call it. > (mio_typespec): Don't peek. > (mio_constructor): Likewise. > (mio_typebound_proc): Likewise. > (mio_full_typebound_tree): Likewise. > (mio_f2k_derived): Likewise. > (load_operator_interfaces): Likewise. > (load_generic_interfaces): Likewise. > (load_commons): Likewise. > (load_equiv): Likewise. > (load_derived_extensions): Likewise. > (read_module): Likewise.
Hi, > > modseek1.diff > diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c > index 70f8565..982425d 100644 > --- a/gcc/fortran/module.c > +++ b/gcc/fortran/module.c > @@ -1069,51 +1069,49 @@ module_unget_char (void) > static void > parse_string (void) > { > - module_locus start; > - int len, c; > - char *p; > - > - get_module_locus (&start); > + int c; > + size_t cursz = 30; > + size_t len = 0; > > - len = 0; > + atom_string = XNEWVEC (char, cursz); > > - /* See how long the string is. */ > for ( ; ; ) > { > c = module_char (); > - if (c == EOF) > - bad_module ("Unexpected end of module in string constant"); > > - if (c != '\'') > + if (c == '\'') > { > - len++; > - continue; > + int c2 = module_char (); > + if (c2 == '\'') > + { > + if (len + 1 >= cursz) > + { > + cursz *= 2; > + atom_string = XRESIZEVEC (char, atom_string, cursz); > + } > + atom_string[len] = c; > + len++; > + atom_string[len] = c2; > + len++; I think you are changing the behaviour here. Single quotes in a string are encoded as two single quotes in a row. So when decoding, only one of them should be kept. Honestly, it may well be dead/unused code, but it's something we don't want to change at this point. [...] > @@ -1293,22 +1291,15 @@ peek_atom (void) > } > > > -/* Read the next atom from the input, requiring that it be a > - particular kind. */ > +/* Verify that two atoms are equal, fatal error otherwise. */ > > static void > -require_atom (atom_type type) > +verify_atoms (atom_type got, atom_type expected) > { > - module_locus m; > - atom_type t; > const char *p; > - > - get_module_locus (&m); > - > - t = parse_atom (); > - if (t != type) > + if (got != expected) > { > - switch (type) > + switch (expected) > { > case ATOM_NAME: > p = _("Expected name"); > @@ -1329,12 +1320,24 @@ require_atom (atom_type type) > gfc_internal_error ("require_atom(): bad atom type required"); > } > > - set_module_locus (&m); > bad_module (p); I'm certainly nitpicking here, but for me it's better to have the file position before the unexpected atom than after it when throwing the error in bad_module. To save the call to fgetpos, (through get_module_locus) one could save/restore module_column and module_line as the call to bad_module doesn't need anything else. About the rest of the patch, it's unfortunate that it makes the code slightly more difficult to read [this is my nitpicking day]. I think it's better to optimize peek_atom: the first character is sufficient to distinguish between atom types, so ungetc/module_unget_char can be used, even for strings, names or numbers. Mikael