On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li <davi...@google.com> wrote: > unsigned ggc_memory = gcov_read_unsigned (); > + unsigned marker = 0, len = 0, k; > + char **string_array, *saved_cc1_strings; > + > for (unsigned j = 0; j < 7; j++) > > > Do not use hard coded number. Use the enum defined in coverage.c.
OK. > > > + string_array[j] = xstrdup (gcov_read_string ()); > + > + k = 0; > + for (unsigned j = 1; j < 7; j++) > > Do not use hard coded number. OK. > > > + { > + if (num_array[j] == 0) > + continue; > + marker += num_array[j]; > > It is better to read if the name of variable 'marker' is changed to > 'j_end' or something similar > > For all the substrings of 'j' kind, there should be just one marker, > right? It looks like here you introduce one marker per string, not one > marker per string kind. I don't understand what you meant here. "marker" is fixed for each j substring (one option kind) -- it the end index of the sub-string array. k-loop is for each string. > > + len += 3; /* [[<FLAG> */ > > Same here for hard coded value. > > + for (; k < marker; k++) > + len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' */ > > Why do we need one ']' per string? This is because the options strings can contain space ' '. I cannot use space as the delimiter, neither is \0 as it is the end of the string of the encoded string. > > > + } > + saved_cc1_strings = (char *) xmalloc (len + 1); > + saved_cc1_strings[0] = 0; > + > + marker = 0; > + k = 0; > + for (unsigned j = 1; j < 7; j++) > > Same here for 7. will fix in the new patch. > > + { > + static const char lipo_string_flags[6] = {'Q', 'B', 'S', > 'D','I', 'C'}; > + if (num_array[j] == 0) > + continue; > + marker += num_array[j]; > > Suggest changing marker to j_end OK. > > + sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings, > + lipo_string_flags[j - 1]); > + for (; k < marker; k++) > + { > + sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings, > + string_array[k]); > > +#define DELIMTER "[[" > > Why double '[' ? I will change to single '['. > > +#define DELIMTER2 "]" > +#define QUOTE_PATH_FLAG 'Q' > +#define BRACKET_PATH_FLAG 'B' > +#define SYSTEM_PATH_FLAG 'S' > +#define D_U_OPTION_FLAG 'D' > +#define INCLUDE_OPTION_FLAG 'I' > +#define COMMAND_ARG_FLAG 'C' > + > +enum lipo_cc1_string_kind { > + k_quote_paths = 0, > + k_bracket_paths, > + k_system_paths, > + k_cpp_defines, > + k_cpp_includes, > + k_lipo_cl_args, > + num_lipo_cc1_string_kind > +}; > + > +struct lipo_parsed_cc1_string { > + const char* source_filename; > + unsigned num[num_lipo_cc1_string_kind]; > + char **strings[num_lipo_cc1_string_kind]; > +}; > + > +struct lipo_parsed_cc1_string * > +lipo_parse_saved_cc1_string (const char *src, char *str, > + bool parse_cl_args_only); > +void free_parsed_string (struct lipo_parsed_cc1_string *string); > + > > Declare above in a header file. OK. > > > /* Returns true if the command-line arguments stored in the given > module-infos > are incompatible. */ > bool > -incompatible_cl_args (struct gcov_module_info* mod_info1, > - struct gcov_module_info* mod_info2) > +incompatible_cl_args (struct lipo_parsed_cc1_string* mod_info1, > + struct lipo_parsed_cc1_string* mod_info2) > > Fix formating. OK. > > { > { > @@ -1647,7 +1679,7 @@ build_var (tree fn_decl, tree type, int counter) > /* Creates the gcov_fn_info RECORD_TYPE. */ > > NULL_TREE, get_gcov_unsigned_t ()); > DECL_CHAIN (field) = fields; > fields = field; > > - /* Num bracket paths */ > + /* cc1_uncompressed_strlen field */ > field = build_decl (BUILTINS_LOCATION, FIELD_DECL, > NULL_TREE, get_gcov_unsigned_t ()); > DECL_CHAIN (field) = fields; > fields = field; > > > Why do we need to store uncompressed string length? If there is need > to do that, I suggest combine uncompressed length and compressed > length into one 32bit integer. (16/16, or 17/15 split) In theory, I don't need the uncompressed length, But I would need to guess the uncompressed length to allocate the buffer. If the decompressing fails with insufficient space, I need to have another guess and do it again. I think it's easier just save the uncompressed size to the struct. I think combine the two sizes into one gcov_unsigned_t is a good idea. We don't expect the string size are very big. > > > David > > On Mon, Oct 5, 2015 at 3:51 PM, Rong Xu <x...@google.com> wrote: >> Hi, >> >> This patch is for google branch only. >> >> It encodes and compresses various cc1 option strings in >> gcov_module_info to reduce the lipo instrumented object size. The >> savings are from string compression and the reduced number of >> relocations. >> >> More specifically, we replace the following fields in gcov_module_info >> gcov_unsigned_t num_quote_paths; >> gcov_unsigned_t num_bracket_paths; >> gcov_unsigned_t num_system_paths; >> gcov_unsigned_t num_cpp_defines; >> gcov_unsigned_t num_cpp_includes; >> gcov_unsigned_t num_cl_args; >> char *string_array[1]; >> with >> gcov_unsigned_t cc1_strlen; >> gcov_unsigned_t cc1_uncompressed_strlen; >> char *saved_cc1_strings; >> >> The new saved_cc1_strings are zlib compressed string. >> >> Tested with google internal benchmarks. >> >> Thanks, >> >> -Rong