libcpp has a degraded fallback mode for large source files where if a
location_t > LINE_MAP_MAX_LOCATION_WITH_COLS
we fall back to just tracking lines, not columns&ranges
(currently 0x60000000), and every location_t expands to having a
column == 0.
Sadly we're more likely to see this case in gcc 6 than in gcc 5 since
I'm using up 5 bits of location_t, for packing short ranges into them
(to avoid needing to go to the ad-hoc lookaside during tokenization;
this change was part of r230331).
-Wmisleading-indentation expects to have meaningful column data, and
so interacts poorly with this fallback, issuing a warning at every
if/for/while, I believe.
I don't think there's a way to fix -Wmisleading-indentation if we're
in this state, so the first part of the following patch detects if
this has happened, and effectively turns off -Wmisleading-indentation
from that point onwards. To avoid a false sense of security, the
patch issues a "sorry" at the that point, currently with this wording:
location-overflow-test-1.c:17:0: sorry, unimplemented: -Wmisleading-indentation
is disabled from this point onwards, since column-tracking was disabled due to
the size of the code/headers
On attempting to bootstrap with just this change, I ran into a
problem: gimple-match.c is hitting the
LINE_MAP_MAX_LOCATION_WITH_COLS limit, and I got a "sorry" at line
50437:0: sorry, unimplemented: etc
std::swap (o30, o31);
i.e. the bit-packing optimization is causing us to hit that
LINE_MAP_MAX_LOCATION_WITH_COLS limit earlier than before,
and presumably this will affect people with other very large source
files (e.g. as in the initial PR c++/68819 report).
Should this greater chance of hitting LINE_MAP_MAX_LOCATION_WITH_COLS
be filed as a separate PR?
The second part of the patch resolves this by adding an additional
level of fallback: a new LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
threshold (currently 0x50000000) that occurs well before
the LINE_MAP_MAX_LOCATION_WITH_COLS threshold (0x60000000).
Once we reach the new LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
threshold, the range-packing optimization is disabled (with a transition
to an ordinary map with m_range_bits == 0), effectively giving us a
much "longer runway" before the LINE_MAP_MAX_LOCATION_WITH_COLS
threshold is hit, at the cost to requiring the use of the ad-hoc
table for every location (e.g. every token of length > 1).
I haven't yet done performance testing on this.
The patch adds test coverage for this by using a plugin to simulate
the two levels of degraded locations.
Rough calculations, assuming 7 bits of columns,
LINE_MAP_MAX_LOCATION_WITH_COLS == 0x60000000
LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES == 0x50000000
gcc 5:
0x60000000 / 128 per line = 12,582,912 lines of code before
hitting the has-column-information limit.
gcc 6 trunk:
0x60000000 / (128 * 32) per line = 393,216 lines of code before
hitting the has-column-information limit.
with this patch:
0x50000000 / (128 * 32) per line = 327,680 lines of code before
hitting the range-packing limit, then:
0x10000000 / 128 = 2,097,152 lines of code before hitting the
has-column-information limit.
giving 2,424,832 lines of code total before hitting the
has-column-information limit.
These numbers will be less in the face of lines longer than
127 characters.
If the increased use of the ad-hoc table is an issue, another
approach might be to simply disable range-handling for locations
that go beyond a threshold location_t value: attempts to combine
locations above that value lead to you simply getting the caret
location. If we take this approach, I think we'd still want to
have a range threshold before the column one, so that we preserve
the ability to have column information for these pure-caret
locations.
Alternatively, the range bits could be lowered from 5 to 4,
doubling the lines we can handle before hitting the limit:
0x60000000 / (128 * 16) = 786,432, though that doesn't buy
us much compared to the approach in this patch.
Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
Thoughts?
gcc/c-family/ChangeLog:
PR c++/68819
* c-indentation.c (get_visual_column): Handle the column
number being zero by effectively disabling the warning, with
a "sorry".
gcc/testsuite/ChangeLog:
PR c++/68819
* gcc.dg/plugin/location-overflow-test-1.c: New test case.
* gcc.dg/plugin/location-overflow-test-2.c: New test case.
* gcc.dg/plugin/location_overflow_plugin.c: New test plugin.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
libcpp/ChangeLog:
PR c++/68819
* line-map.c (LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES): New
constant.
(LINE_MAP_MAX_LOCATION_WITH_COLS): Add note about unit tests
to comment.
(can_be_stored_compactly_p): Reduce threshold from
LINE_MAP_MAX_LOCATION_WITH_COLS to
LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES.
(get_combined_adhoc_loc): Likewise.
(get_range_from_loc): Likewise.
(linemap_line_start): Ensure that a new ordinary map is created
when transitioning from range-packing being enabled to disabled,
at the LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES threshold. Set
range_bits to 0 for new ordinary maps when beyond this limit.
Prevent the "increase the column bits of a freshly created map"
optimization if the range bits has reduced.
---
gcc/c-family/c-indentation.c | 19 ++++
.../gcc.dg/plugin/location-overflow-test-1.c | 29 ++++++
.../gcc.dg/plugin/location-overflow-test-2.c | 37 ++++++++
.../gcc.dg/plugin/location_overflow_plugin.c | 103 +++++++++++++++++++++
gcc/testsuite/gcc.dg/plugin/plugin.exp | 3 +
libcpp/line-map.c | 27 ++++--
6 files changed, 212 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/plugin/location-overflow-test-1.c
create mode 100644 gcc/testsuite/gcc.dg/plugin/location-overflow-test-2.c
create mode 100644 gcc/testsuite/gcc.dg/plugin/location_overflow_plugin.c
diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index ca7efdc..229a017 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -40,6 +40,25 @@ get_visual_column (expanded_location exploc,
unsigned int *out,
unsigned int *first_nws = NULL)
{
+ /* PR c++/68819: if the column number is zero, we presumably
+ had a location_t > LINE_MAP_MAX_LOCATION_WITH_COLS, and so
+ we have no column information.
+ Act as if no conversion was possible, triggering the
+ error-handling path in the caller. */
+ if (!exploc.column)
+ {
+ static bool issued_apology = false;
+ if (!issued_apology)
+ {
+ /* Notify the user the first time this happens. */
+ issued_apology = true;
+ sorry ("-Wmisleading-indentation is disabled from this point"
+ " onwards, since column-tracking was disabled due to"
+ " the size of the code/headers");
+ }
+ return false;
+ }
+
int line_len;
const char *line = location_get_source_line (exploc.file, exploc.line,
&line_len);
diff --git a/gcc/testsuite/gcc.dg/plugin/location-overflow-test-1.c
b/gcc/testsuite/gcc.dg/plugin/location-overflow-test-1.c
new file mode 100644
index 0000000..1fefc90
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/location-overflow-test-1.c
@@ -0,0 +1,29 @@
+/* { dg-options "-Wmisleading-indentation -Wall
-fplugin-arg-location_overflow_plugin-value=0x60000001" } */
+
+/* We use location_overflow_plugin.c to inject the
+ which injects the case that location_t values have exceeded
+ LINE_MAP_MAX_LOCATION_WITH_COLS, and hence no column
+ numbers are available. */
+
+/* Verify that we're in column-less mode. */
+extern unknown_type test; /* { dg-error "0: unknown type name" } */
+
+/* PR c++/68819: verify that -Wmisleading-indentation is suppressed. */
+
+int
+fn_1 (int flag)
+{
+ int x = 4, y = 5;
+ if (flag) x = 3; y = 2; /* { dg-message "sorry" } */
+ return x * y;
+}
+
+/* ...and that a "sorry" is only emitted the first time. */
+
+int
+fn_2 (int flag)
+{
+ int x = 4, y = 5;
+ if (flag) x = 3; y = 2; /* { dg-bogus "sorry" } */
+ return x * y;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/location-overflow-test-2.c
b/gcc/testsuite/gcc.dg/plugin/location-overflow-test-2.c
new file mode 100644
index 0000000..661f0c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/location-overflow-test-2.c
@@ -0,0 +1,37 @@
+/* { dg-options "-fdiagnostics-show-caret -Wmisleading-indentation -Wall
-fplugin-arg-location_overflow_plugin-value=0x50000001" } */
+
+/* We use location_overflow_plugin.c to inject the
+ which injects the case that location_t values have exceeded
+ LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES, and hence no
+ range-packing should occur. */
+
+/* Verify that we still have column numbers. */
+extern unknown_type test; /* { dg-error "8: unknown type name" } */
+
+/* ...and ranges. */
+/* { dg-begin-multiline-output "" }
+ extern unknown_type test;
+ ^~~~~~~~~~~~
+ { dg-end-multiline-output "" } */
+
+
+/* PR c++/68819: verify that -Wmisleading-indentation is still available. */
+
+int
+fn_1 (int flag)
+{
+ int foo = 4, bar = 5;
+ if (flag) foo = 3; bar = 2; /* { dg-warning "indented" } */
+ return foo * bar;
+}
+
+/* Verify that we still have ranges, despite the lack of packing. */
+
+/* { dg-begin-multiline-output "" }
+ if (flag) foo = 3; bar = 2;
+ ^~~
+ { dg-end-multiline-output "" } */
+/* { dg-begin-multiline-output "" }
+ if (flag) foo = 3; bar = 2;
+ ^~
+ { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/plugin/location_overflow_plugin.c
b/gcc/testsuite/gcc.dg/plugin/location_overflow_plugin.c
new file mode 100644
index 0000000..1c140d8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/location_overflow_plugin.c
@@ -0,0 +1,103 @@
+/* Plugin for testing how gracefully we degrade in the face of very
+ large source files. */
+
+#include "config.h"
+#include "gcc-plugin.h"
+#include "system.h"
+#include "coretypes.h"
+#include "spellcheck.h"
+#include "diagnostic.h"
+
+int plugin_is_GPL_compatible;
+
+static location_t base_location;
+
+/* Callback handler for the PLUGIN_START_UNIT event; pretend
+ we parsed a very large include file. */
+
+static void
+on_start_unit (void */*gcc_data*/, void */*user_data*/)
+{
+ /* Act as if we've already parsed a large body of code;
+ so that we can simulate various fallbacks in libcpp:
+
+ 0x50000001 > LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES:
+ this will trigger the creation of line maps with range_bits == 0
+ so that all ranges will be stored in the ad-hoc lookaside.
+
+ 0x60000001 > LINE_MAP_MAX_LOCATION_WITH_COLS:
+ this will trigger the creation of line maps with column_bits == 0
+ and hence we will immediately degrade to having locations in which
+ column number is 0. */
+ line_table->highest_location = base_location;
+}
+
+/* We add some extra testing during diagnostics by chaining up
+ to the finalizer. */
+
+static diagnostic_finalizer_fn original_finalizer = NULL;
+
+static void
+verify_unpacked_ranges (diagnostic_context *context,
+ diagnostic_info *diagnostic)
+{
+ /* Verify that the locations are ad-hoc, not packed. */
+ location_t loc = diagnostic_location (diagnostic);
+ gcc_assert (IS_ADHOC_LOC (loc));
+
+ /* We're done testing; chain up to original finalizer. */
+ gcc_assert (original_finalizer);
+ original_finalizer (context, diagnostic);
+}
+
+static void
+verify_no_columns (diagnostic_context *context,
+ diagnostic_info *diagnostic)
+{
+ /* Verify that the locations have no columns. */
+ location_t loc = diagnostic_location (diagnostic);
+ gcc_assert (LOCATION_COLUMN (loc) == 0);
+
+ /* We're done testing; chain up to original finalizer. */
+ gcc_assert (original_finalizer);
+ original_finalizer (context, diagnostic);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+ struct plugin_gcc_version */*version*/)
+{
+ /* Read VALUE from -fplugin-arg-location_overflow_plugin-value=<VALUE>
+ in hexadecimal form into base_location. */
+ for (int i = 0; i < plugin_info->argc; i++)
+ {
+ if (0 == strcmp (plugin_info->argv[i].key, "value"))
+ base_location = strtol (plugin_info->argv[i].value, NULL, 16);
+ }
+
+ if (!base_location)
+ error_at (UNKNOWN_LOCATION, "missing plugin argument");
+
+ register_callback (plugin_info->base_name,
+ PLUGIN_START_UNIT,
+ on_start_unit,
+ NULL); /* void *user_data */
+
+ /* Hack in additional testing, based on the exact value supplied. */
+ original_finalizer = diagnostic_finalizer (global_dc);
+ switch (base_location)
+ {
+ case 0x50000001:
+ diagnostic_finalizer (global_dc) = verify_unpacked_ranges;
+ break;
+
+ case 0x60000001:
+ diagnostic_finalizer (global_dc) = verify_no_columns;
+ break;
+
+ default:
+ error_at (UNKNOWN_LOCATION, "unrecognized value for plugin argument");
+ }
+
+ return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp
b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 06080cc..78b062b 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -71,6 +71,9 @@ set plugin_test_list [list \
{ diagnostic_plugin_show_trees.c \
diagnostic-test-show-trees-1.c } \
{ levenshtein_plugin.c levenshtein-test-1.c } \
+ { location_overflow_plugin.c \
+ location-overflow-test-1.c \
+ location-overflow-test-2.c } \
]
foreach plugin_test $plugin_test_list {
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index c20a32b..e3eeff3 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -31,7 +31,16 @@ along with this program; see the file COPYING3. If not see
disabled). */
const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 12);
-/* Do not track column numbers if locations get higher than this. */
+/* Do not pack ranges if locations get higher than this.
+ If you change this, update:
+ gcc.dg/plugin/location_overflow_plugin.c
+ gcc.dg/plugin/location-overflow-test-*.c. */
+const source_location LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES = 0x50000000;
+
+/* Do not track column numbers if locations get higher than this.
+ If you change this, update:
+ gcc.dg/plugin/location_overflow_plugin.c
+ gcc.dg/plugin/location-overflow-test-*.c. */
const source_location LINE_MAP_MAX_LOCATION_WITH_COLS = 0x60000000;
/* Highest possible source location encoded within an ordinary or
@@ -138,7 +147,7 @@ can_be_stored_compactly_p (struct line_maps *set,
if (src_range.m_start < RESERVED_LOCATION_COUNT)
return false;
- if (locus >= LINE_MAP_MAX_LOCATION_WITH_COLS)
+ if (locus >= LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES)
return false;
/* All 3 locations must be within ordinary maps, typically, the same
@@ -175,7 +184,7 @@ get_combined_adhoc_loc (struct line_maps *set,
/* Any ordinary locations ought to be "pure" at this point: no
compressed ranges. */
linemap_assert (locus < RESERVED_LOCATION_COUNT
- || locus >= LINE_MAP_MAX_LOCATION_WITH_COLS
+ || locus >= LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
|| locus >= LINEMAPS_MACRO_LOWEST_LOCATION (set)
|| pure_location_p (set, locus));
@@ -284,7 +293,7 @@ get_range_from_loc (struct line_maps *set,
/* For ordinary maps, extract packed range. */
if (loc >= RESERVED_LOCATION_COUNT
&& loc < LINEMAPS_MACRO_LOWEST_LOCATION (set)
- && loc <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+ && loc <= LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES)
{
const line_map *map = linemap_lookup (set, loc);
const line_map_ordinary *ordmap = linemap_check_ordinary (map);
@@ -715,6 +724,8 @@ linemap_line_start (struct line_maps *set, linenum_type
to_line,
&& line_delta * map->m_column_and_range_bits > 1000)
|| (max_column_hint >= (1U << effective_column_bits))
|| (max_column_hint <= 80 && effective_column_bits >= 10)
+ || (highest > LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
+ && map->m_range_bits > 0)
|| (highest > LINE_MAP_MAX_LOCATION_WITH_COLS
&& (set->max_column_hint || highest >= LINE_MAP_MAX_SOURCE_LOCATION)))
add_map = true;
@@ -739,7 +750,10 @@ linemap_line_start (struct line_maps *set, linenum_type
to_line,
else
{
column_bits = 7;
- range_bits = set->default_range_bits;
+ if (highest <= LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES)
+ range_bits = set->default_range_bits;
+ else
+ range_bits = 0;
while (max_column_hint >= (1U << column_bits))
column_bits++;
max_column_hint = 1U << column_bits;
@@ -749,7 +763,8 @@ linemap_line_start (struct line_maps *set, linenum_type
to_line,
single line we can sometimes just increase its column_bits instead. */
if (line_delta < 0
|| last_line != ORDINARY_MAP_STARTING_LINE_NUMBER (map)
- || SOURCE_COLUMN (map, highest) >= (1U << column_bits))
+ || SOURCE_COLUMN (map, highest) >= (1U << column_bits)
+ || range_bits < map->m_range_bits)
map = linemap_check_ordinary
(const_cast <line_map *>
(linemap_add (set, LC_RENAME,
--
1.8.5.3