Tamar Christina <tamar.christ...@arm.com> writes:
> Hi All,
>
> This patch fixes a couple of issues in AArch64's -mcpu=native processing:
>
> The buffer used to read the lines from /proc/cpuinfo is 128 bytes long.  While
> this was enough in the past with the increase in architecture extensions it is
> no longer enough.   It results in two bugs:
>
> 1) No option string longer than 127 characters is correctly parsed.  Features
>    that are supported are silently ignored.
>
> 2) It incorrectly enables features that are not present on the machine:
>   a) It checks for substring matching instead of full word matching.  This 
> makes
>      it incorrectly detect sb support when ssbs is provided instead.
>   b) Due to the truncation at the 127 char border it also incorrectly enables
>      features due to the full feature being cut off and the part that is left
>      accidentally enables something else.
>
> This breaks -mcpu=native detection on some of our newer system.
>
> The patch fixes these issues by reading full lines up to the \n in a string.
> This gives us the full feature line.  Secondly it creates a set from this 
> string
> to:
>
>  1) Reduce matching complexity from O(n*m) to O(n*logm).
>  2) Perform whole word matching instead of substring matching.
>
> To make this code somewhat cleaner I also changed from using char* to using
> std::string and std::set.
>
> Note that I have intentionally avoided the use of ifstream and stringstream
> to make it easier to backport.  I have also not change the substring matching
> for the initial line classification as I cannot find a documented cpuinfo 
> format
> which leads me to believe there may be kernels out there that require this 
> which
> may be why the original code does this.
>
> I also do not want this to break if the kernel adds a new line that is long 
> and
> indents the file by two tabs to keep everything aligned.  In short I think an
> imprecise match is the right thing here.
>
> Test for this is added as the last thing in this series as it requires some
> changes to be made to be able to test this.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Sorry to be awkward.  I know Kyrill's already approved this,
but I kind-of object.

I think we should split this into two: fixing the limit bug,
and fixing the complexity.  It's easier to justify backporting
the fix for the limit bug than the fix for the complexity.

For fixing the limit bug, it seems better to use an obstack to read the
file instead.  This should actually be much easier than what the patch
does. :-)  It should also be more efficient.

For fixing the complexity: does this actually make things noticeably
faster in practice?  The “m” in the O(n*m) is a fairly small constant.
The cost of making it O(n*logm) this way involves many more memory
allocations, so it isn't obvious that it's actually more efficient
in practice.  Do you have any timings?

If search time is a problem, building a hash_map might be better.

Thanks,
Richard

Reply via email to