On Wed, 23 Sep 2020 at 17:33, Martin Sebor <mse...@gmail.com> wrote:
>
> On 9/23/20 2:54 AM, Christophe Lyon wrote:
> > On Wed, 23 Sep 2020 at 01:47, Martin Sebor <mse...@gmail.com> wrote:
> >>
> >> On 9/22/20 9:15 AM, Christophe Lyon wrote:
> >>> On Tue, 22 Sep 2020 at 17:02, Martin Sebor <mse...@gmail.com> wrote:
> >>>>
> >>>> Hi Christophe,
> >>>>
> >>>> While checking recent test results I noticed many posts with results
> >>>> for various flavors of arm that at high level seem like duplicates
> >>>> of one another.
> >>>>
> >>>> For example, the batch below all have the same title, but not all
> >>>> of the contents are the same.  The details (such as test failures)
> >>>> on some of the pages are different.
> >>>>
> >>>> Can you help explain the differences?  Is there a way to avoid
> >>>> the duplication?
> >>>>
> >>>
> >>> Sure, I am aware that many results look the same...
> >>>
> >>>
> >>> If you look at the top of the report (~line 5), you'll see:
> >>> Running target myarm-sim
> >>> Running target 
> >>> myarm-sim/-mthumb/-mcpu=cortex-m3/-mfloat-abi=soft/-march=armv7-m
> >>> Running target 
> >>> myarm-sim/-mthumb/-mcpu=cortex-m0/-mfloat-abi=soft/-march=armv6s-m
> >>> Running target 
> >>> myarm-sim/-mcpu=cortex-a7/-mfloat-abi=hard/-march=armv7ve+simd
> >>> Running target 
> >>> myarm-sim/-mthumb/-mcpu=cortex-m7/-mfloat-abi=hard/-march=armv7e-m+fp.dp
> >>> Running target 
> >>> myarm-sim/-mthumb/-mcpu=cortex-m4/-mfloat-abi=hard/-march=armv7e-m+fp
> >>> Running target 
> >>> myarm-sim/-mthumb/-mcpu=cortex-m33/-mfloat-abi=hard/-march=armv8-m.main+fp+dsp
> >>> Running target 
> >>> myarm-sim/-mcpu=cortex-a7/-mfloat-abi=soft/-march=armv7ve+simd
> >>> Running target 
> >>> myarm-sim/-mthumb/-mcpu=cortex-a7/-mfloat-abi=hard/-march=armv7ve+simd
> >>>
> >>> For all of these, the first line of the report is:
> >>> LAST_UPDATED: Tue Sep 22 09:39:18 UTC 2020 (revision
> >>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c)
> >>> TARGET=arm-none-eabi CPU=default FPU=default MODE=default
> >>>
> >>> I have other combinations where I override the configure flags, eg:
> >>> LAST_UPDATED: Tue Sep 22 11:25:12 UTC 2020 (revision
> >>> r9-8928-gb3043e490896ea37cd0273e6e149c3eeb3298720)
> >>> TARGET=arm-none-linux-gnueabihf CPU=cortex-a9 FPU=neon-fp16 MODE=thumb
> >>>
> >>> I tried to see if I could fit something in the subject line, but that
> >>> didn't seem convenient (would be too long, and I fear modifying the
> >>> awk script....)
> >>
> >> Without some indication of a difference in the title there's no way
> >> to know what result to look at, and checking all of them isn't really
> >> practical.  The duplication (and the sheer number of results) also
> >> make it more difficult to find results for targets other than arm-*.
> >> There are about 13,000 results for September and over 10,000 of those
> >> for arm-* alone.  It's good to have data but when there's this much
> >> of it, and when the only form of presentation is as a running list,
> >> it's too cumbersome to work with.
> >>
> >
> > To help me track & report regressions, I build higher level reports like:
> > https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/0latest/report-build-info.html
> > where it's more obvious what configurations are tested.
>
> That looks awesome!  The regression indicator looks especially
> helpful.  I really wish we had an overview like this for all
> results.  I've been thinking about writing a script to scrape
> gcc-testresults and format an HTML table kind of like this for
> years.  With that, the number of posts sent to the list wouldn't
> be a problem (at least not for those using the page).  But it
> would require settling on a standard format for the basic
> parameters of each run.
>

It's probably easier to detect regressions and format reports from the
.sum files rather than extracting them from the mailing-list.
But your approach has the advantage that you can detect regressions
from reports sent by other people, not only by you.


> >
> > Each line of such reports can send a message to gcc-testresults.
> >
> > I can control when such emails are sent, independently for each line:
> > - never
> > - for daily bump
> > - for each validation
> >
> > So, I can easily reduce the amount of emails (by disabling them for
> > some configurations),
> > but that won't make the subject more informative.
> > I included the short revision (rXX-YYYY) in the title to make it clearer.
> >
> > The number of configurations has grown over time because we regularly
> > found regressions
> > in configurations not tested previously.
> >
> > I can probably easily add the values of --with-cpu, --with-fpu,
> > --with-mode and RUNTESTFLAGS
> > as part of the [<branch> revision rXX-YYYY-ZZZZZ] string in the title,
> > would that help?
> > I fear that's going to make very long subject lines.
> >
> > It would probably be cleaner to update test_summary such that it adds
> > more info as part of $host
> > (as in "... testsuite on $host"), so that it grabs useful configure
> > parameters and runtestflags, however
> > this would be more controversial.
>
> Until a way to present summaries is available, would grouping
> the results of multiple runs in the same "basic configuration"
> (for some definition of basic) in the same post work for you?
>

That's not convenient for me at the moment: each build+make check runs
on a different server in a scratch area. It sends its results, saves
the logs and everything else is discarded.
After that I have a pass to compute regressions once all .sum are
available, and that's when I build the HTML reports you saw.
It's not terribly hard to reorganize, but it does require some work
and probably some disruption. I tend to try to make sure the reports
and results are still generated while I make changes to the scripts
:-)

In the meantime, I am updating the title format following the
suggestions from Richard & Jakub. Hopefully this will be in place
quite soon, after the currently-running validations have completed.

Thanks,

Christophe

> Martin
>
> >
> > Christophe
> >
> >> Martin
> >>
> >>>
> >>> I think HJ generates several "running targets" in the same log, I run
> >>> them separately to benefit from the compute farm I have access to.
> >>>
> >>> Christophe
> >>>
> >>>> Thanks
> >>>> Martin
> >>>>
> >>>> Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>>>        Results for 11.0.0 20200922 (experimental) [master revision
> >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
> >>>> arm-none-eabi   Christophe LYON
> >>
>

Reply via email to