Yes! This really looks bad!

The best move in the list (LLJLCAAECIAEAAAAAAAA) is actually running 24/15.
I evaluate that as the worst move, since that make it a certain backgammon.
Could there be a bug in rollout program used for generating the benchmarks?

-Øystein

On Tue, Jun 5, 2012 at 12:33 AM, Philippe Michel <[email protected]>wrote:

> The benchmark database for the crashed positions seems seriously corrupted.
>
> Offhand, it seems that many positions with significant backgammons in play
> are very innacurate.
>
> For instance :
>
> #R HLDHABAADEAEAAAAAAAA 0.999958 0.696718 0.000878771 0 0
> #R HHDHACAADEAEAAAAAAAA 0.999994 0.695403 0.00295781 0 0
> #R LLJLCAACDAAEAAAAAAAA 0.999805 0.970848 0 0 0
> #R LLJLABEADAAEAAAAAAAA 0.999858 0.969998 0 0 0
> #R LLJLACCADAAEAAAAAAAA 0.999771 0.978238 0.00154321 0 0
> #R LLJLCAAECIAEAAAAAAAA 1 0.444444 0 0 0
> #R LLJLCACACBAEAAAAAAAA 1 0.666667 0 0 0
> m AEAAAAOMGOICAANAAAAA 5 4 LLJLCAAECIAEAAAAAAAA -1.444444
> LLJLCACACBAEAAAAAAAA 0.222223 HLDHABAADEAEAAAAAAAA 0.253066
> HHDHACAADEAEAAAAAAAA 0.253902372 LLJLABEADAAEAAAAAAAA 0.525268
> LLJLCAACDAAEAAAAAAAA 0.52601 LLJLACCADAAEAAAAAAAA 0.534875
>
> The position is :
>
>    +24-23-22-21-20-19------18-17-**16-15-14-13-+  O: GNUbg
> OOO | X  X  O          |   |                  |  0 points
> OOO | X                |   |                  |
> OOO |                  |   |                  |
> OOO |                  |   |                  |
>  OO |                  |   |                  |
>    |                  |BAR|                  |v
>    |                  |   |                  |
>    |                  |   |                  |
>    |    X  X          |   |                  |
>    | X  X  X  X       |   |                  |  Rolled 54
>    | X  X  X  X     X |   |             X    |  0 points
>    +-1--2--3--4--5--6-------7--8-**-9-10-11-12-+  X: You
>
> What ID corresponds to what move doesn't really matter : it is obvious
> thax X loses a lot of backgammons, and more than 44, 67 or 70% gammons.
>
> The position is worth about -2.96 (cubeless), not -1.44, the possible
> errors are small, not random 0.2 or 0.5 blunders.
>
> This is a rather extreme case, but I think the aggregate effect is
> important enough to significantly impair the usefulness of this benchmark.
>
> I didn't look at the other databases. It seems the were done later, with a
> more recent version of the rollout tool.
>
> ______________________________**_________________
> Bug-gnubg mailing list
> [email protected]
> https://lists.gnu.org/mailman/**listinfo/bug-gnubg<https://lists.gnu.org/mailman/listinfo/bug-gnubg>
>
_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg

Reply via email to