Strategies for Testing with UHO Openings and Bullet Time Controls

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Modern Times
Posts: 3693
Joined: Thu Jun 07, 2012 11:02 pm

Re: Strategies for Testing with UHO Openings and Bullet Time Controls

Post by Modern Times »

supernova wrote: Sat Apr 26, 2025 4:35 am I respectfully disagree with your perspective. I believe that Torch is quite similar to Stockish when utilizing UHO. Additionally, it's important to recognize that Chess.com has a business goal to monetize, and Torch is part of that strategy.
Well Stockfosh is nearly +60 Elo over Torch on SPCC UHO Top 15

Code: Select all

  Program                    Celo    +    - Games    Score   Av.Op. Draws
   1 Stockfish 250418 a512    : 3857    4    4 15000    69.5%   3709   48.0%
   2 Stockfish 17.1 250330    : 3854    4    4 15000    69.1%   3709   48.5%
   3 Torch 3.1 a512           : 3800    4    4 15000    61.8%   3713   48.4%
https://www.sp-cc.de/index.htm
User avatar
pohl4711
Posts: 2686
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Strategies for Testing with UHO Openings and Bullet Time Controls

Post by pohl4711 »

Modern Times wrote: Fri Apr 25, 2025 9:17 pm
supernova wrote: Fri Apr 25, 2025 8:39 pm Irrelevant for Human Chess: In top-level human chess, most games begin with balanced positions where both sides have equal chances. UHO positions, however, frequently generate starting positions that are far removed from real-world scenarios. UHO positions rarely, if ever, occur in human games because they are specifically constructed to test engines in unbalanced or irregular setups.
I thought the "H" in UHO meant human, so these openings do come from actual human games. I may be wrong there - initially that was the case I think but maybe things have changed since then, in particular the need for a bigger volume of openings.
All UHO Openings, which were done by me, were filtered out of the Megabase, so human games, only. The Stockfish devs have built a new (much bigger) UHO file, mostly taken from Lichess Gamebases. So, AFAIK, the UHO openings in use should contain human openings, only.
supernova
Posts: 85
Joined: Mon Apr 15, 2024 8:30 pm
Full name: Arthur Matheus

Re: Strategies for Testing with UHO Openings and Bullet Time Controls

Post by supernova »

Modern Times wrote: Sat Apr 26, 2025 4:45 am
supernova wrote: Sat Apr 26, 2025 4:35 am I respectfully disagree with your perspective. I believe that Torch is quite similar to Stockish when utilizing UHO. Additionally, it's important to recognize that Chess.com has a business goal to monetize, and Torch is part of that strategy.
Well Stockfosh is nearly +60 Elo over Torch on SPCC UHO Top 15

Code: Select all

  Program                    Celo    +    - Games    Score   Av.Op. Draws
   1 Stockfish 250418 a512    : 3857    4    4 15000    69.5%   3709   48.0%
   2 Stockfish 17.1 250330    : 3854    4    4 15000    69.1%   3709   48.5%
   3 Torch 3.1 a512           : 3800    4    4 15000    61.8%   3713   48.4%
https://www.sp-cc.de/index.htm
What about CCRL? https://computerchess.org.uk/ccrl/404/ It is not 50 elo difference with less bias opening lines.

Code: Select all

Rank	Name	Rating	Score	Average
Opponent	Draws	Games	LOS
Elo	+	−
1	Stockfish 17.1 64-bit 8CPU	3816	+17	−17	62.7%	−79.5	74.0%	1312	 97.3%
2	Torch v2 64-bit 8CPU	         3802	 +14	−14	61.2%	−70.6	76.4%	4184
Modern Times
Posts: 3693
Joined: Thu Jun 07, 2012 11:02 pm

Re: Strategies for Testing with UHO Openings and Bullet Time Controls

Post by Modern Times »

supernova wrote: Sat Apr 26, 2025 3:44 pm
What about CCRL? https://computerchess.org.uk/ccrl/404/ It is not 50 elo difference with less bias opening lines.

Code: Select all

Rank	Name	Rating	Score	Average
Opponent	Draws	Games	LOS
Elo	+	−
1	Stockfish 17.1 64-bit 8CPU	3816	+17	−17	62.7%	−79.5	74.0%	1312	 97.3%
2	Torch v2 64-bit 8CPU	         3802	 +14	−14	61.2%	−70.6	76.4%	4184
That shows that testing with balanced lines is pointless these days, especially with high draw rates amongst strong engines. There is your argument in favour of UHO or unbalanced openings in general right there.
Modern Times
Posts: 3693
Joined: Thu Jun 07, 2012 11:02 pm

Re: Strategies for Testing with UHO Openings and Bullet Time Controls

Post by Modern Times »

BayesElo (used by CCRL) also gives a much tighter spread than Ordo (used by SPCC) on the top engines. And making different choices on the parameters for each can also give different answers. Which one you believe to be the best measurement tool I don't know. I don't like Elo numbers particularly for this reason. What do you believe. The only absolute for me is tournament points and rankings.
AndrewGrant
Posts: 1952
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Strategies for Testing with UHO Openings and Bullet Time Controls

Post by AndrewGrant »

supernova wrote: Sat Apr 26, 2025 3:44 pm
Modern Times wrote: Sat Apr 26, 2025 4:45 am
supernova wrote: Sat Apr 26, 2025 4:35 am I respectfully disagree with your perspective. I believe that Torch is quite similar to Stockish when utilizing UHO. Additionally, it's important to recognize that Chess.com has a business goal to monetize, and Torch is part of that strategy.
Well Stockfosh is nearly +60 Elo over Torch on SPCC UHO Top 15

Code: Select all

  Program                    Celo    +    - Games    Score   Av.Op. Draws
   1 Stockfish 250418 a512    : 3857    4    4 15000    69.5%   3709   48.0%
   2 Stockfish 17.1 250330    : 3854    4    4 15000    69.1%   3709   48.5%
   3 Torch 3.1 a512           : 3800    4    4 15000    61.8%   3713   48.4%
https://www.sp-cc.de/index.htm
What about CCRL? https://computerchess.org.uk/ccrl/404/ It is not 50 elo difference with less bias opening lines.

Code: Select all

Rank	Name	Rating	Score	Average
Opponent	Draws	Games	LOS
Elo	+	−
1	Stockfish 17.1 64-bit 8CPU	3816	+17	−17	62.7%	−79.5	74.0%	1312	 97.3%
2	Torch v2 64-bit 8CPU	         3802	 +14	−14	61.2%	−70.6	76.4%	4184
You are observing many things at once, not just a difference in openings, when comparing SPCC and CCRL