YATT - Yet Another Turing Test

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

smatovic
Posts: 3784
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: YATT - Yet Another Turing Test

Post by smatovic »

Let me cross post this from another thread:

Re: ChatGPT usage in computer chess?
viewtopic.php?p=981757#p981757
j.t. wrote: Wed Aug 06, 2025 2:23 pm
j.t. wrote: Mon Jul 21, 2025 6:38 pm
smatovic wrote: Mon Jul 21, 2025 2:22 pm - Can it contribute to Stockfish?
No, it would need to load a mental model of Stockfish chess engine, then ponder, then come up with ideas, then implement ideas, then test ideas (with ~10K games selfplay) then decide to commit changes or update mental model. <- This pipeline is currently not present, especially testing, but I think doable in general. Ofc, the whole process of mental model and ideas can be done by brute force method, just try every possible permutation of code generation, driven by some NN heuristic.

I'm working a bit on this currently. No LTC passer as of yet unfortunately, however, there have been a few STC passers (of roughly 500 potential LLM patches):
- https://tests.stockfishchess.org/tests/ ... 4f6388c891
- https://tests.stockfishchess.org/tests/ ... 2d74b172ae
- https://tests.stockfishchess.org/tests/ ... 2d74b1559d
Some of the patches I read and the reasoning how they came up with them are quite reasonable.

Of course all the git and OpenBench submitting part in the pipeline is not done by the LLM itself (yet?), that's all implemented in a series of python scripts. The fishtest submitting of successful OpenBench tests is done fully manually by me.
Finally a test also passed LTC: https://github.com/official-stockfish/S ... /pull/6210
Hugging Face offers a new tool called Hugging Face Skills:

We Got Claude to Fine-Tune an Open Source LLM
https://huggingface.co/blog/hf-skills-training

You can fully automate LLM jobs, maybe an idea or blueprint for an LLM driven Stockfish dev cycle?

--
Srdja
smatovic
Posts: 3784
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: YATT - Yet Another Turing Test

Post by smatovic »

I have to realize that we are meanwhile in the Centaur phase of programming...

Sable 1.6 - from-scratch C++ engine with a self-trained NNUE (~3200 CCRL, AI-assisted build)
Post by Dylan » Sun Jun 14, 2026 11:08 am
viewtopic.php?p=994057#p994057

~3200 CCRL Elo engine from scratch vibe-coded over a week.

Question remains open how long the Centaur phase will last, and when fully autonomous AI agents will take over, able to crack the YATT.

--
Srdja
User avatar
Eelco de Groot
Posts: 4724
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: YATT - Yet Another Turing Test

Post by Eelco de Groot »

In the same vein Srdja, take a look in the computerchess main forum from Ed, I do not think Mark Young is professionally a programmer, but that does not matter anymore because as I understand it, he knows how to give the right directions, corrections and parameters for what he wants to achieve and with AI he not only creates beautiful promotion pictures but tools that, I have not really tested, but I think stand up to professional tools, certainly compared to what you would get from Chessbase (not meant as criticism of Chessbase, they certainly cater to a lot of computerchess users). Mark and Ed and Werner from CEGT are just about the only posters there for some time, to give Ed (and Chris too as moderator) a little rest from trying to moderate us :) (my personal interpretation of the situation solely).
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan