YATT - Yet Another Turing Test

smatovic · Post by **smatovic** » Tue Dec 09, 2025 12:50 pm

Let me cross post this from another thread:

Re: ChatGPT usage in computer chess?
viewtopic.php?p=981757#p981757

j.t. wrote: ↑Wed Aug 06, 2025 2:23 pm
j.t. wrote: ↑Mon Jul 21, 2025 6:38 pm
smatovic wrote: ↑Mon Jul 21, 2025 2:22 pm - Can it contribute to Stockfish?
No, it would need to load a mental model of Stockfish chess engine, then ponder, then come up with ideas, then implement ideas, then test ideas (with ~10K games selfplay) then decide to commit changes or update mental model. <- This pipeline is currently not present, especially testing, but I think doable in general. Ofc, the whole process of mental model and ideas can be done by brute force method, just try every possible permutation of code generation, driven by some NN heuristic.

I'm working a bit on this currently. No LTC passer as of yet unfortunately, however, there have been a few STC passers (of roughly 500 potential LLM patches):
- https://tests.stockfishchess.org/tests/ ... 4f6388c891
- https://tests.stockfishchess.org/tests/ ... 2d74b172ae
- https://tests.stockfishchess.org/tests/ ... 2d74b1559d
Some of the patches I read and the reasoning how they came up with them are quite reasonable.

Of course all the git and OpenBench submitting part in the pipeline is not done by the LLM itself (yet?), that's all implemented in a series of python scripts. The fishtest submitting of successful OpenBench tests is done fully manually by me.
Finally a test also passed LTC: https://github.com/official-stockfish/S ... /pull/6210

Hugging Face offers a new tool called Hugging Face Skills:

We Got Claude to Fine-Tune an Open Source LLM
https://huggingface.co/blog/hf-skills-training

You can fully automate LLM jobs, maybe an idea or blueprint for an LLM driven Stockfish dev cycle?

--
Srdja

smatovic · Post by **smatovic** » Mon Jun 15, 2026 8:47 am

I have to realize that we are meanwhile in the Centaur phase of programming...

Sable 1.6 - from-scratch C++ engine with a self-trained NNUE (~3200 CCRL, AI-assisted build)
Post by Dylan » Sun Jun 14, 2026 11:08 am
viewtopic.php?p=994057#p994057

~3200 CCRL Elo engine from scratch vibe-coded over a week.

Question remains open how long the Centaur phase will last, and when fully autonomous AI agents will take over, able to crack the YATT.

--
Srdja

Eelco de Groot · Post by **Eelco de Groot** » Tue Jun 16, 2026 4:01 am

In the same vein Srdja, take a look in the computerchess main forum from Ed, I do not think Mark Young is professionally a programmer, but that does not matter anymore because as I understand it, he knows how to give the right directions, corrections and parameters for what he wants to achieve and with AI he not only creates beautiful promotion pictures but tools that, I have not really tested, but I think stand up to professional tools, certainly compared to what you would get from Chessbase (not meant as criticism of Chessbase, they certainly cater to a lot of computerchess users). Mark and Ed and Werner from CEGT are just about the only posters there for some time, to give Ed (and Chris too as moderator) a little rest from trying to moderate us

(my personal interpretation of the situation solely).

YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test