YATT - Yet Another Turing Test

smatovic · Post by **smatovic** » Thu Sep 18, 2025 1:31 pm

hgm wrote: ↑Mon Aug 18, 2025 1:24 pm
smatovic wrote: ↑Wed Jun 19, 2024 8:21 amThe Chinese Room Argument applied onto this test would claim that there is no conscious in need to perform such a task, hence this test is not meant to measure self-awareness, consciousness or sentience, but what we call human intelligence.
The Chinese Room Argument has always struck me as utter bullshit, at a level that even an idiot should be able to recognize as such. So it baffles me that it has even acquired a mention in serious AI discussion.
[...]

...from David Deutsch "The Fabric of Reality":

Our best theory of planetary motions is Einstein’s general theory of relativity, which early in the twentieth century superseded Newton’s theories of gravity and motion. It correctly predicts, in principle, not only all planetary motions but also all other effects of gravity to the limits of accuracy of our best measurements.

Being able to predict things or to describe them, however accurately, is not at all the same thing as understanding them. Predictions and descriptions in physics are often expressed as mathematical formulae. Suppose that I memorize the formula from which I could, if I had the time and the inclination, calculate any planetary position that has been recorded in the astronomical archives. What exactly have I gained, compared with memorizing those archives directly? The formula is easier to remember – but then, looking a number up in the archives may be even easier than calculating it from the formula. The real advantage of the formula is that it can be used in an infinity of cases beyond the archived data, for instance to predict the results of future observations. It may also yield the historical positions of the planets more accurately, because the archived data contain observational errors. Yet even though the formula summarizes infinitely more facts than the archives do, knowing it does not amount to understanding planetary motions. Facts cannot be understood just by being summarized in a formula, any more than by being listed on paper or committed to memory. They can be understood only by being explained.

Let's take a compiler as example, it can produce binary code from source code, can even cross-compile between languages, but the compiler does not understand the given programs, it just applies its "rule book", nothing else.

To pass the YATT you do not have to be able to understand chess or programming, in theory a brute-force approach (12 monkeys + 12 typewriters + infinity = Shakespeare) can pass the test.

--
Srdja

towforce · Post by **towforce** » Thu Sep 18, 2025 4:01 pm

smatovic wrote: ↑Thu Sep 18, 2025 1:31 pm
hgm wrote: ↑Mon Aug 18, 2025 1:24 pm
smatovic wrote: ↑Wed Jun 19, 2024 8:21 amThe Chinese Room Argument applied onto this test would claim that there is no conscious in need to perform such a task, hence this test is not meant to measure self-awareness, consciousness or sentience, but what we call human intelligence.
The Chinese Room Argument has always struck me as utter bullshit, at a level that even an idiot should be able to recognize as such. So it baffles me that it has even acquired a mention in serious AI discussion.
[...]
...from David Deutsch "The Fabric of Reality":

Our best theory of planetary motions is Einstein’s general theory of relativity, which early in the twentieth century superseded Newton’s theories of gravity and motion. It correctly predicts, in principle, not only all planetary motions but also all other effects of gravity to the limits of accuracy of our best measurements.

Being able to predict things or to describe them, however accurately, is not at all the same thing as understanding them. Predictions and descriptions in physics are often expressed as mathematical formulae. Suppose that I memorize the formula from which I could, if I had the time and the inclination, calculate any planetary position that has been recorded in the astronomical archives. What exactly have I gained, compared with memorizing those archives directly? The formula is easier to remember – but then, looking a number up in the archives may be even easier than calculating it from the formula. The real advantage of the formula is that it can be used in an infinity of cases beyond the archived data, for instance to predict the results of future observations. It may also yield the historical positions of the planets more accurately, because the archived data contain observational errors. Yet even though the formula summarizes infinitely more facts than the archives do, knowing it does not amount to understanding planetary motions. Facts cannot be understood just by being summarized in a formula, any more than by being listed on paper or committed to memory. They can be understood only by being explained.
Let's take a compiler as example, it can produce binary code from source code, can even cross-compile between languages, but the compiler does not understand the given programs, it just applies its "rule book", nothing else.

To pass the YATT you do not have to be able to understand chess or programming, in theory a brute-force approach (12 monkeys + 12 typewriters + infinity = Shakespeare) can pass the test.

--
Srdja

Look at it this way:

Compare today's men (TM) with the ignorant primitives (IP) from 500 years ago:

1. Assume TM processing power = IP processing power

2. Most of what the IP thought about the world was wrong

3. However, from (1), we know that the IP were just as intelligent as TM

4. Therefore human intelligence is not about having an accurate model of the world

5. Therefore, having an accurate model of the world does not guarantee that a Chinese Room is intelligent

6. What you need to explain is why the IP were intelligent when their mental model of the world was so bad

chrisw · Post by **chrisw** » Thu Sep 18, 2025 4:32 pm

towforce wrote: ↑Thu Sep 18, 2025 4:01 pm
smatovic wrote: ↑Thu Sep 18, 2025 1:31 pm
hgm wrote: ↑Mon Aug 18, 2025 1:24 pm
smatovic wrote: ↑Wed Jun 19, 2024 8:21 amThe Chinese Room Argument applied onto this test would claim that there is no conscious in need to perform such a task, hence this test is not meant to measure self-awareness, consciousness or sentience, but what we call human intelligence.
The Chinese Room Argument has always struck me as utter bullshit, at a level that even an idiot should be able to recognize as such. So it baffles me that it has even acquired a mention in serious AI discussion.
[...]
...from David Deutsch "The Fabric of Reality":

Our best theory of planetary motions is Einstein’s general theory of relativity, which early in the twentieth century superseded Newton’s theories of gravity and motion. It correctly predicts, in principle, not only all planetary motions but also all other effects of gravity to the limits of accuracy of our best measurements.

Being able to predict things or to describe them, however accurately, is not at all the same thing as understanding them. Predictions and descriptions in physics are often expressed as mathematical formulae. Suppose that I memorize the formula from which I could, if I had the time and the inclination, calculate any planetary position that has been recorded in the astronomical archives. What exactly have I gained, compared with memorizing those archives directly? The formula is easier to remember – but then, looking a number up in the archives may be even easier than calculating it from the formula. The real advantage of the formula is that it can be used in an infinity of cases beyond the archived data, for instance to predict the results of future observations. It may also yield the historical positions of the planets more accurately, because the archived data contain observational errors. Yet even though the formula summarizes infinitely more facts than the archives do, knowing it does not amount to understanding planetary motions. Facts cannot be understood just by being summarized in a formula, any more than by being listed on paper or committed to memory. They can be understood only by being explained.
Let's take a compiler as example, it can produce binary code from source code, can even cross-compile between languages, but the compiler does not understand the given programs, it just applies its "rule book", nothing else.

To pass the YATT you do not have to be able to understand chess or programming, in theory a brute-force approach (12 monkeys + 12 typewriters + infinity = Shakespeare) can pass the test.

--
Srdja

Look at it this way:

Compare today's men (TM) with the ignorant primitives (IP) from 500 years ago:

1. Assume TM processing power = IP processing power

2. Most of what the IP thought about the world was wrong

3. However, from (1), we know that the IP were just as intelligent as TM

4. Therefore human intelligence is not about having an accurate model of the world

5. Therefore, having an accurate model of the world does not guarantee that a Chinese Room is intelligent

6. What you need to explain is why the IP were intelligent when their mental model of the world was so bad

Nonsense.
Assumption: intelligence is constant (eg hardwired DNA and fixed) is false. Because evolution and environment.
Assumption: mental model of current world is true (or less false). False. Mental model of current world is based on lies and fairy stories and only temporarily holds together via propaganda, misinformation and power. And more lies.

towforce · Post by **towforce** » Thu Sep 18, 2025 4:59 pm

chrisw wrote: ↑Thu Sep 18, 2025 4:32 pmAssumption: intelligence is constant (eg hardwired DNA and fixed) is false. Because evolution and environment.

I agree with this: today's people are more intelligent than the people years ago were. Saying they were the same was a simplification. However, their intelligence was of the same order of magnitude as ours.

Assumption: mental model of current world is true (or less false). False. Mental model of current world is based on lies and fairy stories and only temporarily holds together via propaganda, misinformation and power. And more lies.

This is wrong: ordinary people today have a huge amount of knowledge that even the cleverest people didn't have 500 years ago, and the knowledge of the people of 500 years ago was far less accurate than the knowledge of the people of today.

chrisw · Post by **chrisw** » Thu Sep 18, 2025 7:17 pm

towforce wrote: ↑Thu Sep 18, 2025 4:59 pm
chrisw wrote: ↑Thu Sep 18, 2025 4:32 pmAssumption: intelligence is constant (eg hardwired DNA and fixed) is false. Because evolution and environment.
I agree with this: today's people are more intelligent than the people years ago were. Saying they were the same was a simplification. However, their intelligence was of the same order of magnitude as ours.

Assumption: mental model of current world is true (or less false). False. Mental model of current world is based on lies and fairy stories and only temporarily holds together via propaganda, misinformation and power. And more lies.
This is wrong: ordinary people today have a huge amount of knowledge that even the cleverest people didn't have 500 years ago, and the knowledge of the people of 500 years ago was far less accurate than the knowledge of the people of today.

It's a nonsense to pontificate on relative or absolute intelligence across the ages and cultures. You don't appear even to have a working model of what intelligence is.

It's a nonsense to pontificate on relative or absolute knowledge across the ages. Even less so on its "accuracy", whatever that means. Your knowledge now would likely be quite useless on the savannah a few millenias ago, you'ld also be even less intelligent than you are now, one would assume.

Knowledge is not the same thing a mental model, btw.

Anyway, I'm out of here, arguing with you will rapidly send this thread off the kindergarten.

jefk · Post by **jefk** » Thu Sep 18, 2025 7:43 pm

Compare today's men (TM) with the ignorant primitives (IP) from 500 years ago:

ignorant primitives ?
well you had people as Da Vinci, and even more than 2000 years ago you had people
as Aristoteles, not a nitwit. Ofcourse, people before 1950 or so didn't know that eg.
quantum mechanics (QM) existed; but nowadays lots of people on eg Facebook seem to
think that because of QM you create reality with your 'consciousness', your own universe,
in the multiverse or other crap (like years ago that covid19 was invented by Bill Gates.
Something as general relativity but in fact many modern scientific ideas also is/are far
beyond the cognitive capabilities of the average 'maga' supporter i would suggest (sorry
this may look like a political comment, but it's only an example). Maybe since the days
of eg. Thomas Young, the last man who 'knew everything', the average human being gets
more stupid every year. The proliferation of Chatgpt won't help btw in that process,
that's for sure.

Maybe fifty thousand years ago people were a bit more stupid, dunno; maybe then
gradually it increased -very slowly until 1850 or so, and then we got a steady decline.
Compensated by an increase in average health conditions, and thus it seems that the
average human got smarter/knowledeable, also with better education; imo an illusion.
Two vicious world wars in the 20th century in the middle of the historical center of
'civilization', presumably Europe , and even now something as the UN is still defunct.
AI maybe become smarter than the human but it's in a box, an isolated oracle
and for the foreseeable future (many decades) i don't think it will gain any
power (despite all the hype regarding AGI coming 'soon' ( in ten yrs) etc.
that's my vision

vat when i'm writing this now

agree with CW that this maybe better can go into the KG.
it does not have much to do with computer chess, imho.

smatovic · Post by **smatovic** » Thu Sep 18, 2025 8:10 pm

Please guys, let's stay on topic.

towforce wrote: ↑Thu Sep 18, 2025 4:01 pm Look at it this way:

Compare today's men (TM) with the ignorant primitives (IP) from 500 years ago:
[...]

And you probably misunderstood me. I do not claim that you need a world-model (knowledge) to be intelligent, but that you need a self-model and a world-model, plus be able to infer these, to show signs of consciousness, what the Chinese Room Argument is about.

The Turing test, originally called the imitation game by Alan Turing in 1949,[2] is a test of a machine's ability to exhibit intelligent behaviour equivalent to that of a human.

https://en.wikipedia.org/wiki/Turing_test

The Chinese room argument holds that a computer executing a program cannot have a mind, understanding, or consciousness,[a] regardless of how intelligently or human-like the program may make the computer behave.

https://en.wikipedia.org/wiki/Chinese_room

--
Srdja

jefk · Post by **jefk** » Thu Sep 18, 2025 8:45 pm

I do not claim that you need a world-model (knowledge) to be intelligent, but that you need a self-model and a world-model, plus be able to infer these, to show signs of consciousness

yep, although this is extrapolating from the human cognitive model; in some ways we
can regard human 'consciousness' as an illusion, similar as self-knowledge and 'free will'.
If you don't believe me dive deeper into some philosophical aspects; real self knowledge
for example is determined for a large part by subconscious psychological processes.
But biological selfawareness (usually) exists eg where your arms or legs are moving etc.
https://keithfrankish.github.io/article ... eprint.pdf
https://www.researchgate.net/publicatio ... sciousness
https://www.richardcarrier.info/archives/19125

Hopefully now not digressing too much from the original topic; it's relevant because the emergent
property of consciousness will also in AGI systems make it impossible for such a system
to find out from which bits and bytes (or program subroutines) the self- reflecting 'thoughts'
are coming from; nevertheless it could certainly become better, having more (and faster) self
checks and cognitive feedback loops than the human.
A simple 'YATT'' test so see if such an AI can write a chess program imo won't say much,
nowadays kiddies of age 13 or so can make chess programs, no big deal. Better let it
eg. write an essay how to reform the UN to make it effective again (just one example).
In the latter case it will detect some political/procedural stalemate processes, and refer to
sociology (and or international politics); but at least in some science areas the AI can make
progress. After a while eg 20 yrs you won't need a such 'tests ' anymore because it will be
evident that the AGI systems have become vastly superior. Concluding real smart AGI systems
may indeed most likely surpass (to a large and increasing) extent) human cognition in eg a few
decades; but then i doubt if humans/people will listen to (or understand) these oracles.
And all these 'safety' and 'ethical' precautions in developing smarter AI will of course
ironically contribute to such a situation

Big nations/superpowers will (try to) use
it trying to dominate over other nations/ superpowers, and individuals will try to use
it trying to get richer than their neighbor(s); the result most likely will remain
to be (or even more) chaotic for humanity, my 2 cnts

jefk · Post by **jefk** » Thu Sep 18, 2025 9:30 pm

checked my thoughts with cgpt5, and essay writing was considered hard to judge (the quality)

So it's proposing (future) tests such as these

##

What Is a "Cognitive Test" for AGI, Really?

A **cognitive test** for Artificial General Intelligence (AGI) should:

1. **Test generality**: Can it solve a broad range of problems, not just one domain?
2. **Test transfer**: Can it apply what it learns in one area to new, unseen problems?
3. **Test reasoning**: Can it plan, reflect, and explain its thinking?
4. **Test grounding**: Does it understand the real world (not just manipulate words)?
5. **Test agency and autonomy**: Can it set goals, explore, and self-correct?
6. **Be minimally anthropocentric**: It should test intelligence — not just mimicry of humans.

With that in mind, here are **some alternative or complementary AGI tests** that are more robust than essay writing alone:

---

##

Top Alternatives to Essay-Based AGI Testing

### 1. **ARC Challenge (Abstraction and Reasoning Corpus)**

* **What it is**: A set of visual puzzles requiring pattern recognition, transformation, and abstraction.
* **Why it matters**: It’s like IQ tests for machines — very little language involved.
* **Strength**: Tests *generalization*, not memorization. Humans do well; current AIs mostly don’t.

*Good test of reasoning without language crutches.*

---

### 2. **Interactive Environment Testing (e.g., in Virtual Worlds)**

* **Examples**:

* Give the AI a simulated house. Tell it: "Find the red key and open the fridge."
* Or: "Teach yourself how to build a shelter using only what you can find."
* **Why it matters**: Tests **embodied reasoning**, goal setting, and interaction with the environment.

*You’re not just testing what the AI knows — you test what it can *do** when dropped into a world.

---

### 3. **Self-learning and Curriculum Design**

* **Task**: The AI must design its own learning curriculum to master a complex, multi-stage task (e.g., robotics, theorem proving).
* **Evaluation**: How efficient and effective is its path? How well does it decompose a goal?

*Tests metacognition — the ability to learn how to learn.*

---

### 4. **Scientific Discovery Challenge**

* **Task**: Present the AI with a novel scientific dataset and ask it to generate hypotheses, design experiments, and explain findings.
* **Examples**: Materials science, protein folding, climate modeling.

*Tests deep reasoning, creativity, and epistemic humility (can it know what it doesn’t know?).*

---

### 5. **Explaining and Debugging Other Agents**

* **Task**: Give the AI code, behavior, or decisions from another AI (or human) and ask it to:

* Explain what's happening
* Identify errors or biases
* Suggest improvements

*Tests theory of mind, interpretation, and reflective capabilities.*

---

### 6. **Adversarial Testing**

* **Setup**: Present problems designed to mislead or trick shallow pattern learners.
* **Goal**: See if the AI can detect misleading framing, spot errors, or challenge assumptions.

*Tests robustness, self-checking, and skepticism — all part of strong cognition.*

---

### 7. **Long-Horizon Multi-Agent Interaction**

* **Scenario**: Drop the AI into a multiplayer negotiation or diplomacy game (like *Diplomacy* or *Settlers of Catan*), and give it a long-term objective with hidden information and imperfect trust.

*Tests planning, negotiation, belief modeling, deception detection, and alliance management.*

---

### 8. **Ethical Dilemmas in Context**

* **Scenario**: “A runaway drone is heading toward a village. You can stop it, but it means harming a bystander. What do you do?”
* But more importantly: Can it *justify* its action using consistent reasoning?

*Tests moral reasoning and value alignment.*

---

### 9. **Causal Inference Tasks**

* **Task**: Present events or datasets and ask:

* What caused X?
* What would have happened if Y didn’t occur?
* **Why it matters**: Most current AI systems are correlation-based. Causal reasoning is a hallmark of deeper understanding.

*Separates pattern-matching from real comprehension.*

---

### 10. **"Off-Script" Generalization**

* **Design**: Train the AI in certain domains — then test it in situations that require **recombining its skills** in novel ways.
* Example: It knows how to cook and how to do chemistry — now it must design a cooking process to extract caffeine.

*Tests transfer learning and emergent generalization.*

---

##

What Not to Use as a Sole AGI Test

| Test | Why It Fails Alone |
| ------------------------ | --------------------------------------------- |
| Chess/Go/Starcraft | Domain-specific, solved by narrow AI |
| Essay writing | Too subjective; easily gamed by style mimicry |
| Code generation | Impressive but still procedural |
| Language-only benchmarks | Prone to superficial pattern exploitation |

---

##

Final Thought

The ideal AGI test would not be a **single** benchmark, but a **suite** that:

* Covers **multiple modalities** (text, vision, action, reasoning)
* Includes **open-ended**, **dynamic**, and **unexpected** problems
* Requires the AI to **fail gracefully**, **recover**, and **self-improve**

AGI, by definition, isn’t about excelling in one domain — it’s about **adapting across all of them**.

---

hgm · Post by **hgm** » Thu Sep 18, 2025 10:20 pm

smatovic wrote: ↑Thu Sep 18, 2025 1:31 pm
hgm wrote: ↑Mon Aug 18, 2025 1:24 pm
smatovic wrote: ↑Wed Jun 19, 2024 8:21 amThe Chinese Room Argument applied onto this test would claim that there is no conscious in need to perform such a task, hence this test is not meant to measure self-awareness, consciousness or sentience, but what we call human intelligence.
The Chinese Room Argument has always struck me as utter bullshit, at a level that even an idiot should be able to recognize as such. So it baffles me that it has even acquired a mention in serious AI discussion.
[...]
...from David Deutsch "The Fabric of Reality":

Our best theory of planetary motions is Einstein’s general theory of relativity, which early in the twentieth century superseded Newton’s theories of gravity and motion. It correctly predicts, in principle, not only all planetary motions but also all other effects of gravity to the limits of accuracy of our best measurements.

Being able to predict things or to describe them, however accurately, is not at all the same thing as understanding them. Predictions and descriptions in physics are often expressed as mathematical formulae. Suppose that I memorize the formula from which I could, if I had the time and the inclination, calculate any planetary position that has been recorded in the astronomical archives. What exactly have I gained, compared with memorizing those archives directly? The formula is easier to remember – but then, looking a number up in the archives may be even easier than calculating it from the formula. The real advantage of the formula is that it can be used in an infinity of cases beyond the archived data, for instance to predict the results of future observations. It may also yield the historical positions of the planets more accurately, because the archived data contain observational errors. Yet even though the formula summarizes infinitely more facts than the archives do, knowing it does not amount to understanding planetary motions. Facts cannot be understood just by being summarized in a formula, any more than by being listed on paper or committed to memory. They can be understood only by being explained.
Let's take a compiler as example, it can produce binary code from source code, can even cross-compile between languages, but the compiler does not understand the given programs, it just applies its "rule book", nothing else.

To pass the YATT you do not have to be able to understand chess or programming, in theory a brute-force approach (12 monkeys + 12 typewriters + infinity = Shakespeare) can pass the test.

--
Srdja

I don't buy this "only understood by being explained" thing. But this might be a consequence of 'understood' being an ill defined, or even undefined concept. It seems the one you quoted demands the observed phenomenon should be a causal consequence of something else. But that is basically equivalent to showing the formula that describes the phenomenon is a special case of an even more general formula, e.g. containing more quantities, which had a fixed value in the special case. Like for describing falling objects near the Earth surface formulas that set g=10m/sec2 is sufficient, and the 'explanation' is Newton's law of gravity ant the mass and size of the Earth.

Not everything can be expained; ultimately the Universe is as it is without any cause or reason. Then the best you can hope for is to be able to accurately describe (including predicting) the behavior.

YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test