I have the following output from a cutechess session https://pastebin.com/raw/JpHAX8tJ
Interestingly, the first 7951 games had no problems, and neither did the next ~6000.
My #1 worry is that this is the fault of Ethereal, which I find highly suspect. Outside input would be appreciated.
Assuming it is not Ethereal, the issue must either be with Cutechess or with my actual machine.
To further confuse things, I have played 2,008,484 games on my testing framework without having a time loss. Those are played in batches of 250.
Any thoughts? Similar experience?
Thanks
EDIT: Both blocks of time losses are 30 games each. I was playing with concurrency 30, meaning 60 engines running at once. This suggests that at some point, every single engine hanged. Then cutechess restarted the first 30. Then the 2nd set of 30 crashed, where restarted, and then all went on smoothly.
Strings of timelosses under cutechess-cli
Moderators: hgm, Rebel, chrisw
-
- Posts: 1777
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
Strings of timelosses under cutechess-cli
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
-
- Posts: 4368
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Strings of timelosses under cutechess-cli
I don't know. I don't use concurrency myself, I just start multiple cutechess-cli instances.
I see zero time losses with the engines I use routinely, except Nemorino 4, which regularly loses about 5-6 games out of a 8000 game match (very fast time control).
It is essential to use a high-resolution timer to avoid losses in fast games, but most engines have that now. If they are losing on time I suspect that is faulty time management logic, but it is hard to be sure, especially with very rare time losses.
(Note I use Linux for testing - I have almost no experience with cutechess-cli on other platforms).
--Jon
I see zero time losses with the engines I use routinely, except Nemorino 4, which regularly loses about 5-6 games out of a 8000 game match (very fast time control).
It is essential to use a high-resolution timer to avoid losses in fast games, but most engines have that now. If they are losing on time I suspect that is faulty time management logic, but it is hard to be sure, especially with very rare time losses.
(Note I use Linux for testing - I have almost no experience with cutechess-cli on other platforms).
--Jon
-
- Posts: 1777
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
Re: Strings of timelosses under cutechess-cli
There were 60+.6s games, 250ms move overhead, and I have 30million games played on 10+.1s without a single time loss....
main point here was the fact that all time losses occurred on top each other...
main point here was the fact that all time losses occurred on top each other...
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
-
- Posts: 2495
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Strings of timelosses under cutechess-cli
The interesting part is that the engine failed to respond to ping. Looks like either the engine is hanging or the I/O has gone south.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 1777
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
Re: Strings of timelosses under cutechess-cli
It was suggested that the machine over heats, and due to a kernal bug, instead of downclocking each thread, the cpu instead idles them. This would cause engines to time loss, despite never crashing.
It seems like a reasonable idea, so I've upgrade the kernal and opened up the chassis to increase airflow.
The problem only happened once over the course of 12 hours, so testing is hard.
It seems like a reasonable idea, so I've upgrade the kernal and opened up the chassis to increase airflow.
The problem only happened once over the course of 12 hours, so testing is hard.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )