Chess For Android, DroidFish, and my issues

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Chess For Android, DroidFish, and my issues

Post by AndrewGrant »

Okay,

So long time now Ethereal has been built and run on android. I have not had any issues up until today, when I received an email from someone who I had recently built Ethereal10.29-android-armv8 for. His output showed a few crashes over the course of 10 games. He was using Chess For Android.

So I downloaded it, and ran Ethereal 10.00, 10.15, and 10.29. When using the built in "auto-play" option, all engines had issues with hanging or crashing. I then used the "Engine Tournament" option, and played 10.00, ... 10.29, vs the built in Chess For Android engine. No crashes reported.

Now I am playing 10.29 vs 10.15 in another tourney, and thus far no crashes or issues.
At this point, I might be content to say, "Well clearly Chess For Android has some issues that I cannot know of"...
So I downloaded Droid Fish. Here, I again got the occasional hang/crash (Output from both GUIs is lacking).

I'm very confused because I have played in the upwards of 8 million games since 10.00, and the only crashes I have seen are from one computer in my network which has memory issues. There is exactly ZERO difference in what I do to build for android, aside from the version of gcc of course. Builds from both my windows box and linux box appear to have the same issue (Different versions of gcc for android).

Does anyone have experience here, or know of a GUI with enough output to help me track down the issue?

Thanks,
Andrew Grant

EDIT: After many more games, I got a crash between 10.29 and 10.15 :(
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Chess For Android, DroidFish, and my issues

Post by Ras »

I am a bit astonished that you are using GCC for the Android builds. The NDK has deprecated GCC for a long time and will remove it with r18. Clang is the common way to go for Android cross-compiles. Or are you using a toolchain other than the NDK?
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Chess For Android, DroidFish, and my issues

Post by AndrewGrant »

I am using something named arm-linux-gnueabi-gcc, which has worked thus far.
I've not changed anything too drastic, so I would expect if the GCC builds worked before, that they continue to.
However, I'll look into the clang builds. How would you recommend building from a Linux machine?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Chess For Android, DroidFish, and my issues

Post by Ras »

I don't exactly know about building from Linux, but the NDK is available also for Linux (64 bit only), so the usage should be just the same. In the NDK download, there is a directory build/tools/ that contains a script make_standalone_toolchain.py . This is for generating the desired toolchain.

The important thing to remember is that you need position independent code to be generated for modern Android (4.1 or higher for 32 bit, 5.0 or higher for 64 bit). This position independent code, however, requires at least API level 16 for 32 bit and API level 21 for 64 bit. Assuming you want to generate a toolchain that targets ARM 64 bit Android, you can call this script like this:

Code: Select all

make_standalone_toolchain.py --arch arm64 --api 21 --install-dir /some/path/
Or like this for ARM 32 bit:

Code: Select all

make_standalone_toolchain.py --arch arm --api 16 --install-dir /some/path/
Then in /some/path/bin/ , you should see "clang" as script, and you can call that directly from your Ethereal build system. I'm using the "one compiler call for everything at once" method and call clang like this - note the pie for "position independent code":

Code: Select all

/some/path/bin/clang -m64 -march=armv8-a -pie -fPIE -Wl,-pie -Wall -Wextra -Werror -O2 -std=c99 -fno-strict-aliasing -fno-strict-overflow -ffunction-sections -fdata-sections -Wl,--gc-sections -Wl,-s -o my_engine_executable all_my_c_files.c
Or for ARM 32 bit, with a couple of additional parameters that I got from the NDK website:

Code: Select all

/some/path/bin/clang -m32 -march=armv7-a -pie -fPIE -Wl,-pie -mfloat-abi=softfp -mfpu=vfpv3-d16 -mthumb -Wl,--fix-cortex-a8 -Wall -Wextra -Werror -O2 -std=c99 -fno-strict-aliasing -fno-strict-overflow -ffunction-sections -fdata-sections -Wl,--gc-sections -Wl,-s -o my_engine_executable all_my_c_files.c
I don't recommend static linkage of libc because the static build (libc.a) is usually outdated in the NDK.

Pthread works like that without special command line parameters for Clang. There is one small issue for 32 bit Android builds with API level 16, targeting older devices: monotonic clock doesn't work in pthread conditions, you'll have to use real time. The 64 bit builds don't have this issue because that feature was added with API level 21.

Link to the NDK download: https://developer.android.com/ndk/downloads/
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Chess For Android, DroidFish, and my issues

Post by AndrewGrant »

Followed your steps, able to produce a binary.
Everything exits on opening when trying to add it to Chess For Android.
Barring a trivial fix, I think I'll stop looking for a solution. This is already too convoluted.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Chess For Android, DroidFish, and my issues

Post by Ras »

Ok, so I have taken a closer look at it.. several aspects that may or may not be relevant:

- ABORT_SIGNAL is just plain volatile. This may not be enough for getting the variable update across threads because it only instructs the compiler, not the CPU. For a "one writer, many readers" thing, a mutex isn't necessary, but a memory barrier is (which a mutex would also contain). Calling __sync_synchronize() after every write and before every read would be the portable standard solution.

- the original makefile for Android (from the Github repo) is using static linkage, which usually is discouraged under Linux, and therefore also Android. In particular, position independent executables have to be shared objects ELFs under Linux, which statically linked exetuables aren't. You can use the Linux command "file" for an ELF, there has to be someting like "LSB shared object" or so, but not "LSB executable". You can also check the Android ELFs under a regular Linux.

- the search grabs a "ready lock" because it cannot be ready until the search is finished. "isready", however, has to be answered also during search as per the UCI specification. If the engine doesn't answer to "isready" during search, the GUI may kill the engine.
In particular, the following sequence will lock up the engine:
1) call search with "go infinite"
2) issue an "isready"
3) since the parser is now blocking for the "ready lock", but the search will never return by itself because the time is infinite, the search cannot be stopped with a "stop" command anymore. The engine has to be killed.

- The consecutive whitespace filtering is only implemented for the move list, and then only for ' ', but not '\t'. Any amount of whitespace, be it space or tab, is allowed at any point of all UCI commands between tokens.

- The IDs for the setoption command are handled as case sensitive while the UCI spec says they aren't.

- Some GUIs may have difficulties with UCI option values greater than 32767, I have seen that e.g. with the Shredder GUI.

- For the parameters that come in via UCI or command line, a range check validation would enhance robustness, especially hash size and thread count.

- malloc() and the string parsing calls are often not checked for their return value - could be a NULL pointer for unexpected input with crashes resulting.
In particular, using short-hand FEN formats like "1k//K4R w" crashes the engine, but I also didn't have luck with "position fen 1k6/8/K4R2/8/8/8/8/8 w". However, "position fen 1k6/8/K4R2/8/8/8/8/8 w - - 0 1" did work as expected.

- Probably not relevant for Android, but the maximum setting of 2048 threads might overflow the allocation in the malloc call in thread.c, line 32. Using calloc would prevent that. The desired thread number should also be checked against PTHREAD_THREADS_MAX, ideally already when printing the response for the "uci" command.
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Chess For Android, DroidFish, and my issues

Post by AndrewGrant »

Okay ... If you are willing to give me even more of your time :)

I've created an 'android' branch with a commit found here

You made the following points, which I will number here just for readability ....
  • 1) ABORT_SIGNAL not guaranteeing atomic operations
  • 2) Static linkage issues
  • 3) Ready lock
  • 4) White space reading
  • 5) Case insensitive UCI options
  • 6) Large values for options
  • 7) Range checking
  • 8) Malloc and string parsing issues
  • 9) Short hand / incomplete FENS
  • 10) PTHREAD_THREADS_MAX
For 1) My branch attempts to resolve, but using <stdatomic.h> as apart of c11. I think this is correct, but you have more understanding on the topic than I do. Only ~5 lines are changed. After adding this, I seem unable to run at all on android. (non-static)

For 2), I have absolutely zero clue what you are talking about. I'm woefully unaware of most everything you said. If I am not to build with -static, I do not know how to get the engine to run on an android phone.

For 3) I see now that UCI uses isready as a pinging mechanism, as opposed to "Are you ready for a new search", so I have dropped the locking code from the isready response. However, in uci.c uciGo(), I still have the ready lock. Spawning two searches ontop each other would be a very bad idea. Is there at least some guarantee that the GUI does not attempt this? If so, the lock's should be both unneeded, and non problematic to the pings.

For 4) I have added a few things to parse better for the move list, but I imagine this is not my issue

For 5) I have not fixed this (Nor will I). Its never been an issue. And I can see Chess For Android setting things correctly.

For 6) Again, never been an issue. Good to note, however.

For 7) I have added range checking to the inputs.

For 8) Yes, the mallocs are unguarded. But they are only called twice, and since the engine starts running these are not the issue. As for the string parsing examples, can you be more specific. (Unless you are talking about the case for 9)

For 9) Yes, a crash is expected. As far as I know, I am not expected to handle anything other than a full fledged FEN

For 10) Interesting. I will do this at some point in the future.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Chess For Android, DroidFish, and my issues

Post by Ras »

1) I'm not 100% sure about the atomics in C11, but I think they have the memory fences in the wrong order for this use case. For a store, they insert the barrier before the store, not after it; for a load, they insert it after the load, not before it. Their point is ordering of access, not ensuring visibility.

C11 also offers explicit fences, but they are also with that different use case in mind so that an acquire fence will only guard loads, not writes. I suggest leaving the ABORT_SIGNAL just as volatile int and insert

Code: Select all

atomic_thread_fence(memory_order_acq_rel);
after the write and before the read. On ARM, this should give the desired "DMB ISH" instruction, and no instruction on x86, if this here https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html is still correct.

The access to the int itself is always atomic anyway, except on 8 bit microcontrollers.

2) The short form is not to use static linkage with regard to the C standard library. For my engine, I'm just linking dynamically for the Android version, and it works. In fact, the reason why the NDK has a totally outdated C standard library for static linkage is that nobody uses it.

3) GUIs should not issue several "go" commands if there is no "stop" command or a "bestmove" reply from the engine, so I think that's safe.

4) The checks for '\r' and '\n' are probably not necessary because that's already eliminated in getInput().

9) That's a design philosophy - I expect malformed input and don't allow my engine to crash. This avoids potential security issues, plus that debugging and bug reports are easier.

There is another potential issue that could be worth examining, the hash tables. There are no locks and no (obvious) lockless algorithm that would guard concurrent access. If one thread writes while the other has half-read an entry, maybe the engine reacts strangely. That would be a bigger issue under Android for two reasons: first, the hash tables are much smaller than on the PC so that more access collisions can happen, and second, ARM has a weaker memory ordering than x86.

That should be easy to test - if things work with Ethereal set to 1 thread, but not with several, than this is worth closer examination.
Rasmus Althoff
https://www.ct800.net
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Chess For Android, DroidFish, and my issues

Post by AndrewGrant »

Hash tables have no issues. Ethereal runs on TCEC hardware with zero issues.
Everything I have tried here is fruitless.
I don't think caveats in the UCI spec have any role. Nor do I think that the volatile signal is even an issue.
This issue HAS to be Android specific. Ive played 10s of millions of games through cutechess and never had a single issue.
I really don't know where to go from here since I have no android dev environment, and these GUIs have no real logging as far as I can tell.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
AndrewGrant
Posts: 1754
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Chess For Android, DroidFish, and my issues

Post by AndrewGrant »

So apparently even V9.00 has the issue now. But I know a user who has played a thousand games with no issue since V9.00 was released. Therefore, the only conclusion I can draw is that an update to Chess For Android, Android itself, or the gcc-arm-linux toolchain has broken the process.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )