Help request for debugging ICS

Discussion of chess software programming and technical issues.

Moderators: Harvey Williamson, bob, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
hgm
Posts: 25049
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Help request for debugging ICS

Post by hgm » Sat Jan 07, 2017 4:46 pm

After many months of flawless running, the ICS on which I organize the monthly blitz tournaments suddenly became crash prone in October, and now typically crashes within the hour. Nothing was changed, neither in the ICS nor in the OS of the VPS it is running on. My experience with Linux is quite limited, and I am not sure how I can figure out what causes these crashes.

The ICS itself is so friendly that it catches segfaults, through the following handler:

Code: Select all

/*
  give a decent backtrace on segv
*/
static void segv_handler(int sig)
{
	char cmd[100];
	snprintf(cmd, sizeof(cmd), "/home/mics/bin/backtrace %d > /home/mics/chessd/segv_%d 2>&1", 
		 (int)getpid(), (int)getpid());
	system(cmd);
	_exit(1);
}
Indeed many segv_* files were created in the home directory, corresponding to times where we suffered crashing:

Code: Select all

mics@www:~/chessd$ ls -lt
total 140
-rw-r--r--  1 mics mics    40 2016-12-24 14:13 segv_9841
-rw-------  1 mics mics  8192 2016-12-23 21:30 config.tdb
-rw-------  1 mics mics   696 2016-12-23 21:30 news.tdb
-rw-r--r--  1 mics mics    40 2016-12-03 12:32 segv_15471
-rw-r--r--  1 mics mics    40 2016-11-19 23:09 segv_1092
drwxr-xr-x  2 mics mics 20480 2016-11-19 23:05 spool
-rw-------  1 mics mics 27360 2016-11-19 21:10 admin.log
-rw-r--r--  1 mics mics    40 2016-11-19 20:38 segv_628
-rw-r--r--  1 mics mics    40 2016-11-19 19:24 segv_15291
-rw-r--r--  1 mics mics    40 2016-11-19 18:27 segv_11847
-rw-r--r--  1 mics mics    40 2016-11-19 15:43 segv_5887
-rw-r--r--  1 mics mics    40 2016-11-16 23:07 segv_28079
-rw-r--r--  1 mics mics    40 2016-10-23 08:35 segv_19347
-rw-r--r--  1 mics mics    40 2015-07-16 03:33 segv_28315
-rw-r--r--  1 mics mics    40 2015-05-08 12:56 segv_27428
-rw-r--r--  1 mics mics    40 2015-05-08 10:55 segv_27290
-rw-r--r--  1 mics mics    40 2015-05-08 10:43 segv_26709
-rw-r--r--  1 mics mics    40 2015-05-06 17:33 segv_13933
-rw-r--r--  1 mics mics    40 2015-02-24 02:38 segv_28503
drwxr-xr-x  2 mics mics  4096 2014-04-29 12:24 bin
drwxr-xr-x 10 mics mics  4096 2014-04-29 12:18 data
drwxr-xr-x  5 mics mics  4096 2014-04-29 12:18 games
drwxr-xr-x  2 mics mics  4096 2014-04-29 12:18 lib
drwxr-xr-x 28 mics mics  4096 2014-04-29 12:18 players
(Note that in the crash-free periods since October the ICS was simply not running. According to the change log on May 8, 2015 I fixed a buffer overrun.) Unfortunately the segv* files do not contain any useful information; the all contain the single line

Code: Select all

sh: /home/mics/bin/backtrace: not found
Apparently I am missing a tool called 'backtrace' that ancient operators of the ICS still had, but now is no longer part of the project.

How could I figure out where the ICS is crashing? I suppose I could run it under gdb. Unfortuately, gdb is not installed on the VPS where the ICS is running. "Apt-get install" does not work on that machine, because repositories for the OS (Ubuntu 10.04) no longer exist.

Are there other ways to get a working gdb on that system? Or other ways to get a stack trace? Would it be possible to force a core dump from the segfault handler, which could be taken to another machine for post-mortem debugging? If so, how?

tttony
Posts: 263
Joined: Sat Apr 23, 2011 10:33 pm
Contact:

Re: Help request for debugging ICS

Post by tttony » Sat Jan 07, 2017 5:14 pm

You can install gdb adding this PPA: https://code.launchpad.net/~nitrof22/+a ... ubuntu/ppa (I got it from here: http://askubuntu.com/questions/100296/i ... or-gdb-7-4)

Add it to: /etc/apt/sources.list

Code: Select all

deb http://ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
deb-src http://ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
The install gdb

Code: Select all

sudo apt-get install gdb

jdart
Posts: 4014
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Help request for debugging ICS

Post by jdart » Sat Jan 07, 2017 5:44 pm

1. Compile code with -g. Consider not optimizing or if you do have optimization flags on, add -fno-omit-frame-pointer.

2. You can start gdb and attach to the process once it is running. Just type "attach <pid>" on the gdb command line, where "<pid>" is the process id obtained from "ps". attach automatically halts the program at it s current location.

3. Breakpoint on the segv handler. "break handler_file.cpp:line_no".

4. Type "continue" to resume program execution

5. When the segv happens you will hit the breakpoint. Type "where" to see the stack trace. You can also examine variables, etc.

For buffer overflow issues, also consider using a tool like valgrind or the gcc6 flags -fsantize=bounds and -fsanitize=address (must pass these to both the compler and linker). These will detect issues at runtime. You can also submit the code to Coverity Scan (https://scan.coverity.com/, which will do a static analysis for you.

--Jon

User avatar
hgm
Posts: 25049
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Help request for debugging ICS

Post by hgm » Sat Jan 07, 2017 6:55 pm

tttony wrote:You can install gdb adding this PPA: https://code.launchpad.net/~nitrof22/+a ... ubuntu/ppa (I got it from here: http://askubuntu.com/questions/100296/i ... or-gdb-7-4)

Add it to: /etc/apt/sources.list

Code: Select all

deb http&#58;//ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
deb-src http&#58;//ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
The install gdb

Code: Select all

sudo apt-get install gdb
OK, after an extra "apt-get update" this worked. So I managed to obtain a gdb on that machine now. Thanks!

Fortunately -g is amongst the standard flags for compiling, in the Makefile. I also removed the catching of SIGSEGV, and set unlimited core dumps.

Now it is just a matter of waiting for it to segfault again. You are all invited to log in and start games, in order to stress it!

Sven
Posts: 3883
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: Help request for debugging ICS

Post by Sven » Sat Jan 07, 2017 8:11 pm

I get "connection closed", is ICS already online?

Sven
Posts: 3883
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: Help request for debugging ICS

Post by Sven » Sat Jan 07, 2017 8:16 pm

Sven Schüle wrote:I get "connection closed", is ICS already online?
I could also rephrase: did it crash already?

flok

Re: Help request for debugging ICS

Post by flok » Sat Jan 07, 2017 8:17 pm

If you need more detailed debug info, try -ggdb3 instead of -g

User avatar
hgm
Posts: 25049
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Help request for debugging ICS

Post by hgm » Sat Jan 07, 2017 9:42 pm

Yes, it crashed already. Unfortunately I cannot find any core dump, even though I commented out the catching of SIGSEGV. I had hoped to do a post-mortem debug.

I have restarted it now interactively under gdb. The problem will be to keep the ssh connection to the VPS through which I operate gdb open until the crash occurs, as it tends to time out when I just wait.

Volker Annuss
Posts: 173
Joined: Mon Sep 03, 2007 7:15 am

Re: Help request for debugging ICS

Post by Volker Annuss » Sat Jan 07, 2017 9:46 pm

Try to compile the ICS with -g and run it under valgrind with

Code: Select all

valgrind --log-file=ics.log <your ICS start command>
and you'll get your stacktrace and more. See the valgrind manual for more options. Expect the ICS to run about 30 times slower than normal.

You can download valgrind from http://valgrind.org and install it with the triple jump

Code: Select all

./configure
make
make install
No need to get it from the old Ubuntu repositories.

User avatar
Evert
Posts: 2929
Joined: Fri Jan 21, 2011 11:42 pm
Location: NL
Contact:

Re: Help request for debugging ICS

Post by Evert » Sat Jan 07, 2017 9:56 pm

hgm wrote:Yes, it crashed already. Unfortunately I cannot find any core dump, even though I commented out the catching of SIGSEGV. I had hoped to do a post-mortem debug.
Make sure core dumps are enabled (type "ulimit -c unlimited").

Post Reply