Help request for debugging ICS

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Help request for debugging ICS

Post by hgm »

After many months of flawless running, the ICS on which I organize the monthly blitz tournaments suddenly became crash prone in October, and now typically crashes within the hour. Nothing was changed, neither in the ICS nor in the OS of the VPS it is running on. My experience with Linux is quite limited, and I am not sure how I can figure out what causes these crashes.

The ICS itself is so friendly that it catches segfaults, through the following handler:

Code: Select all

/*
  give a decent backtrace on segv
*/
static void segv_handler(int sig)
{
	char cmd[100];
	snprintf(cmd, sizeof(cmd), "/home/mics/bin/backtrace %d > /home/mics/chessd/segv_%d 2>&1", 
		 (int)getpid(), (int)getpid());
	system(cmd);
	_exit(1);
}
Indeed many segv_* files were created in the home directory, corresponding to times where we suffered crashing:

Code: Select all

mics@www:~/chessd$ ls -lt
total 140
-rw-r--r--  1 mics mics    40 2016-12-24 14:13 segv_9841
-rw-------  1 mics mics  8192 2016-12-23 21:30 config.tdb
-rw-------  1 mics mics   696 2016-12-23 21:30 news.tdb
-rw-r--r--  1 mics mics    40 2016-12-03 12:32 segv_15471
-rw-r--r--  1 mics mics    40 2016-11-19 23:09 segv_1092
drwxr-xr-x  2 mics mics 20480 2016-11-19 23:05 spool
-rw-------  1 mics mics 27360 2016-11-19 21:10 admin.log
-rw-r--r--  1 mics mics    40 2016-11-19 20:38 segv_628
-rw-r--r--  1 mics mics    40 2016-11-19 19:24 segv_15291
-rw-r--r--  1 mics mics    40 2016-11-19 18:27 segv_11847
-rw-r--r--  1 mics mics    40 2016-11-19 15:43 segv_5887
-rw-r--r--  1 mics mics    40 2016-11-16 23:07 segv_28079
-rw-r--r--  1 mics mics    40 2016-10-23 08:35 segv_19347
-rw-r--r--  1 mics mics    40 2015-07-16 03:33 segv_28315
-rw-r--r--  1 mics mics    40 2015-05-08 12:56 segv_27428
-rw-r--r--  1 mics mics    40 2015-05-08 10:55 segv_27290
-rw-r--r--  1 mics mics    40 2015-05-08 10:43 segv_26709
-rw-r--r--  1 mics mics    40 2015-05-06 17:33 segv_13933
-rw-r--r--  1 mics mics    40 2015-02-24 02:38 segv_28503
drwxr-xr-x  2 mics mics  4096 2014-04-29 12:24 bin
drwxr-xr-x 10 mics mics  4096 2014-04-29 12:18 data
drwxr-xr-x  5 mics mics  4096 2014-04-29 12:18 games
drwxr-xr-x  2 mics mics  4096 2014-04-29 12:18 lib
drwxr-xr-x 28 mics mics  4096 2014-04-29 12:18 players
(Note that in the crash-free periods since October the ICS was simply not running. According to the change log on May 8, 2015 I fixed a buffer overrun.) Unfortunately the segv* files do not contain any useful information; the all contain the single line

Code: Select all

sh: /home/mics/bin/backtrace: not found
Apparently I am missing a tool called 'backtrace' that ancient operators of the ICS still had, but now is no longer part of the project.

How could I figure out where the ICS is crashing? I suppose I could run it under gdb. Unfortuately, gdb is not installed on the VPS where the ICS is running. "Apt-get install" does not work on that machine, because repositories for the OS (Ubuntu 10.04) no longer exist.

Are there other ways to get a working gdb on that system? Or other ways to get a stack trace? Would it be possible to force a core dump from the segfault handler, which could be taken to another machine for post-mortem debugging? If so, how?
tttony
Posts: 268
Joined: Sun Apr 24, 2011 12:33 am

Re: Help request for debugging ICS

Post by tttony »

You can install gdb adding this PPA: https://code.launchpad.net/~nitrof22/+a ... ubuntu/ppa (I got it from here: http://askubuntu.com/questions/100296/i ... or-gdb-7-4)

Add it to: /etc/apt/sources.list

Code: Select all

deb http://ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
deb-src http://ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
The install gdb

Code: Select all

sudo apt-get install gdb
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Help request for debugging ICS

Post by jdart »

1. Compile code with -g. Consider not optimizing or if you do have optimization flags on, add -fno-omit-frame-pointer.

2. You can start gdb and attach to the process once it is running. Just type "attach <pid>" on the gdb command line, where "<pid>" is the process id obtained from "ps". attach automatically halts the program at it s current location.

3. Breakpoint on the segv handler. "break handler_file.cpp:line_no".

4. Type "continue" to resume program execution

5. When the segv happens you will hit the breakpoint. Type "where" to see the stack trace. You can also examine variables, etc.

For buffer overflow issues, also consider using a tool like valgrind or the gcc6 flags -fsantize=bounds and -fsanitize=address (must pass these to both the compler and linker). These will detect issues at runtime. You can also submit the code to Coverity Scan (https://scan.coverity.com/, which will do a static analysis for you.

--Jon
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help request for debugging ICS

Post by hgm »

tttony wrote:You can install gdb adding this PPA: https://code.launchpad.net/~nitrof22/+a ... ubuntu/ppa (I got it from here: http://askubuntu.com/questions/100296/i ... or-gdb-7-4)

Add it to: /etc/apt/sources.list

Code: Select all

deb http&#58;//ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
deb-src http&#58;//ppa.launchpad.net/nitrof22/ppa/ubuntu lucid main 
The install gdb

Code: Select all

sudo apt-get install gdb
OK, after an extra "apt-get update" this worked. So I managed to obtain a gdb on that machine now. Thanks!

Fortunately -g is amongst the standard flags for compiling, in the Makefile. I also removed the catching of SIGSEGV, and set unlimited core dumps.

Now it is just a matter of waiting for it to segfault again. You are all invited to log in and start games, in order to stress it!
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Help request for debugging ICS

Post by Sven »

I get "connection closed", is ICS already online?
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Help request for debugging ICS

Post by Sven »

Sven Schüle wrote:I get "connection closed", is ICS already online?
I could also rephrase: did it crash already?
flok

Re: Help request for debugging ICS

Post by flok »

If you need more detailed debug info, try -ggdb3 instead of -g
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help request for debugging ICS

Post by hgm »

Yes, it crashed already. Unfortunately I cannot find any core dump, even though I commented out the catching of SIGSEGV. I had hoped to do a post-mortem debug.

I have restarted it now interactively under gdb. The problem will be to keep the ssh connection to the VPS through which I operate gdb open until the crash occurs, as it tends to time out when I just wait.
Volker Annuss
Posts: 180
Joined: Mon Sep 03, 2007 9:15 am

Re: Help request for debugging ICS

Post by Volker Annuss »

Try to compile the ICS with -g and run it under valgrind with

Code: Select all

valgrind --log-file=ics.log <your ICS start command>
and you'll get your stacktrace and more. See the valgrind manual for more options. Expect the ICS to run about 30 times slower than normal.

You can download valgrind from http://valgrind.org and install it with the triple jump

Code: Select all

./configure
make
make install
No need to get it from the old Ubuntu repositories.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Help request for debugging ICS

Post by Evert »

hgm wrote:Yes, it crashed already. Unfortunately I cannot find any core dump, even though I commented out the catching of SIGSEGV. I had hoped to do a post-mortem debug.
Make sure core dumps are enabled (type "ulimit -c unlimited").