[arch-general] GDB does not work, sometimes

Rodrigo Rivas rodrigorivascosta at gmail.com
Mon Nov 25 14:17:42 EST 2013


On Mon, Nov 25, 2013 at 7:30 AM, jb <jb.1234abcd at gmail.com> wrote:

> Rodrigo Rivas <rodrigorivascosta <at> gmail.com> writes:
>
> > ...
> > The problem is in the "signal mask". It looks like some process masks the
> > signals in the early boot, and then the signal mask is inherited by all
> the
> > process in my session. And, as it seems, `gdb` needs a lot of signals to
> > work properly, but it assumes that they are not masked at the beginning.
> I
> > don't know if this should be considered a gdb bug or not, but the real
> > problem is elsewhere.
> > ...
> > So I am now pretty sure that some process in the session is corrupting
> the
> > signal mask. The only thing left is to know which one...
>
> Review it:
> $ pstree -p
>
> Your signal blocking could come from:
> - GUI Login Manager, DE session, ...
>   Check for gnome-session PID and for its parent PID:
>   $ grep SigBlk /proc/$pid/status
> - gnome-terminal (I guess) in which you run gdb
>   Check as above.
>
> This entry gives you an overview of signal states;
> it will help you match processes to e.g. SigBlk pattern:
> $ ps axs | grep fffffffe7ffbfeff
>
> Btw, a similar problem occured there:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=499569


I think I've finally solved it!
The first process in my session that gets the wrong signal mask is
cinnamon-session, so that's where I started looking.
I managed to trace the bug until the conection to logind via dbus. But the
connection is made using glib/gio, so I looked there.
Then I traced the glib code until a deep buried call to `pthread_create()`.
That is in glibc! So I got the sources and debugged again.
There things get complicated... The bug happens somewhere between glib
calling pthread_create() and glibc's implementation of that very same
function. I got a few stack traces and they all pointed to one suspect...

    /usr/lib/libnvidia-tls.so.331.20

Alas, the source of that file is not available, so my investigation ends
here.

I did a rollback to nvidia-325.15-11 and
{nvidia-libgl,nvidia-utils,opencl-nvidia}-325.15-1 and all is back to
normality 8-).

A quick search in the web shows that it has happened before [1] [2] [3]

I'm reporting my findings out there.

To anybody that is still reading, thank you for you attention.
-- 
Rodrigo.


[1]
https://devtalk.nvidia.com/default/topic/638521/linux/gnome-terminal-problems-ctrl-c-and-exit/
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1028272
[3] https://bbs.archlinux.org/viewtopic.php?pid=1350302


More information about the arch-general mailing list