top | item 18219215

Using /proc to get a process' current stack trace

190 points| cirowrc | 7 years ago |ops.tips | reply

66 comments

order
[+] drewg123|7 years ago|reply
FWIW, the equivalent in FreeBSD is 'procstat -kk $PID' Eg:

  % procstat -kk 5592

  PID    TID COMM                TDNAME              KSTACK                   
 5592 103222 less                -                   
mi_switch+0xe1 sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _cv_wait_sig+0x154 tty_wait+0x1c ttydisc_read+0x1f2 ttydev_read+0x64 devfs_read_f+0xdc dofileread+0x95 sys_read+0xc3 amd64_syscall+0x369 fast_syscall_common+0x101

procstat can also do interesting things, like show current rusage state:

  % procstat -r 5592
  PID COMM             RESOURCE                          VALUE
 5592 less             user time                    00:00:00.010805
 5592 less             system time                  00:00:00.002444
 5592 less             maximum RSS                             3172 KB
 5592 less             integral shared memory                   192 KB
 5592 less             integral unshared data                    80 KB
 5592 less             integral unshared stack                  256 KB
 5592 less             page reclaims                            199
 5592 less             page faults                                0
 5592 less             swaps                                      0
 5592 less             block reads                                4
 5592 less             block writes                               0
 5592 less             messages sent                              0
 5592 less             messages received                          0
 5592 less             signals received                           0
 5592 less             voluntary context switches                59
 5592 less             involuntary context switches               0
[+] lelf|7 years ago|reply
Also you can easily get both kernel- and user-level stack trace with dtrace.
[+] drewg123|7 years ago|reply
BTW, I totally suck at formatting stuff in various forums. Is there a way to post pre-formatted text here? Eg, like triple back-ticks will do in slack? ``` stuff... ```
[+] lixtra|7 years ago|reply
If you are running java instead of a c program the proc stacktrace shows you just the virtual machine state. You can still get a stacktrace of your java threads[1].

How about other languages? Python, ruby?

[1] https://stackoverflow.com/questions/4876274/kill-3-to-get-ja...

[+] scottlamb|7 years ago|reply
> If you are running java instead of a c program the proc stacktrace shows you just the virtual machine state.

No, this is a _kernel_ backtrace: what is happening in kernelspace on behalf of your process. If the work is being done in userspace (that is, in state R; the thread isn't in a syscall or page fault handler), you'll see essentially nothing here. I just tried it on a userspace busylooper and got this:

    [<0000000000000000>] exit_to_usermode_loop+0x57/0xb0
    [<0000000000000000>] prepare_exit_to_usermode+0x20/0x30
    [<0000000000000000>] 0xfffffffffffffff
Java, Python, C++ nothings all look pretty similar.

If you want a userspace stack trace, you need a different tool. If you're using an interpreted (or perhaps JITted) language, yes, you probably want something language-specific.

Also note the current stack trace is a per-thread concept, not a per-process one. If you're looking at a multithreaded program, you want to target the thread(s) of interest with "/proc/<PID>/task/<TID>/stack".

[+] monocasa|7 years ago|reply
This is about the kernel's stack trace during a system call, not the user space stack trace.
[+] ktpsns|7 years ago|reply
Note, the GNU debugger (gdb) can attach to running processes. This should give you a stack trace with readable addresses, cf. https://stackoverflow.com/questions/2308653/can-i-use-gdb-to...
[+] drewg123|7 years ago|reply
The /proc interface tells you what the process is doing in the kernel (eg, after a system call), which is orthogonal to getting a user-space stack trace. Both are helpful. The userspace trace is arguably more helpful. The nice thing about the kernel trace is that you're likely to have symbols (or be able to download them); that is unlikely to be true for userspace if you're running a commercial binary, for example.
[+] jhallenworld|7 years ago|reply
I want this capability for embedded ARM systems. I should be able to call a function to have the current stack trace printed:

https://communities.mentor.com/thread/16468

[+] stefan_|7 years ago|reply
The most widely distributed embedded ARM system software in the world, Android, offers this. They use mini debug info (normal debug info sections compressed, IIRC) and then you can signal a daemon in the background to do a stacktrace on your application.

Not on production images, though.

[+] monocasa|7 years ago|reply
I've done it, but it increases binary size by a whole lot.

What worked better was a last chance hardware fault handler that wrote a partial core dump out to flash, and a tool to create an ELF core dump file from that that GDB can accept.

[+] AlphaWeaver|7 years ago|reply
This is a very well written and formatted article... I found it easy to read!
[+] Annatar|7 years ago|reply
Instead of teaching people how to do this portably across all UNIX-like systems, by sending SIGABRT to the process, the article is steeping them in GNU/Linux only way of doing things. This feels exactly like the '90's of the past century, where a lot of people with computer-related careers had no idea that there were other operating systems and other ways of doing things (better): an intel-based PC tin bucket with Windows was the one and only truth for them. Now it's exactly the same except Windows has been replaced with GNU/Linux. 28 years later and the only advancement some people have made is running the proverbial sed 's/Windows/Linux/g'.
[+] seanhunter|7 years ago|reply
That's a very strange response. This is an article about Linux specifically. The author tags the article with Linux and mentions that it's part of a series of articles about Linux. It's not in and of itself a bad idea to have articles that specialise in a specific unix flavour.

Secondly, SIGABRT will cause the process to abort and dump core, will it not? That would give you a userspace stack trace (if you load the core into a debugger) whereas this is how you get the kernel-side stack trace of a still-running process.

I don't know of a way to get the kernel stack trace of a process in a cross-platform way. Is there such a thing?

[+] cthalupa|7 years ago|reply
SIGABRT in most places is not going to both get the stack trace and keep the process running. In fact, I would argue that if you are running something that continues working after it receives a signal telling it to /abort/, it's a bug. What do the word abort mean to you?

As for SmartOS, as someone who ran it at home and in production for years: Keep flying that flag, I guess. I liked it. But I also realized that Joyent, even with the Samsung acquisition, does not have the resources to keep it going in any meaningful way for anyone beyond themselves and people who have the exact same usecase as them. Things like lx branded zones are clever. I miss SMF, and am not a fan of systemd. But bpf is better than dtrace. Container management is easier than zone management. I've got less bugs dealing with KVM on Linux than I ever did on SmartOS. I spend less time compiling things from source and having to find random patch files to make things work. I know plenty about Solaris and SmartOS and HPUX and AIX and the BSDs and I don't think anyone is making the incorrect choice in deciding to learn Linux over any other UNIX-like.

That ship has sailed, man. And there's no compelling reason that it shouldn't have.

[+] mfukar|7 years ago|reply
Using SIGABRT gets you a stack trace in the same manner as a burnt house excuses you from sweeping the floors.
[+] geofft|7 years ago|reply
> Instead of teaching people how to do this portably across all UNIX-like systems, by sending SIGABRT to the process

Sending SIGABRT doesn't do what the article is talking about, on any UNIX. Perhaps you would learn something from listening to the kids these days, like the distinction between a kernel stack and a userspace stack.

[+] monocasa|7 years ago|reply
This mainly about grabbing the kernel stack trace. There's not a portable way to do that.

Also, doesn't SIGABRT generally kill off the process?

[+] zerkten|7 years ago|reply
Could you write a blog post in response that describes the better way to achieve this? There are possibly related items which you could throw in for others on achieving better portability.
[+] sctb|7 years ago|reply
Sounds like a good post for Hacker News! If you find it or write it, please go ahead and submit.
[+] pjmlp|7 years ago|reply
Yes, it feels strange to hear millennials talk about UNIX, when they are actually talking about GNU/Linux, and many things don't apply to e.g. OS X, Aix or many other variants.
[+] jstanley|7 years ago|reply
Even if that were the case, a free software hegemony is surely better than a proprietary Microsoft hegemony.
[+] someguydave|7 years ago|reply
Umm, what is the alternative O/S that has complete hardware support?