/* Ajith - Syntax Higlighter - End ----------------------------------------------- */

6.30.2010

strace - diagnostic, debugging and reverse engineering tool

Many times we come across hopeless situations where a program when compiled and installed in GNU/Linux just fails to run. Then we have to trace the output of the misbehaving program. But tracing the output of a program throws up a lot of data and it is a daunting task to go through volumes of data. Still there are cases where we are not fruitful in pin pointing the cause of error.

In this situation strace also known as system-call tracer comes for rescue. It is a debugging tool that monitors the system calls used by a program and all the signals it receives.

A system call is the most common way programs communicate with the kernel. System calls include reading and writing data, opening and closing files and all kinds of network communication. Under Linux, a system call is done by calling a special interrupt with the number of the system call and its parameters stored in the CPU's registers.

Using strace is quite simple. There are two ways to let strace monitor a program.


Method 1:

To start strace along with a program, just run the executable with strace as shown below.
strace program-name
For example let us trace ls command.
$ strace ls
execve("/bin/ls", ["ls"], [/* 39 vars */]) = 0
brk(0)                                  = 0x82d4000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7787000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=76503, ...}) = 0
mmap2(NULL, 76503, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7774000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libselinux.so.1", O_RDONLY)  = 3
read(3, "177ELF111���������3�3�1���@G��004���"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=104148, ...}) = 0
mmap2(NULL, 109432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x41d000
mmap2(0x436000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18) = 0x436000
close(3)                                = 0
.
.
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7613000
write(1, "01.c  a.outn", 1201.c  a.out
)           = 12
close(1)                                = 0
munmap(0xb7613000, 4096)                = 0
close(2)                                = 0
exit_group(0)                           = ?
In the above example we are not displaying the complete output of strace command. Even though output from strace looks very complicated, this is only due to many system calls made when loading shared libraries. However, once we have found which system calls are the important ones (mainly open, read, write and the like), the results will look fairly intuitive to us.

Method 2:

If we want to monitor a process which is currently running we can attach to the process using –p option. Thus we can even debug a daemon process.
strace –p <pid-of-the-application>
For e.g
#include <stdio.h>
#include <unistd.h>

int main()
{
   sleep(20);
   return 0;
}
We will compile the above code and run it as a background process. Then we try to monitor the program using its process id as shown below.
$ gcc main.c

$ ./a.out &
[1] 1885

$ strace -p 1885
Process 1885 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
exit_group(0)                           = ?
Process 1885 detached
[1]+  Done                    ./a.out
In contrast to a debugger, strace does not need a program's source code to produce human-readable output.

Some handy options

Below example is used in the discussion of other important options supported by strace.
#include <stdio.h>

int main(void)
{
  FILE *fd = NULL;

  if(fd = fopen("test","rw"))
    {   
      printf("TEST file openedn");
      fclose(fd);
    }   
  else
    {   
      printf("Failed to open the filen");
    }   

  return 0;
}

Providing the time taken by multiple system calls in a program


Using –c option strace provides summary information on executing a program.

It provides information like number of times a system call is used, time spent executing various system calls, number of times errors returned as shown below.
$ strace -c ./a.out 
Failed to open the file
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 91.47    0.004000        4000         1           execve
  8.53    0.000373         124         3         3 access
  0.00    0.000000           0         1           read
  0.00    0.000000           0         1           write
  0.00    0.000000           0         3         1 open
  0.00    0.000000           0         2           close
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         3           mprotect
  0.00    0.000000           0         7           mmap2
  0.00    0.000000           0         3           fstat64
  0.00    0.000000           0         1           set_thread_area
------ ----------- ----------- --------- --------- ----------------
100.00    0.004373                    29         4 total

Redirecting the output to a file 


Using -o option we can redirect the complex output of strace into a file.
$ strace -o <output-file-name> <program-name>

Time spent per system call 


Using –T option we can get time spent per system call. In the below example we can see time spent per system call is printed at the end of the line.
$ strace -T ./a.out 
execve("./a.out", ["./a.out"], [/* 39 vars */]) = 0 <0.003256>
.
brk(0x9db0000)                          = 0x9db0000 <0.000123>
open("test", O_RDONLY)                  = -1 ENOENT (No such file or directory) <0.000154>
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000125>
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77d4000 <0.000121>
write(1, "Failed to open the filen", 24Failed to open the file
) = 24 <0.000258>
exit_group(0)                           = ?

Prefixing time of the day for every line in trace 


It is useful sometimes to track at what time a particular is triggered. By using -t option strace will prefix each line of the trace with the time of day, which will be really helpful to find out at particular time at which call is the process blocked.
$ strace -t ./a.out 
execve("./a.out", ["./a.out"], [/* 39 vars */]) = 0 <0.003256>
.
brk(0x9db0000)                          = 0x9db0000 <0.000123>
open("test", O_RDONLY)                  = -1 ENOENT (No such file or directory) <0.000154>
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000125>
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77d4000 <0.000121>
write(1, "Failed to open the filen", 24Failed to open the file
) = 24 <0.000258>
exit_group(0)                           = ?

Tracing only specific system calls 


Using –e option we can also specify which system calls to be traced. To trace only open() and close() system calls use the following command:
$ strace –e trace=’open,close’ <program-name>
Similarly we can also use negation option to not trace specific system calls. If we don’t want to trace open() system call in previous example we can give the below command
$ strace -e trace='!open,close' ./a.out
Check the man page of strace for other options.

Static Functions in C

By default all functions are implicitly declared as extern, which means they're visible across translation units. But when we use static it restricts visibility of the function to the translation unit in which it's defined. So we can say
Functions that are visible only to other functions in the same file are known as static functions.

Let use try out some code about static functions.
main.c
#include "header.h"

int main()
{
hello();
return 0;
}
func.c
#include "header.h"

void hello()
{
printf("HELLO WORLD\n");
}
header.h
#include <stdio.h>

static void hello();
If we compile above code it fails as shown below
$gcc main.c func.c
header.h:4: warning: "hello" used but never defined
/tmp/ccaHx5Ic.o: In function `main':
main.c:(.text+0x12): undefined reference to `hello'
collect2: ld returned 1 exit status
It fails in Linking since function hello() is declared as static and its definition is accessible only within func.c file but not for main.c file. All the functions within func.c can access hello() function but not by functions outside func.c file.

Using this concept we can restrict others from accessing the internal functions which we want to hide from outside world. Now we don't need to create private header files for internal functions.

Note:
For some reason, static has different meanings in in different contexts.

1. When specified on a function declaration, it makes the function local to the file.
2. When specified with a variable inside a function, it allows the variable to retain its value between calls to the function. See static variables.

It seems a little strange that the same keyword has such different meanings...