codingfreak

7.11.2012

GIT - Adding diffmerge as visual merge in git

GIT is one of the popular distributed code repositories in opensource community, especially with developers working on opensource projects like linux kernel and others.

Operations like Merge and Diff are the most irritating & tricky tasks via command line when working on large code changes. So let us see how can we add some graphical stuff for these operations when using on GIT so that we can do merging and other options very easily.

TCPDUMP cheat sheet

Decoding hardware information of a PC

Checking out machine hardware information is no more a geeky thing of olden days where you need to go through the hardware specification documents or to open up a physical machine to find out the hardware details if the specification documents are missing. Now we have some really cool handy software tools to help us out.

I thought to make a page with some important tools which help us in decoding the hardware related information on a Linux Box. Feel free to drop a comment with the tool names I missed out in this post.

NOTE: All of them are ordered in alphabetical order and I am trying these tools on a Ubuntu machine running in virtual box. So some of the tool outputs might be displaying names likes VirtualBox and Oracle Corporation.

Uli Drepper - Basics you should know about buffer overflow

Part 1: Buffer Overflows

Connecting via SSH without password prompt

SSH is the common way to connect remote machines these days. Almost all kinds of applications use SSH in background for communicating with end machines.

But sometimes typing password repeatedly for establishing a SSH connection with the trusted end machine is quite daunting task.

Let us see how to automate the things in 3 simple steps where we can ssh to "user@host" without asking a password every time. Replace user with the username and host with the remote machine ip-address or hostname. For E.g. john@192.168.245.129

NOTE: Only do this with trusted machines.

Daemon-izing a Process in Linux

A Linux process works either in foreground or background.

A process running in foreground can interact with the user in front of the terminal. To run a.out in foreground we execute as shown below.

./a.out

When a process runs as background process then it runs by itself without any user interaction. The user can check its status but he doesn't (need to) know what it is doing. To run a.out in background we execute as shown below.

$ ./a.out &
[1] 3665

As shown above when we run a process with & at the end then the process runs in background and returns the process id (3665 in above example).

what is a DAEMON Process?
A 'daemon' process is a process that runs in background, begins execution at startup
(not neccessarily), runs forever, usually do not die or get restarted, waits for requests to arrive and respond to them and frequently spawn other processes to handle these requests.

So running a process in BACKGROUND with a while loop logic in code to loop forever makes a Daemon ? Yes and also No. But there are certain things to be considered when we create a daemon process. We follow a step-by-step procedure as shown below to create a daemon process.

1. Create a separate child process - fork() it.
Using fork() system call create a copy of our process(child), then let the parent process exit. Once the parent process exits the Orphaned child process will become the child of init process (this is the initial system process, in other words the parent of all processes). As a result our process will be completely detached from its parent and start operating in background.

pid=fork();

if (pid<0) exit(1); /* fork error */

if (pid>0) exit(0); /* parent exits */

/* child (daemon) continues */

Memory Layout of a C program - Part 2

Continuation of PART-1

As we have seen so much theory in the PART-1 now let us see a real-time example to understand about these segments. we will use size(1) command to list various section sizes in a C code.

A simple C program is given below

#include <stdio.h>

int main()
{
    return 0;
}

$ gcc test.c 
$ size a.out 
   text    data     bss     dec     hex filename
    836     260       8    1104     450 a.out

Now add a global variable as shown below

#include <stdio.h>

int global; /* Uninitialized variable stored in bss*/

int main()
{
    return 0;
}

$ gcc test.c 
$ size a.out 
   text    data     bss     dec     hex filename
    836     260      12    1108     454 a.out

As you can see BSS is incremented by 4 bytes.

Memory Layout of a C program - Part 1

A running program is called a process.

When we run a program, its executable image is loaded into memory area that normally called a process address space in an organized manner. It is organized into following areas of memory, called segments:

text segment
data segment
stack segment
heap segment

Figure 1: Memory layout

text segment

It is also called the code segment.

This is the area where the compiled code of the program itself resides. This is the machine language representation of the program steps to be carried out, including all functions making up the program, both user defined and system.

For example, Linux/Unix arranges things so that multiple running instances of the same program share their code if possible. Only one copy of the instructions for the same program resides in memory at any time and also it is often read-only, to prevent a program from accidentally modifying its instructions.

The portion of the executable file containing the text segment is the text section.

Deletion of a Node from Doubly Linked List

This article is part of article series - "Datastructures".

Previous Article: Inserting a Node in Doubly Linked List.

Deletion of a Node

Let us say our current Doubly Linked List is as shown in Figure-1.

Figure 1: Current Doubly Linked List

Deleting First Node of the List

In HEAD

In NODE-2

Decrement LENGTH variable in HEAD so that it maintains proper count of Nodes in the List.

Pseudocode:

HEAD->FIRST = HEAD->FIRST->NEXT

NODE-2->PREV = NULL

decrement(HEAD->LENGTH)

Output:

Figure 2: After deleting the First Node in the List.

Inserting a Node in Doubly Linked List

This article is part of article series - "Datastructures"

Previous Article: Doubly Linked List Next Article: Deletion of a Node from Doubly Linked List.

Insertion of a Node

Before we discuss about how to insert a NODE let us discuss few rules to follow at the time of insertion.

Check the location into which the user want to insert a new NODE. The possible locations where an user can insert a new node is in the range of 1 <= loc <= (length of list)+1. Let us say the length of the list is 10 & the user want to insert at location 12 (sounds stupid).

As we know we can traverse Bi-Directional in case of Doubly Linked Lists so we have to take care of PREV and NEXT variables in the NODE structure. We should also update the neighboring Nodes which are affected by this operation. If not we might break up the List somewhere or the other by creating a BROKEN LIST.

We have following scenarios in the case of insertion of a NODE.

Adding a Node at the start of the Empty List

Figure 1: Empty List and the newNode we want to add

NULL

In HEAD

In newNode

NULL

Increment the LENGTH variable in HEAD once insertion is successful to maintain the count of number of Nodes in the List.

Pseudocode:

HEAD->FIRST = newNode

newNode->PREV = NULL

newNode->NEXT = NULL

increment(HEAD->LENGTH)

Output:

Figure 2: After adding newNode in Empty List. (Changes in BLUE)

Implementation of Doubly Linked List in C

In computer science, a doubly linked list is a linked data structure that consists of a set of sequentially linked records called Nodes. Each Node contains two fields, called Links, that are references to the Previous and to the Next Node in the sequence of Nodes as well as field named Data.
For every Linked List we have something called Head which marks the starting of a list. So we have two main structures namely

Node
Head

Why we need a new structure for HEAD variable ?
Just for convenience I decided to have HEAD its own structure. You can even use the Node structure.

Node
Every Node in a Doubly Linked List has three main members namely

PREV
DATA
NEXT

As their names say

PREV - holds the memory location of the Previous Node in the List. If there are none we point towards NULL. For the First Node in the List PREV points to NULL.
NEXT - holds the memory location of the Next Node in the List. If there are none we point towards NULL. For the Last Node in the List NEXT points to NULL.
DATA - In simple words it holds Data. In our case it holds the memory location to the actual data to be held by the Node.

typedef struct node
{
    struct node *prev;
    void        *data;
    struct node *next;
}NODE;

Head

Head acts as the "head" of the List. Head structure has two members namely

LENGTH - holds the count of number of Nodes in the List.
FIRST - hold the memory location of the first Node in the List. If the List is EMPTY it points to NULL.

typedef struct head 
{
    unsigned int length;
    struct node  *first;
}HEAD;

NOTE: Our Head structure doesn't contain any pointer to the Tail of the List. Eventhough its a best way to include a pointer to Tail Node we decided to cover that implementation in Circular Doubly Linked List.

Notification Chains in Linux Kernel - Part 03

Continuation after PART-2.

Notifying Events on a Chain
Notifications are generated with notifier_call_chain. This function simply invokes, in order of priority, all the callback routines registered against the chain. Note that callback routines are executed in the context of the process that calls notifier_call_chain. A callback routine could, however, be implemented so that it queues the notification somewhere and wakes up a process that will look at it.

NOTE: Similar to register and unregister functions we don't directly call notifier_call_chain function as we have wrapper functions for respective chains.

 <kernel/notifier.c>

 58 static int __kprobes notifier_call_chain(struct notifier_block **nl,
 59                     unsigned long val, void *v,
 60                     int nr_to_call, int *nr_calls)
 61 {
 62     int ret = NOTIFY_DONE;
 63     struct notifier_block *nb, *next_nb;
 64 
 65     nb = rcu_dereference(*nl);
 66 
 67     while (nb && nr_to_call) {
 68         next_nb = rcu_dereference(nb->next);
 69         ret = nb->notifier_call(nb, val, v);
 .
 76         nb = next_nb;
 77         nr_to_call--;
 78     }
 79     return ret;
 80 }

nl
Notification chain.

val
Event type. The chain itself identifies a class of events; val unequivocally identifies an event type (i.e., NETDEV_REGISTER).

v
Input parameter that can be used by the handlers registered by the various clients. This can be used in different ways under different circumstances. For instance, when a new network device is registered with the kernel, the associated notification uses v to identify the net_device data structure.

nr_to_call
Number of notifier functions to be called. Don't care value of this parameter is -1.

nr_calls
Records the number of notifications sent. Don't care value of this field is NULL.

Notification Chains in Linux Kernel - Part 02

Continuation after PART-1.

Check the PART-3.

Blocking Notifier chains
A blocking notifier chain runs in the process context. The calls in the notification list could be blocked as it runs in the process context. Notifications that are not highly time critical could use blocking notifier chains.

Linux modules use blocking notifier chains to inform the modules on a change in QOS value or the addition of a new device.

<kernel/notifier.c>

186 int blocking_notifier_chain_register(struct blocking_notifier_head *nh,
187         struct notifier_block *n)
188 {
.
199     down_write(&nh->rwsem);
200     ret = notifier_chain_register(&nh->head, n);
201     up_write(&nh->rwsem);
202     return ret;
203 }
204 EXPORT_SYMBOL_GPL(blocking_notifier_chain_register)
.
216 int blocking_notifier_chain_unregister(struct blocking_notifier_head *nh,
217         struct notifier_block *n)
218 {
.
229     down_write(&nh->rwsem);
230     ret = notifier_chain_unregister(&nh->head, n);
231     up_write(&nh->rwsem);
232     return ret;
233 }
234 EXPORT_SYMBOL_GPL(blocking_notifier_chain_unregister);

Notification Chains in Linux Kernel - Part 01

Linux is a monolithic kernel. Its subsystems or modules help to keep the kernel light by being flexible enough to load and unload at runtime. In most cases, the kernel modules are interconnected to one another. An event captured by a certain module might be of interest to another module.

Typically, communication systems implement request-reply messaging, or polling. In such models, a program that receives a request will have to send the data available since the last transaction. Such methods sometimes require high bandwidth or they waste polling cycles.

To fulfill the need for interaction, Linux uses so called notification chains. These notifier chains work in a Publish-Subscribe model. This model is more effective when compared to polling or the request-reply model.

For each notification chain there is a passive side (the notified) and an active side (the notifier), as in the so-called publish-and-subscribe model:

The notified are the subsystems that ask to be notified about the event and that provide a callback function to invoke.

The notifier is the subsystem that experiences an event and calls the callback function.

NOTE: All the code samples are taken from Linux 2.6.24 kernel.

struct notifier_block
The elements of the notification chain's list are of type notifier_block:

<include/linux/notifier.h>

 50 struct notifier_block {
 51     int (*notifier_call)
(struct notifier_block *, unsigned long, void *);
 52     struct notifier_block *next;
 53     int priority;
 54 };

notifier_call - function to execute.

next - used to link together the elements of the list.

priority - the priority of the function. Functions with higher priority are executed first. But in practice, almost all registrations leave the priority out of the notifier_block definition, which means it gets the default value of 0 and execution order ends up depending only on the registration order (i.e., it is a semirandom order).

The notifier_block data structure is a simple linked list of function pointers. The function pointers are registered with ‘functions’ that are to be called when an event occurs. Each module needs to maintain a notifier list. The functions are registered to this notification list. The notification module (publisher) maintains a list head that is used to manage and traverse the notifier block list. The function that subscribes to a module is added to the head of the module’s list by using the register_xxxxxx_notifier API and deletion from the list is done using unregister_xxxxxx_notifier.

Printing logs based on log levels in C

LOG LEVELS ??
As per my definition LOG LEVEL means a way to differentiate the importance of logs in our application. We can divide the logs into categories based on their importance and effect for e.g. ERROR logs are more important than DEBUG logs.

Why do we need to print logs based on LOG LEVEL ??
It is really helpful in projects with millions of lines of source code where the user can't use #defines or #ifdef's in order to maintain DEBUG prints. It is really tiresome to maintain #defines and #ifdef atleast for printing logs.

printk which is a part of LINUX KERNEL supports printing logs based on LOG LEVEL and it is really helpful in debugging kernel.

printf or any of its brothers & sisters don't support the option to print logs depending upon the log levels.

Implementation of Stack using Singly Linked Lists

Stacks are linear data structures which means the data is stored in what looks like a line (although vertically). In simple words we can say

A stack is a last in, first out (LIFO) abstract data type and data structure.

Basic usage of stack at the Architecture level is as a means of allocating and accessing memory.

We can only perform two fundamental operations on a stack: push and pop.

The push operation adds to the top of the list, hiding any items already on the stack, or initializing the stack if it is empty. The pop operation removes an item from the top of the list, and returns this value to the caller. A pop either reveals previously concealed items, or results in an empty list.

A stack is a restricted data structure, because only a small number of operations are performed on it.

strace - diagnostic, debugging and reverse engineering tool

Many times we come across hopeless situations where a program when compiled and installed in GNU/Linux just fails to run. Then we have to trace the output of the misbehaving program. But tracing the output of a program throws up a lot of data and it is a daunting task to go through volumes of data. Still there are cases where we are not fruitful in pin pointing the cause of error.

In this situation strace also known as system-call tracer comes for rescue. It is a debugging tool that monitors the system calls used by a program and all the signals it receives.

A system call is the most common way programs communicate with the kernel. System calls include reading and writing data, opening and closing files and all kinds of network communication. Under Linux, a system call is done by calling a special interrupt with the number of the system call and its parameters stored in the CPU's registers.

Using strace is quite simple. There are two ways to let strace monitor a program.

Method 1:

To start strace along with a program, just run the executable with strace as shown below.

strace program-name

For example let us trace ls command.

$ strace ls
execve("/bin/ls", ["ls"], [/* 39 vars */]) = 0
brk(0)                                  = 0x82d4000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7787000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=76503, ...}) = 0
mmap2(NULL, 76503, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7774000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libselinux.so.1", O_RDONLY)  = 3
read(3, "177ELF111���������3�3�1���@G��004���"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=104148, ...}) = 0
mmap2(NULL, 109432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x41d000
mmap2(0x436000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18) = 0x436000
close(3)                                = 0
.
.
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7613000
write(1, "01.c  a.outn", 1201.c  a.out
)           = 12
close(1)                                = 0
munmap(0xb7613000, 4096)                = 0
close(2)                                = 0
exit_group(0)                           = ?

In the above example we are not displaying the complete output of strace command. Even though output from strace looks very complicated, this is only due to many system calls made when loading shared libraries. However, once we have found which system calls are the important ones (mainly open, read, write and the like), the results will look fairly intuitive to us.

Method 2:

If we want to monitor a process which is currently running we can attach to the process using –p option. Thus we can even debug a daemon process.

strace –p <pid-of-the-application>

For e.g

#include <stdio.h>
#include <unistd.h>

int main()
{
   sleep(20);
   return 0;
}

We will compile the above code and run it as a background process. Then we try to monitor the program using its process id as shown below.

$ gcc main.c

$ ./a.out &
[1] 1885

$ strace -p 1885
Process 1885 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
exit_group(0)                           = ?
Process 1885 detached
[1]+  Done                    ./a.out

In contrast to a debugger, strace does not need a program's source code to produce human-readable output.

Some handy options

Below example is used in the discussion of other important options supported by strace.

#include <stdio.h>

int main(void)
{
  FILE *fd = NULL;

  if(fd = fopen("test","rw"))
    {   
      printf("TEST file openedn");
      fclose(fd);
    }   
  else
    {   
      printf("Failed to open the filen");
    }   

  return 0;
}

Providing the time taken by multiple system calls in a program

Using –c option strace provides summary information on executing a program.

It provides information like number of times a system call is used, time spent executing various system calls, number of times errors returned as shown below.

$ strace -c ./a.out 
Failed to open the file
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 91.47    0.004000        4000         1           execve
  8.53    0.000373         124         3         3 access
  0.00    0.000000           0         1           read
  0.00    0.000000           0         1           write
  0.00    0.000000           0         3         1 open
  0.00    0.000000           0         2           close
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         3           mprotect
  0.00    0.000000           0         7           mmap2
  0.00    0.000000           0         3           fstat64
  0.00    0.000000           0         1           set_thread_area
------ ----------- ----------- --------- --------- ----------------
100.00    0.004373                    29         4 total

Redirecting the output to a file

Using -o option we can redirect the complex output of strace into a file.

$ strace -o <output-file-name> <program-name>

Time spent per system call

Using –T option we can get time spent per system call. In the below example we can see time spent per system call is printed at the end of the line.

$ strace -T ./a.out 
execve("./a.out", ["./a.out"], [/* 39 vars */]) = 0 <0.003256>
.
brk(0x9db0000)                          = 0x9db0000 <0.000123>
open("test", O_RDONLY)                  = -1 ENOENT (No such file or directory) <0.000154>
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000125>
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77d4000 <0.000121>
write(1, "Failed to open the filen", 24Failed to open the file
) = 24 <0.000258>
exit_group(0)                           = ?

Prefixing time of the day for every line in trace

It is useful sometimes to track at what time a particular is triggered. By using -t option strace will prefix each line of the trace with the time of day, which will be really helpful to find out at particular time at which call is the process blocked.

$ strace -t ./a.out 
execve("./a.out", ["./a.out"], [/* 39 vars */]) = 0 <0.003256>
.
brk(0x9db0000)                          = 0x9db0000 <0.000123>
open("test", O_RDONLY)                  = -1 ENOENT (No such file or directory) <0.000154>
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000125>
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77d4000 <0.000121>
write(1, "Failed to open the filen", 24Failed to open the file
) = 24 <0.000258>
exit_group(0)                           = ?

Tracing only specific system calls

Using –e option we can also specify which system calls to be traced. To trace only open() and close() system calls use the following command:

$ strace –e trace=’open,close’ <program-name>

Similarly we can also use negation option to not trace specific system calls. If we don’t want to trace open() system call in previous example we can give the below command

$ strace -e trace='!open,close' ./a.out

Check the man page of strace for other options.

Static Functions in C

By default all functions are implicitly declared as extern, which means they're visible across translation units. But when we use static it restricts visibility of the function to the translation unit in which it's defined. So we can say

Functions that are visible only to other functions in the same file are known as static functions.

Let use try out some code about static functions.
main.c

#include "header.h"

int main()
{
hello();
return 0;
}

func.c

#include "header.h"

void hello()
{
printf("HELLO WORLD\n");
}

header.h

#include <stdio.h>

static void hello();

If we compile above code it fails as shown below

$gcc main.c func.c
header.h:4: warning: "hello" used but never defined
/tmp/ccaHx5Ic.o: In function `main':
main.c:(.text+0x12): undefined reference to `hello'
collect2: ld returned 1 exit status

It fails in Linking since function hello() is declared as static and its definition is accessible only within func.c file but not for main.c file. All the functions within func.c can access hello() function but not by functions outside func.c file.

Using this concept we can restrict others from accessing the internal functions which we want to hide from outside world. Now we don't need to create private header files for internal functions.

Note:
For some reason, static has different meanings in in different contexts.

1. When specified on a function declaration, it makes the function local to the file.
2. When specified with a variable inside a function, it allows the variable to retain its value between calls to the function. See static variables.

It seems a little strange that the same keyword has such different meanings...

Subscribe to: Posts ( Atom )

codingfreak

Pages

7.11.2012

GIT - Adding diffmerge as visual merge in git

6.14.2012

TCPDUMP cheat sheet

5.02.2012

Decoding hardware information of a PC

4.19.2012

Uli Drepper - Basics you should know about buffer overflow

4.05.2012

Connecting via SSH without password prompt

3.13.2012

Daemon-izing a Process in Linux

3.04.2012

Memory Layout of a C program - Part 2

Memory Layout of a C program - Part 1

3.01.2012

Deletion of a Node from Doubly Linked List

2.27.2012