/* Ajith - Syntax Higlighter - End ----------------------------------------------- */

12.16.2009

ctags - vim with steroids

Basically C/C++ projects have thousands of lines of code divided into hundreds in some cases thousands of files. Inorder to access various function definitions within the sourcecode repository effectively using a VIM editor there is a great need for using addons like ctags which provides easy code go through.

Eventhough there are many effective GUI based code editors like eclipse e.t.c I prefer to use VIM editor as my primary code editor. I am not much into GUI funda so still prefer basic editor like VIM.

1. Installing ctags package
Almost all the linux flavours with 2.6.X kernel might have ctags installed by default. If not download the appropriate .deb or rpm file. Sorry I am not dealing with installing of ctags as I havent came across this stage as I am using FEDORA 9.

2. Generating ctags on your source code.
We have to generate a file named 'tags' for all the source code files, use the following command:
ctags *.c *.h
When we have many files in many directories then we have to create a tags file in each of the directories. But VIM will only be able to jump to tags within the same directory. To find more tags files, we have to set the 'tags' option in VIM to include all the relevant tags files. Just set the following command in ~/.vimrc file.
:set tags=./tags,./../tags,./*/tags
This finds a tags file in the same directory as the current file, one directory level higher and in all subdirectories.

But VIM searching many places for tags files is not really robust enough as it may get a bit slow. It's better to generate one big tags file offcourse it takes more time to create a single tags files for whole project. Just give the following command to recursively add all files in your project.
$ cd /proj 
$ ctags -R *
-R is to recursively go across all the directories, and a ‘tags’ file will be created including all the files in the sub-directories also.

Now we can simply include this single tags files as shown below
:set tags=~/proj/tags
3. Start using ctags with VIM editor
We have 3 different ways to use ctags in VIM editor.

From Shell:
We can invoke directly the file or the files containing the definition for a particular function from shell.
vi -t function_name
This will find the correct file or list of files having function_name definition.

OR

VIM command line:
We can invoke from VIM commandline (in command mode) the definition of a function.
:tag function_name
or
:ta function_name
This will jump automatically from the current file to the file containing the function_name definition.

OR

By cursor position:
This option is more user-friendly as we use a key combination instead of giving commands.
ctrl + ]
Place the cursor on the first character of the function name and press ctrl-]. This will jump to the file containing the definition of function_name.
ctrl + t
This command will jump back to previous location.
ctrl + i
To travel forward through the tag history.
ctrl + o
To travel backward through the tag history.

History
To display the list of tags that we have traversed in past give the following command.
:tags
Shows tag stack (history).

Divide and Conquer
As we saw already 'ctrl + ]' replaces the file in the current window with the one containing the new function. But suppose if we want to see not only the old function but also the new one?

We can split the window using the ":split" command followed by the ":tag" command.
:stag function_name
Cursor command for above feature is
ctrl + w + ]
Auto Complete
VIM supports tag name completion. Start typing the tag name (i.e. function name) and then hit TAB key and name completion will be done automatically if there is a tag name.
:tag start-of-tag-name_TAB
Jump to a tag name found by a search.
:tag /search-string
When multiple entries exist in the tags file, such as a function declaration in a header file and a function definition (the function itself), the operator can choose by issuing this command. The user will be presented with all the references to the function and the user will be prompted to enter the number associated with the appropriate one.
:tselect function-name
Jumps to next matching tag.
:tnext
or
:tn
Jump to previous matching tag.
:tprevious
or
:tp
Jump to first matching tag.
:tfirst
or
:tf
or
:trewindor:tr
Jump to last matching tag.
:tlast
or
:tl
Finding global identifiers
You are editing a C program and wonder if a variable is declared as "int" or "unsigned". A quick way to find this is with the "[I" command. Suppose the cursor is on the word "column" then type
[ + shiftkey + i
Vim will list the matching lines it can find. Not only in the current file,
but also in all included files (and files included in them, etc.). The result
looks like this:

structs.h
1: 29 unsigned column; /* column number */
The advantage over using tags or the preview window is that included files are
searched. In most cases this results in the right declaration to be found.
Also when the tags file is out of date. Also when you don't have tags for the
included files.

However, a few things must be right for "[I" to do its work. First of all,
the 'include' option must specify how a file is included. The default value
works for C and C++. For other languages you will have to change it.

Reference:
ctags documentation

Creating Shared Libraries in Linux - Part 2

Check the PART1 of this article.

4. Making the library available at run-time

Using LD_LIBRARY_PATH
We have to create or set the environment variable "LD_LIBRARY_PATH" to the directory containing the shared libraries.
export LD_LIBRARY_PATH=/home/cf/lib
If in current directory you can give the following command
export LD_LIBRARY_PATH=.
If we have to append a new directory to the existing paths then add the directories separated by colons to environment variable "LD_LIBRARY_PATH".
export LD_LIBRARY_PATH=/opt/lib:$LD_LIBRARY_PATH
Now recompile the main program
gcc -o test main.c -lcalc_mean -L/home/cf/slib
Now check the ldd command output
$ ldd test
 linux-gate.so.1 =>  (0x007ad000)
 libcalc_mean.so => ./libcalc_mean.so (0x0081e000)
 libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x005ff000)
 /lib/ld-linux.so.2 (0x00e19000)
It seems now linker is able to locate our shared library as we can see in above output. Now run the program
$./test
LD_LIBRARY_PATH is good for quick tests and for systems on which you don’t have admin privileges. As a downside, however, exporting the LD_LIBRARY_PATH environment variable  might screw up with other programs you run that also rely on LD_LIBRARY_PATH if you don’t reset it to its previous state when you’re done.

12.09.2009

Creating and using shared libraries in Linux

What is a Shared Library ??
A library that is loaded into physical memory only once and reused by multiple processes via virtual memory. 
Generally Shared libraries are .so (or in windows .dll) files.

Why shared libraries ??
  • They reduce memory consumption if used by more than one process, and they reduce the size of the executable.
  • They make developing applications easier: a small change in the implementation of a function in the library don't need the user to recompile and relink his application code every time. You need to only relink if you make incompatible changes, such as adding arguments to a call or changing the size of a struct.
NOTE: Debugging using a shared library is slightly more difficult when compared with static libraries, because the debugger usually used on Linux, gdb, has some problems with shared libraries.

Let us see how to create a shared library on Linux. We use following source code files for this post.
calc_mean.h
#ifndef calc_mean_h__
#define calc_mean_h__
double mean(double, double);
#endif  // calc_mean_h__
calc_mean.c
double mean(double a, double b)
{
return (a+b) / 2;
}
main.c - We are including our shared library in this application.
#include <stdio.h>
#include "calc_mean.h"

int main(int argc, char* argv[]) {

double v1, v2, m;
v1 = 5.2;
v2 = 7.9;

m  = mean(v1, v2);

printf("The mean of %3.2f and %3.2f is %3.2f\n", v1, v2, m);

return 0;
}

1. Creating Object File with Position Independent Code
All the code that goes into a shared library needs to be position independent. We can make gcc emit position-independent code by passing it one of the command-line switches -fpic or -fPIC (the former is preferred, unless the modules have grown so large that the relocatable code table is simply too small in which case the compiler will emit an error message, and you have to use -fPIC).

First we will create object files for all .c files that goes into a shared library.
gcc -c -fPIC calc_mean.c -o calc_mean.o
Above we are compiling calc_mean.c with -fPIC option and generating calc_mean.o object file.

2. Creating Shared library with the Object File
Every shared library has a prefix "lib", the name of the library, the phrase ".so", followed by a period and a version number that is incremented whenever the interface changes (as a special exception, the lowest-level C libraries don't start with "lib").
gcc -shared -o libcalc_mean.so calc_mean.o
Above command on successful produces a shared library named "libcalc_mean.so".
  • -shared: Produces a shared object which can then be linked with other objects to form an executable.

3. Using the Shared Library
Now let us link the created shared library with our application. Compile main.c as shown below
$ gcc -o test main.c -lcalc_mean
/usr/bin/ld: cannot find -lcalc_mean
collect2: ld returned 1 exit status
The linker doesn’t know where to find libcalc_mean. But why ?
GCC has a list of places to look by default for shared libraries, but our directory is not in that list. Bingo that's the reason compilation failed at linking level.

Now we need to tell GCC where to find libcalc_mean.so. We will do that with the -L option.
gcc -o test main.c -lcalc_mean -L/home/cf/slib
  •  -l option tells the compiler to look for a file named libsomething.so The something is specified by the argument immediately following the “-l”. i.e. -lmean
  • -L option tells the compiler where to find the library. The path to the directory containing the shared libraries is followed by "-L". If no “-L” is specified, the compiler will search the usual locations. "-L." means looking for the shared libraries in the current directory and "-L/home/cf/lib" means looking for the shared libraries at "/opt/lib" path. You can specify as many “-l” and “-L” options as you like.
NOTE: It would be a better idea to move all personal shared libraries in one directory rather in the current directory. For easy understanding I am moving "libcalc_mean.so" to "/home/cf/slib".
mv libcalc_mean.so /home/cf/slib
Now compile main.c. It would be successful and creates an executable named "test".
Let us check if the path to our shared library is included successfully into the executable by linker as shown below.

ldd executablename
$ ldd test 
 linux-gate.so.1 =>  (0x00332000)
 libcalc_mean.so => not found
 libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x006aa000)
 /lib/ld-linux.so.2 (0x00db9000)
You can see that linker cannot find our shared library libcalc_mean.so.

Basically libraries are present in /usr/lib amongst other places. Static libraries (.a suffix) are incorporated into the binary at link time, whereas dynamic ones (.so suffix) are referenced by location.

Check the PART2 of this article to understand further

11.03.2009

Using 'find' command

'find' is one of the useful commands available in Unix/Linux systems. In this article I am trying to use 'find' in a effective way.

Basic syntax of find command as per man page is - find [path...] [expression]
codingfreak find
NOTE: There are many options available for find command which are available in man page for find. Only some of those options are shown in this tutorial.

Finding a particular file in your system
$ find / -name 'filename' 2>/dev/null
$ find / -name 'filename' 2>errors.txt
/ - Start searching from the root directory (i.e / directory)

-name - Given search text is the filename rather than any other attribute of a file 'filename'. Always enclose the filename in single quotes...

NOTE: 2>/dev/null is not related to find tool as such. 2 indicates the error stream in Linux, and /dev/null is the device where anything you send simply disappears. So 2>/dev/null in this case means that while finding for the files, in case any error messages pop up simply send them to /dev/null i.e. simply discard all error messages.

Alternatively you could use 2>error.txt where after the search is completed you would have a file named error.txt in the current directory with all the error messages in it.
$ find -name 'met*'
The above command would start searching for the files that begin with the letters 'met' within the current directory and the directories that are present within the current directory.

If no paths are given, the current directory is used. If no expression is given, the expression ‘-print’ is used.

Searching with respect to type of the file (-type)
$find . -name 'temp'
./keepout/temp
./temp

$find . -name 'temp' -type d
./temp
Where d - directory, p - named pipe (FIFO), f - regular file and so on

Ignoring case-sensitivity (-iname)
$ find /home/temp -iname 'index*'
This command searchs for a file starting with string 'index' without considering the case of the filename. So all files starting with any combination of letters in upper and lower case such as INDEX or indEX or index would be returned.

Searching for a file based on size and time
$ find /home/songs -name '*.mp3' -size -5000k

$ find / -size +10000k
First command finds within a directory called /home/songs, only those mp3 files that have a size less than 5000 Kilobytes.
$ find /home/temp -amin -10 -name '*.c'
$ find /home/temp -atime -2 -name '*.c'
$ find /home/temp -mmin -10 -name '*.c'
$ find /home/temp -mtime -2 -name '*.c'
The 1st command searches for those files that are present in the directory /home/temp and its subdirectories which end in .c and which have been accessed in the last 10 minutes.

The 2nd command does the same but searches for those files that have been accessed in the last 10 hours.

The 3rd and the 4th commands do the same as the 1st and 2nd commands but they search for modified files rather than accessed files. Only if the contents of the files have been modified, would their names be returned in the search results.
$ find / -mount -name 'win*'
This command searches for files starting with the letters 'win' in their filenames. The only difference is that the mounted filesystems would not be searched for this time. This is useful when you have your Windows partitions mounted by default. And a search for 'win' might return many files on those partitions, which you may not be really interested in. This is only one use of -mount parameter.
$ find /home/songs -name 'Metallica*' -and -size +10000k
$ find /home/songs -size +10000k ! -name "Metallica*"
$ find /home/songs -name 'Metallica*' -or -size +10000k
Boolean operators such as AND, OR and NOT make find an extremely useful tool.

The 1st command searches within the directory /songs for files that have their names beginning with 'Metallica' and whose size is greater than 10000 kilobytes (> 10 MB).

The 2nd command searches in the same directory as above case but only for files that are greater than 10MB, but they should not have 'Metallica' as the starting of their filenames.

The 3rd command searches in the same directory for files that begin with 'Metallica' in their names or all the files that are greater than 10 MB in size.

How to apply a unix command to a set of files (-exec) ?
$ find . -name '*.sh' -exec chmod o+r '{}' \; -print
This command will search in the current directory and all sub directories. All files ending with .sh extension will be processed by the chmod -o+r command. The argument '{}' inserts each found file into the chmod command line. The \; argument indicates the exec command line has ended.

The end results of this command is all .sh files have the other permissions set to read access (if the operator is the owner of the file).

Searching for a string in a selection of files (-exec grep ...).
$ find . -exec grep "hello" '{}' \; -print
Prints all files that contain the string 'hello' will have their path printed to standard output.

If you want to just find each file then pass it on for processing use the -q grep option. This finds the first occurrance of the search string. It then signals success to find and find continues searching for more files.
find . -exec grep -q "hello" '{}' \; -print
Finding Empty files (-empty)
$find . -empty
To delete empty files in the current directory:
$ find . -empty -maxdepth 1 -exec rm '{}' \;
For more examples try out

1. linux.ie
2. Devdaily
3. hccfl.edu

10.30.2009

Are you a Hacker ??

from ReDragon on IRC, handed to newbies...

Are You a Hacker?

Take a little quiz for me today. Tell me if you fit this description.
You got your net account several months ago. You have been surfing the
net, and you laugh at those media reports of the information superhighway.
You have a red box, you don't have to pay for phone calls. You have
crackerjack, and you have run it on the password file at a unix you got
an account on. Everyone at your school is impressed by your computer
knowledge, you are the one the teachers ask for help. Does this sound
like you? You are not a hacker.

There are thousands of you out there. You buy 2600 and you ask
questions. You read phrack and you ask questions. You join #hack and
you ask questions. You ask all of these questions, and you ask what is
wrong with that? After all, to be a hacker is to question things, is
it not? But, you do not want knowledge. You want answers. You do not
want to learn how things work. You want answers. You do not want to
explore. All you want to know is the answer to your damn questions.
You are not a hacker.

Hacking is not about answers. Hacking is about the path you take to
find the answers. If you want help, don't ask for answers, ask for
a pointer to the path you need to take to find out those answers for
yourself. Because it is not the people with the answers that are the
hackers, it is the people that are travelling along the path.

-ReDragon

10.04.2009

Signals in Linux - Blocking Signals

Blocking a signal means telling the operating system to hold it and deliver it later when it is unblocked. Between the time when it is generated and when it is delivered a signal is said to be pending.

Generally, a program does not block signals indefinitely - it might as well ignore them by setting their actions to SIG_IGN. But it is useful to block signals briefly, to prevent them from interrupting sensitive operations.

Is Blocking a signal similar to Ignoring a signal ?

No, blocking a signal is different from ignoring a signal. When a process blocks a signal, the operating system does not deliver the signal until the process unblocks the signal. A process blocks a signal by modifying its signal mask with sigprocmask. But when a process ignores a signal, the signal is delivered and the process handles it by throwing it away.

How Blocking Signals is Useful ?

Temporary blocking of signals with sigprocmask gives you a way to prevent interrupts during critical parts of your code. If signals arrive in that part of the program, they are delivered later, after you unblock them.

One example where this is useful is for sharing data between a signal handler and the rest of the program. If the type of the data is not sig_atomic_t, then the signal handler could run when the rest of the program has only half finished reading or writing the data. This would lead to confusing consequences.

To make the program reliable, you can prevent the signal handler from running while the rest of the program is examining or modifying that data - by blocking the appropriate signal around the parts of the program that touch the data. Blocking signals is also necessary when you want to perform a certain action only if a signal has not arrived.

All signal blocking functions use a data structure called a signal set to specify what signals are affected. Thus, every activity involves two stages: creating the signal set, and then passing it as an argument to a library function. These facilities are declared in the header file signal.h.

The sigset_t data type is used to represent a signal set. Internally, it may be implemented as either an integer or structure type. For portability, use only the functions described below to initialize, change, and retrieve information from sigset_t objects - don't try to manipulate them directly.

#include <signal.h>

int sigemptyset(sigset_t *set);

int sigfillset(sigset_t *set);

int sigaddset(sigset_t *set, int signum);

int sigdelset(sigset_t *set, int signum);

int sigismember(const sigset_t *set, int signum);

sigemptyset function initializes the signal set given by set to empty, with all signals excluded from the set.

sigfillset function initializes set to full, including all signals.

sigaddset and sigdelset functions add and delete respectively signal signum from set.

sigismember function tests whether signum is a member of set.

Objects of type sigset_t must be initialized by a call to either sigemptyset or sigfillset before being passed to the functions sigaddset, sigdelset and sigismember.

For more information checkout: man 3 sigsetops

The collection of signals that are currently blocked is called the signal mask. Each process has its own signal mask. When you create a new process, it inherits its parent's mask. You can block or unblock signals with total flexibility by modifying the signal mask.

#include <signal.h>

int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);

In a traditional single-threaded application, sigprocmask system call can be used to fetch and/or manipulate the signal mask of the calling thread.

how determines what operation to be performed on the signal mask.
If oldset is non-null, the previous value of the signal mask is stored in oldset.
set determines list of signals to be set in blocking state.

Signals, such as SIGSTOP and SIGKILL, cannot be blocked. If an attempt is made to block these signals, the system ignores the request without reporting an error.

NOTE: Do not use sigprocmask in multi-threaded processes, because each thread has its own signal mask and there is no single process signal mask. According to POSIX, the behavior of sigprocmask in a multi-threaded process is "unspecified". Instead, use pthread_sigmask.

For more information checkout: man 2 sigprocmask

In the below example we try to block and unblock the SIGINT signal continually in a loop. If a user enters Ctrl-C while SIGINT is blocked, then the program terminates only after it is unblocked. If a user types Ctrl-C while SIGINT is unblocked, the program terminates immediately.
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
int i;
sigset_t intmask;
int repeatfactor;
double y = 0.0;

if ((sigemptyset(&intmask) == -1) || (sigaddset(&intmask, SIGINT) == -1)){
perror("Failed to initialize the signal mask");
return 1;
}

for ( ; ; ) {
printf("Entering BLOCK state\n");
if (sigprocmask(SIG_BLOCK, &intmask, NULL) == -1)
break;
fprintf(stderr, "SIGINT signal blocked\n");
sleep(2);

printf("Leaving Blocking State & Entering UNBLOCK state\n");
if (sigprocmask(SIG_UNBLOCK, &intmask, NULL) == -1)
break;
fprintf(stderr, "SIGINT signal unblocked\n");
sleep(2);
}
perror("Failed to change signal mask");
return 1;
}


Output:

$ ./a.out
Entering BLOCK state
SIGINT signal blocked
Leaving Blocking State & Entering UNBLOCK state
SIGINT signal unblocked
^C

$ ./a.out
Entering BLOCK state
SIGINT signal blocked
^CLeaving Blocking State & Entering UNBLOCK state

$

9.22.2009

Signals in Linux - Catching and Ignoring Signals - sigaction

The sigaction system call has the same basic effect as signal system call: to specify how a signal should be handled by the process.

But sigaction offers more control over the signal mechanism, at the expense of more complexity. In particular, sigaction allows you to specify additional flags to control when the signal is generated and how the handler is invoked.

Hence for reference we can say that sigaction is more like opensource Linux OS flavours more options, more control and complex for normal users than the proprietary stuff.
#include <signal.h>

int sigaction(int signum, const struct sigaction *action,struct sigaction *oldaction);

signum specifies the signal and can be any valid signal except SIGKILL and SIGSTOP.

The action argument is used to set up a new action for the signal signum, while the oldaction argument is used to return information about the action previously associated with this symbol. If action is non-null, the new action for signal signum is installed from action. If oldaction is non-null, the previous action is saved in oldaction.

For more information checkout: man 2 sigaction
struct sigaction {
void (*sa_handler)(int); /*SIG_DFL, SIG_IGN, or
a function pointer*/
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
};
Portability Note: The basic signal function is a feature of ISO C, while sigaction is part of the POSIX.1 standard. If you are concerned about portability to non-POSIX systems, then you should use the signal function instead.
static volatile sig_atomic_t doneflag = 0;

int main (void) {
struct sigaction act;

act.sa_handler = setdoneflag;/* set up signal handler */
act.sa_flags = 0;
if ((sigemptyset(&act.sa_mask) == -1) ||
(sigaction(SIGINT, &act, NULL) == -1)) {
perror("Failed to set SIGINT handler");
return 1;
}

while (!doneflag) {
printf("press CTRL+C to kill the loop\n");
sleep(1);
}

printf("Program terminating ...\n");
return 0;
}

Output:
$ ./a.out
press CTRL+C to kill the Loop
press CTRL+C to kill the Loop
^C

In SignalHandler - setdoneflag
Program terminating ...
sig_atomic_t is an integer type that is guaranteed by the standard to not be partially written or partially read in the presence of asynchronous interrupts.

If we set act.sa_handler as SIG_DFL then the program simply terminates. If we set act.sa_handler as SIG_IGN then the program simply keeps on looping as we are ignoring the SIG_INT signal.

In the POSIX base standard, a signal handler is an ordinary function that returns void and has one integer parameter. When the operating system delivers the signal, it sets this parameter to the number of the signal that was delivered. Most signal handlers ignore this value, but it is possible to have a single signal handler for many signals. The usefulness of signal handlers is limited by the inability to pass values to them. This capability has been added to the POSIX:RTS and POSIX:XSI Extensions, which can use the alternative sa_sigaction field of the struct sigaction structure to specify a handler.

If SA_SIGINFO is specified in sa_flags, then sa_sigaction (instead of sa_handler) specifies the signal-handling function for signum. This function receives the signal number as its first argument, a pointer to a siginfo_t as its second argument and a pointer to ucontext_t (cast to void *) as its third argument. The siginfo_t argument to sa_sigaction is a struct with the following elements:
siginfo_t {
int si_signo; /* Signal number */
int si_errno; /* An errno value */
int si_code; /* Signal code */
int si_trapno; /* Trap number that caused
hardware-generated signal
(unused on most architectures)*/
pid_t si_pid; /* Sending process ID */
uid_t si_uid; /* Real user ID of sending
process */
int si_status; /* Exit value or signal */
clock_t si_utime; /* User time consumed */
clock_t si_stime; /* System time consumed */
sigval_t si_value; /* Signal value */
int si_int; /* POSIX.1b signal */
void *si_ptr; /* POSIX.1b signal */
int si_overrun; /* Timer overrun count;
POSIX.1b timers */
int si_timerid; /* Timer ID; POSIX.1b timers */
void *si_addr; /* Memory location which
caused fault */
int si_band; /* Band event */
int si_fd; /* File descriptor */
}
NOTE: si_signo, si_errno and si_code are defined for all signals (si_errno is generally unused on Linux). The rest of the struct may be a union, so that one should only read the fields that are meaningful for the given signal.
static volatile sig_atomic_t doneflag = 0;

static void setdoneflag(int sig, siginfo_t *siginfo, void *context)
{
printf ("Signo - %d\n",siginfo->si_signo);
printf ("SigCode - %d\n",siginfo->si_code);
doneflag = 1;
}

int main (int argc, char *argv[])
{
struct sigaction act;

memset (&act, '\0', sizeof(act));

act.sa_sigaction = &setdoneflag;

act.sa_flags = SA_SIGINFO;

if (sigaction(SIGINT, &act, NULL) < 0) {
perror ("sigaction");
return 1;
}

while (!doneflag) {
printf("press CTRL+C to kill the Loop\n");
sleep(1);
}

printf("Program terminating ...\n");
return 0;
}

Output:

$ ./a.out
press CTRL+C to kill the Loop
press CTRL+C to kill the Loop
^C
Signo - 2
SigCode - 128
Program terminating ...
$

9.15.2009

Signals in Linux - Catching and Ignoring Signals - signal

The simplest way to change the action for a signal is to use the signal system call. You can specify a built-in action (such as to ignore the signal), or you can establish a handler.
#include <signal.h>

typedef void (*sighandler_t)(int);

sighandler_t signal(int signum, sighandler_t handler)

signal sets the disposition of the signal signum to handler, which is either SIG_IGN, SIG_DFL, or the address of a programmer-defined function (a "signal handler").

NOTE: The signals SIGKILL and SIGSTOP cannot be caught or ignored. The effects of signal system call in a multithreaded process are unspecified.

For more information checkout: man 2 signal

NOTE: The behavior of signal system call varies across different UNIX versions, and has also varied historically across different versions of Linux. Avoid its use: use sigaction system call instead. Check man 2 signal for detailed information about various portability issues.

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>

static volatile sig_atomic_t doneflag = 0;

static void setdoneflag(int signo) {
printf("\nIn SignalHandler - setdoneflag\n");
doneflag = 1;
}

int main (void) {

signal(SIGINT, setdoneflag);

while (!doneflag) {
printf("press CTRL+C to kill the Loop\n");
sleep(1);
}

printf("Program terminating ...\n");
return 0;
}

Output:

$ ./a.out
press CTRL+C to kill the Loop
press CTRL+C to kill the Loop
press CTRL+C to kill the Loop
^C
In SignalHandler - setdoneflag
Program terminating ...
$

9.08.2009

Reading a string of length 'n' from Standard Input [STDIN]

We know how to read a string from STDIN in C by using library functions like scanf, fgets and so on. By using these functions there is a chance for memory corruption and strange behaviour. For example while using scanf if we try to save a string of length more than the variable size there is a chance of memory corruption.

So here in this post I am just trying to implement a function capable to read a string of length 'n' from STDIN without memory corruption and other bugs.

Do help me by checking the code if there is a chance for further improvements.

#include <stdio.h>
#include <string.h>

#define BUF_SIZE 6
#define STRING_SIZE 4

/*
* void getStringStdin(char *, int , int );
*
* 1: BUF :Pointer to the array of characters where input string
is to be stored.
* 2: BUF_LEN :Is the length of the array of characters where the
string is stored.buffer where we save the string.
* 3: STRING_LEN :Is the length of the string.
*
* NOTE: STRING_LEN < BUF_LEN
*
*/

getStringStdin(char *buf, int buf_len, int str_len)
{
int ch, len;
char *s;

if(str_len>=buf_len)
len=buf_len-1;
else
len=str_len;

printf ("\nEnter string of length %d(Remaining is ignored): ",len);

if( (fgets(buf, len+1, stdin)) != NULL )
{
s=my_strchr(buf,'\n');

if(s!=NULL)
{
*s='\0';
}
else
{
while ((ch = getchar()) != '\n' && ch != EOF);
}
}
}

int main(void)
{
int i=0;
char buf[BUF_SIZE];

do
{
getString(buf, BUF_SIZE, STRING_SIZE);
printf ("\nString : %s\n", buf);
i++;
}while(i<2);

return 0;
}

How to: Listing all users in a Linux machine

TO list all the users who can access a Linux machine we have to access the /etc/passwd file, which stores information about all registered users of that machine. But it is not really so easy as told above since the file contains many other fields & machine trust accounts & inbuilt accounts.

We'll start by
cat /etc/passwd 

As we all know that by default all the users created will have their home directories in /home share so we'll modify our command a bit by using grep. Now it'll be
cat /etc/passwd | grep "/home"

Now we'll get all the user accounts which have their home share in /home.But the only output we need is the list of users & nothing else. So we'll modify our command again
cat /etc/passwd | grep "/home" |cut -d: -f1
Now what we have done is that we have piped the output of previous command to another variable "cut"

What we have done here is we have added cut -d: -f1
-d: means delimiter :
-f1 means display first field of line i.e. username.

So final command is
cat /etc/passwd | grep "/home" |cut -d: -f1
This works until all your users have their home share in /home. If you have defined their home share to some other destination. Modify the above command accordingly.

9.02.2009

Signals in Linux - Generating Signals

Besides signals that are generated as a result of a hardware trap or interrupt, your program can explicitly send signals to itself or to another process.

The kill system call can be used to send any signal to any process group or process.
#include <sys/types.h>
#include <signal.h>

int kill(pid_t pid, int sig);

For more information checkout: man 2 kill

There are restrictions that prevent you from using kill to send signals to any random process. These are intended to prevent antisocial behavior such as arbitrarily killing off processes belonging to another user. In typical use, kill is used to pass signals between parent, child, and sibling processes, and in these situations you normally do have permission to send signals. The only common exception is when you run a setuid program in a child process; if the program changes its real UID as well as its effective UID, you may not have permission to send a signal. The su program does this.

A process or thread can send a signal to itself with the raise function. The raise function takes just one parameter, a signal number.

In a single-threaded program it is equivalent to kill(getpid(), sig). In a multithreaded program it is equivalent to pthread_kill(pthread_self(), sig). If the signal causes a handler to be called, raise will only return after the signal handler has returned.
#include <signal.h>

int raise(int sig);

For more information checkout: man 3 raise
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>

static volatile sig_atomic_t doneflag = 10;

static void setdoneflag(int signo) {
printf("\nIn SignalHandler - setdoneflag\n");
doneflag=0;
}

int main (void) {

signal(SIGINT, setdoneflag);

while(doneflag--)
{
printf("In While loop - %d\n",doneflag);
if(doneflag==5)
raise(2);
else
sleep(1);
}

printf("Program terminating ...\n");
return 0;
}

8.26.2009

Implementation of Singly Linked List

This article is part of article series - "Datastructures"

Generally a Linked List means "Singly Linked List". It is a chain of records known as Nodes. Each node has at least two members, one of which points to the next Node in the list and the other holds the data.

Figure 1: Singly Linked List
Basically Single Linked Lists are uni-directional as they can only point to the next Node in the list but not to the previous. We use below structure for a Node in our example.
 struct Node
 {
   int Data;
   struct Node *Next;
 }; 
Variable Data holds the data in the Node (It can be a pointer variable pointing to the dynamically allocated memory) while Next holds the address to the next Node in the list.

Figure 2: Node in a Singly Linked List
Head is a pointer variable of type struct Node which acts as the Head to the list. Initially we set 'Head' as NULL which means list is empty.

8.25.2009

Signals in Linux - Standard Signals

Every signal has a unique signal name, an abbreviation that begins with SIG (SIGINT for interrupt signal, for example). Each signal name is a macro which stands for a positive integer - the signal number for that kind of signal. Your programs should never make assumptions about the numeric code for a particular kind of signal, but rather refer to them always by the names defined. This is because the number for a given kind of signal can vary from system to system, but the meanings of the names are standardized and fairly uniform.

The signal names are defined in signal.h (/usr/include/bits/signum.h), which must be included by any C program that uses signals.

Several signal numbers are architecture-dependent, as indicated in the "Value" column. (Where three values are given, the first one is usually valid for alpha and sparc, the middle one for ix86, ia64, ppc, s390, arm and sh, and the last one for mips. A - denotes that a signal is absent on the corresponding architecture.)



The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

Next the signals not in the POSIX.1-1990 standard but described in SUSv2 and POSIX.1-2001.


Up to and including Linux 2.2, the default behavior for SIGSYS, SIGXCPU, SIGXFSZ, and (on architectures other than SPARC and MIPS) SIGBUS was to terminate the process (without a core dump). Linux 2.4 conforms to the POSIX.1-2001 requirements for these signals, terminating the process with a core dump.

Next various other signals.


For detailed information about the side-effects and reasons causing these signals checout libc manual.

8.23.2009

Signals in Linux - Basics

What is a Signal ?

A signal is a software interrupt delivered to notify a process or thread of a particular event. The operating system uses signals to report exceptional situations to an executing program. Some signals report errors such as references to invalid memory addresses; others report asynchronous events, such as disconnection of a phone line.

The lifetime of a signal is the interval between its generation and its delivery. A signal that has been generated but not yet delivered is said to be pending. There may be considerable time between signal generation and signal delivery. The process must be running on a processor at the time of signal delivery.
Many computer science researchers compare signals with hardware interrupts, which occur when a hardware subsystem, such as a disk I/O (input/output) interface, generates an interrupt to a processor when the I/O completes. This event in turn causes the processor to enter an interrupt handler, so subsequent processing can be done in the operating system based on the source and cause of the interrupt. When a signal is sent to a process or thread, a signal handler may be entered (depending on the current disposition of the signal), which is similar to the system entering an interrupt handler as the result of receiving an interrupt.

What causes a Signal ?

Let us see some of the events that can cause (or generate, or raise) a signal:
  • A program error such as dividing by zero or issuing an address outside the valid range.
  • A user request to interrupt or terminate the program. Most environments are set up to let a user suspend the program by typing Ctrl-z, or terminate it with Ctrl-c. Whatever key sequence is used, the operating system sends the proper signal to interrupt the process.
  • The termination of a child process.
  • Expiration of a timer or alarm.
  • A call to kill or raise system calls by the same process.
  • A call to kill from another process. Signals are a limited but useful form of interprocess communication.
  • An attempt to perform an I/O operation that cannot be done. Examples are reading from a pipe that has no writer, and reading or writing to a terminal in certain situations.
Each of these kinds of events (excepting explicit calls to kill and raise) generates its own particular kind of signal.

Signals may be generated synchronously or asynchronously. A synchronous signal pertains to a specific action in the program, and is delivered (unless blocked) during that action. Most errors generate signals synchronously, and so do explicit requests by a process to generate a signal for that same process. On some machines, certain kinds of hardware errors (usually floating-point exceptions) are not reported completely synchronously, but may arrive a few instructions later.

Asynchronous signals are generated by events outside the control of the process that receives them. These signals arrive at unpredictable times during execution. External events generate signals asynchronously, and so do explicit requests that apply to some other process. One obvious example would be the sending of a signal to a process from another process or thread via a kill system call. Asynchronous signals are also aptly referred to as interrupts.

A given type of signal is either typically synchronous or typically asynchronous. For example, signals for errors are typically synchronous because errors generate signals synchronously. But any type of signal can be generated synchronously or asynchronously with an explicit request.

How Signals Are Delivered ?

When a signal is generated, it becomes pending. Normally it remains pending for just a short period of time and then is delivered to the process that was signaled. However, if that kind of signal is currently blocked, it may remain pending indefinitely—until signals of that kind are unblocked. Once unblocked, it will be delivered immediately.

When the signal is delivered, whether right away or after a long delay, the specified action for that signal is taken. For certain signals, such as SIGKILL and SIGSTOP, the action is fixed, but for most signals, the program has a choice. Possible default dispositions are
Term   Default action is to terminate the process.
Ign Default action is to ignore the signal.
Core Default action is to terminate the process
and dump core.
Stop Default action is to stop the process.
Cont Default action is to continue the process
if it is currently stopped.
The program can also specify its own way of handling the signals (signal handlers). We sometimes say that a handler catches the signal. While the handler is running, that particular signal is normally blocked.

References:

1. Libc Manual

7.29.2009

Mysterious C program

Never thought in my life that one can write such a mysterious 'C' program which compiles without any errors and provides a strange output.

#include <stdio.h>

main(t,_,a)
char *a;
{return!0<t?t<3?main(-79,-13,a+main(-87,1-_,
main(-86, 0, a+1 )+a)):1,t<_?main(t+1, _, a ):3,main ( -94, -27+t, a
)&&t == 2 ?_<13 ?main ( 2, _+1, "%s %d %d\n" ):9:16:t<0?t<-72?main(_,
t,"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l,+,/n{n+\
,/+#n+,/#;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/\
+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# ){n\
l]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#\
n'wk nw' iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;\
#'rdq#w! nr'/ ') }+}{rl#'{n' ')# }'+}##(!!/")
:t<-50?_==*a ?putchar(a[31]):main(-65,_,a+1):main((*a == '/')+t,_,a\
+1 ):0<t?main ( 2, 2 , "%s"):*a=='/'||main(0,main(-61,*a, "!ek;dc \
i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);}
Any Clue of the output ???

Output for the above code is

On the first day of Christmas my true love gave to me
a partridge in a pear tree.

On the second day of Christmas my true love gave to me
two turtle doves
and a partridge in a pear tree.

On the third day of Christmas my true love gave to me
three french hens, two turtle doves
and a partridge in a pear tree.

On the fourth day of Christmas my true love gave to me
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the fifth day of Christmas my true love gave to me
five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the sixth day of Christmas my true love gave to me
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the seventh day of Christmas my true love gave to me
seven swans a-swimming,
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the eighth day of Christmas my true love gave to me
eight maids a-milking, seven swans a-swimming,
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the ninth day of Christmas my true love gave to me
nine ladies dancing, eight maids a-milking, seven swans a-swimming,
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the tenth day of Christmas my true love gave to me
ten lords a-leaping,
nine ladies dancing, eight maids a-milking, seven swans a-swimming,
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the eleventh day of Christmas my true love gave to me
eleven pipers piping, ten lords a-leaping,
nine ladies dancing, eight maids a-milking, seven swans a-swimming,
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.

On the twelfth day of Christmas my true love gave to me
twelve drummers drumming, eleven pipers piping, ten lords a-leaping,
nine ladies dancing, eight maids a-milking, seven swans a-swimming,
six geese a-laying, five gold rings;
four calling birds, three french hens, two turtle doves
and a partridge in a pear tree.
Can anyone explain me step by step way the process to get above output ....

4.29.2009

Beware of NVIDIA Graphic cards in Laptops

NVIDIA famously known for its high end graphics cards which makes everyone to own a machine with NVIDIA graphics card. But beware of NVIDIA graphics card as they suffer OVER HEATING issue because of weak die/packaging material. It seems NVIDIA G84 & G86 graphic chipsets had this problem.
According to our sources, the failures are caused by a solder bump that connects the I/O termination of the silicon chip to the pad on the substrate. In Nvidia’s GPUs, this solder bump is created using high-lead. A thermal mismatch between the chip and the substrate has substantially grown in recent chip generations, apparently leading to fatigue cracking. Add into the equation a growing chip size (double the chip dimension, quadruple the stress on the bump) as well as generally hotter chips and you may have the perfect storm to take high lead beyond its limits. Apparently, problems arise at what Nvidia claims to be "extreme temperatures" and what we hear may be temperatures not too much above 70 degrees Celsius.

What supports the theory that a high-lead solder bump in fact is at fault is the fact that Nvidia ordered an immediate switch to use eutectic solders instead of high-lead versions in the last week of July.
source
Major laptop manufacturers like DELL, HP has accepted that there are some issues with laptops shipped with a NVIDIA graphics cards.

According to NVIDIA
, these affected GPUs are experiencing higher than expected failure rates causing video problems and it is because of weak die/packaging material set, which may fail with higher GPU temperature fluctuations.

If your NVIDIA GPU fails, you may see intermittent symptoms during early stages of failure that include:

* Multiple images
* Random characters on the screen
* Lines on the screen
* No video
* Black Screen

DELL agreed that some of its laptop models namely Inspiron 1420, Latitude D630, Latitude D630c, Dell Precision M2300, Vostro Notebook 1310, Vostro Notebook 1400, Vostro Notebook 1510, Vostro Notebook 1710, XPS M1330, XPS M1530 are facing the issue and released a BIOS upgradation to minimize the effect of the GPU failure problem and new systems are being shipped with the updated BIOS revisions.
DELL is offering an additional 12-month limited warranty enhancement specific to this GPU failure issue. For all customers worldwide, DELL plans to add 12 months of coverage for this issue to the existing limited warranty up to 60 months from the date of purchase for the given list of systems.

DELL's BIOS fix is simply turning on the laptop fan much more than usual as a result laptop battery is drained. The 'fix' keeps the fan on much more and destroys battery life.

HP also identified a hardware issue with certain HP Pavilion dv2000/dv6000/dv9000 and Compaq Presario V3000/V6000 series notebook PCs, and has also released a new BIOS for these notebook PCs.
HP says"This service enhancement program is available in North America for 24 months after the start of your original standard limited warranty for issues listed below; otherwise your current standard limited warranty applies. Customers who already have a 24 month or longer warranty period will be covered under their existing standard HP Limited Warranty."
Apple is also facing the same problem with the macbook pro's.
Apple says, NVIDIA assured Apple that Mac computers with these graphics processors were not affected. However, after an Apple-led investigation, Apple has determined that some MacBook Pro computers with the NVIDIA GeForce 8600M GT graphics processor may be affected. If the NVIDIA graphics processor in your MacBook Pro has failed, or fails within two years of the original date of purchase, a repair will be done free of charge, even if your MacBook Pro is out of warranty.
Customers are really angry as the OEM's are not providing a permanent solution to the GPU problem and also not including some of the laptop models that are facing the same issue. Even some of the customers sued NVIDIA to repair their laptops with faulty graphics cards.

We can see still DELL is shipping laptops with defective NVIDIA graphics card.

3.22.2009

Cache Memory - Part2

Interaction Policies with Main Memory
Basically READ operations dominate processor cache accesses since many of the instruction accesses are READ operation’s and most instructions do not WRITE into memory. When the address of the block to be READ is available then the tag is read and if it is a HIT then READ from it.

In case of a miss the READ policies are:
  • Read Through - Reading a block directly from main memory.
  • No Read Through - Reading a block from main memory into cache and then from cache to CPU. So we even update the cache memory.

Basically a Miss is comparatively slow because they require the data to be transferred from main memory to CPU which incurs a delay since main memory is much slower than cache memory, and also incurs the overhead for recording the new data in the cache before it is delivered to the processor. To take advantage of Locality of Reference, the CPU copies data into the cache whenever it accesses an address not present in the cache. Since it is likely the system will access that same location shortly, the system will save wait states by having that data in the cache. Thus cache memory handles the temporal aspects of memory access, but not the spatial aspects.

Accessing Caching memory locations won't speed up the program execution if we constantly access consecutive memory locations (Spatial Locality of Reference). To solve this problem, most caching systems read several consecutive bytes from memory when a cache miss occurs. 80x86 CPUs, for example, read between 16 and 64 bytes at a shot (depending upon the CPU) upon a cache miss. If you read 16 bytes, why read them in blocks rather than as you need them? As it turns out, most memory chips available today have special modes which let you quickly access several consecutive memory locations on the chip. The cache exploits this capability to reduce the average number of wait states needed to access memory.
Cache READ operation
Cache READ operation

It is not the same with Cache WRITE operation. Modifying a block cannot begin until the tag is checked to see if the address is a hit. Also the processor specifies the size of the write, usually between 1 and 8 bytes; only that portion of the block can be changed. In contrast, reads can access more bytes than necessary without a problem.

The Cache WRITE policies on write hit often distinguish cache designs:
  • Write Through - the modified data is written back to both the block in the cache memory and in the main memory.

    Advantage:
    1. READ miss never results in writes to main memory.
    2. Easy to implement
    3. Main Memory always has the most current copy of the data (consistent)

    Disadvantage:
    1. WRITE operation is slower as we have to update both Main Memory and Cache Memory.
    2. Every write needs a main memory access as a result uses more memory bandwidth
Write Through in cache
  • Write Back - the modified data is first written only to the block in the cache memory. The modified cache block is written to main memory only when it is replaced. In order to reduce the frequency of writing back blocks on replacement, a dirty bit (a status bit) is commonly used to indicate whether the block is dirty (modified while in the cache) or clean (not modified). If it is clean the block is not written on a miss.

    Advantage:
    1. WRITE’s occur at the speed of the cache memory.
    2. Multiple WRITE’s within a block require only one WRITE to main memory as a result uses less memory bandwidth

    Disadvantage:
    1. Harder to implement
    2. Main Memory is not always consistent with cache reads that result in replacement may cause writes of dirty blocks to main memory.
Incase of Cache Write MISS we have to options.
  • Write Allocate - the memory block is first loaded into cache memory from main memory on a write miss, followed by the write-hit action.
  • No Write Allocate - the block is directly modified in the main memory and not loaded into the cache memory.
Although either write-miss policy could be used with write through or write back, write-back caches generally use write allocate (hoping that subsequent writes to that block will be captured by the cache) and write-through caches often use no-write allocate (since subsequent writes to that block will still have to go to memory).

The data in main memory being cached may be changed by other entities, in which case the copy in the cache may become out-of-date or stale. Alternatively, when the CPU updates the data in the cache, copies of data in other caches will become stale. Communication protocols between the cache managers which keep the data consistent are known as cache coherence protocols.

3.18.2009

Principle of Locality

When a program executes on a computer, most of the memory references are not made uniformly to a small number of locations. Here the Locality of the reference does matter.

Locality of Reference, also known as the Principle of Locality, the phenomenon of the same value or related storage locations being frequently accessed. Locality occurs in time(temporal locality) and in space (spatial locality).
  • Temporal Locality refers to the reuse of specific data and/or resources within relatively small time durations.
  • Spatial Locality refers to the use of data elements within relatively close storage locations. Sequential locality, a special case of spatial locality, occurs when data elements are arranged and accessed linearly, eg, traversing the elements in a one-dimensional array.
To be very simple when exhibiting spatial locality, a program accesses consecutive memory locations and during temporal locality of reference a program repeatedly accesses the same memory location during a short time period. Both forms of locality occur in the following Pascal code segment:
  for i := 0 to 10 do
A [i] := 0;

In the above Pascal code, the variable 'i' is referenced several times in for loop where 'i' is compared against 10 to see if the loop is complete and also incremented by one at the end of the loop. This shows temporal locality of reference in action since the CPU accesses 'i' at different points in a short time period.

This program also exhibits spatial locality of reference. The loop itself zeros out the elements of array A by writing a zero to the first location in A, then to the second location in A, and so on. Assume Pascal stores elements of A into consecutive memory locations then on each loop iteration it accesses adjacent memory locations.

Cache Memory - Set Associative Mapped Cache

Set Associative mapping scheme combines the simplicity of Direct mapping with the flexibility of Fully Associative mapping. It is more practical than Fully Associative mapping because the associative portion is limited to just a few slots that make up a set.

In this mapping mechanism, the cache memory is divided into 'v' sets, each consisting of 'n' cache lines. A block from Main memory is first mapped onto a specific cache set, and then it can be placed anywhere within that set. This type of mapping has very efficient ratio between implementation and efficiency. The set is usually chosen by

Cache set number = (Main memory block number) MOD (Number of sets in the cache memory)

If there are 'n' cache lines in a set, the cache placement is called n-way set associative i.e. if there are two blocks or cache lines per set, then it is a 2-way set associative cache mapping and four blocks or cache lines per set, then it is a 4-way set associative cache mapping.

Let us assume we have a Main Memory of size 4GB (232), with each byte directly addressable by a 32-bit address. We will divide Main memory into blocks of each 32 bytes (25). Thus there are 128M (i.e. 232/25 = 227) blocks in Main memory.

We have a Cache memory of 512KB (i.e. 219), divided into blocks of each 32 bytes (25). Thus there are 16K (i.e. 219/25 = 214) blocks also known as Cache slots or Cache lines in cache memory. It is clear from above numbers that there are more Main memory blocks than Cache slots.

NOTE: The Main memory is not physically partitioned in the given way, but this is the view of Main memory that the cache sees.

NOTE: We are dividing both Main Memory and cache memory into blocks of same size i.e. 32 bytes.

Let us try 2-way set associative cache mapping i.e. 2 cache lines per set. We will divide 16K cache lines into sets of 2 and hence there are 8K (214/2 = 213) sets in the Cache memory.

Cache Size = (Number of Sets) * (Size of each set) * (Cache line size)

So even using the above formula we can find out number of sets in the Cache memory i.e.

219 = (Number of Sets) * 2 * 25

Number of Sets = 219 / (2 * 25) = 213.

Set Associative Mapped CacheWhen an address is mapped to a set, the direct mapping scheme is used, and then associative mapping is used within a set.

The format for an address has 13 bits in the set field, which identifies the set in which the addressed word will be found if it is in the cache. There are five bits for the word field as before and there is 14-bit tag field that together make up the remaining 32 bits of the address as shown below:
Set Associative Mapped CacheAs an example of how the set associative cache views a Main memory address, consider again the address (A035F014)16. The leftmost 14 bits form the tag field, followed by 13 bits for the set field, followed by five bits for the word field as shown below:
cache memory - Set Associative mapped
In the below example we have chosen the block 14 from Main memory and compared it with the different block replacement algorithms. In Direct Mapped cache it can be placed in Frame 6 since 14 mod 8 = 6. In Set associative cache it can be placed in set 2.
cache memory - Set Associative mappedCheckout one more solved problem below.
cache memory - Set Associative mapped
References


1. Computer Architecture Tutorial - By Gurpur M. Prabhu.

2. Computer Architecture And Organization: An Integrated Approach - By Murdocca & Vincent Heuring

Cache Memory - Fully Associative Mapped Cache

If a Main memory block can be placed in any of the Cache slots, then the cache is said to be mapped in fully associative.

Let us assume we have a Main Memory of size 4GB (232), with each byte directly addressable by a 32-bit address. We will divide Main memory into blocks of each 32 bytes (25). Thus there are 128M (i.e. 232/25 = 227) blocks in Main memory.

We have a Cache memory of 512KB (i.e. 219), divided into blocks of each 32 bytes (25). Thus there are 16K (i.e. 219/25 = 214) blocks also known as Cache slots or Cache lines in cache memory. It is clear from above numbers that there are more Main memory blocks than Cache slots.

NOTE: The Main memory is not physically partitioned in the given way, but this is the view of Main memory that the cache sees.

NOTE: We are dividing both Main Memory and cache memory into blocks of same size i.e. 32 bytes.

In fully associative mapping any one of the 128M (i.e. 227) Main memory blocks can be mapped into any of the single Cache slot. To keep track of which one of the 227 possible blocks is in each slot, a 27-bit tag field is added to each slot which holds an identifier in the range from 0 to 227 – 1. The tag field is the most significant 27 bits of the 32-bit memory address presented to the cache.
Fully Associative Mapped Cache
In an associative mapped cache, each Main memory block can be mapped to any slot. The mapping from main memory blocks to cache slots is performed by partitioning an address into fields for the tag and the word (also known as the “byte” field) as shown below:
Fully Associative Mapped Cache
When a reference is made to a Main memory address, the cache hardware intercepts the reference and searches the cache tag memory to see if the requested block is in the cache. For each slot, if the valid bit is 1, then the tag field of the referenced address is compared with the tag field of the slot. All of the tags are searched in parallel, using an associative memory. If any tag in the cache tag memory matches the tag field of the memory reference, then the word is taken from the position in the slot specified by the word field. If the referenced word is not found in the cache, then the main memory block that contains the word is brought into the cache and the referenced word is then taken from the cache. The tag, valid, and dirty fields are updated, and the program resumes execution.

Associative mapped cache has the advantage of placing any main memory block into any cache line. This means that regardless of how irregular the data and program references are, if a slot is available for the block, it can be stored in the cache. This results in considerable hardware overhead needed for cache bookkeeping.

Although this mapping scheme is powerful enough to satisfy a wide range of memory access situations, there are two implementation problems that limit performance.
  • The process of deciding which slot should be freed when a new block is brought into the cache can be complex. This process requires a significant amount of hardware and introduces delays in memory accesses.
  • When the cache is searched, the tag field of the referenced address must be compared with all 214 tag fields in the cache.

Cache Memory - Direct Mapped Cache

If each block from main memory has only one place it can appear in the cache, the cache is said to be Direct Mapped. Inorder to determine to which Cache line a main memory block is mapped we can use the formula shown below

Cache Line Number = (Main memory Block number) MOD (Number of Cache lines)

Let us assume we have a Main Memory of size 4GB (232), with each byte directly addressable by a 32-bit address. We will divide Main memory into blocks of each 32 bytes (25). Thus there are 128M (i.e. 232/25 = 227) blocks in Main memory.

We have a Cache memory of 512KB (i.e. 219), divided into blocks of each 32 bytes (25). Thus there are 16K (i.e. 219/25 = 214) blocks also known as Cache slots or Cache lines in cache memory. It is clear from above numbers that there are more Main memory blocks than Cache slots.

NOTE: The Main memory is not physically partitioned in the given way, but this is the view of Main memory that the cache sees.

NOTE: We are dividing both Main Memory and cache memory into blocks of same size i.e. 32 bytes.

A set of 8k (i.e. 227/214 = 213) Main memory blocks are mapped onto a single Cache slot. In order to keep track of which of the 213 possible Main memory blocks are in each Cache slot, a 13-bit tag field is added to each Cache slot which holds an identifier in the range from 0 to 213 – 1.

All the tags are stored in a special tag memory where they can be searched in parallel. Whenever a new block is stored in the cache, its tag is stored in the corresponding tag memory location.
Direct Mapped Cache
When a program is first loaded into Main memory, the Cache is cleared, and so while a program is executing, a valid bit is needed to indicate whether or not the slot holds a block that belongs to the program being executed. There is also a dirty bit that keeps track of whether or not a block has been modified while it is in the cache. A slot that is modified must be written back to the main memory before the slot is reused for another block. When a program is initially loaded into memory, the valid bits are all set to 0. The first instruction that is executed in the program will therefore cause a miss, since none of the program is in the cache at this point. The block that causes the miss is located in the main memory and is loaded into the cache.

This scheme is called "direct mapping" because each cache slot corresponds to an explicit set of main memory blocks. For a direct mapped cache, each main memory block can be mapped to only one slot, but each slot can receive more than one block.

The mapping from main memory blocks to cache slots is performed by partitioning an main memory address into fields for the tag, the slot, and the word as shown below:
Direct Mapped Cache
The 32-bit main memory address is partitioned into a 13-bit tag field, followed by a 14-bit slot field, followed by a 5-bit word field. When a reference is made to a main memory address, the slot field identifies in which of the 214 cache slots the block will be found if it is in the cache.

If the valid bit is 1, then the tag field of the referenced address is compared with the tag field of the cache slot. If the tag fields are the same, then the word is taken from the position in the slot specified by the word field. If the valid bit is 1 but the tag fields are not the same, then the slot is written back to main memory if the dirty bit is set, and the corresponding main memory block is then read into the slot. For a program that has just started execution, the valid bit will be 0, and so the block is simply written to the slot. The valid bit for the block is then set to 1, and the program resumes execution.

Check out one more solved problem below


References


1. Computer Architecture Tutorial - By Gurpur M. Prabhu.