codingfreak

3.18.2009

Cache Memory - Direct Mapped Cache

If each block from main memory has only one place it can appear in the cache, the cache is said to be Direct Mapped. Inorder to determine to which Cache line a main memory block is mapped we can use the formula shown below

Cache Line Number = (Main memory Block number) MOD (Number of Cache lines)

Let us assume we have a Main Memory of size 4GB (2³²), with each byte directly addressable by a 32-bit address. We will divide Main memory into blocks of each 32 bytes (2⁵). Thus there are 128M (i.e. 2³²/2⁵ = 2²⁷) blocks in Main memory.

We have a Cache memory of 512KB (i.e. 2¹⁹), divided into blocks of each 32 bytes (2⁵). Thus there are 16K (i.e. 2¹⁹/2⁵ = 2¹⁴) blocks also known as Cache slots or Cache lines in cache memory. It is clear from above numbers that there are more Main memory blocks than Cache slots.

NOTE: The Main memory is not physically partitioned in the given way, but this is the view of Main memory that the cache sees.

NOTE: We are dividing both Main Memory and cache memory into blocks of same size i.e. 32 bytes.

A set of 8k (i.e. 2²⁷/2¹⁴ = 2¹³) Main memory blocks are mapped onto a single Cache slot. In order to keep track of which of the 2¹³ possible Main memory blocks are in each Cache slot, a 13-bit tag field is added to each Cache slot which holds an identifier in the range from 0 to 2¹³ – 1.

All the tags are stored in a special tag memory where they can be searched in parallel. Whenever a new block is stored in the cache, its tag is stored in the corresponding tag memory location.

When a program is first loaded into Main memory, the Cache is cleared, and so while a program is executing, a valid bit is needed to indicate whether or not the slot holds a block that belongs to the program being executed. There is also a dirty bit that keeps track of whether or not a block has been modified while it is in the cache. A slot that is modified must be written back to the main memory before the slot is reused for another block. When a program is initially loaded into memory, the valid bits are all set to 0. The first instruction that is executed in the program will therefore cause a miss, since none of the program is in the cache at this point. The block that causes the miss is located in the main memory and is loaded into the cache.

This scheme is called "direct mapping" because each cache slot corresponds to an explicit set of main memory blocks. For a direct mapped cache, each main memory block can be mapped to only one slot, but each slot can receive more than one block.

The mapping from main memory blocks to cache slots is performed by partitioning an main memory address into fields for the tag, the slot, and the word as shown below:

The 32-bit main memory address is partitioned into a 13-bit tag field, followed by a 14-bit slot field, followed by a 5-bit word field. When a reference is made to a main memory address, the slot field identifies in which of the 2¹⁴ cache slots the block will be found if it is in the cache.

If the valid bit is 1, then the tag field of the referenced address is compared with the tag field of the cache slot. If the tag fields are the same, then the word is taken from the position in the slot specified by the word field. If the valid bit is 1 but the tag fields are not the same, then the slot is written back to main memory if the dirty bit is set, and the corresponding main memory block is then read into the slot. For a program that has just started execution, the valid bit will be 0, and so the block is simply written to the slot. The valid bit for the block is then set to 1, and the program resumes execution.

Check out one more solved problem below

References

1. Computer Architecture Tutorial - By Gurpur M. Prabhu.

Cache Memory - Part1

Cache memory is a small (in size) and very fast (zero wait state) memory which sits between the CPU and main memory. Unlike normal memory, the bytes appearing within a cache do not have fixed addresses. Instead, cache memory can reassign the address of a data object. This allows the system to keep recently accessed values in the cache.

Cache Memory

Using the Principle of Locality to improve performance while keeping the memory system affordable we can pose 4 questions about any level of memory hierarchy and we will answer those questions considering one level of memory hierarchy for e.g. cache in our case.

Block Placement - Where should a block be placed in the cache?
Block Identification -How to confirm if a block is in the cache or not?
Block Replacement -Which block frame in the cache should be replaced upon a miss?
Interaction Policies with Main Memory - What happens when reads and writes are done in the cache?

Block Placement
A number of hardware schemes have been developed for translating main memory addresses to cache memory addresses. The user does not need to know much about the address translation circuitry, which has the advantage, that cache memory enhancements can be introduced into a computer without a corresponding need for modifying application software.

Basically number of cache lines are very less than the number of main memory blocks. As a result an algorithm is needed for mapping main memory blocks into cache lines. Also a means is needed for determining which main memory block currently occupies a cache line.

The choice of cache mapping scheme affects cost and performance, and there is no single best method that is appropriate for all situations. There are three methods in block placement namely

Block Identification
In general a cache has two important parts; the cache data line and the cache tags. But in granular it can shown as below

Valid Bit : is set to 1 when a valid data is stored in cache.
Dirty Bit : is set to 1 when data is changed and is not updated to main memory in the same time.
Tag : this field tells which address is in that line.
Data : the data fetched from main memory.

Since a cache is typically smaller than an entire address space, there is a possibility that any particular requested data is not present in the cache. Therefore there must be some mechanism to determine whether any requested data is present in the cache or not. The tags fill this purpose.

Cache has an address tag on each block frame that gives the block address. Therefore tag entry of every cache block is checked to see if it matches the block address from the CPU. As a rule, all possible tags are searched in parallel because speed is critical.

There must be a way to know that a cache block does/doesn’t have valid information. The most common procedure is to add a valid bit to the tag to say whether or not this entry contains a valid address. If the bit is not set, there cannot be a match on this address. Accordingly an address, generated by CPU (or main memory address) is divided as shown below:

Courtesy: Various parts into which main menory address is divided.

The first division is between the block address and the block offset. The block frame address can be further divided into the tag field and the index field. The block-offset field selects the desired data from the block, the index field selects the set, and the tag field is compared against it for a hit. Although the comparison could be made on more of the address than the tag, there is no need because of the following:

If the total cache size is kept the same, increasing associativity increases the number of blocks per set, thereby decreasing the size of the index and increasing the size of the tag.

Block Replacement
When a miss occurs, the cache controller must select a block to be replaced with the desired data. A benefit of direct-mapped placement is that hardware decisions are simplified - in fact, so simple that there is no choice: Only one block frame is checked for a hit, and only that block can be replaced. With fully associative or set-associative placement, there are many blocks to choose from on a miss. There are three primary strategies employed for selecting which block to replace:

Random - To spread allocation uniformly, candidate blocks are randomly selected. Some systems generate pseudorandom block numbers to get reproducible behavior, which is particularly useful when debugging hardware.

Advantage : simple to implement in hardware
Disadvantage : ignores Principle of Locality

Least-recently used (LRU) - To reduce the chance of throwing out information that will be needed soon, accesses to blocks are recorded. Relying on the past to predict the future, the block replaced is the one that has been unused for the longest time. LRU relies on a corollary of locality: If recently used blocks are likely to be used again, then a good candidate for disposal is the least-recently used block.

Advantage : takes locality into account
Disadvantage : as the number of blocks to keep track of increases, LRU becomes more expensive (harder to implement, slower and often just approximated).

First In First Out (FIFO) - Because LRU can be complicated to calculate, this approximates LRU by determining the oldest block rather than the LRU.

Checkout part2.

References
1. Computer Architecture - A Quantitative Approach, Third Edition.

2.17.2009

Blogger: Adding syntax highlighter to Blogger

As we know it is really hard to post any source code to the blogger as there is no Syntax Highlighting option by default to the blogger (A Big Deficiency).

After doing some web search first I came across a Javascript tool named syntaxhighlighter and then I came across Heisencoder's post on how to add the syntaxhighlighter option to the blogger template.

NOTE: We are about to tweak the HTML code of the blogger template. Inorder to know how to edit the HTML code of the blogger template check this post.

NOTE: For safety precautions click "Download Full Template" link to download full html code of your present blog's template.

Select "Expand Widget Templates" option to see the full html code in the editor. Now we have to make our hands dirty as we are about to add some code into our blogger template code.

1. Go to http://syntaxhighlighter.googlecode.com/svn/trunk/Styles/SyntaxHighlighter.css, and perform "select all" and "copy" the whole code and paste it at the end of the css section of your blogger html template (i.e., before ]]--></b:skin>).

2. Before the </head> tag, paste the following code:


<!-- Add-in CSS for syntax highlighting -->
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shCore.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushCpp.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushCSharp.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushCss.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushDelphi.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushJava.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushJScript.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushPhp.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushPython.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushRuby.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushSql.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushVb.js' type='text/javascript'></script>
<script src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushXml.js' type='text/javascript'></script>

NOTE: Simply remove lines for languages which you will never use (for example, Delphi) -- it will save some loading time of the blogs.

3. Before the </body> tag, insert the following:

<!-- Add-in Script for syntax highlighting -->
<script language='javascript'>
dp.SyntaxHighlighter.BloggerMode();
dp.SyntaxHighlighter.HighlightAll('code');
</script>

NOTE: Tweaking of the Blogger HTML template code is complete. So before you save the template code just click on "Preview" button to see if the code is not crashing & working fine.

4. While posting a post that has source code then click on "Edit Html" tab and post the source code between pre tags shown below

<pre name="code" class="cpp">
...Your html-escaped code goes here...
</pre>

In the above code substitute "cpp" with whatever language you're using. Choices: cpp, c, c++, c#, c-sharp, csharp, css, delphi, pascal, java, js, jscript, javascript, php, py, python, rb, ruby, rails, ror, sql, vb, vb.net, xml, html, xhtml, xslt. Full list can be accessed at Supported languages.

NOTE: Instead of remembering the code everytime we can add this HTML code simply into the template so that it is displayed whenever we create a new post. Click on "Settings" tab and then "formatting" sub-tab and post the html code in the "Post Template" box. As a result next whenever we create a new post it is displayed when we click "Edit Html".

We have to perform HTML escaping which can be done in the sites like Centricle, Accessify.

Reference

[1] Heisencoder - Link

2.16.2009

Gcov - analyzing code produced with GCC

How can a programmer exactly know which part of his code is frequently executed and which part of code is never traversed ? That's where CODE COVERAGE comes for rescue ...

Code Coverage is a measure used in software testing to describe the degree to which the source code of a program has been tested. It is a form of testing that inspects the code directly and checks the active and non-active parts of the source-code.

Basically there are number of code coverage criteria, the main ones being:

Function coverage

Statement coverage

Decision coverage (also known as Branch coverage)

Condition coverage

Modified Condition/Decision Coverage (MC/DC)

Path coverage

Entry/exit coverage

NPTEL - Web Video courses from IIT's and IISc

NPTEL - National Programme on Technology Enhanced Learning, a project funded by the Ministry of Human Resource Development (MHRD) of India.

The main objective of NPTEL program is "To enhance the quality of engineering education in the country by developing curriculum based video and web courses".

Seven IITs and IISc Bangalore together created web courses on different engineering subjects and made them available for all other students around the world.

In the first phase of the project, supplementary content for 129 web courses in engineering/science and humanities have been developed. Each course contains materials that can be covered in depth in 40 or more lecture hours. In addition, 110 courses have been developed in video format, with each course comprising of approximately 40 or more one-hour lectures. In the next phase other premier institutions are also likely to participate in content creation.

The courses can be accessed at http://nptel.iitm.ac.in/

The video lectures of various courses can be directly accessed from Youtube at http://youtube.com/iit

Subscribe to: Posts ( Atom )