Computer Science Study Guide: 2010

Tuesday, December 7, 2010

Logical and Vector Timestamps in Distributed Systems - Example

Logical Time Stamp

Example 1

Vector Timestamps
Example 1

Example 2

Friday, December 3, 2010

Simple MapReduce Program Example in Python

Given a set of integers, compute the sum of their square values.
(Assumption: You already know how to run a map reduce program using an input file and how to check the output)

mapper.py

#!/usr/bin/env python

import sys

# input comes from STDIN (standard input)
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # split the line into numbers
    numbers = line.split()
    for number in numbers:
        num = int(number)
        square = num*num
        print '%s\t%s' % (1,square)

reducer.py

#!/usr/bin/env python

import sys

#sum initialized to zero
sum=0
# input comes from STDIN
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # parse the input we got from mapper.py
    word, square = line.split('\t', 1)
    # convert square(currently a string) to int
    try:
        square = int(square)
        sum=sum+square
    except ValueError:
        # count was not a number, so silently
        # ignore/discard this line
        pass
#print sum of squares
print '%s'% (sum)
___________________________________________________________________
input.txt
1 2 3 4

output/part-0000
30

NP completeness exam questions

Consider a reduction of problem A to problem B. What is the most precise claim you
can make about problem B for each of the following situations?

a) A is NP-complete and the reduction is in polynomial time.
NP-hard. (At least as hard as NP-complete problem.)

b) A is in polynomial time and the reduction is also in polynomial time.
B could be anything.

c) A is NP-complete and the reduction is in Pspace.
B could be anything.

d) A is in nondeterministic polynomial time and the reduction is in polynomial time.
B is at least as hard as A, but nothing more can be said.

e) A requires exponential time and the reduction is in polynomial time.
B must requires polynomial time for deciding.

f) A is Pspace complete and the reduction is in Pspace.
B could be anything.
_______________________________________________________________________
Suppose you could reduce an NP complete problem to a polynomial time problem in
polynomial time. What would be the consequence?
What if the reduction required exponential time?

If we could reduce an NP-complete problem to a problem in P, then NP will
be equal to P. If the reduction required exponential time, then there is not
special consequence. (In fact any NP-complete problem can be reduced to
a polynomial time solvable problem using exponential time reduction.)
_______________________________________________________________________

Wednesday, December 1, 2010

True-Sat problem is NP-complete

TRUE-SAT = {boolean expressions E in conjunctive normal form that
                   (1) are true when all variables are set true, AND ALSO
                   (2) have some other truth assignment that makes E true}
            This can also be stated (in Garey-Johnson notation) as:
             Instance: Set of clauses C₁,C₂,....C_k over the boolean variables u₁,u₂,....u_m
             Question:
                     (1) are all C_i true when u₁,u₂,....u_m are set to true, and
                     (2) is there another satisfying truth assignment (not all variables set to true)?

First, Show TRUE_SAT is in NP. (This is left as an exercise)
Then show TRUE-SAT is NP-hard.
If an expression is not true when all variables are true, then it is surely not in TRUE-SAT.

To show TRUE-SAT is NP-complete, we reduce SAT to it. Suppose we are given an expression E with variables a,b,c.

Convert E to E' as follows:
Test if E is true when a = T, b=T and c=T.
If so, we know E is satisfiable, then add (a U a') to E.
                 E' = E U ( a U a' )
         // E satisfies first condition. a U a' satisfies the second condition.
Otherwise
                E' = E U ( a U b U c )
                //Adding a U b U c satisfies first condition. If E is in SAT, that satisfies the second condition.
Therefore, TRUE_SAT accepts only when E is in SAT. Hence TRUE-SAT is reducible to SAT and TRUE-SAT is NP-complete.

Monday, November 29, 2010

PageRank Algorithm Using Mapreduce

Description - http://en.wikipedia.org/wiki/PageRank

Examples of pagerank calculation - http://pagerank.suchmaschinen-doktor.de/index/examples.html

Formula for pagerank calculation:

PR(A) = (1 - d) + d * SUM ((PR(I->A)/C(I))

Where:

PR(A) is the PageRank of your page A.
d is the damping factor, usually set to 0,85.
PR(I->A) is the PageRank of page I containing a link to page A.
C(I) is the number of links off page I.
PR(I->A)/C(I) is a PR-value page A receives from page I.
SUM (PR(I->A)/C(I)) is the sum of all PR-values page A receives from pages with links to page A..

Inputs and Outputs of Mapper and Reducer - https://wiki.umiacs.umd.edu/ccc/images/e/ea/CLuE-Jagannathan.pdf

Video lecture

Thursday, November 25, 2010

Factorization of a Number is in NP. Why?

Factoring a number seems to be trivial. The fact is that factorization when done in binary takes non-polynomial time.

For more information check http://en.wikipedia.org/wiki/Integer_factorization

Wednesday, November 17, 2010

Polynomial Reduction Example

Show that WRITE_ONE_TM = {M : M is a Turing machine that writes a 1 on its tape } is undecidable.

Let us assume that WRITE_ONE_TM is decidable. Then there exists a Turing machine WRITE_ONE_TM, which can tell you if a given Turing Machine M will write a 1 on its tape on input z.

Let us look at how WRITE_ONE_TM works.

M is the input to WRITE_ONE_{TM .}

If M writes a 1 on its tape , WRITE_ONE_TM accepts. – Yes instance

If M does not write a 1 on its tape, WRITE_ONE_TM rejects. – No instance.

If M writes a 1 on its tape at any time of its computation, WRITE_ONE_TM decides yes. It does not matter what M does before it writes 1 on its tape.

M might have 10000 transitions before it writes 1 on its tape. Still WRITE_ONE_TMshould be able to decide whether M will write 1 on its tape.

M might compute what is 5+7 before it writes 1 on its tape. Still WRITE_ONE_TMshould be able to decide whether M will write 1 on its tape.

M can watch a youtube video and then write a 1 on its tape. Still WRITE_ONE_TMshould be able to decide whether M will write 1 on its tape.

( just kidding)

M might simulate another Turing Machine and then write a 1 on its tape. Still WRITE_ONE_TMshould be able to decide whether M will write a1 on its tape.

Suppose M is simulating another machine T on input w on M’s tapes and if T accepts w, M is writing a 1 on its tape. Still WRITE_ONE_TMshould be able to decide whether M will write a1 on its tape.

Did you hear that?

Isn’t that a solution to decidability of A_TM. Whenever you have to decide if a TM T will accept or reject w, all you have to do is:

Construct a new TM, M such that

· It simulates T on input w on M’s tape

· If T accepts w, write a 1 on M’s tape.

· Feed M to WRITE_ONE_TM

· If WRITE_ONE_TMdecides yes, T accepts w.

But we already know that A_TM is undecidable. So it is not possible that WRITE_ONE_TM can be decidable if it can solve A_TM.

Hence WRITE_ONE_TMis undecidable.

Tuesday, November 2, 2010

Number of full nodes in a binary tree c++

A FULL node in a binary tree is a node that has exactly two non-null children. Write a RECURSIVE function that returns the number of full nodes in a binary tree.

# include <iostream >
#include <stdio.h >

using namespace std;

struct TreeNode{
    int value;
    TreeNode* left;
    TreeNode* right;
};

int FullNodes(TreeNode* root){


    if(root == NULL)        //if tree is empty
        return 0;
    if(root->left == NULL && root->right == NULL)         //leaf nodes
        return 0;
    if(root->left!=NULL && root->right != NULL)             // Full Nodes
        return 1 + FullNodes(root->left) + FullNodes(root->right);

    if(root->left==NULL && root->right != NULL)           //Nodes with no left child
        return FullNodes(root->right);

    if(root->left!=NULL && root->right == NULL)          // Nodes with no right child
        return FullNodes(root->left);

}
/*---------------------------------TEST---------------------------------*/

//Function to insert into binary tree
void insert(TreeNode*& root, int num){

    if(root == NULL){
        root = new TreeNode();
        root->value = num;
        root->left = NULL;
        root->right = NULL;

    }
    else if(num < root->value){
        insert(root->left,num);
    }
    else
        insert(root->right,num);
}

int main(){

//creates a tree
    TreeNode* tree = NULL;
    insert(tree, 8);
    insert(tree, 5);
    insert(tree,10);
    insert(tree, 3);
    insert(tree, 9);
    insert(tree, 6);


//Full Node count test
    cout << FullNodes(tree); //8 and 5 are full nodes . So output is 2

    return 0;
}

Monday, October 18, 2010

MPI Programming Exam Questions

Feel free to give answers to unanswered questions as comments

1. Sum of prime factors :

http://people.sc.fsu.edu/%7Ejburkardt/presentations/fdi_2008_lecture8.pdf#page=16

Add up the prime numbers from 2 to N.
Each of P processors will simply take about 1/P of the range of
numbers to check, and add up the primes locally.
When it's done, it will send the partial result to processor 0.

2. Deadlock avoidance when using blocking send and receive in order.

http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html

In MPI, when does a non-blocking recv message return?

What is an MPI communicator? What is MPI_COMM_WORLD?

Two processors must be in a common "communicator group" in order to communicate. This is simply a way for the user to organize processors into sub-groups. All processors can communicate in the
shared group known as MPI_COMM_WORLD.

In MPI, what is a process rank?

True or false: In MPI you set the number of processes when you write the source code.

No. Number of processes are given on execution.

Give a short piece of pseudocode that illustrates the master/slave programming model.

Explain if the following MPI code segment is correct or not, and why:
Process 0 executes:

MPI_Recv(&yourdata, 1, MPI_FLOAT, 1, tag, MPI_COMM_WORLD, &status);
MPI_Send(&mydata, 1, MPI_FLOAT, 1, tag, MPI_COMM_WORLD);

Process 1 executes:

MPI_Recv(&yourdata, 1, MPI_FLOAT, 0, tag,MPI_COMM_WORLD, &status);
MPI_Send(&mydata, 1, MPI_FLOAT, 0, tag, MPI_COMM_WORLD);

Both are blocking receives waiting for each other to send. System is deadlocked.

Suppose that process 0 has variable A, and process 1 also has a variable A. Write MPI-like pseudocode to exchange these values between the processes.

          P0:
send(P1,A)
          recieve(P0,A)

          P1:
recieve(P1,A)
          send(P0,A)

Explain the purpose of each of the library calls listed.

· MPI_Init

· MPI_Finalize

· MPI_Comm_rank

· MPI_Comm_size

· MPI_Send

· MPI_Recv

· MPI_Barrier

· MPI_Bcast

· MPI_Scatter

· MPI_Gather

· MPI_Reduce

What is an MPI derived datatype and when would you use one? Give an example.
In MPI, when does a blocking recv message return?
Blocks until it receives message
True or false: You can write a program using MPI that will run across all of the cores of your multicore computer in parallel. Also, if this is possible, indicate if you think this is a good way to write the program. You must justify your answer to receive credit.
Discuss marshalling in MPI.
When does a blocking send return.

MPI uses buffering by default. Send returns when message is saved in receivers buffer. Process calling MPI_Send can continue even if destination has not received the message.

16. Programming question 1 : http://www.cs.usfca.edu/~peter/cs220/mt1_old_key#page=6.pdf
17. Programming question2 : http://www.cs.usfca.edu/~peter/cs220/mt1_key#page=4.pdf

Wednesday, October 6, 2010

3 bit up counter using D flipflops

Wednesday, September 29, 2010

Memory management questions

Assume you have a small virtual address space of size 64 KB. Further assume that this is a system
that uses paging and that each page is of size 8 KB.
(a) How many bits are in a virtual address in this system?

16 (1-KB of address space needs 10 bits, and 64 needs 6; thus 16).

(b) Recall that with paging, a virtual address is usually split into two components: a virtual page number (VPN) and an offset. How many bits are in the VPN?

3. Only eight 8-KB pages in a 64-KB address space.

(c) How many bits are in the offset?

16 (VA) - 3 (VPN) = 13.
Alternately: an 8KB page of course requires 13 bits to address each byte (213 = 8192).

(d) Now assume that the OS is using a linear page table, as discussed in class. How many entries does this linear page table contain?

One entry per virtual page. Thus, 8.

Now assume you again have a small virtual address space of size 64 KB, that the system again uses paging, but
that each page is of size 4 bytes (note: not KB!).

(a) How many bits are in a virtual address in this system?

Still 16. The address space is the same size.

(b) How many bits are in the VPN?

14.

(c) How many bits are in the offset?

Just 2 (4 bytes).

(d) Again assume that the OS is using a linear page table. How many entries does this linear page table
contain?
2**14, or 16,384.
____________________________________________________________________________________

Consider the following segment table:

Segment  Base  Length
    0   219    600
    1  2300     14
    2    90    100
    3  1327    580
    4  1952     96

What are the physical addressed for the following logical addresses?

(a) 0,430

(b) 1,10

(d) 3,400

(e) 4,112

(a) 219 + 430 = 649
(b) 2300 + 10 = 2310
(c) illegal reference; traps to operating system
(d) 1327 + 400 = 1727
(e) illegal reference; traps to operating system

___________________________________________________________________________

Consider a paging system with the page table stored in memory.

(a) If a memory reference takes 200 nanoseconds, how long does a paged memory reference take?

(b) If we add associative registers, and 75% of all page-table references are found in the associative registers, what is the effective memory reference time? (Assume that finding a page-table entry in the associative registers takes zero time, if the entry is there.)

400 nanoseconds. 200 ns to access the page table plus 200 ns to access the word in memory.
250 nanoseconds. 75% of the time it's 200 ns, and the other 25% of the time it's 400ns, so the equation is:
```
e.a. = (.75*200)+(.25*400)
```
________________________________________________________________________

A certain computer provides its users with a virtual memory space of 2**32 bytes. The computer has 2**18 bytes of physical memory. The virtual memory is implemented by paging, and the page size is 4K bytes. A user process generated the virtual address 11123456. Explain how the system establishes the corresponding physical location.

* The virtual address in binary form is

0001 0001 0001 0010 0011 0100 0101 0110

Since the page size is 2**12, the page table size is 2**20. Therefore, the low-order 12 bits (0100 0101 0110) are used as the displacement into the page, while the remaining 20 bits (0001 0001 0001 0010 0011) are used as the displacement in the page table.

__________________________________________________________________________________

Round robin scheduling questions

1. Consider N processes sharing the CPU in a round-robin fashion (N>=2). Assume that each context switch takes S msec and that each time quantum is Q msec. For simplicity, assume that processes never block on any event and simply switch between the CPU and the ready queue.

In the following your answers should be functions of N, S and T.

a) Find the maximum value of Q such that no process will ever go more than T msec

Time taken for one process per quantum = quantum,Q+context switch,S

Max wait time, T = N(Q+S)

T = NQ+NS
Q = (T-NS)/N

b) Find the maximum value of Q such that no process will ever go more than T msecs between executing instructions on the CPU?

Max wait time, T = N(Q+S) - Q ie.last instruction just before context switch executes at the end of the quantum of the first time when process executes..
T = NQ+NS-Q
T = Q(N-1)+NS
Q = (T-NS)/(N-1)

2. Suppose that there are two processes, PH and PL, running in a system. Each process is single-threaded. The operating system’s scheduler is preemptive and uses round-robin scheduling with a quantum of q time units.

The scheduler supports two priority levels, HIGH and LOW. Processes at LOW priority will run only if there are no runnable HIGH priority processes. Process PH is a HIGH priority process.
It behaves as described in the following pseudo-code:

while (TRUE) do
compute for tc time units
block for tb time units to wait for a resource
end while

That is, if this process were the only one running in the system, it would alternate between running for tc units of time and blocking for tb units of time. Assume that tc is less than q.

Process PL is a low priority process. This process runs forever, doing nothing but computation. That is, it never blocks waiting for a resource.

a. For what percentage of the time will the low priority process PL be running in this system?Express your answer in terms of tb and tc.

tb/(tb + tc)

b. Repeat part (a), but this time under the assumption that there are two HIGH priority processes (PH1 and PH2) and one LOW priority process (PL). Assume that each HIGH priority process waits for a different resource. Again, express your answer in terms of tb and tc. Your answer should be correct for all tb greater than 0 and all 0 less than tc less than q.

(tb−tc)/(tb+tc) if tc is less than tb

0 if tc greater than or equal to tb
------------------------------------------------------------------------------------------------------------
Suppose a processor uses a prioritized round robin scheduling policy. New processes are assigned an initial quantum of length q. Whenever a process uses its entire quantum without blocking, its new quantum is set to twice its current quantum. If a process blocks before its quantum expires, its new quantum is reset to q. For the purposes of this question, assume that
every process requires a finite total amount of CPU time.
(a) Suppose the scheduler gives higher priority to processes that have larger quanta. Is starvation possible in this system? Why or why not?
No, starvation is not possible. Because we assume that a process will terminate, the worst
that can happen is that a CPU bound process will continue to execute until it completes.
When it finishes, one of the lower priority processes will execute. Because the I/O bound
processes will sit on the low priority queue, they will eventually make it to the head of the
queue and will not starve.
(b) Suppose instead that the scheduler gives higher priority to processes that have smaller quanta. Is starvation possible in this system? Why or why not?
Yes, starvation is possible. Suppose a CPU bound process runs on the processor, uses its
entire quantum, and has its quantum doubled. Suppose a steady stream of I/O bound processes
enter the system. Since they will always have a lower quantum and will be selected for
execution before the process with the doubled quantum, they will starve the original process.
_______________________________________________________________
Assume that 3 processes all with requirements of 1 second of CPU time each and
no I/O arrive at the same time.

a)What will be the average response time (i.e., average time to
completion) for the processes Round Robin (RR) scheduling assuming a timeslice of 0.1 sec and no overhead for context switches (i.e., context switches are free).

Answer: 2.9 seconds

Explanation:

Time for completion for process A =0.28
Time for completion for process B =0.29
Time for completion for process C = 0.30

Average time for completion =0. 29
________________________________________________________________________________________________________

Suppose that the operating system is running a round-robin scheduler with a 50 msec time quantum. There are three processes with the following characteristics:

* Process A runs for 60 msec, blocks for 100 msec, runs for 10 msec and terminates.
* Process B runs for 70 msec, blocks for 40 msec, runs for 20 msec, and terminates.
* Process C runs for 20 msec, blocks for 80 msec, runs for 60 msec, and terminates.

Process A enters the system at time 0. Process B enters at time 10 msec. Process C enters at time 20 msec. Trace the evolution of the system. You should ignore the time required for a context switch. The time required for process P to block is the actual clock time between the time that P blocks and the time that it unblocks, regardless of anything else that is happening.
Answer:

Time Running process Events
0-50 A B enters at time 10. C enters at time 20.
50-100 B
100-120 C C blocks until time 200.
120-130 A A blocks until time 230.
130-150 B B blocks until time 190
150-190 Idle B unblocks at time 190
190-210 B C unblocks at time 200. B terminates at time 210
210-260 C A unblocks at time 230.
260-270 A A terminates at time 270.
270-280 C C terminates at time 280.

_______________________________________________________________________________________________
Consider a variant of the round-robin scheduling algorithm where the entries in the ready queue are pointers to process-control-blocks.

1. What would be the effect of putting two pointers to the same process in the ready queue?

Doubling the time given to that process.
2. What would be the major advantages and disadvantages of this scheme?

Simple scheme which would provide some priority work with minimal modification to scheduler. But, overhead for managing pointers is a nuisance -- what if the process is io waiting or done? Have to remove BOTH pointers from ready queue, etc. Also may increase overhead if same process runs back-to-back -- it was not necessary to switch contexts.
3. How would you modify the basic round-robin algorithm to achieve the same effect without duplicate pointers?

Add a simple quantum indicator to PCB.
_______________________________________________________________________________________________

For each of the following statements, indicate whether you think it is probably true (T) or probably false (F). Then give a brief (one sentence) reason. There is not necessarily a single correct answer to each question, so your one sentence explanation is the most important part of your answer.

1. Small time slices always improve the average completion time of a system.

Probably false: Small time slices will sometimes improve the average response of the system. If the slice is too small, the context switching time will start to dominate the useful computation time and everything (including response time) will suffer.

2. Using a round robin scheduler, a large time slice is bad for interactive users.

Probably true: Large time slices can allow non-interactive processes keep control of the CPU for longer periods of time, causing the interactive processes to be less responsive.

3. Shortest Job First (SJF) or Shortest Completion Time First (SCTF) scheduling is difficult to build on a real operating system.

Probably true: SCTF scheduling requires knowledge of how much time a process is going to take. This requires future knowledge. You might require a user to specify the maximum amount of time that a process could run (and kill it if it exceeds this amount), then use a variant on SCTF.

________________________________________________________________________________

Wednesday, September 22, 2010

5 Variable Karnaugh Map Solution

Simplify the Boolean function
F(A,B,C,D,E) = Σ (0,2,4,6,9,11,13,15,17,21,25,27,29,31)

Writing decimals in binary,

Decimal   A   B   C   D   E
     0   0   0     0   0   0
     2   0   0    0   1   0
     4   0   0   1   0   0
     6   0   0   1   1   0
     9   0   1   0   0   1
     11   0   1   0   1   1
     13   0   1   1   0   1
     15   0   1   1   1   1
     17   1   0   0   0   1
     21   1   0   1   0   1
     25   1   1   0   0   1
     27   1   1   0   1   1
     29   1   1   1   0   1
     31   1   1   1   1   1

Construct two Karnaugh maps for variables A,B,C and D when E=0 and E=1

From Karnaugh map E=0, F0 = A'B'E'

From Karnaugh map E=1, F1 = BE+ AD'E

F = F0+ F1

F = A'B'E' + BE + AD'E

Sunday, September 19, 2010

Computer Architecture - Counters

Overview - Types of counters

Wednesday, September 8, 2010

Recurrence relations examples

Solve the recurrence relation T(n) = T(n/2) + lg(n) where T(1) = 1 and n = 2k for a nonnegative integer k. Your answer should be a precise function of n in closed form. Note that lg represents the log function base 2.

T(n) = T(n/2) + lg(n)

T(n/2) = T(n/4) + lg(n/2)

T(n/4) = T(n/8) + lg(n/4)

T(n/8) = T(n/16) + lg(n/8)

Substituting for T(n)

T(n) = T(n/2) + lg(n)

= T(n/4) + lg(n/2)+lg(n)

= T(n/8)+ lg(n/4)+ lg(n/2)+lg(n)

= T(n/16)+lg(n/8)+ lg(n/4)+ lg(n/2)+lg(n)

----------------------------------------------------------

----------------------------------------------------------

Suppose n= 2**k

T(n) = 1 + lg (n. n/2 . n/4. . . . . n/2**k-1)

= 1+ lg(2**k . 2**k-1 . 2**k-2 . …… . 2)

= 1+ lg( 2 **(1+2+3+ - - - +k))

= 1 + (1+2+3+ ----+k)

= 1+ k(k+1)/2

______________________________________________________________________________

Solve the recurrence T(n) = T(n-1) + n

T(n)    = T(n-1) + n
T(n-1) = T(n-2) + n-1
T(n-2) = T(n-3) + n-2
T(n-3) = T(n-4) + n -3

T(n) = T(n-1) + n
   = T(n-2) + n-1+n
   =   T(n-3) + n-2+n-1+n
   =   T(n-4) + n-3+n-2+n-1+n
-----------------------------------------------

        Consider n= 2**k
T(n) = T(n-k) + n+n-1+n-2+n-3+ -------------+n-(k-1)
         = T(n-k) + n(n-1) - k(k-1)/2
         = T(n-k) + n**2 - n -lgn(lgn-1)/2

Therefore, asymptotic complexity is n**2

Tuesday, August 31, 2010

Delete a linked list using recursion in C++

#include < iostream >
using namespace std;

struct Node{
    int num;
    Node* next;
};

//Here address of pointer to the list is passed so that head of the list will not be a local variable.

void DeleteListRecursion(Node*& list){
    if(list == NULL)
        return;
    else
    DeleteListRecursion(list->next);
    delete(list);
    list = NULL;
}

Node* Insert(int number, Node* list){
    Node* node = new Node;
    node->num = number;
    node->next = list;
    return node;
}

//TEST

int main(){

    Node* list = NULL;

    list = Insert(5,list);
    list = Insert(7,list);
    list = Insert(9,list);
    list = Insert(4,list);
    list = Insert(6,list);
    list = Insert(3,list);

    DeleteListRecursion(list);

    return 0;
}

Saturday, August 28, 2010

Data Structures You Should Know

Big Oh of common algorithms : http://www.mcs.csueastbay.edu/~billard/cs3240/big-oh.html
Binary Search

UnSorted List Using Arrays
UnSorted List Using Linked List
Sorted List Using Arrays
Sorted List Using Linked List

Doubly Linked List
Circular List

Stack Using Arrays
Stack Using LinkedList

Queue Using arrays
Queue Using LinkedList

Recursion Problems

Binary Tree

Graph

Minimum Spanning Tree

Kruskals Algorithm Pseudocode Example
Prim's Algorithm Pseudocode Example

Sorting

Bubble Sort Code Example
Selection Sort Code Example
Insertion Sort Code Example
QuickSort Code Example
MergeSort Code Example
HeapSort Code Example
Radix Sort Code Example

Hashing

Sunday, August 15, 2010

Computer Architecture

Tuesday, June 15, 2010

Chomsky Hierarchy Diagram for Languages.

The diagram shows how different classes of languages such as regular, context free, context sensitive, P, NP, PSAPACE, NPSPACE, EXPTIME, NEXPTIME, EXPSPACE, ACCEPTABLE, DECIDABLE, CO-ACCEPTABLE etc are related to each other.

Half-Sat Problem With Example

Half-SAT = {F | F is a CNF formula with 2n variables and there is a satisfying assignment in which n variables are set to True and n variables are set to False}.
Show that Half-SAT is NP-Hard.

We can show that half-sat is NP-hard by reducing sat to half-sat. We can show that if half-sat has a solution sat also has a solution.

Let T be a turing machine(transducer) which reduces sat to half-sat. T takes F as input.
Let F has variables x1,x2,......xn.
T(F) = Construct a new CNF formula F' as follows.
            F' will have variables x1,x2,.......xn from F and new variables y1,y2,...........yn such that xi = ‾yi
            If a clause is of the form (x1 U x2), convert it into (x1 U x2) ^( ‾y1 U ‾y2).
Output(F')


Example 1
Let F = (x1 U x2 U ‾x3) and x1=True, x2=False, x3= True
F = T U F U T = T
Introduce three new variables y1,y2 and y3 such that
x1 = ‾y1
x2 = ‾y2
‾x3 = y3

F' = (x1 U x2 U ‾x3) ^ ( ‾y1 U ‾y2 U y3).
x1 = T    y1 =F

x2 = F    y2 =T
x3 = T    y3 =F

Here three variables are assigned True and other three variables are assigned False. Therefore the expression is in half-sat.
Assigning values to F', F' = ( T U F U T) ^ (T U F U T) = True.

Hence F' evaluates to true iff F evaluates to true.

Let x1=False, x2 = False, x3=False
F= false U false U false = False
F' = (F U F U F) ^ (F U F U F) = F
Hence F' evaluates to false iff F evaluates to false.

Friday, June 11, 2010

NP- complete Problems with variations

K-clique video

half-clique
quarter-clique
sub-graph isomorphism
Independent set video
vertex cover video
Exact Cover video

Hamiltonian Path video lecture

Does a hamiltonian path exist in G.
Does a hamiltonian path exist from node s to t
k-link path
Hamiltonian Circuit video

SAT Problems video

TRUE-SAT
DOUBLE-SAT
3SAT
HALF-SAT
not equal to 3SAT

Subset Sum Problem video

Partition video
Multiset video

3-COLOR

Thursday, June 10, 2010

Half Clique Problem with examples ( Complexity theory )

Half-clique problem is described as follows. Given a graph G with n number of vertices, does there exist a clique of G consisting of exactly half the nodes of G? Show that HALF-CLIQUE is NP-hard by reducing CLIQUE to HALF-CLIQUE.

Let f be a transducer which does the reduction.

f(G,k) = Construct G' as follows
              Add n more vertices to G.
              Select n-k of the newly added vertices, connect them to each other and to all original vertices in G.

Output (G')

Here adding n-k vertices to G can be done in polynomial time. So reduction is done in polynomial time.
Yes Instance

In the following example G has 5 nodes. Let k=3. This is a yes instance since G has a clique of k=3( nodes A,B and E).

Construct G' by adding 5 more nodes (F,G,H,I,J) to graph G. Connect 5-3=2 of them to each other and to original nodes of G.

G' has five-clique (A,B,E,F and G).
G' will not have 5-clique unless G already has a 3 clique.

No Instance

Try yourself.:

Remove any of the edges forming 3 clique and try to construct G'. Check whether G' can have a 5-clique.

Clique can be reduced to any fractionof clique using the same logic. To solve (1/m)th clique, add (m-1)n vertices to graph G and connect n-k vertices.
For example, to solve quarter-clique, add 3n vertices to G and fully connect n-k new vertices.

Thursday, May 20, 2010

Oracle Trouble shooting

When I am trying to sign in to Oracle 11g in windows xp , I am getting the error 'ORA-12560: TNS:protocol adapter error'

Solution:
Try1:
set local=databasename
For example, if name of database is orcl, command will be
C:\ set local = ORCL

Then log in to sqlplus
C:\ sqlplus / as sysdba
___________________________________________________________________________________
Try 2

If try1 does not work,

In command prompt do the following:
C:\ set ORACLE_SID = databasename
Important: The database name has to be in uppercase letters.

For example, if name of database is orcl, command will be
C:\ set ORACLE_SID = ORCL

Then log in to sqlplus
C:\ sqlplus / as sysdba

SQL*Plus: Release 11.1.0.6.0 - Production on Sun Jun 20 00:06:58 2010

Copyright (c) 1982, 2007, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Now you are connected to the database.
____________________________________________________________________________________

Sunday, May 16, 2010

Reducibility , NP completeness - Slides and Video Lectures

Video Lectures
http://web.cs.wpi.edu/~kal/courses/cs503/module9/

CMU
http://www.cs.cmu.edu/~ggordon/780/slides/04-sat.pdf

NP - completeness

Vertex Cover
http://cs.nmu.edu/~mkowalcz/cs422w09/35/lecNotes.pdf

http://www.cs.nthu.edu.tw/~wkhon/toc07-lectures/lecture22.pdf

Friday, April 16, 2010

Turing Machine ( True or False )

1. A Universal Turing Machine can compute anything that any other Turing Machine could possibly compute.
True

2.The Turing Test is a test of whether a problem can be solved by a Turing Machine.
True

3. Every acceptable language is also decidable.
False

4. Decidability is a special case of decidability
True

5. Regular languages are decidable
True

6. Context free languages are not decidable
False

Friday, April 2, 2010

Useful Websites

Countability Proofs
http://mathrefresher.blogspot.com/2006/09/countability.html

Theory of Automata video lectures
http://web.cs.wpi.edu/~kal/courses/cs503/

Decidability Slides
http://www.cs.virginia.edu/~robins/cs6160/slides/Deciders,%20Recognizers,%20Rice%27s%20Theorem%20(parts%2013%20and%2014).pps

List of Automata theorems
http://web.njit.edu/~marvin/cs341/listofthms.pdf

Proof for REGTM
http://www.cs.tut.fi/~elomaa/teach/iTCS-8.pdf

Saturday, March 13, 2010

Interesting Algorithms

1. Let A be a list of n elements in non-decreasing order where some elements are replicated.
The problem is to find the element that appear most frequently. Give an O(k log n) time
algorithm to solve the problem where k denotes the number of distinct elements in A.

2. You are given a list A = [a1; a2; - - - ; an] of n distinct numbers in an arbitrary order.
You are supposed to find two numbers _i and _j from A such that (i) i < j, (ii) _j > _i, and
(iii) satisfying (i) and (ii), aj - ai is minimum over all possible such pairs. Give an O(n) time
algorithm.

3. A carpenter has a piece of wood of a certain length that must be cut at positions a₁, a₂ ..., a_n where a_i is the distance from the left end of the original piece of wood. Notice that after making the first cut, the carpenter now has two pieces of wood; after making the second cut, the carpenter has three pieces of wood, etc. Assume that the cost of making a cut in a piece of wood of length l is equal to l, and is the same no matter which position in that piece of wood is being cut. Let L be the length of the original piece of wood.
Derive the recurrence relation which could be used to design a recursive algorithm to find the minimum total cost for making all the cuts.

4. Describe a Θ(n lg n)-time algorithm, that given a set of n integers and another
integer x, determines whether or not there exist two elements in S whose sum is exactly x. Write
pseudocode.
Hint: Sort using mergesort and iterate once after sorting. Θ(n lg n+n) = Θ(n lg n)

5. Suppose you are given an array A containing n sorted elements followed by lgn unsorted elements. Thus, entire array contains N = n+lgn elements. Can the entire array sorted in O(n) time.

6. You are given an array of n numbers. Write an algorithm with O(n)=nlogn that returns the number of distinct numbers in the array.

Friday, March 12, 2010

Minimum Spanning Tree - True or False

1. In an undirected graph, the shortest path between two nodes lies on some minimum spanning
tree.
A: False.

2. If the edges in a graph have different weights, then the minimum spanning tree is unique.
A: True.

3. If the edge with maximum weight belongs to a cycle, then there exists some MST that
does not contain this edge.
A: True

4. Adding a constant to every edge weight does not change the minimum spanning tree.
A: True

5. There can be more than one minimum spanning trees if the weight of the edges are all distinct
A.False

6. If all weights are the same, every spanning tree is minimum.
A. True

7. Greedy algorithm runs in exponential time
A. False

8. A minimum spanning tree should contain all edges of the graph
A. False

9. A minimum spanning tree should contain all vertices of the graph
A.True

10.Adding an edge to a spanning tree of a graph G always creates a cycle.
A.False

11. Adding an edge to a spanning tree connecting two existing vertices of a graph G always creates a cycle.
A. True

12. For any cycle in a graph, the cheapest edge in the cycle is in a minimum spanning tree.
A. False

Big O, Big Omega, Big Theta

For each of the following pairs of functions f(n) and g(n), state whether f(n) =
O(g(n)), f(n) = Ω(g(n)), or f(n) = Θ(g(n)), or none of the above:

(a) f(n) = n2 + 3n + 4, g(n) = 6n + 7
Sol: f(n) = Ω(g(n).

(b) f(n) = n √n , g(n) = n2 - n
Sol: f(n) = O(g(n)).

(c) f(n) = 2**n - n**2, g(n) = n**4 + n**2
Sol: f(n) = Ω(g(n).

(d) Assume you have five algorithms with the running times
listed below (these are the exact running times). How much slower do each of these
algorithms get when you double the input size
i) n²
Sol: 4T(n)

ii) n³
Sol: 8T(n)

iii) 100n²
Sol: 4T(n)

iv) nlgn
Sol: 2n(lg2n) = 2n(1+lgn) ~ 2n lgn
T(n) = 2T(n)

v) 2ⁿ
Sol: [T(n)]²

(e) Write a big-O expression for 1+2+3+...+n?
Sol: O(n²)

True or false:
(f) An algorithm with worst case time behavior of 3n takes at least 30 operations for every input of size n=10.
Sol : False

(g) n! = O(nⁿ)
Sol: True

(h) n^O(1) = O(n²)
Sol: False. O(1) can be any large constant .

(i) All of the following are true.

f(n) = O(f(n))
c * O(f(n)) = O(f(n)), if c is constant
O(f(n)) + O(f(n)) = O(f(n))
O(O(f(n))) = O(f(n))
O(f(n)) * O(g(n)) = O(f(n)g(n))
O(f(n)g(n)) = f(n) * O(g(n))

Wednesday, February 17, 2010

merge sort code in c++ to count number of comparisons

#include < iostream >
#include < stdio.h >

using namespace std;

int count = 0; //count of comparisons
int n = 0;
const int MAX_ITEMS = 100;
void merge(int values[], int leftFirst, int leftLast, int rightFirst, int rightLast);
void printarray( int a[], int n);
void mergesort(int a[], int start, int end){ //no significant comparisons are done during splitting

    if(start < end){
        int mid = (start+end)/2;
        mergesort(a,start, mid);
        mergesort(a,mid+1,end);
        merge(a, start,mid, mid+1, end);
    }
}
void merge(int values[], int leftFirst, int leftLast, int rightFirst, int rightLast){
        int temparray[MAX_ITEMS];
        int index = leftFirst;
        int saveFirst = leftFirst;

        while((leftFirst <= leftLast) && ( rightFirst <= rightLast)){//compare and select smallest from two subarrays

            if(values[leftFirst] < values[rightFirst]){
                temparray[index] = values[leftFirst]; //smallest assigned to temp
                leftFirst++;
            }
            else
            {
                temparray[index] = values[rightFirst];
                rightFirst++;
            }
            index++;
            count++; //count of comaparisons done during merge. One comparison is done per iteration of while loop.
        }

        while(leftFirst <= leftLast){

            temparray[index] = values[leftFirst];
            leftFirst++;
            index++;

        }
        while(rightFirst <= rightLast){
            temparray[index] = values[rightFirst];
            rightFirst++;
            index++;

        }

        for(index = saveFirst; index <= rightLast; index++)//copies from temp array to values array
            values[index] = temparray[index];
        printarray(values,n);
        cout << endl;

    }

void printarray( int a[], int n){
    for (int i=0; i < n; i++)
        cout << a[i] << " ";
}

int main(){

    cout << "Enter number of elements to be sorted : ";
    cin >>n;

    int a[MAX_ITEMS];
    for (int i=0; i < n; i++){
        if(i==0)
            cout << "Enter the first element: ";
        else
            cout << "Enter the next element: ";
        cin >>     a[i];
    }

    int start = 0;
    int end = n-1;
    mergesort(a, start, end);
    printarray(a, n);
    cout << endl;
    cout << "Number of comparisons : "<< count << endl;
}

Friday, February 12, 2010

systems programming (UNIX, C)

How to toggle lettercases.

struct termios info;
    int   rv = tcgetattr(0, &info);
    if(info.c_oflag & OLCUC)
            info.c_oflag &= ~OLCUC;
    else
        info.c_oflag |= OLCUC;
    tcsetattr(0, TCSANOW, &info);

______________________________________________________________

How can we measure CPU time ( user + kernel) of a program in UNIX.

1. Use clock() system call to measure CPU time.
The clock() function returns the amount of CPU time (in
     microseconds) used since the first call to clock() in the
     calling process.
      example
          time_t time = clock();
          int i=0;
          while(i<10000){
              printf("%d\n", i);
              i++;
          }
         time = clock();
   printf( "\nCPU time : %ld\n" , time);// prints CPU time taken to print numbers from 0 to 999

2. Use time command
   example
     bash$ time gcc file.c
3. Use getitimer and setitimer with ITIMER_REALPROF

Friday, February 5, 2010

IMac

How can I access(SSH) a remote machine from my iMac.

Access the terminal from application -> Utilities -> Terminal
On prompt type ssh -l uername host
On prompt for password , type pasword

How can I switch between two terminal windows in iMac
command- ~

Wednesday, February 3, 2010

c programmin errors, warnings and solutions

Warnings:

warning: incompatible implicit declaration of built-in function 'malloc'
solution - add #include < stdlib.h>
warning: incompatible implicit declaration of built-in function 'strlen'
solution- add #include < string.h>

Friday, January 8, 2010

C programming

Code to deep copy a string in C

char* deepcopy(char* str1){
        char *copy = malloc (strlen(str1));
        sprintf(copy, str1);
        return copy;
}

1. When I compile, I am getting the error: 'EOF' undeclared (first use in this function)

Solution: #include < stdio.h >

2. In visual studio output window just flashes and disappears. What can I do to make it stay.

Run the program using Ctrl F5 instead of just F5

3. When I compile a simple program I am getting the error "Cannot open include file: 'iostream.h': No such file or directory". How can I fix this.

Check if iostream is stored in C:\Program Files\Microsoft Visual Studio 8\VC\include folder. Sometimes it is stored just as iostream instead of iostream.h . If that is the case add #include < iostream> omitting .h

4.LINK : fatal error LNK1104: cannot open file 'C:\Documents and Settings/...............

Usual reason is that the same program is running and a terminal window is waiting for input.