Friday, December 3, 2010

Simple MapReduce Program Example in Python

Given a set of integers, compute the sum of their square values.
(Assumption: You already know how to run a map reduce program using an input file and how to check the output)

mapper.py

#!/usr/bin/env python

import sys

# input comes from STDIN (standard input)
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # split the line into numbers
    numbers = line.split()
    for number in numbers:
        num = int(number)
        square = num*num
        print '%s\t%s' % (1,square)

reducer.py

#!/usr/bin/env python

import sys

#sum initialized to zero
sum=0
# input comes from STDIN
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # parse the input we got from mapper.py
    word, square = line.split('\t', 1)
    # convert square(currently a string) to int
    try:
        square = int(square)
        sum=sum+square
    except ValueError:
        # count was not a number, so silently
        # ignore/discard this line
        pass
#print sum of squares
print '%s'% (sum)
___________________________________________________________________
input.txt
1 2 3 4


output/part-0000
30

NP completeness exam questions

Consider a reduction of problem A to problem B. What is the most precise claim you
can make about problem B for each of the following situations?

a) A is NP-complete and the reduction is in polynomial time.
 NP-hard. (At least as hard as NP-complete problem.)

b) A is in polynomial time and the reduction is also in polynomial time.
B could be anything.

c) A is NP-complete and the reduction is in Pspace.
B could be anything.

d) A is in nondeterministic polynomial time and the reduction is in polynomial time.
B is at least as hard as A, but nothing more can be said.

e) A requires exponential time and the reduction is in polynomial time.
B must requires polynomial time for deciding.

f) A is Pspace complete and the reduction is in Pspace.
B could be anything.
_______________________________________________________________________
Suppose you could reduce an NP complete problem to a polynomial time problem in
polynomial time. What would be the consequence?
What if the reduction required exponential time?

If we could reduce an NP-complete problem to a problem in P, then NP will
be equal to P. If the reduction required exponential time, then there is not
special consequence. (In fact any NP-complete problem can be reduced to
a polynomial time solvable problem using exponential time reduction.)
_______________________________________________________________________

Wednesday, December 1, 2010

True-Sat problem is NP-complete

TRUE-SAT = {boolean expressions E in conjunctive normal form that
                   (1) are true when all variables are set true, AND ALSO
                   (2) have some other truth assignment that makes E true}
            This can also be stated (in Garey-Johnson notation) as:  
             Instance: Set of clauses C1,C2,....Ck over the boolean variables u1,u2,....um  
             Question:
                     (1) are all Ci true when u1,u2,....um are set to true, and
                     (2) is there another satisfying truth assignment (not all variables set to true)?



First, Show TRUE_SAT is in NP. (This is left as an exercise)
Then show TRUE-SAT is NP-hard.
If an expression is not true when all variables are true, then it is surely not in TRUE-SAT.

To show TRUE-SAT is NP-complete, we reduce SAT to it. Suppose we are given an expression E with variables a,b,c.

Convert E to E' as follows:
Test if E is true when a = T, b=T and c=T.
If so, we know E is satisfiable, then add (a U a') to E.
                 E' = E  U ( a U a' )
                 // E satisfies first condition. a U a' satisfies the second condition.
Otherwise
                E' = E U (  a U b U c )
                //Adding  a U b U c satisfies first condition. If E is in SAT, that satisfies the second condition.
 Therefore, TRUE_SAT accepts only when E is in SAT. Hence TRUE-SAT is reducible to SAT and TRUE-SAT is NP-complete.

Monday, November 29, 2010

PageRank Algorithm Using Mapreduce

Description  -  http://en.wikipedia.org/wiki/PageRank


Examples of pagerank calculation - http://pagerank.suchmaschinen-doktor.de/index/examples.html

Formula for pagerank calculation:

PR(A) = (1 - d) + d * SUM ((PR(I->A)/C(I))
Where:
  • PR(A) is the PageRank of your page A.
  • d is the damping factor, usually set to 0,85.
  • PR(I->A) is the PageRank of page I containing a link to page A.
  • C(I) is the number of links off page I.
  • PR(I->A)/C(I) is a PR-value page A receives from page I.
  • SUM (PR(I->A)/C(I)) is the sum of all PR-values page A receives from pages with links to page A..



 Inputs and Outputs of Mapper and Reducer - https://wiki.umiacs.umd.edu/ccc/images/e/ea/CLuE-Jagannathan.pdf 







Video lecture


Thursday, November 25, 2010

Factorization of a Number is in NP. Why?

Factoring a number seems to be trivial. The fact is that factorization when done in binary takes non-polynomial time.

For more information check http://en.wikipedia.org/wiki/Integer_factorization

Wednesday, November 17, 2010

Polynomial Reduction Example

Show that WRITE_ONETM = {M : M is a Turing machine that writes a 1 on its tape } is undecidable.

Let us assume that WRITE_ONETM is decidable. Then there exists a Turing machine WRITE_ONETM,  which can tell you if a given Turing Machine M will write a 1 on its tape on input z.

Let us look at how WRITE_ONETM  works.
M is the input to WRITE_ONETM .
If M writes a 1 on its tape , WRITE_ONETM  accepts.  – Yes instance
If M does not write a 1 on its tape, WRITE_ONETM  rejects. – No instance.

If M writes a 1 on its tape at any time of its computation, WRITE_ONETM decides yes. It does not matter what M does before it writes 1 on its tape.

 M might have 10000 transitions before it writes 1 on its tape. Still WRITE_ONETM should be able to decide whether M will write 1 on its tape.

M might compute what is 5+7 before it writes 1 on its tape. Still WRITE_ONETM should be able to decide whether M will write 1 on its tape.

M can watch a youtube video and then write a 1 on its tape. Still WRITE_ONETM should be able to decide whether M will write 1 on its tape.
( just kidding)

M might simulate another Turing Machine and then write a 1 on its tape. Still WRITE_ONETM should be able to decide whether M will write a1 on its tape.

Suppose M is simulating another machine T on input w on M’s tapes and if T accepts w, M is writing a 1 on its tape. Still WRITE_ONETM should be able to decide whether M will write a1 on its tape.

Did you hear that?
Isn’t that a solution to decidability of ATM. Whenever you have to decide if a TM T will accept or reject w, all you have to do is:
Construct a new TM, M such that
·        It simulates T on input w on M’s tape
·        If T accepts w, write a 1 on M’s tape.
·        Feed M to WRITE_ONETM
·        If WRITE_ONETM decides yes, T accepts w.
    
But we already know that ATM is undecidable. So it is not possible that WRITE_ONETM  can be decidable if it can solve ATM.

Hence WRITE_ONETM  is undecidable.



    





  

Tuesday, November 2, 2010

Number of full nodes in a binary tree c++

A FULL node in a binary tree is a node that has exactly two non-null children. Write a RECURSIVE function that returns the number of full nodes in a binary tree.



# include <iostream >
#include  <stdio.h >

using namespace std;

struct TreeNode{
    int value;
    TreeNode* left;
    TreeNode* right;
};


int FullNodes(TreeNode* root){
   

    if(root == NULL)        //if tree is empty
        return 0;
    if(root->left == NULL && root->right == NULL)         //leaf nodes
        return 0;
    if(root->left!=NULL && root->right != NULL)             // Full Nodes
        return 1 + FullNodes(root->left) + FullNodes(root->right);

    if(root->left==NULL && root->right != NULL)           //Nodes with no left child
        return FullNodes(root->right);

    if(root->left!=NULL && root->right == NULL)          // Nodes with no right child
        return FullNodes(root->left);

}

/*---------------------------------TEST---------------------------------*/

//Function to insert into binary tree
void insert(TreeNode*& root, int num){
       
    if(root == NULL){
        root = new TreeNode();
        root->value = num;
        root->left = NULL;
        root->right = NULL;   
       
    }
    else if(num < root->value){
        insert(root->left,num);
    }
    else
        insert(root->right,num);
}


int main(){


//creates a tree
    TreeNode* tree = NULL;
    insert(tree, 8);
    insert(tree, 5);
    insert(tree,10);
    insert(tree, 3);
    insert(tree, 9);
    insert(tree, 6);
   


//Full Node count test
    cout << FullNodes(tree); //8 and 5 are full nodes . So output is 2

    return 0;
}

Monday, October 18, 2010

MPI Programming Exam Questions

Feel free to give answers to unanswered questions as comments


1.  Sum of prime factors :

http://people.sc.fsu.edu/%7Ejburkardt/presentations/fdi_2008_lecture8.pdf#page=16

Add up the prime numbers from 2 to N.
Each of P processors will simply take about 1/P of the range of
numbers to check, and add up the primes locally.
When it's done, it will send the partial result to processor 0.

2.  Deadlock avoidance when using blocking send and receive in order.

http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html


  1.  In MPI, when does a non-blocking recv message return?
          
  1.  What is an MPI communicator?   What is MPI_COMM_WORLD?
            Two processors must be in a common "communicator group" in order to communicate. This is simply  a way for the user to organize  processors into sub-groups. All processors can communicate in the
shared group known as MPI_COMM_WORLD.
  1.  In MPI, what is a process rank?
          
  1.  True or false:  In MPI you set the number of processes when you write the source code.
           No. Number of processes are given on execution.
  1.  Give a short piece of pseudocode that illustrates the master/slave programming model.

  1.  Explain if the following MPI code segment is correct or not, and why:
    Process 0 executes:
MPI_Recv(&yourdata, 1, MPI_FLOAT, 1, tag, MPI_COMM_WORLD, &status);
MPI_Send(&mydata, 1, MPI_FLOAT, 1, tag, MPI_COMM_WORLD);
Process 1 executes:
MPI_Recv(&yourdata, 1, MPI_FLOAT, 0, tag,MPI_COMM_WORLD, &status);
MPI_Send(&mydata, 1, MPI_FLOAT, 0, tag, MPI_COMM_WORLD);

Both are blocking receives waiting for each other to send. System is deadlocked.

  1.  Suppose that process 0 has variable A, and process 1 also has a variable A. Write MPI-like pseudocode to exchange these values between the processes.
          P0:
          send(P1,A)
          recieve(P0,A)

          P1:
          recieve(P1,A)
          send(P0,A)


  1.  Explain the purpose of each of the library calls listed.
·         MPI_Init
·         MPI_Finalize
·         MPI_Comm_rank
·         MPI_Comm_size
·         MPI_Send
·         MPI_Recv
·         MPI_Barrier
·         MPI_Bcast
·         MPI_Scatter
·         MPI_Gather
·         MPI_Reduce
  1.  What is an MPI derived datatype and when would you use one?  Give an example. 
  2. Derived datatypes are datatypes that are built from the basic MPI datatypes.
  3. In MPI, when does a blocking recv message return?
               Blocks until it receives message
  4. True or false:  You can write a program using MPI that will run across all of the cores of your multicore computer in parallel.  Also, if this is possible, indicate if you think this is a good way to write the program.  You must justify your answer to receive credit.  
  5. Discuss marshalling in MPI.
  6. http://books.google.com/books?id=LLdekoUxmr0C&pg=PA86&lpg=PA86&dq=MPI+marshalling&source=bl&ots=aLn5ivDP2i&sig=ElkL0CwR55hVES-tQJjoAKBIIaI&hl=en&ei=U3nITNq7NYrAsAPqytW9DQ&sa=X&oi=book_result&ct=result&resnum=10&ved=0CE0Q6AEwCQ#v=onepage&q=MPI%20marshalling&f=false
  7. When does a blocking send return.
           MPI uses buffering by default. Send returns when message is saved in receivers buffer. Process calling  MPI_Send can continue even if destination has not received the message.

16. Programming question 1 : http://www.cs.usfca.edu/~peter/cs220/mt1_old_key#page=6.pdf
17.  Programming question2 :  http://www.cs.usfca.edu/~peter/cs220/mt1_key#page=4.pdf









    Wednesday, September 29, 2010

    Memory management questions

    Assume you have a small virtual address space of size 64 KB. Further assume that this is a system
    that uses paging and that each page is of size 8 KB.
    (a) How many bits are in a virtual address in this system?

    16 (1-KB of address space needs 10 bits, and 64 needs 6; thus 16).

    (b) Recall that with paging, a virtual address is usually split into two components: a virtual page number (VPN) and an offset. How many bits are in the VPN?

    3. Only eight 8-KB pages in a 64-KB address space.

    (c) How many bits are in the offset?

    16 (VA) - 3 (VPN) = 13.
    Alternately: an 8KB page of course requires 13 bits to address each byte (213 = 8192).

    (d) Now assume that the OS is using a linear page table, as discussed in class. How many entries does this linear page table contain?

    One entry per virtual page. Thus, 8.

    Now assume you again have a small virtual address space of size 64 KB, that the system again uses paging, but
    that each page is of size 4 bytes (note: not KB!).


    (a) How many bits are in a virtual address in this system?

    Still 16. The address space is the same size.

    (b) How many bits are in the VPN?

    14.

    (c) How many bits are in the offset?

    Just 2 (4 bytes).

    (d) Again assume that the OS is using a linear page table. How many entries does this linear page table
    contain?

    2**14, or 16,384.
    ____________________________________________________________________________________

     Consider the following segment table:
    Segment  Base  Length
        0   219    600
        1  2300     14
        2    90    100
        3  1327    580
        4  1952     96
    What are the physical addressed for the following logical addresses?
    (a) 0,430
    (b) 1,10
    (c) 2,500
    (d) 3,400
    (e) 4,112
    • (a) 219 + 430 = 649
    • (b) 2300 + 10 = 2310
    • (c) illegal reference; traps to operating system
    • (d) 1327 + 400 = 1727
    • (e) illegal reference; traps to operating system 

    ___________________________________________________________________________


    Consider a paging system with the page table stored in memory.
    (a) If a memory reference takes 200 nanoseconds, how long does a paged memory reference take?
    (b) If we add associative registers, and 75% of all page-table references are found in the associative registers, what is the effective memory reference time? (Assume that finding a page-table entry in the associative registers takes zero time, if the entry is there.)
    • 400 nanoseconds. 200 ns to access the page table plus 200 ns to access the word in memory.
    • 250 nanoseconds. 75% of the time it's 200 ns, and the other 25% of the time it's 400ns, so the equation is:
      e.a. = (.75*200)+(.25*400)
      ________________________________________________________________________

      A certain computer provides its users with a virtual memory space of 2**32 bytes. The computer has 2**18 bytes of physical memory. The virtual memory is implemented by paging, and the page size is 4K bytes. A user process generated the virtual address 11123456. Explain how the system establishes the corresponding physical location.

      * The virtual address in binary form is

      0001 0001 0001 0010 0011 0100 0101 0110

      Since the page size is 2**12, the page table size is 2**20. Therefore, the low-order 12 bits (0100 0101 0110) are used as the displacement into the page, while the remaining 20 bits (0001 0001 0001 0010 0011) are used as the displacement in the page table.

      __________________________________________________________________________________

    Round robin scheduling questions

    1. Consider N processes sharing the CPU in a round-robin fashion (N>=2). Assume that each context switch takes S msec and that each time quantum is Q msec. For simplicity, assume that processes never block on any event and simply switch between the CPU and the ready queue.

    In the following your answers should be functions of N, S and T.

    a) Find the maximum value of Q such that no process will ever go more than T msec

    Time taken for one process per quantum = quantum,Q+context switch,S

    Max wait time, T = N(Q+S)

    T = NQ+NS
    Q = (T-NS)/N

    b) Find the maximum value of Q such that no process will ever go more than T msecs between executing instructions on the CPU?

    Max wait time, T = N(Q+S) - Q ie.last instruction just before context switch executes at the end of the quantum of the first time when process executes..
    T = NQ+NS-Q
    T = Q(N-1)+NS
    Q = (T-NS)/(N-1)

    2. Suppose that there are two processes, PH and PL, running in a system. Each process is single-threaded. The operating system’s scheduler is preemptive and uses round-robin scheduling with a quantum of q time units.

    The scheduler supports two priority levels, HIGH and LOW. Processes at LOW priority will run only if there are no runnable HIGH priority processes. Process PH is a HIGH priority process.
    It behaves as described in the following pseudo-code:

    while (TRUE) do
    compute for tc time units
    block for tb time units to wait for a resource
    end while

    That is, if this process were the only one running in the system, it would alternate between running for tc units of time and blocking for tb units of time. Assume that tc is less than q.

    Process PL is a low priority process. This process runs forever, doing nothing but computation. That is, it never blocks waiting for a resource.

    a. For what percentage of the time will the low priority process PL be running in this system?Express your answer in terms of tb and tc.

    tb/(tb + tc)

    b. Repeat part (a), but this time under the assumption that there are two HIGH priority processes (PH1 and PH2) and one LOW priority process (PL). Assume that each HIGH priority process waits for a different resource. Again, express your answer in terms of tb and tc. Your answer should be correct for all tb greater than 0 and all 0 less than tc less than q.

    (tb−tc)/(tb+tc)       if tc is less than tb

    0                          if tc greater than or equal to tb
    ------------------------------------------------------------------------------------------------------------
    Suppose a processor uses a prioritized round robin scheduling policy. New processes are assigned an initial quantum of length q. Whenever a process uses its entire quantum without blocking, its new quantum is set to twice its current quantum. If a process blocks before its quantum expires, its new quantum is reset to q. For the purposes of this question, assume that
    every process requires a finite total amount of CPU time.
    (a) Suppose the scheduler gives higher priority to processes that have larger quanta. Is starvation possible in this system? Why or why not?
    No, starvation is not possible. Because we assume that a process will terminate, the worst
    that can happen is that a CPU bound process will continue to execute until it completes.
    When it finishes, one of the lower priority processes will execute. Because the I/O bound
    processes will sit on the low priority queue, they will eventually make it to the head of the
    queue and will not starve.
    (b) Suppose instead that the scheduler gives higher priority to processes that have smaller quanta. Is starvation possible in this system? Why or why not?
    Yes, starvation is possible. Suppose a CPU bound process runs on the processor, uses its
    entire quantum, and has its quantum doubled. Suppose a steady stream of I/O bound processes
    enter the system. Since they will always have a lower quantum and will be selected for
    execution before the process with the doubled quantum, they will starve the original process.
    _______________________________________________________________
    Assume that 3 processes all with requirements of 1 second of CPU time each and
    no I/O arrive at the same time.

    a)What will be the average response time (i.e., average time to
    completion) for the processes Round Robin (RR) scheduling assuming a timeslice of 0.1 sec and no overhead for context switches (i.e., context switches are free).


    Answer: 2.9 seconds

    Explanation:

    Time for completion for process A =0.28
    Time for completion for process B =0.29
    Time for completion for process C = 0.30

    Average time for completion =0. 29
    ________________________________________________________________________________________________________

    Suppose that the operating system is running a round-robin scheduler with a 50 msec time quantum. There are three processes with the following characteristics:

    * Process A runs for 60 msec, blocks for 100 msec, runs for 10 msec and terminates.
    * Process B runs for 70 msec, blocks for 40 msec, runs for 20 msec, and terminates.
    * Process C runs for 20 msec, blocks for 80 msec, runs for 60 msec, and terminates.

    Process A enters the system at time 0. Process B enters at time 10 msec. Process C enters at time 20 msec. Trace the evolution of the system. You should ignore the time required for a context switch. The time required for process P to block is the actual clock time between the time that P blocks and the time that it unblocks, regardless of anything else that is happening.
    Answer:

    Time Running process Events
    0-50 A B enters at time 10. C enters at time 20.
    50-100 B
    100-120 C C blocks until time 200.
    120-130 A A blocks until time 230.
    130-150 B B blocks until time 190
    150-190 Idle B unblocks at time 190
    190-210 B C unblocks at time 200. B terminates at time 210
    210-260 C A unblocks at time 230.
    260-270 A A terminates at time 270.
    270-280 C C terminates at time 280.



    _______________________________________________________________________________________________
    Consider a variant of the round-robin scheduling algorithm where the entries in the ready queue are pointers to process-control-blocks.

    1. What would be the effect of putting two pointers to the same process in the ready queue?

    Doubling the time given to that process.
    2. What would be the major advantages and disadvantages of this scheme?

    Simple scheme which would provide some priority work with minimal modification to scheduler. But, overhead for managing pointers is a nuisance -- what if the process is io waiting or done? Have to remove BOTH pointers from ready queue, etc. Also may increase overhead if same process runs back-to-back -- it was not necessary to switch contexts.
    3. How would you modify the basic round-robin algorithm to achieve the same effect without duplicate pointers?

    Add a simple quantum indicator to PCB.
    _______________________________________________________________________________________________

    For each of the following statements, indicate whether you think it is probably true (T) or probably false (F). Then give a brief (one sentence) reason. There is not necessarily a single correct answer to each question, so your one sentence explanation is the most important part of your answer.

    1. Small time slices always improve the average completion time of a system.

    Probably false: Small time slices will sometimes improve the average response of the system. If the slice is too small, the context switching time will start to dominate the useful computation time and everything (including response time) will suffer.

    2. Using a round robin scheduler, a large time slice is bad for interactive users.

    Probably true: Large time slices can allow non-interactive processes keep control of the CPU for longer periods of time, causing the interactive processes to be less responsive.

    3. Shortest Job First (SJF) or Shortest Completion Time First (SCTF) scheduling is difficult to build on a real operating system.

    Probably true: SCTF scheduling requires knowledge of how much time a process is going to take. This requires future knowledge. You might require a user to specify the maximum amount of time that a process could run (and kill it if it exceeds this amount), then use a variant on SCTF.

    ________________________________________________________________________________

    Wednesday, September 22, 2010

    5 Variable Karnaugh Map Solution

    Simplify the Boolean function
    F(A,B,C,D,E)  = Σ (0,2,4,6,9,11,13,15,17,21,25,27,29,31)

    Writing decimals in binary,

     Decimal    A    B    C    D    E
         0           0    0     0    0    0
         2           0    0     0    1    0
         4           0    0     1    0    0
         6           0    0     1    1    0
         9           0    1     0    0    1
         11         0    1     0    1    1  
         13         0    1     1    0    1
         15         0    1     1    1    1
         17         1    0     0    0    1
         21         1    0     1    0    1
         25         1    1     0    0    1
         27         1    1     0    1    1
         29         1    1     1    0    1
         31         1    1     1    1    1

    Construct two Karnaugh maps for variables A,B,C and D when  E=0 and E=1




    From Karnaugh map E=0,  F0 = A'B'E'

    From Karnaugh map E=1,  F1 = BE+ AD'E

    F  =   F0+  F1

    F   =    A'B'E'   +  BE  +   AD'E

    Wednesday, September 8, 2010

    Recurrence relations examples


    Solve the recurrence relation T(n) = T(n/2) + lg(n) where T(1) = 1 and n = 2k for a nonnegative integer k. Your answer should be a precise function of n in closed form. Note that lg represents the log function base 2.

    T(n) = T(n/2) + lg(n)

    T(n/2) = T(n/4) + lg(n/2)

    T(n/4) = T(n/8) + lg(n/4)

    T(n/8) = T(n/16) + lg(n/8)


    Substituting for T(n)

    T(n) = T(n/2) + lg(n)

    = T(n/4) + lg(n/2)+lg(n)

    = T(n/8)+ lg(n/4)+ lg(n/2)+lg(n)

    = T(n/16)+lg(n/8)+ lg(n/4)+ lg(n/2)+lg(n)


    ----------------------------------------------------------

    ----------------------------------------------------------

    Suppose n= 2**k

    T(n) = 1 + lg (n. n/2 . n/4. . . . . n/2**k-1)

    = 1+ lg(2**k  . 2**k-1 .   2**k-2 .  …… . 2)

    = 1+ lg( 2 **(1+2+3+ - - - +k))

    = 1 + (1+2+3+ ----+k)

    = 1+ k(k+1)/2

    ______________________________________________________________________________

    Solve the recurrence T(n) = T(n-1) + n

    T(n)    = T(n-1) + n
    T(n-1) = T(n-2) + n-1
    T(n-2) = T(n-3) + n-2
    T(n-3) = T(n-4) + n -3

     T(n)  =  T(n-1) + n
             =   T(n-2) + n-1+n
             =   T(n-3) + n-2+n-1+n
             =   T(n-4) + n-3+n-2+n-1+n
    -----------------------------------------------

            Consider n= 2**k
     T(n)  =  T(n-k) + n+n-1+n-2+n-3+ -------------+n-(k-1)
             =  T(n-k) +  n(n-1) - k(k-1)/2
             = T(n-k) +  n**2 - n -lgn(lgn-1)/2
           
    Therefore, asymptotic complexity is n**2



    Tuesday, August 31, 2010

    Delete a linked list using recursion in C++

    #include < iostream >
    using namespace std;

    struct Node{
        int num;
        Node* next;
    };

    //Here address of pointer to the list is passed so that head of the  list  will not be a local variable.

    void DeleteListRecursion(Node*& list){
        if(list == NULL)
            return;
        else
         DeleteListRecursion(list->next);
         delete(list);
         list = NULL;
    }



    Node* Insert(int number, Node* list){
        Node* node = new Node;
        node->num = number;
        node->next = list;
        return node;
    }

    //TEST

    int main(){
           
        Node* list = NULL;

        list = Insert(5,list);
        list = Insert(7,list);
        list = Insert(9,list);
        list = Insert(4,list);
        list = Insert(6,list);
        list = Insert(3,list);
       
        DeleteListRecursion(list);

        return 0;
    }

    Saturday, August 28, 2010

    Data Structures You Should Know

    • UnSorted List Using Arrays
    • UnSorted List Using Linked List
    • Sorted List Using Arrays
    • Sorted List Using Linked List
    • Doubly Linked List
    • Circular List
    • Stack Using Arrays
    • Stack Using LinkedList
    • Queue Using arrays
    • Queue Using LinkedList
    • Recursion Problems
    • Binary Tree
    • Sorting
      • Bubble Sort           Code          Example
      • Selection Sort        Code          Example
      • Insertion Sort         Code          Example
      • QuickSort              Code          Example
      • MergeSort             Code          Example
      • HeapSort             Code          Example
      • Radix Sort             Code          Example
      • Hashing

      Tuesday, June 15, 2010

      Chomsky Hierarchy Diagram for Languages.


      The diagram shows how different classes of languages such as regular, context free, context sensitive, P, NP, PSAPACE, NPSPACE, EXPTIME, NEXPTIME, EXPSPACE, ACCEPTABLE, DECIDABLE, CO-ACCEPTABLE etc are related to each other.

      Half-Sat Problem With Example

      Half-SAT = {F | F is a CNF formula with 2n variables and there is a satisfying assignment in which n variables are set to True and n variables are set to False}.
      Show that Half-SAT is NP-Hard.


      We can show that half-sat is NP-hard by reducing sat to half-sat. We can show that if half-sat has a solution sat also has a solution.

      Let T be a turing machine(transducer) which reduces sat to half-sat. T takes F as input.
      Let F has variables x1,x2,......xn.
      T(F) =  Construct a new CNF formula F' as follows.
                  F' will have variables x1,x2,.......xn from F and new variables y1,y2,...........yn such that xi = ‾yi
                  If a clause is of the form (x1 U x2), convert it into (x1 U x2) ^( ‾y1  U  ‾y2).
      Output(F')
          

      Example 1
      Let F = (x1 U x2 U ‾x3) and  x1=True,  x2=False, x3= True
      F = T U F U T = T
      Introduce three new variables y1,y2 and y3 such that
      x1 =  ‾y1
      x2 =  ‾y2
      ‾x3 =  y3


      F' = (x1 U x2 U ‾x3) ^ (  ‾y1 U ‾y2 U y3).
       x1  = T    y1  =F

      x2  = F    y2  =T
      x3  = T    y3  =F

      Here three variables are assigned True and other three variables are assigned False. Therefore the expression is in half-sat.
      Assigning values to F',  F' = ( T U F U T) ^ (T U F U T) =  True.

      Hence F' evaluates to  true iff F evaluates to  true.

      Let x1=False, x2 = False, x3=False
      F=  false U false U false = False
      F' = (F U F U F) ^ (F U F U F) = F
      Hence F' evaluates to  false iff F evaluates to  false.

      Friday, June 11, 2010

      NP- complete Problems with variations

      K-clique                           video
      Hamiltonian Path                   video lecture
      • Does a hamiltonian path exist in G.
      • Does a hamiltonian path exist from node s to t
      • k-link path 
      • Hamiltonian Circuit       video
      SAT Problems                    video
      Subset Sum Problem              video
      • Partition                       video
      • Multiset                        video
      3-COLOR

      Thursday, June 10, 2010

      Half Clique Problem with examples ( Complexity theory )

      Half-clique problem is described as follows. Given a graph G with n number of vertices, does there exist a clique of G consisting of exactly  half the nodes of G? Show that HALF-CLIQUE is NP-hard by reducing CLIQUE to  HALF-CLIQUE.

       Let f be a transducer which does the reduction.

      f(G,k) =  Construct G' as follows
                    Add n more vertices to G.
                    Select n-k of the newly added vertices, connect them to each other and to all  original vertices in G.
                   
      Output (G')

      Here adding n-k vertices to G can be done in polynomial time. So reduction is done in polynomial time.
      Yes Instance

      In the following example G has 5 nodes. Let k=3. This is a yes instance since G has a clique of k=3( nodes A,B and E).










      Construct G' by adding 5 more nodes (F,G,H,I,J)  to graph G. Connect 5-3=2 of them to each other and to original nodes of G.












      G' has five-clique (A,B,E,F and G). 
      G' will not have 5-clique unless G already has a 3 clique.

      No Instance

      Try yourself.:
      Remove any of the edges forming 3 clique and try to construct G'. Check whether G' can have a 5-clique.


      Clique can be reduced to any fractionof clique using the same logic. To solve (1/m)th clique, add (m-1)n vertices to graph G and connect n-k vertices.
      For example, to solve quarter-clique, add 3n vertices to G and fully connect n-k new vertices.

      Thursday, May 20, 2010

      Oracle Trouble shooting

       When I am trying to sign in to Oracle 11g  in windows xp , I am getting the error  'ORA-12560: TNS:protocol adapter error'

      Solution:
      Try1:
      set local=databasename
      For example, if name of database is orcl, command will be
      C:\ set local = ORCL

      Then log in to sqlplus 
      C:\ sqlplus / as sysdba
      ___________________________________________________________________________________
      Try 2

      If try1 does not work,

      In command prompt do the following:
      C:\ set ORACLE_SID =  databasename
       Important:  The database name has to be in uppercase letters. 

      For example, if name of database is orcl, command will be
      C:\ set ORACLE_SID = ORCL

      Then log in to sqlplus 
      C:\ sqlplus / as sysdba

      SQL*Plus: Release 11.1.0.6.0 - Production on Sun Jun 20 00:06:58 2010


      Copyright (c) 1982, 2007, Oracle.  All rights reserved.


      Connected to:
      Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
      With the Partitioning, OLAP, Data Mining and Real Application Testing options

      Now you are connected to the database.
      ____________________________________________________________________________________

      Friday, April 16, 2010

      Turing Machine ( True or False )

      1. A Universal Turing Machine can compute anything that any other Turing Machine could possibly compute.
      True

      2.The Turing Test is a test of whether a problem can be solved by a Turing Machine.
      True

      3. Every acceptable language is also decidable.
      False

      4. Decidability is a special case of decidability
      True

      5. Regular languages are decidable
      True

      6. Context free languages are not decidable
      False

      Saturday, March 13, 2010

      Interesting Algorithms

      1. Let A be a list of n elements in non-decreasing order where some elements are replicated.
      The problem is to find the element that appear most frequently. Give an O(k log n) time
      algorithm to solve the problem where k denotes the number of distinct elements in A.


      2. You are given a list A = [a1; a2; - - - ; an] of n distinct numbers in an arbitrary order.
      You are supposed to find two numbers i and j from A such that (i) i < j, (ii) j > i, and
      (iii) satisfying (i) and (ii), aj - ai is minimum over all possible such pairs. Give an O(n) time
      algorithm.


      3. A carpenter has a piece of wood of a certain length that must be cut at positions a1, a2 ..., an where ai is the distance from the left end of the original piece of wood. Notice that after making the first cut, the carpenter now has two pieces of wood; after making the second cut, the carpenter has three pieces of wood, etc. Assume that the cost of making a cut in a piece of wood of length l is equal to l, and is the same no matter which position in that piece of wood is being cut. Let L be the length of the original piece of wood.
      Derive the recurrence relation which could be used to design a recursive algorithm to find the minimum total cost for making all the cuts.


      4. Describe a Θ(n lg n)-time algorithm, that given a set of n integers and another
      integer x, determines whether or not there exist two elements in S whose sum is exactly x. Write
      pseudocode.
      Hint: Sort using mergesort and iterate once after sorting. Θ(n lg n+n) = Θ(n lg n)


      5. Suppose you are given an array A containing n sorted elements followed by lgn unsorted elements. Thus, entire array contains N = n+lgn elements. Can the entire array sorted in O(n) time.

      6.  You are given an array of n numbers. Write an algorithm with O(n)=nlogn  that returns the number of distinct numbers in the array.   

      Friday, March 12, 2010

      Minimum Spanning Tree - True or False

      1. In an undirected graph, the shortest path between two nodes lies on some minimum spanning
      tree.
      A: False.

      2. If the edges in a graph have different weights, then the minimum spanning tree is unique.
      A: True.

      3. If the edge with maximum weight belongs to a cycle, then there exists some MST that
      does not contain this edge.
      A: True

      4. Adding a constant to every edge weight does not change the minimum spanning tree.
      A: True

      5. There can be more than one minimum spanning trees if the weight of the edges are all distinct
      A.False

      6. If all weights are the same, every spanning tree is minimum.
      A. True

      7. Greedy algorithm runs in exponential time
      A. False

      8. A minimum spanning tree should contain all edges of the graph
      A. False

      9. A minimum spanning tree should contain all vertices of the graph
      A.True

      10.Adding an edge to a spanning tree of a graph G always creates a cycle.
      A.False

      11. Adding an edge to a spanning tree connecting two existing vertices of a graph G always creates a cycle.
      A. True

      12. For any cycle in a graph, the cheapest edge in the cycle is in a minimum spanning tree.
      A. False

      Big O, Big Omega, Big Theta

      For each of the following pairs of functions f(n) and g(n), state whether f(n) =
      O(g(n)), f(n) = ­Î©(g(n)), or f(n) = Θ(g(n)), or none of the above:


      (a) f(n) = n2 + 3n + 4, g(n) = 6n + 7
      Sol: f(n) = ­Î©(g(n).


      (b) f(n) = n √n , g(n) = n2 - n
      Sol: f(n) = O(g(n)).


      (c) f(n) = 2**n - n**2, g(n) = n**4 + n**2
      Sol: f(n) = ­­Î©(g(n).


      (d) Assume you have five algorithms with the running times
      listed below (these are the exact running times). How much slower do each of these
      algorithms get when you double the input size
      i) n2
      Sol: 4T(n)

      ii) n3
      Sol: 8T(n)

      iii) 100n2
      Sol: 4T(n)

      iv) nlgn
      Sol: 2n(lg2n) = 2n(1+lgn) ~ 2n lgn
      T(n) = 2T(n)

      v) 2n
      Sol:  [T(n)]


      (e)  Write a big-O expression for 1+2+3+...+n?
      Sol:  O(n2)


      True or false:
      (f) An algorithm with worst case time behavior of 3n takes at least 30 operations for every input of size n=10.
      Sol : False


      (g)  n! = O(nn)
      Sol: True


      (h) nO(1) = O(n2)
      Sol: False.  O(1) can be any large constant .


      (i) All of the following are true.
      • f(n) = O(f(n))
      • c * O(f(n)) = O(f(n)), if c is constant
      • O(f(n)) + O(f(n)) = O(f(n))
      • O(O(f(n))) = O(f(n))
      • O(f(n)) * O(g(n)) = O(f(n)g(n))
      • O(f(n)g(n)) = f(n) * O(g(n))

      Wednesday, February 17, 2010

      merge sort code in c++ to count number of comparisons

      #include < iostream >
      #include < stdio.h >

      using namespace std;

      int count = 0; //count of comparisons
      int n = 0;
      const int MAX_ITEMS = 100;
      void merge(int values[], int leftFirst, int leftLast, int rightFirst, int rightLast);
      void printarray( int a[], int n);
      void mergesort(int a[], int start, int end){  //no significant comparisons are done during splitting
         
          if(start < end){
              int mid = (start+end)/2;   
              mergesort(a,start, mid);
              mergesort(a,mid+1,end);
              merge(a, start,mid, mid+1, end);
          }
      }
      void merge(int values[], int leftFirst, int leftLast, int rightFirst, int rightLast){
              int temparray[MAX_ITEMS];
              int index = leftFirst;
              int saveFirst = leftFirst;

              while((leftFirst <= leftLast)  && ( rightFirst <= rightLast)){//compare and select smallest from two subarrays

                  if(values[leftFirst] < values[rightFirst]){
                      temparray[index]  = values[leftFirst]; //smallest assigned to temp
                      leftFirst++;
                  }
                  else
                  {
                      temparray[index]  = values[rightFirst];
                      rightFirst++;
                  }
                  index++;
                  count++;  //count of comaparisons done during merge. One comparison is done per iteration of while loop. 
              }
             
              while(leftFirst <= leftLast){ 

                  temparray[index] = values[leftFirst];
                  leftFirst++;
                  index++;
                 
              }
              while(rightFirst <= rightLast){
                  temparray[index] = values[rightFirst];
                  rightFirst++;
                  index++;
                 
              }
             
              for(index = saveFirst; index <= rightLast; index++)//copies from temp array to values array
                  values[index] = temparray[index];
              printarray(values,n);
              cout << endl;

          }

      void printarray( int a[], int n){
          for (int i=0; i < n; i++)
              cout << a[i] << "  ";
      }

      int main(){
         
          cout << "Enter number of  elements to be sorted : ";
          cin >>n;

          int a[MAX_ITEMS];
         
      for (int i=0; i < n; i++){
              if(i==0)
                  cout << "Enter the first element: ";
              else
                  cout << "Enter the next element: ";
              cin >>     a[i];
          }
         
          int start = 0;
          int end = n-1;
            mergesort(a, start, end);
          printarray(a, n);

          cout << endl;
          cout  << "Number of comparisons : "<< count << endl;
      }

      Friday, February 12, 2010

      systems programming (UNIX, C)

      How to toggle lettercases.

      struct termios info;
          int   rv = tcgetattr(0, &info);
          if(info.c_oflag & OLCUC)
                  info.c_oflag &=  ~OLCUC;
          else
              info.c_oflag |= OLCUC;
          tcsetattr(0, TCSANOW, &info);

      ______________________________________________________________

      How can we measure CPU time ( user + kernel) of a program in UNIX.

      1. Use clock() system call to measure CPU time.
            The clock() function returns the  amount  of  CPU  time  (in
           microseconds)  used  since  the first call to clock() in the
           calling process.
            example
                time_t time = clock();
                int i=0;
                while(i<10000){
                    printf("%d\n", i);
                    i++;
                }
               time = clock();
               printf( "\nCPU time : %ld\n" , time);// prints CPU time  taken  to print numbers from 0 to 999


      2.  Use time command
           example
           bash$     time gcc file.c
      3.  Use getitimer and setitimer with ITIMER_REALPROF

      Friday, February 5, 2010

      IMac

      How can I access(SSH) a remote machine from my iMac.

      Access the terminal from application -> Utilities -> Terminal
      On prompt type ssh -l uername host 
      On prompt for password ,  type pasword

      How can I switch between two terminal windows in iMac
      command- ~

      Wednesday, February 3, 2010

      c programmin errors, warnings and solutions

      Warnings:
      1.  warning: incompatible implicit declaration of built-in function 'malloc'
        solution - add #include < stdlib.h> 
      2.  warning: incompatible implicit declaration of built-in function 'strlen'
        solution-   add #include < string.h>

      Friday, January 8, 2010

      C programming

      Code to deep copy a string in C

      char*  deepcopy(char* str1){
              char *copy  = malloc (strlen(str1));
              sprintf(copy, str1);
              return copy;
      }




      1. When I compile, I am getting the error: 'EOF' undeclared (first use in this function)

      Solution: #include < stdio.h >

      2. In visual studio output window just flashes and disappears. What can I do to make it stay.

      Run the program using Ctrl F5 instead of just F5


      3. When I compile a simple program I am getting the error "Cannot open include file: 'iostream.h': No such file or directory". How can I fix this.

      Check if iostream is stored in C:\Program Files\Microsoft Visual Studio 8\VC\include folder. Sometimes it is stored just as iostream instead of iostream.h . If that is the case add #include < iostream> omitting .h

      4.LINK : fatal error LNK1104: cannot open file 'C:\Documents and Settings/...............

      Usual reason is that the same program is running and a terminal window is waiting for input.