Friday, December 3, 2010

Simple MapReduce Program Example in Python

Given a set of integers, compute the sum of their square values.
(Assumption: You already know how to run a map reduce program using an input file and how to check the output)

mapper.py

#!/usr/bin/env python

import sys

# input comes from STDIN (standard input)
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # split the line into numbers
    numbers = line.split()
    for number in numbers:
        num = int(number)
        square = num*num
        print '%s\t%s' % (1,square)

reducer.py

#!/usr/bin/env python

import sys

#sum initialized to zero
sum=0
# input comes from STDIN
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # parse the input we got from mapper.py
    word, square = line.split('\t', 1)
    # convert square(currently a string) to int
    try:
        square = int(square)
        sum=sum+square
    except ValueError:
        # count was not a number, so silently
        # ignore/discard this line
        pass
#print sum of squares
print '%s'% (sum)
___________________________________________________________________
input.txt
1 2 3 4


output/part-0000
30

NP completeness exam questions

Consider a reduction of problem A to problem B. What is the most precise claim you
can make about problem B for each of the following situations?

a) A is NP-complete and the reduction is in polynomial time.
 NP-hard. (At least as hard as NP-complete problem.)

b) A is in polynomial time and the reduction is also in polynomial time.
B could be anything.

c) A is NP-complete and the reduction is in Pspace.
B could be anything.

d) A is in nondeterministic polynomial time and the reduction is in polynomial time.
B is at least as hard as A, but nothing more can be said.

e) A requires exponential time and the reduction is in polynomial time.
B must requires polynomial time for deciding.

f) A is Pspace complete and the reduction is in Pspace.
B could be anything.
_______________________________________________________________________
Suppose you could reduce an NP complete problem to a polynomial time problem in
polynomial time. What would be the consequence?
What if the reduction required exponential time?

If we could reduce an NP-complete problem to a problem in P, then NP will
be equal to P. If the reduction required exponential time, then there is not
special consequence. (In fact any NP-complete problem can be reduced to
a polynomial time solvable problem using exponential time reduction.)
_______________________________________________________________________

Wednesday, December 1, 2010

True-Sat problem is NP-complete

TRUE-SAT = {boolean expressions E in conjunctive normal form that
                   (1) are true when all variables are set true, AND ALSO
                   (2) have some other truth assignment that makes E true}
            This can also be stated (in Garey-Johnson notation) as:  
             Instance: Set of clauses C1,C2,....Ck over the boolean variables u1,u2,....um  
             Question:
                     (1) are all Ci true when u1,u2,....um are set to true, and
                     (2) is there another satisfying truth assignment (not all variables set to true)?



First, Show TRUE_SAT is in NP. (This is left as an exercise)
Then show TRUE-SAT is NP-hard.
If an expression is not true when all variables are true, then it is surely not in TRUE-SAT.

To show TRUE-SAT is NP-complete, we reduce SAT to it. Suppose we are given an expression E with variables a,b,c.

Convert E to E' as follows:
Test if E is true when a = T, b=T and c=T.
If so, we know E is satisfiable, then add (a U a') to E.
                 E' = E  U ( a U a' )
                 // E satisfies first condition. a U a' satisfies the second condition.
Otherwise
                E' = E U (  a U b U c )
                //Adding  a U b U c satisfies first condition. If E is in SAT, that satisfies the second condition.
 Therefore, TRUE_SAT accepts only when E is in SAT. Hence TRUE-SAT is reducible to SAT and TRUE-SAT is NP-complete.