Difference between revisions of "CompSciWeek4"

From Predictive Chemistry
Jump to: navigation, search
(Homework 3 - due Monday, Sept. 22)
m (Class 1: Formal Proofs)
 
(3 intermediate revisions by the same user not shown)
Line 12: Line 12:
 
A: The Pope and I make 2, but 1+1 = 1, so I and the Pope are
 
A: The Pope and I make 2, but 1+1 = 1, so I and the Pope are
 
one and the same.
 
one and the same.
  +
** This is a famous argument from the Cambridge mathematician G. H. Hardy, written to satisfy J. E. McTaggart's doubt that anything can be proven from a false proposition. It was cited in the first chapter of Sir Harold Jeffreys' "Scientific Inference" and in a discussion discounting set theory in appendix B to Jaynes "Probability Theory, the Logic of Science."
 
** Anything that proves a false statement is false.
 
** Anything that proves a false statement is false.
 
** If a then False => ~a
 
** If a then False => ~a
 
* Induction
 
* Induction
   
= Class 2: Using HPC Resources =
+
= Class 2: Code Profiling Examples =
* accessing binaries and libraries, using modules
+
* Creating and loading a module
* using scratch space
+
* formatting the results as a flat text file
* submitting a job script
+
* reading text files into arrays (numpy) and plotting (pylab)
* managing queued jobs
 
* advanced scripting tips and tricks
 
   
 
== Profiling and Plotting Code ==
 
== Profiling and Plotting Code ==
Line 82: Line 83:
 
Both the shell and python scripts above have been modified from the version in class to show how to read command-line arguments. In the shell, the number of arguments is in the variable '''$#''', and the '''if''' statement checks for the right usage. '''$0''' is the program name and ''is not'' counted in '''$#'''. In python, the arguments are in the list, sys.argv, sys.argv[0] is the program name, and ''is'' counted in len(sys.argv). The '''assert''' statement checks for the right usage, and immediately terminates with an error (raises an AssertionError exception) if the assertion fails.
 
Both the shell and python scripts above have been modified from the version in class to show how to read command-line arguments. In the shell, the number of arguments is in the variable '''$#''', and the '''if''' statement checks for the right usage. '''$0''' is the program name and ''is not'' counted in '''$#'''. In python, the arguments are in the list, sys.argv, sys.argv[0] is the program name, and ''is'' counted in len(sys.argv). The '''assert''' statement checks for the right usage, and immediately terminates with an error (raises an AssertionError exception) if the assertion fails.
   
= Homework 3 - due Monday, Sept. 22 =
+
= Homework 3 - due Monday, Sept. 29 =
 
* Coding Problems
 
* Coding Problems
** Write a code to build a graph representation of what's inside the /usr/share/X11 directory on circe. Use (os.walk and/or os.listdir, os.path.join). see example graph representation above!
+
** Write a code to build a graph representation of what's inside the /usr/share/X11 directory on circe. Use (os.walk and/or os.listdir, os.path.join). see [[Code:networkx|example graph representation]]
 
**# How many directories are there total?
 
**# How many directories are there total?
 
**# What is the maximum depth, counting files and dirs so the depth of a dir with no files/subdirs = 0, and with any files/subdirs = 1?
 
**# What is the maximum depth, counting files and dirs so the depth of a dir with no files/subdirs = 0, and with any files/subdirs = 1?
Line 92: Line 93:
 
* Problems from Algorithms in Python:
 
* Problems from Algorithms in Python:
 
** 2-4 (read the text definition of big-O notation)
 
** 2-4 (read the text definition of big-O notation)
** 2-7 (write and turn in a test code for each case mentioned -- see above)
 
  +
** 2-7 (write and turn in a test code for each case mentioned -- see above) You should show the test function you used and write the answer to the question in big-O (or theta) notation, e.g. list append was tested with "def lappend(n): l = [1]*n; l.append(1);" and the append operation takes constant time, O(1).
 
** 2-10
 
** 2-10
 
*** The question describes a binary tree. Internal nodes are nodes with two children. Leaf nodes have no children. Try drawing some trees to get a sense of the answer for this one.
 
*** The question describes a binary tree. Internal nodes are nodes with two children. Leaf nodes have no children. Try drawing some trees to get a sense of the answer for this one.

Latest revision as of 15:50, 13 October 2014

Reading Assignment

  • Algorithms, Chapter 2-3

Class 1: Formal Proofs

  • Complexity Notation
  • Rules - state assumptions, use logical operations on assumptions
  • Modus Ponens - (a, a =>b) => b
  • Modus Tollens - (a => b, ~b) => ~a
  • Disjunctive Syllogism - ((a v b), ~a) => b
  • Contradiction - (a, ~a) => Yikes!
  ex. from 1+1 = 1, prove that I am the Pope.
  A: The Pope and I make 2, but 1+1 = 1, so I and the Pope are
     one and the same.
    • This is a famous argument from the Cambridge mathematician G. H. Hardy, written to satisfy J. E. McTaggart's doubt that anything can be proven from a false proposition. It was cited in the first chapter of Sir Harold Jeffreys' "Scientific Inference" and in a discussion discounting set theory in appendix B to Jaynes "Probability Theory, the Logic of Science."
    • Anything that proves a false statement is false.
    • If a then False => ~a
  • Induction

Class 2: Code Profiling Examples

  • Creating and loading a module
  • formatting the results as a flat text file
  • reading text files into arrays (numpy) and plotting (pylab)

Profiling and Plotting Code

lib/ltime.py <source lang="python">

  1. Functions to time

from random import random

def lappend(n): l = [1]*n l.append(random())

def lmod(n): l = [1]*n l[0] = random()

def make_list(n): l = [1]*n

def rand(n): random() </source>

Make sure that PYTHONPATH includes your lib directory (where ltime.py lives), e.g.

 PYTHONPATH=/usr/lib/python2.7/site-packages:$HOME/scicomp/lib

time.sh <source lang="bash">

  1. !/bin/bash

if [ $# -ne 1 ]; then

 echo "Usage: $0 <function name from ltime.py>"
 exit 1

fi

cmd=$1

k=1 for((n=1;n<8;n++)); do

 echo -n "$k "
 python -m timeit -s"import ltime as m;" "m.$cmd($k)" | sed 's/.*[^0-9]\([0-9]*\.\)/\1/; s/ usec.*//;'
 k=$(($k*2))

done </source>

analyze.py <source lang="python"> from numpy import * import sys, pylab

assert len(sys.argv) == 2, "Usage: %s <function name>"%sys.argv[0]

name = sys.argv[1] x = fromfile(name+".dat", sep=' ') x = reshape(x, (-1,2)) pylab.plot(x[:,0], x[:,1]) pylab.savefig(name+".png") </source>

Both the shell and python scripts above have been modified from the version in class to show how to read command-line arguments. In the shell, the number of arguments is in the variable $#, and the if statement checks for the right usage. $0 is the program name and is not counted in $#. In python, the arguments are in the list, sys.argv, sys.argv[0] is the program name, and is counted in len(sys.argv). The assert statement checks for the right usage, and immediately terminates with an error (raises an AssertionError exception) if the assertion fails.

Homework 3 - due Monday, Sept. 29

  • Coding Problems
    • Write a code to build a graph representation of what's inside the /usr/share/X11 directory on circe. Use (os.walk and/or os.listdir, os.path.join). see example graph representation
      1. How many directories are there total?
      2. What is the maximum depth, counting files and dirs so the depth of a dir with no files/subdirs = 0, and with any files/subdirs = 1?
      3. How many references are there to each directory -- node in-degree? (ignore symlinks)
      4. What is the maximum number of references from a directory (i.e. maximum out-

degree)?

  • Problems from Algorithms in Python:
    • 2-4 (read the text definition of big-O notation)
    • 2-7 (write and turn in a test code for each case mentioned -- see above) You should show the test function you used and write the answer to the question in big-O (or theta) notation, e.g. list append was tested with "def lappend(n): l = [1]*n; l.append(1);" and the append operation takes constant time, O(1).
    • 2-10
      • The question describes a binary tree. Internal nodes are nodes with two children. Leaf nodes have no children. Try drawing some trees to get a sense of the answer for this one.
    • 2-12 The idea of weighted edges is to add information to the graph, e.g. 42 steps between node u and node v.
      • Specifically, explain how looking up children of a given node would work in this representation (how complicated is finding all children of node 'u').