Difference between revisions of "CompSciSpring2018"

From Predictive Chemistry
Jump to: navigation, search
(In-Class Problems)
m (In-Class Problems)
 
(14 intermediate revisions by the same user not shown)
Line 159: Line 159:
 
* Midterm Project Description [[Media:Midterm2018.pdf]] (2 Mar., 2018)
 
* Midterm Project Description [[Media:Midterm2018.pdf]] (2 Mar., 2018)
 
** [[Agile|Agile Development Tips & Tricks]]
 
** [[Agile|Agile Development Tips & Tricks]]
<!--* Final Project Description [[Media:Final2018.pdf]] (25 Apr., 2018)-->
 
 
* Midterm Projects:
  +
<!--
 
** Luis & Matteo [https://drive.google.com/file/d/1tTkwRuJts2SnZcEkVlHO51ItacFfHYL3/view?usp=sharing_eil&ts=5a9d6cb2]
 
** Misty & Cherinee [https://docs.google.com/presentation/d/17UTtjv-2d19mTtYt9oVwrWI5eOuo7l2qg4zYsOGXTxw/edit?ts=5a9d7601#slide=id.gc6f90357f_0_0]
 
** Pauline [http://predictivestatmech.org/code/pauline.py]
 
** Nasha [http://predictivestatmech.org/code/nasha.py]
  +
** Daniel [https://drive.google.com/drive/folders/1NNQEG7O8Pi1jGeMaPwcWYEBVI62BhYgD?usp=sharing]
 
** Melissa, Casidy & Paul [https://docs.google.com/presentation/d/10U8loA8uPMypoEWO163e6smZ9JnrPXjyQl97Duz8Hg8/edit?usp=sharing_eip&ts=5a9d6081]
  +
** Kimberly [http://predictivestatmech.org/code/kimberly.py]
  +
** Eric
  +
** Jovina & Jesunifemi [http://predictivestatmech.org/code/jovina-nifemi.py]
  +
** Jakin [https://docs.google.com/presentation/d/1LFh-_n6J2jC6FBEl-X4WTJqYyhwjnVucGdMBKsUFsjU/edit?usp=sharing_eip&ts=5a9d8fa1]
  +
** Bryan [http://predictivestatmech.org/code/bryan.py]
  +
-->
 
* [[Final2018|Final Project Description]] (25 Apr., 2018)
   
 
== Textbooks and Resources ==
 
== Textbooks and Resources ==
Line 196: Line 210:
 
* [[Media:Week2.pdf|Function Syntax]] and [[Media:Week3.pdf|For-Loop Syntax]] cheat-sheets.
 
* [[Media:Week2.pdf|Function Syntax]] and [[Media:Week3.pdf|For-Loop Syntax]] cheat-sheets.
 
* [[Media:PlottingMaps.pdf|Plotting Maps]]
 
* [[Media:PlottingMaps.pdf|Plotting Maps]]
  +
* [https://m3challenge.siam.org/sites/default/files/uploads/siam-guidebook-final-press.pdf SIAM Notes on Mathematical Modeling]
  +
* [[Code:libxdrfile|libxdrfile]]
  +
* The simple, yet amazing harmonic oscillator
  +
** [http://hyperphysics.phy-astr.gsu.edu/hbase/shm.html Classical Oscillator]
  +
** [http://hyperphysics.phy-astr.gsu.edu/hbase/quantum/hosc.html Quantum Oscillator]
  +
** [http://hyperphysics.phy-astr.gsu.edu/hbase/quantum/hosc6.html#c2 Probabilities]
  +
* [[Code:MDIntro]]
 
<!--
 
<!--
 
** [[Turing Exercise]]
 
** [[Turing Exercise]]
 
** [https://predictivestatmech.org/code/example.dcd Example DCD File]
 
** [https://predictivestatmech.org/code/example.dcd Example DCD File]
 
* [[Media:Week-5.2017.pdf|Week 5 Notes]]
 
* [[Media:Week-5.2017.pdf|Week 5 Notes]]
* [[Media:PlottingMaps.pdf|Week 6 Notes]]
 
 
** [[Code:PyTesting]]
 
** [[Code:PyTesting]]
 
* [[Media:NumpyArrays.pdf|Week 7 Notes]]
 
* [[Media:NumpyArrays.pdf|Week 7 Notes]]
Line 216: Line 236:
 
* [[Media:Matrix.pdf|Matrix Algorithms]]
 
* [[Media:Matrix.pdf|Matrix Algorithms]]
 
-->
 
-->
* Midterm Projects:
 
** Luis & Matteo [https://drive.google.com/file/d/1tTkwRuJts2SnZcEkVlHO51ItacFfHYL3/view?usp=sharing_eil&ts=5a9d6cb2]
 
** Misty & Cherinee [https://docs.google.com/presentation/d/17UTtjv-2d19mTtYt9oVwrWI5eOuo7l2qg4zYsOGXTxw/edit?ts=5a9d7601#slide=id.gc6f90357f_0_0]
 
** Pauline [http://predictivestatmech.org/code/pauline.py]
 
** Daniel
 
** Nasha [http://predictivestatmech.org/code/nasha.py]
 
** Melissa, Casidy & Paul [https://docs.google.com/presentation/d/10U8loA8uPMypoEWO163e6smZ9JnrPXjyQl97Duz8Hg8/edit?usp=sharing_eip&ts=5a9d6081]
 
 
=== In-Class Problems ===
 
 
* Week 1: [[Media:CalculatorProbs.pdf|Calculator Problems]]
 
** Solutions: [http://predictivestatmech.org/terms/term.html]
 
** This solution uses python list comprehensions, and shows that the slowest step is actually unit conversion!
 
* Week 2: [[Media:Pseudocode.pdf|Pseudocode Problems]]
 
** We will work these in class. For practice, be sure you have python list ideas down!
 
** Solutions: [https://docs.google.com/document/d/1HRkSdVNsTUydY4cVx67jRHRyhi5gL6nneEj9IX89wlE/edit?usp=sharing]
 
* Week 3: [[Media:Plotting1.pdf|Plotting 1]]
 
** [https://stackoverflow.com/questions/11815538/how-to-import-matplotlib-in-python#11830055 Notes on installing Matplotlib on your own computer]
 
** Challenge: actions vs. functions -- what does the following code do: "[ print(1) for i in range(10) ]" (python3 only)
 
* Week 4: [[Media:Plotting2.pdf|Plotting 2]]
 
** Solutions: [https://docs.google.com/document/d/1AanEFQmxLxoNM-662MHxsxToY3wsqovNx6SwcPQuBIg/edit?usp=sharing]
 
* Week 5: Read pseudocode and plotting, focusing on types and syntax. Also, re-read the end of chapter summaries for Ch. 3, 5 and 6.
 
* Week 6: Read the following pieces on first order logic [http://bayes.wustl.edu/etj/prob/book.pdf Chapter 1 of Jaynes' Book] [http://web.cecs.pdx.edu/~cs410aph/Lectures/Smalltalk%20II/Dijkstra%20on%20Hamming%27s%20Problem.pdf Hamming's Problem], and pages 71-80 in Chapter 4 of Hetland's Algorithms book (on induction). Just skip directly to chapter 4.
 
** Solutions [https://docs.google.com/document/d/14QZxh2SAWl3JE-SzVyemkAv4p9TBChO-jzZ5YpT8BZs/edit?usp=sharing]
 
* Week 7: Review Hetland, Chapters 3-7. We'll do a series of parsing problems and flex our collective induction and recursion muscles.
 
** [https://docs.google.com/document/d/1i04oL7zJsD3KzvsPIaHBgwaOZ78UZvz5ReRDF-PlyF0/edit?usp=sharing Week 7 Problems]
 
* Week 8: [[Media:Minimization1.pdf|Minimization Problems!]]
 
   
 
=== Review Notes on Sequences, list comprehensions and arrays ===
 
=== Review Notes on Sequences, list comprehensions and arrays ===
Line 334: Line 327:
   
 
</source>
 
</source>
  +
  +
 
=== In-Class Problems ===
  +
 
* Week 1: [[Media:CalculatorProbs.pdf|Calculator Problems]]
 
** Solutions: [http://predictivestatmech.org/terms/term.html]
 
** This solution uses python list comprehensions, and shows that the slowest step is actually unit conversion!
 
* Week 2: [[Media:Pseudocode.pdf|Pseudocode Problems]]
 
** We will work these in class. For practice, be sure you have python list ideas down!
 
** Solutions: [https://docs.google.com/document/d/1HRkSdVNsTUydY4cVx67jRHRyhi5gL6nneEj9IX89wlE/edit?usp=sharing]
 
* Week 3: [[Media:Plotting1.pdf|Plotting 1]]
 
** [https://stackoverflow.com/questions/11815538/how-to-import-matplotlib-in-python#11830055 Notes on installing Matplotlib on your own computer]
 
** Challenge: actions vs. functions -- what does the following code do: "[ print(1) for i in range(10) ]" (python3 only)
 
* Week 4: [[Media:Plotting2.pdf|Plotting 2]]
 
** Solutions: [https://docs.google.com/document/d/1AanEFQmxLxoNM-662MHxsxToY3wsqovNx6SwcPQuBIg/edit?usp=sharing]
 
* Week 5: Read pseudocode and plotting, focusing on types and syntax. Also, re-read the end of chapter summaries for Ch. 3, 5 and 6.
 
* Week 6: Read the following pieces on first order logic [http://bayes.wustl.edu/etj/prob/book.pdf Chapter 1 of Jaynes' Book] [http://web.cecs.pdx.edu/~cs410aph/Lectures/Smalltalk%20II/Dijkstra%20on%20Hamming%27s%20Problem.pdf Hamming's Problem], and pages 71-80 in Chapter 4 of Hetland's Algorithms book (on induction). Just skip directly to chapter 4.
 
** Solutions [https://docs.google.com/document/d/14QZxh2SAWl3JE-SzVyemkAv4p9TBChO-jzZ5YpT8BZs/edit?usp=sharing]
 
* Week 7: Review Hetland, Chapters 3-7. We'll do a series of parsing problems and flex our collective induction and recursion muscles.
 
** [https://docs.google.com/document/d/1i04oL7zJsD3KzvsPIaHBgwaOZ78UZvz5ReRDF-PlyF0/edit?usp=sharing Week 7 Problems]
 
* Week 8: [[Media:Minimization1.pdf|Minimization Problems!]]
  +
* Week 10: [[Media:Minimization2.pdf|Molecular Optimization]]
  +
* Week 11: [[Media:Dynamics.pdf|Dynamical Problems]]
  +
* Week 12: [[Media:InvPend.pdf|Inverse Pendulum]]
  +
** [[Simulating Rule 110]]
  +
* Week 13: No reading, simulate dynamics in the final project, scale up Rule 110 simulation and calculate histograms of local densities.
  +
* Week 15: [[Media:Stats.pdf|Elementary Statistics]]
   
 
=== Homework ===
 
=== Homework ===
   
 
* Homework will be assigned through Repl.IT
 
* Homework will be assigned through Repl.IT
  +
** Homework 6 is described below
 
* Project presentations will be on Mon.-Wed., Mar 5-7
 
* Project presentations will be on Mon.-Wed., Mar 5-7
* Final Project presentations will be during exam time, Wed., May 2
 
  +
* Final Projects are due by the end of the last week of classes.
  +
  +
'''HW6''' (hand in on Wednesday, April 18)
  +
<blockquote>
  +
After reading through the SIAM guide on mathematical modeling, formulate an interesting question about the chemical behavior of mixed solvent systems. Don't be too constrained by what can and can't be simulated with MD, but do create a well-designed problem statement, list some possible assumptions that could help narrow down the problem scope, and clearly explain any quantitative variables that can be defined. It will help to reference chemical literature, including textbooks from your past courses.
  +
</blockquote>
  +
  +
'''Homework 7''' (hand in on Monday, April 23)
  +
<pre>
  +
Create pseudocode and identify relevant python/numpy/scipy
  +
data structures and functions for carrying out
  +
each of the following MD data analysis tasks.
  +
Be explicit about data types you choose for each variable,
  +
as well as the inputs and outputs of each function you define.
  +
Your pseudocode should be returned neatly handwritten or typed
  +
in outline form and should present a complete, workable solution.
  +
  +
1. How do I determine the best solvent for reaction chemistry?
  +
  +
Chemical reactions at an industrial scale are always carried
  +
out under optimized conditions. Solvent composition (i.e. the
  +
mole fraction of each solvent molecule) are one factor
  +
that can be tuned in order to control reaction rate, yield, selectivity,
  +
and ease of purification. One factor that can be modeled
  +
is the difference in solubility between reactant and product molecules.
  +
The solubility product constant for a single molecule, K, is defined as
  +
the maximum concentration the molecule can be present in solution
  +
before further additions of the molecule `crash out' and form a solid
  +
rather than dissolving.
  +
  +
Let's test the linear solvation free energy model,
  +
  +
K(M) = K0 exp( sum_j n_j(M) u_j ),
  +
  +
where K0 and k_j are constants for all molecules (M),
  +
the sum runs over the 'j' functional groups
  +
present in molecule M, with n_j(M) the number of functional
  +
groups of type 'j' in that molecule.
  +
  +
Write pseudocode for a "guess and check" method that
  +
tries out possible parameters, K0 and u_j, to decide
  +
how well those parameters perform.
  +
Assume you have a table of K(M) and n_j(M) values available,
  +
and can call the function "generate_params" to generate a new
  +
parameter guess.
  +
  +
2. How fast is a 'diffusion-limited' reaction?
  +
  +
Reaction rates depend on how many times the individual
  +
molecules involved in each step of the reaction collide.
  +
For a bimolecular reaction like an enzyme-substrate binding
  +
event, E + S -> (ES), the reaction rate can be limited
  +
by diffusion when binding is very strong (so that every collision
  +
leads to "irreversible" binding). MD can't simulate chemical
  +
reactions, so the analysis will be limited to determining whether
  +
the enzyme-substrate becomes small, indicating binding.
  +
  +
Assume you have been given a trajectory of an enzyme in solution
  +
with 100 substrate molecules. Define a (distance-based) criteria
  +
for determining if a complex has formed when given the positions of
  +
12 atoms on a key residue of the enzyme and all 10 atoms of the substrate.
  +
This criteria should be stated in terms of a function, with well-typed
  +
inputs and outputs.
  +
Then write pseudocode that uses the entire trajectory (and the function)
  +
to determine: i) the exact time-point (frame) at which an
  +
enzyme-substrate complex forms, and ii) whether that complex
  +
ever unbinds during the remainder of the simulation trajectory.
  +
  +
For this exercise, you will need to use steps that define and use labels
  +
for substrates and/or atoms like "loop through substrates,"
  +
"label that substrate as 'k'", and "check substrate 'k'", etc.
  +
Like adding main characters to a story, it's important to know when to
  +
give certain pieces of data individual names and types.
  +
</pre>
  +
  +
'''Homework 8''' - Due Friday, April 27. (hand in on Wednesday or email by Friday COB).
  +
<pre>
  +
Revise your midterm project code using what you've learned
  +
in this class to make it more structured and maintainable.
  +
In some cases, this may simply require adding documentation
  +
and in other cases, this may be simplest to accomplish
  +
with a complete re-write.
  +
  +
The completed code must contain: 1) at least two functions,
  +
documented with input and output types and explanations,
  +
as well as example input -> output pairs, 2) at least one
  +
loop, and 3) at least one new data structure, with
  +
a documented type and an example instance that matches the given type.
  +
One way to satisfy the last requirement is opting to change
  +
the shape of an existing array in the code or to
  +
regroup data into an array, list, or dictionary.
  +
</pre>

Latest revision as of 13:31, 22 April 2018

Introduction to Scientific Computing

Course Info

  • Course Numbers CHM 4932-004/6938-010, CRN: 20438/20439
  • Credit Hours: 3
  • Meeting Dates: January 8 - April 25, 2018
    • No Class Jan. 15 or Mar. 12-18
  • Meeting Times: Mon. and Wed., 12:30-13:45 in SCA 222
    • Midterm Project Presentations: Mon.-Wed., Mar 5 and 7.
    • Final Project Presentations: Wed., May 2, 10:00am – 12:00pm (scheduled final date)
    • Lab Session & Office Hours: Fri. 12:30-13:45 in IDR 214 (or SCA 222 by announcement)
$ for((i=1;i<=4;i++)); do cal $i 2018; done	
	
    ____ January 2018 ____      _____ March 2018 ____
   | Su Mo Tu We Th Fr Sa |    | Su Mo Tu We Th Fr Sa
   |     1  2  3  4  5  6 |    |              1 (2) 3
 1 |  7  8  9*10 11 12 13 |  9 |  4 (5) 6 (7) 8  9 10
 2 | 14  x 16*17 18 19 20 |    |     x     x         
 3 | 21*22 23 24 25 26 27 | 10 | 18*19 20 21 22 23 24
 4 | 28*29 30 31          | 11 | 25*26 27 28 29 30 31

    ___ February 2018 ____      _____ April 2018 ____
   | Su Mo Tu We Th Fr Sa |    | Su Mo Tu We Th Fr Sa
   |              1  2  3 | 12 |  1 *2  3  4  5  6  7
 5 |  4 *5  6  7  8  9 10 | 13 |  8 *9 10 11 12 13 14
 6 | 11*12 13 14 15 16 17 | 14 | 15*16 17 18 19 20 21
 7 | 18*19 20 21 22 23 24 | 15 | 22*23 24(25)26 27 28
 8 | 25*26 27 28          |    | 29  x

Week  Topic
 1    Calculator Form
 2    Pseudocode
 3    Plotting 1 (Excel)
 4    Plotting 2 (Matplotlib and linear spaces)
 5    Abstraction 1 (advanced functions and oo method)
 6    Abstraction 2 (Booleans and logic)
 7    2D transforms and iterated maps
 8    Optimization 1 (Newton-Rhapson)
 9     midterm project presentations
10    Optimization 2 (MD Energy)
11    ODE and SDE algorithms
12    Statistics
13    Complexity
14    Circe and Linux crash course
15    Compile & install process, ctypes

Overview

Mathematical models in the natural sciences take many shapes and forms. The moment they become more complicated than tracking a few variables over time, we have to put away our Excel spreadsheets and make use of a solid foundation in computational science. The course establishes this foundation by introducing data structures and algorithms used in everyday scientific computing using examples in the python scripting language. By the end of the Semester, students will be able to navigate the Linux command-line to go from a mathematical model to a numerical solution. Best practices for working with computer code and visualizing data will also be covered.

Topics

  1. Languages
    • bash (shell)
    • python (scripting)
    • stack & register-based machine languages: overview of C and Fortran syntax & compiling with gcc, ‘thread safety’
    • Shared libraries & language inter-operability (FFI)
  2. Algorithms
    • Horner’s algorithm for polynomials
    • Newton’s root finding algo.
    • Overview of general optimization algo-s (example: linear & nonlin. least-squares)
    • Numerical integration (scripted, plus gnu ODE)
    • Communication (how HTTP get/put works)
  3. Data Structures
    • Linked Lists
    • Trees (file system hierarchies)
    • Graphs
    • Arrays (dense & sparse) example: vectors & rotations / transpositions, densities and difference operators
  4. Presentation & visualization
    • Flat CSV files (and excel)
    • Working with binary data
    • Python matplotlib (2D images)
    • Binning histograms & weighted averages
  5. Best Practices
    • High-level code design (modularity, interface specifications, unit testing, etc.)
    • Code audit: open-source libraries, ex. Gnu Scientific Library, qsort in libc, Boost
    • Source code versioning (example: git or mercurial)
    • gdb & execution profiling
    • documentation
    • Makefiles

Grading

Grading Scale (out of 400 points possible)

Grade Undergraduate Graduate
A+ 269 323
A 258 310
A- 250 300
B+ 241 290
B 230 276
B- 222 266
C+ 213 256
C 202 243
C- 194 233
D+ 186 223
D 175 210
D- 166 200
  • (280 U / 330 G) Points Possible:
    • 9 homework assignments (10 points per homework)
    • 14 in-class assignments / quizzes (10 points per week)
    • 2 projects / competitions (50 points per project)
    • Undergraduates must complete 1 project, graduates must complete 2
  • Important Rules
    • For maximum benefit, homework should be done on-time. However, homework will be allowed to be re-submitted at any time for full credit, but must present a unique solution to previous work from yourself or others.
    • All work must cite co-group members (if any) and the source of any code snippets used online.

Grading Rationale

Homework points will only be awarded for correct, working code and correct algorithmic analysis, while the homework problems will become progressively tougher. Undergraduates should be able to pass the course with an A if they have correctly answered most of the homework, appear at class prepared and on-time, and made a strong attempt at one of the projects. Graduate students should both perform well on the homework and successfully complete one or two projects. The re-grading for homework allows making up for missed assignments, and encourages returning to core concepts at a later point in the class. Students who have put in the studying effort, completed significant portions of the homework and projects both in- and out of class have thus demonstrated introductory programming proficiency.

Projects

Textbooks and Resources

Other (possibly useful) materials:

  • The Jargon File
  • Chapters 7&8 of Modelling and Simulation, Birta and Arbez, Springer 2007 (eBook avail. from USF Library).
  • Chapters 4-6 of Models and Algorithms for Intelligent Data Analysis, Runkler, Springer 2012 (eBook avail. from USF Library).
  • Guide to Scientific Computing in C++, Francis and Whiteley, Springer, 2012. (eBook avail. from USF Library)
  • Extra info. On the basics:
    • Chapters 3 and 7 of Mathematics in Computing, Regan, Springer 2013. (eBook avail. from USF Library)
    • Instant feedback graphical programming with Khan Academy
  • Local PyBulls Meetings

Code:

Interactive Interpreters:

  • CodeWars Recommended for building experience with practice problems.
  • CodeCombat A very visual way to interact with your python code.
  • Exercism A command-line client with a similar learn-through-challenge philosophy.
  • repl.it A read-eval-print loop for several computer languages.
  • CppReference Documentation with an online evaluator for C++ code.

Notes

Review Notes on Sequences, list comprehensions and arrays

<source lang="python">

  1. Say you want to make a plot of the LJ function on the domain x \in [0.9, 3]
  2. 1. You could use the following two list comprehensions to be specific:

x = [ i*(3-0.9)/100.0 + 0.9 for i in range(101) ] y = [ 4*(xi**-12 - xi**-6) for xi in x]

  1. 2. You could implement the LJ function as a python function and use MAP,

def LJ(x):

   ix6 = x**-6 # the factoring makes this calculation more efficient
   return 4*ix6*(ix6 + 1.0)

y = map(LJ, x)

  1. 3. You could work on arrays, where addition, multiplication, and powers are elementwise:

ax = arange(101)*(3-0.9)/100 + 0.9 ay = LJ(ax)

  1. Mind the difference in type!

type(x) type(ax)

  1. 4. You could totally cheat, and use numpy's builtin functions.

ax = linspace(0.9, 3.0, 101)

  1. Exercise: replicate the steps above to test the finite difference computation
  2. of the derivative against the analytical derivative of LJ.

[ (y[i+1]-y[i])/(x[i+1]-x[i]) for i in ...

import matplotlib.pyplot as plt plt.plot(x, y) plt.show() </source>

Review Notes on Functions with Function arguments (functionals)

<source lang="python">

  1. map : (a -> b), [ a ] -> [ b ]
  2. map: map(str, [1,2,3]) --> [ "1", "2", "3" ]
  3. ^ function (a -> b)
  4. map(int, map(str, [1,2,3]) )
  5. - helpful for parsing
  6. x = "1 2 3 4 5"
  7. x.split() # ["1", "2", "3", "4", "5"]
  8. map(int, x.split()) # [1,2,3,4,5]

str([1,2,3]) # "[1, 2, 3]" (bad) map(str, [1,2,3]) # ["1", "2", "3"] (good)

  1. filter: (a -> Bool), [a] -> [a]
  2. Example filter:
  3. '1' -> no
  4. ' ' -> no
  5. 'a-z' -> yes
  6. 'A-Z' -> yes

def f(x): # str -> Bool

   if x >= 'a' and x <= 'z':
       return True
   if x >= 'A' and x <= 'Z':
       return True
   return False

filter(f, " 1 a b 2 \n")

  1. reduce: (a,b -> a), [b], a -> a

sum([1,2,3]) # -> 6 def add2(x,y): # x : int, y : int

 return x+y

reduce(add2, [1,2,3], 0)

  1. For an example where a != b:

def append(x,y): # x : [int], y : int

 return x + [y]

reduce(append, [1,2,3], []) # start with an empty list

  1. Newton-Rhapson

def nrhap(f, fp, x0, tol = 1e-6):

   y = f(x0)
   iter = 0 # guard against infinity
   while abs(y) > tol:
       iter += 1
       if iter > 1000:
         raise ValueError, "Unable to converge!"
       x0 -= y/fp(x0)
       y = f(x0)
   return x0
   
  1. solve x^3 - 2 = 0

def f(x):

 return x**3 - 2

def fp(x):

 return 3*x**2

print( nrhap(f, fp, 1.0)**3 )

</source>


In-Class Problems

Homework

  • Homework will be assigned through Repl.IT
    • Homework 6 is described below
  • Project presentations will be on Mon.-Wed., Mar 5-7
  • Final Projects are due by the end of the last week of classes.

HW6 (hand in on Wednesday, April 18)

After reading through the SIAM guide on mathematical modeling, formulate an interesting question about the chemical behavior of mixed solvent systems. Don't be too constrained by what can and can't be simulated with MD, but do create a well-designed problem statement, list some possible assumptions that could help narrow down the problem scope, and clearly explain any quantitative variables that can be defined. It will help to reference chemical literature, including textbooks from your past courses.

Homework 7 (hand in on Monday, April 23)

Create pseudocode and identify relevant python/numpy/scipy
data structures and functions for carrying out
each of the following MD data analysis tasks.
Be explicit about data types you choose for each variable,
as well as the inputs and outputs of each function you define.
Your pseudocode should be returned neatly handwritten or typed
in outline form and should present a complete, workable solution.

1. How do I determine the best solvent for reaction chemistry?

Chemical reactions at an industrial scale are always carried
out under optimized conditions.  Solvent composition (i.e. the
mole fraction of each solvent molecule) are one factor
that can be tuned in order to control reaction rate, yield, selectivity,
and ease of purification.  One factor that can be modeled
is the difference in solubility between reactant and product molecules.
The solubility product constant for a single molecule, K, is defined as
the maximum concentration the molecule can be present in solution
before further additions of the molecule `crash out' and form a solid
rather than dissolving.

Let's test the linear solvation free energy model,

  K(M) = K0 exp( sum_j n_j(M) u_j ),

where K0 and k_j are constants for all molecules (M),
the sum runs over the 'j' functional groups
present in molecule M, with n_j(M) the number of functional
groups of type 'j' in that molecule.

Write pseudocode for a "guess and check" method that
tries out possible parameters, K0 and u_j, to decide
how well those parameters perform.
Assume you have a table of K(M) and n_j(M) values available,
and can call the function "generate_params" to generate a new
parameter guess.

2. How fast is a 'diffusion-limited' reaction?

Reaction rates depend on how many times the individual
molecules involved in each step of the reaction collide.
For a bimolecular reaction like an enzyme-substrate binding
event, E + S -> (ES), the reaction rate can be limited
by diffusion when binding is very strong (so that every collision
leads to "irreversible" binding).  MD can't simulate chemical
reactions, so the analysis will be limited to determining whether
the enzyme-substrate becomes small, indicating binding.

Assume you have been given a trajectory of an enzyme in solution
with 100 substrate molecules.  Define a (distance-based) criteria
for determining if a complex has formed when given the positions of
12 atoms on a key residue of the enzyme and all 10 atoms of the substrate.
This criteria should be stated in terms of a function, with well-typed
inputs and outputs.
Then write pseudocode that uses the entire trajectory (and the function)
to determine: i) the exact time-point (frame) at which an
enzyme-substrate complex forms, and ii) whether that complex
ever unbinds during the remainder of the simulation trajectory.

For this exercise, you will need to use steps that define and use labels
for substrates and/or atoms like "loop through substrates,"
"label that substrate as 'k'", and "check substrate 'k'", etc.
Like adding main characters to a story, it's important to know when to
give certain pieces of data individual names and types.

Homework 8 - Due Friday, April 27. (hand in on Wednesday or email by Friday COB).

Revise your midterm project code using what you've learned
in this class to make it more structured and maintainable.
In some cases, this may simply require adding documentation
and in other cases, this may be simplest to accomplish
with a complete re-write.

The completed code must contain: 1) at least two functions,
documented with input and output types and explanations,
as well as example input -> output pairs, 2) at least one
loop, and 3) at least one new data structure, with
a documented type and an example instance that matches the given type.
One way to satisfy the last requirement is opting to change 
the shape of an existing array in the code or to
regroup data into an array, list, or dictionary.