10 MB pdf-file here - NTNU

Contents 

From: http:// ... TKP4106 Modelling Course 

(Automatic HTML etc. to PDF Conversion) 

Creator: Tore Haug-Warberg 

Department of Chemical Engineering 

NTNU (Norway) 

Created: Tue Oct 16 09:57:50 +0200 2012 

PDF name: 2012 10 16 09 57 50.pdf 

1 Homepage 5 

2 Tore Haug-Warberg (Programming) 6 

2.1 Real Programmers use FORTRAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 

2.2 Emacs (all platforms) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

2.3 Emacs quick reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

2.4 Vim (UNIX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 

2.5 Vim quick reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 

2.6 TextPad (Windows) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 

2.7 TextPad quick reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 

2.8 LaTeX (Cambridge University) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 

2.9 LaTeX in Norwegian (Hanche-Olsen) . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 

2.10 High-quality portable PDF (Schatz) . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

2.11 Regex (Stephen Ramsay) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 

2.12 Regex quick reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 

2.13 BNF and EBNF (L. M. Garshol) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 

2.14 Windows shortcut keys (Jonah Probell ) . . . . . . . . . . . . . . . . . . . . . . . . . 96 

2.15 Keyboard shortcuts (Windows/Linux) . . . . . . . . . . . . . . . . . . . . . . . . . . 100 

2.16 Mac keyboard shortcuts (Dan Rodney) . . . . . . . . . . . . . . . . . . . . . . . . . . 104 

2.17 The Transparent Language Popularity Index . . . . . . . . . . . . . . . . . . . . . . 108 

1

2.18 The Hows and Whys of Commenting (C) . . . . . . . . . . . . . . . . . . . . . . . . 116 

2.19 99 bottles of beer (1000++ languages) . . . . . . . . . . . . . . . . . . . . . . . . . . 119 

2.20 Programming paradigms (Kurt Normark) . . . . . . . . . . . . . . . . . . . . . . . . 120 

2.21 Real Programmers (Ed Post), see also Sec. 2.1 . . . . . . . . . . . . . . . . . . . . . 126 

2.22 The story of Mel (Ed Nather,) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 

2.23 The Tao of programming (Kragen Sitaker) . . . . . . . . . . . . . . . . . . . . . . . . 133 

2.24 Computer languages (E. Levenez) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 

2.25 Shoot yourself in the foot (WWW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 

2.26 Lord of the Rings (D. Pritchard) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 

2.27 About spell checkers (WWW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 

2.28 Foobar etymology (Jargon File) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 

2.29 2000 languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 

2.30 A Beginner’s Python Tutorial (Steven Thurlow) . . . . . . . . . . . . . . . . . . . . . 191 

2.31 Epytext markup (sourceforge) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 

2.32 Epydoc fields (sourceforge) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 

2.33 Python Docstrings (Sourceforge) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 

2.34 Regex in Python (McCormack) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 

2.35 Unit Testing in Python (William Blum) . . . . . . . . . . . . . . . . . . . . . . . . . 226 

2.36 Python best practise (Well House) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 

2.37 Numerical Python (scipy.org) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 

2.38 Plotting with Python (matplotlib) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 

2.39 Scientific Python (scipy.org), see also Sec. 2.37 . . . . . . . . . . . . . . . . . . . . . 251 

2.40 Symbolic Python (sympy.org) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 

2.41 Functional Python (Moka) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 

2.42 The Transparent Language Popularity Index, see also Sec. 2.17 . . . . . . . . . . . . 264 

3 Heinz A. Preisig (Modelling) 265 

4 Frequently Asked Questions (FAQ) 266 

4.1 use epydoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 

5 Syllabus 275 

5.1 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 

5.1.1 Ken Olsen, founder of DEC (1977) . . . . . . . . . . . . . . . . . . . . . . . . 280 

5.1.2 A Smalltalk about Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 

5.1.3 Regular Expressions, see also Sec. 2.11 . . . . . . . . . . . . . . . . . . . . . . 287 

5.2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 

5.2.1 Reference ??? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 

5.3 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 

5.3.1 The real programmer, see also Sec. 2.1 . . . . . . . . . . . . . . . . . . . . . . 296 

5.3.2 epydoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 

5.3.3 Verbatim: “atoms.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 

5.3.4 epytext, see also Sec. 2.31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 

5.3.5 Verbatim: “morse.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 

5.3.6 Verbatim: “antimorse.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 

5.3.7 Python strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 

5.3.8 docstring, see also Sec. 2.33 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 

5.3.9 Epydoc output file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 

5.4 Mass balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 

5.4.1 Reference ???, see also Sec. 5.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . 337 

5.5 Molecular formula parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 

5.5.1 Alan J. Perlis (1982), see also Sec. 2.29 . . . . . . . . . . . . . . . . . . . . . 341 

5.5.2 atoms.py, see also Sec. 5.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 

2

5.5.3 Python dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 

5.5.4 Backus-Naur Formalism, see also Sec. 2.13 . . . . . . . . . . . . . . . . . . . . 361 

5.5.5 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 

5.6 Energy balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 


5.7 The atom matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 

5.7.1 Spell Check Song, see also Sec. 2.27 . . . . . . . . . . . . . . . . . . . . . . . 378 

5.7.2 Verbatim: “atom matrix.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 

5.7.3 Verbatim: “molecular weight.py” . . . . . . . . . . . . . . . . . . . . . . . . . 381 

5.7.4 Python sets, see also Sec. 5.5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . 383 

5.7.5 List comprehension, see also Sec. 5.5.3 . . . . . . . . . . . . . . . . . . . . . . 384 

5.8 Steady state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 


5.9 Independent reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388 

5.9.1 Computers are male . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 

5.9.2 Verbatim: “rref.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394 

5.9.3 Verbatim: “null.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 

5.9.4 The mass balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 

5.10 Physical events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 

5.10.1 Reference ???, see also Sec. 5.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . 406 

5.11 Root solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 

5.11.1 Computers are female . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 

5.11.2 Verbatim: “sqrt.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 

5.11.3 Verbatim: “pv.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 

5.11.4 The energy balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 

5.11.5 Verbatim: “for lc rc.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 

5.12 Matrix theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 


5.13 A thermodynamic equation solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 

5.13.1 Robert Firth, see also Sec. 2.29 . . . . . . . . . . . . . . . . . . . . . . . . . . 429 

5.13.2 Verbatim: “solve.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 

5.13.3 Verbatim: “hpn.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 

5.13.4 Verbatim: “mprod.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 

5.13.5 The energy balance, see also Sec. 5.11.4 . . . . . . . . . . . . . . . . . . . . . 436 

5.14 ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 


5.15 The reactor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440 

5.15.1 General Motors vs. Bill Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 

5.15.2 Verbatim: “srk ammonia.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 

5.15.3 Verbatim: “flowsheet.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448 

5.15.4 Verbatim: “ammonia reactor.py” . . . . . . . . . . . . . . . . . . . . . . . . . 455 

5.15.5 Verbatim: “tkp4106.py” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 

5.15.6 ammonia reactor.py, see also Sec. 5.15.4 . . . . . . . . . . . . . . . . . . . . . 460 

5.15.7 srk ammonia.py, see also Sec. 5.15.2 . . . . . . . . . . . . . . . . . . . . . . . 461 

5.15.8 Modelling issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462 

5.16 PID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 


5.17 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 

5.17.1 Verbatim: “We don’t need no...” . . . . . . . . . . . . . . . . . . . . . . . . . 480 

5.17.2 flowsheet.py, see also Sec. 5.15.3 . . . . . . . . . . . . . . . . . . . . . . . . . 481 


5.17.4 flowsheet.py, see also Sec. 5.15.3 . . . . . . . . . . . . . . . . . . . . . . . . . 483 

3


5.17.6 Modelling issues, see also Sec. 5.15.8 . . . . . . . . . . . . . . . . . . . . . . . 485 

5.18 AAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 


5.19 Unit testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 

5.19.1 The Origin of Faeces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 

5.20 BBB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 


5.21 Putting the model to work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496 

5.21.1 Verbatim: “graph.plt” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 

5.21.2 Verbatim: “graph.dat” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 

5.21.3 graph.pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 


5.21.5 graph.plt, see also Sec. 5.21.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 

5.22 CCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 


4

TKP4106 Process Modelling 

Lecturer's home page: 

1. Tore Haug-Warberg (Programming) 

2. Heinz A. Preisig (Modelling) 

Common parts: 

1. Frequently Asked Questions (FAQ) 

2. Syllabus 

Process modelling builds on the basic conservation principles, the transport 

phenomena, thermodynamics and mathematical physics. We teach on how 

these models are being built systematically so that we have precisely the 

knowledge required neither more nor less. Models we establish formulate 

implicitly different mathematical problems that need to be solved in order to get 

an over-all solution. We learn on how to approach and solve these problems 

effectively using mathematical and computer-based numerical tools. 

Programming is seen as a core activity for achieving this latter goal. Examples 

taken from the different corners of our discipline are the subject of our 

discussions. 

Learning outcome: 

1. Get a birdsview of the modelling process. 

2. Establish an integration of the different involved subjects. 

3. Programming as part of solving technical problems. 

4. Abstraction of the plant. 

5. Formulation of complete process models. 

6. Solving simple mathematical and numerical problems using computers. 

7. Programming methods and a programming language. 

8. Have a systematic approach to problem solving. 

9. Know how to generate models. 

Last updated: 28 August 2012. © THW+EHW

Programming sessions in TKP4106 

Tore Haug-Warberg 

Department of Chemical Engineering, NTNU 

email: haugwarb@nt.ntnu.no 

phone: +47-7359-4108 

"Talking, you can only hope that somebody is listening. Writing, you can only hope that someone will be 

reading. When doing programming, however, you can tell the computer what to do, how to do it and 

when it should be done. That makes a heck of a difference to the scientist. 

Corollary: In speech and writing it does not matter how wrong you are if you are a little right. In 

programming it does not matter how right you are if you are a little wrong." 

Introductory words to TKP4106, Tore Haug-Warberg (2011) 

"The easiest way to tell a Real Programmer from the crowd is by the programming language he (or she) 

uses. Real Programmers use Fortran. Quiche Eaters use Pascal. Nicklaus Wirth, the designer of Pascal, 

gave a talk once at which he was asked, "How do you pronounce your name?". He replied, "You can 

either call me by name, pronouncing it 'Veert', or call me by value, 'Worth'." One can tell immediately by 

this comment that Nicklaus Wirth is a Quiche Eater. The only parameter passing mechanism endorsed by 

Real Programmers is call-by-value-return, as implemented in the IBM/370 Fortran G and H compilers. 

Real Programmers don't need all these abstract concepts to get their jobs done-- they are perfectly happy 

with a keypunch, a Fortran IV compiler, and a beer." 

Real Programmers use FORTRAN 

This page is the index to the programming session of Process Modelling 

TKP4106. For easy off-line browsing you can download the entire 10 MB pdffile 

here. There is also a FAQ list and a Syllabus available. All subjects are 

taught (chronologically) in a top-down manner. The Goals give an overview of 

where we are heading. We will be using Python for the programming and the 

entire course adds up to 1200 lines of carefully written and fully documented 

Python code, including methods for: formula parsing, atom matrix and matrix 

product calculation, row-reduced-echelon-form, nullspace, linear and non-linear 

equation solving, Euler and Runge-Kutta integration, a thermodynamic equation 

of state and an object-oriented flowsheet module with stream and reactor 

objects. To increase the learning effect you are not given the programs out of 

the box. Instead you are asked to change these stub programs into workable 

code as a compulsory part of the course. 

My goal is take you all the way from algorithmic parsing of chemical formulas to 

matrix theory, and finally to chemical reactor simulation. Our value chain looks 

something like this:

[ 'H2', 'N2', 'NH3' ] 

=> 

| 2 0 3 | 

A = | | 

| 0 2 1 | 

=> 

| 3/2 | 

N = | 1/2 | 

| -1 | 

=> 

| dh/dT dh/dv dh/dc | | grad(T) | | 0 | 

| dT/dp dp/dv dp/dc |*| grad(v) | = | 0 | 

| 0 0 I | | grad(c) | | N*r | 

Here A is the so-called atom or formula matrix, N = null(A) is the nullspace 

of A, function h(T,v,c) is called enthalpy and r(T,v,c,x,t) is the rate of 

reaction (chemical kinetics). It will be our pride to learn how the grand picture 

evolves from basic physical principles and a few pages of computer code. 

However: 

Why? 

The understanding and use of physically based models is becoming 

increasingly important in industry, teaching and academia. 

What? 

Algorithmic description of dynamics, events and static processes. Conservation 

of mass and energy (not so much momentum in our case). The models can be 

simple yet complex (networks). 

How? 

Linear algebra (ODE and DAE), root solvers (NR), syntax (regex and BNF 

parsers), code structure (OOP and FP), containers (tuple, list, hash, struct and 

array), code design (epydoc, patterns and exceptions). 

Our goals are obviously quite widespread and it is worth while reflecting a little 

over what we actually need to understand of mathematics, physics and 

programming: 

Goals (programming): back 

1. Formula parser dict = 

Goals 

(paradigms): 

back 

1. Backus-Naur 

Goals (modelling): back 

1. Applying energy, momentum and 

mass conservation 

2. Chemical reactions and 

nullspace 

3. Linear and non-linear system

atoms(str) 

2. Algebra mw = 

molecular_weight(str) 

3. Formula matrix A = 

amat([str1, str2, ...]) 

4. Row-reduced-echelon-form B = 

rref(A) 

5. Nullspace N = null(A) 

6. Linear equations X = 

solve(A, B) 

7. Matrix product C = mprod(A, 

B) 

formalism 

2. Regular 

expressions 

3. Strings 

4. Lists (arrays) 

5. Tuples 

6. Dictionaries 

(hashes) 

7. Lambda 

functions 

8. Modules 

9. Classes 

10. Objects 

11. Exceptions 

descriptions 

4. Linearization of models 

5. Solving linear equations 

6. Newton-Raphson iteration 

7. Systems of ordinary differential 

equations 

8. Dynamic versus steady state 

approximation 

9. Numerical integration using 

Euler's method 

10. The needs for an equation of 

state 

11. Thermodynamic Jacobian 

transformations 

12. Hand calculations of (1 x 1) up to 

(3 x 6) matrices 

To do all this work the editor will be your most valuable asset. Forget about 

fancy GUI's and IDE's used for large scale programming. Dispose the mouse, 

learn shortkeys and teach yourself TextPad, Vim, Emacs or … That's it. And 

yes, while programming you shall document your code. Always. Coding is 

about syntax — documentation is about semantics. Remember that. You shall 

also test the code. Always. Unit testing is a Good Thing. Finally, you ought to 

have some fun; especially when programming late hours. A little humor helps a 

lot when you do code wrangling. 

About Python as a language I am not religious. Not at all, since I have only 

coded a few projects in Python. The syntax is not very juicy but the language 

seems to offer a good compromise between stringency and sloppiness, and it 

got tons of useful libraries. It also enforces very strict indentation rules upon the 

source code, which definitly is a Good Thing for newbies. For this reason alone 

Python stands out as a good learning platform, besides being one of the more 

popular scripting languages available today (far more so than Matlab for 

instance). 

Editors: 

1. Emacs (all 

platforms) 

2. Emacs 

quick 

reference 

3. Vim 

(UNIX) 

4. Vim quick 

reference 

Text processing: 

1. LaTeX 

(Cambridge 

University) 

2. LaTeX in 

Norwegian 

(Hanche-Olsen) 

3. LaTeX 

professional math 

(Voss) 

4. High-quality 

portable PDF 

Programming en 

masse : 

1. Windows shortcut 

keys (Jonah Probell ) 

2. Keyboard shortcuts 

(Windows/Linux) 

3. Mac keyboard 

shortcuts (Dan 

Rodney) 

4. The Transparent 

Language Popularity 

Index 

Mostly fun: 

1. Real Programmers 

(Ed Post) 

2. The story of Mel (Ed 

Nather,) 

3. The Tao of 

programming (Kragen 

Sitaker) 

4. Computer languages 

(E. Levenez) 

5. Shoot yourself in the 

foot (WWW)

5. TextPad 

(Windows) 

6. TextPad 

quick 

reference 

(Schatz) 

5. Regex (Stephen 

Ramsay) 

6. Regex quick 

reference 

7. BNF and EBNF 

(L. M. Garshol) 

5. The Hows and Whys 

of Commenting (C) 

6. 99 bottles of beer 

(1000++ languages) 

7. Programming 

paradigms (Kurt 

Normark) 

6. Lord of the Rings (D. 

Pritchard) 

7. About spell checkers 

(WWW) 

8. Foobar etymology 

(Jargon File) 

Occasionally, there are matter-of-programming-fact discussions going on in 

the corridor and my colleagues may wonder whether the choice of a computer 

language really matters (which of course it does because there are more than 

2000 languages "out there"), why a switch-case test is better than if-elseif-else 

(a compelling thought indeed), why Object Oriented Programming (OOP) is 

better than Imperative Programming (IP) (which is not always the case), why 

Python is better than Matlab (which is maybe true), and so on. My personal 

attitude to a few of these questions is collected in a list of inFrequently Asked 

Questions (iFAQ) at the bottom of this page. 

It is said that Python is an Object Oriented Programming. So what does OOP 

mean in contrast to IP then? Let me try to explain the difference in terms of how 

NTNU organizes its exams. Assume for the moment that NTNU is a central 

Python module and that you (the student) is a data object floating around in 

cyberspace. In Python jargon we can then state the following: 

... 

... 

# A list of all courses at NTNU. 

courses = [..., TKP4106, ...] 

... 

... 

# It's time for arranging exams. 

for course in courses: 

arrange_exam(course) 

... 

... 

# Make sure all students do their exams. 

def arrange_exam(course): 

for student in course.students(): 

answer = student.do_exam(course) 

if answer == None: 

mark = 'Failed' 

else: 

mark = evaluate_exam(course, answer) 

end 

print(student, course, mark) 

... 

... 

The big difference is how the methods arrange_exam() and do_exam() are 

implemented. NTNU is the official authority and knows exactly why, what, who, 

when and where to examine. NTNU's function arrange_exam() is therefore 

implemented as a global function which is part of an imperative schedule called 

a study program. I.e. NTNU tells you what to do at each level of your study. But, 

whenever NTNU alarms you to conduct an exam it invokes do_exam() which

whenever NTNU alarms you to conduct an exam it invokes do_exam() which 

is an object method installed on you (and on all other student objects). It is in 

fact a singleton since it is installed on a one-to-one basis and will be different 

for each student. For that reason NTNU cannot rely fully on your scientific 

integrity and it therefore invokes another global function called 

evaluate_exam() which marks your answer. The rest of the story you all 

know… I hope this little allegory helps you understand the difference of OOP 

and IP. 

Getting started: 

1. A Beginner's Python Tutorial 

(Steven Thurlow) 

2. Epytext markup (sourceforge) 

3. Epydoc fields (sourceforge) 

Going a little further: 

1. Python Docstrings 

(Sourceforge) 

2. Regex in Python 

(McCormack) 

3. Unit Testing in Python 

(William Blum) 

4. Python best practise (Well 

House) 

(in)Frequently Asked Questions (iFAQ): back 

Which language? 

The full story: 

1. Numerical Python 

(scipy.org) 

2. Plotting with Python 

(matplotlib) 

3. Scientific Python 

(scipy.org) 

4. Symbolic Python 

(sympy.org) 

5. Functional Python 

(Moka) 

Use the language that is ideal for you and your task. Always. Switch to another language if you feel 

constrained. 

Why do I need an editor? 

The editor and the keyboard are your textual links to the computer. Forget about the mouse and 

fancy GUIs. Such things are only useful for graphics work and hyperlinks. Learn about the 

shortkeys of your computer, learn to master one editor efficiently, learn to manipulate several files 

at once and learn to run scripts from the terminal (command) window. Use these tools for all your 

stuff afterwards. This is not about religion but about productivity and self-consciousness. 

Matlab or Python? 

Matlab stands for Matrix Laboratory while Python is a generic programming language. Matlab is 

good at doing numbers while it sucks on doing strings. Python is good at handling strings and have 

good numerics too. More important, however, Matlab is proprietary while Python is open source. 

NTNU should not promote proprietary languages••• Python has also a much bigger community than 

has Matlab (about 10 times higher activity according to The Transparent Language Popularity 

Index). Actually, we should rather been using Ruby because it has a nice, rich and beautiful syntax! 

OOP, IP or FP? 

Object oriented programming (OOP) is valuable for administrating calculations at a high level using 

the concept of a class. Imperative programming (IP) is, quite inevitably, what is used in the inner 

loops of calculation intensive algorithms like e.g. matrix calculations. Functional programming (FP) 

offers a beatiful way of doing recursive calculations on infinite lists and so-called higher order 

programming working with functors (akin to functionals in mathematics). In most program systems of 

reasonable size all three paradigms will be used. 

IF-ELSEIF-ELSE or CASE? 

The answer is almost religious: Never use if-elseif-else only if-else and switch-case or case-

when. The reason is that an if-elseif has to be evaluated one test at a time (you can be comparing 

strings in one test and numbers in the next) while the switch-case is precompiled (you compare one 

single object to a set of predefined matches). The if-elseif clutters the code because you have to 

read every single statement in order to understand what is being tested. The scope of the switch- 

case is, on the other hand, determined by one single line of code and it consequently looks more 

clean and coherent to the human eye. 

TDT41100 vs TKP4106? 

Why are we going to have yet-another introduction course in programming? Why is not TDT41100 

sufficient? The answer is simple: TDT41100 offers you an introduction to information technology 

while TKP4106 focuses at writing beautiful code that stands the test of documentation standards, 

unit testing and reusability. 

Last updated: 03 September 2012. © THW+EHW

Real Programmers Don't Use Pascal 

[ A letter to the editor of Datamation, volume 29 number 7, July 1983. I've long ago lost 

my dog-eared photocopy, but I believe this was written (and is copyright) by Ed Post, 

Tektronix, Wilsonville OR USA. 

The story of Mel is a related article. ] 

Back in the good old days-- the "Golden Era" of computers-- it was easy to separate the 

men from the boys (sometimes called "Real Men" and "Quiche Eaters" in the literature). 

During this period, the Real Men were the ones who understood computer programming, 

and the Quiche Eaters were the ones who didn't. A real computer programmer said things 

like "DO 10 I=1,10" and "ABEND" (they actually talked in capital letters, you 

understand), and the rest of the world said things like "computers are too complicated for 

me" and "I can't relate to computers-- they're so impersonal". (A previous work [1] points 

out that Real Men don't "relate" to anything, and aren't afraid of being impersonal.) 

But, as usual, times change. We are faced today with a world in which little old ladies can 

get computers in their microwave ovens, 12 year old kids can blow Real Men out of the 

water playing Asteroids and Pac-Man, and anyone can buy and even understand their 

very own personal Computer. The Real Programmer is in danger of becoming extinct, of 

being replaced by high school students with TRASH-80s. 

There is a clear need to point out the differences between the typical high school junior 

Pac-Man player and a Real Programmer. If this difference is made clear, it will give these 

kids something to aspire to-- a role model, a Father Figure. It will also help explain to the 

employers of Real Programmers why it would be a mistake to replace the Real 

Programmers on their staff with 12 year old Pac-Man players (at a considerable salary 

savings). 

The easiest way to tell a Real Programmer from the crowd is by the programming 

language he (or she) uses. Real Programmers use Fortran. Quiche Eaters use Pascal. 

Nicklaus Wirth, the designer of Pascal, gave a talk once at which he was asked, "How do 

you pronounce your name?". He replied, "You can either call me by name, pronouncing it 

'Veert', or call me by value, 'Worth'." One can tell immediately by this comment that 

Nicklaus Wirth is a Quiche Eater. The only parameter passing mechanism endorsed by 

Real Programmers is call-by-value-return, as implemented in the IBM/370 Fortran G and 

H compilers. Real Programmers don't need all these abstract concepts to get their jobs 

done-- they are perfectly happy with a keypunch, a Fortran IV compiler, and a beer. 

Real Programmers do List Processing in Fortran. 

Real Programmers do String Manipulation in Fortran. 

Real Programmers do Accounting (if they do it at all) in Fortran. 

Real Programmers do Artificial Intelligence programs in Fortran. 

If you can't do it in Fortran, do it in assembly language. If you can't do it in assembly

language, it isn't worth doing. 

The academics in computer science have gotten into the "structured programming" rut 

over the past several years. They claim that programs are more easily understood if the 

programmer uses some special language constructs and techniques. They don't all agree 

on exactly which constructs, of course, and the example they use to show their particular 

point of view invariably fit on a single page of some obscure journal or another-- clearly 

not enough of an example to convince anyone. When I got out of school, I thought I was 

the best programmer in the world. I could write an unbeatable tic-tac-toe program, use 

five different computer languages, and create 1000 line programs that WORKED 

(Really!). Then I got out into the Real World. My first task in the Real World was to read 

and understand a 200,000 line Fortran program, then speed it up by a factor of two. Any 

Real Programmer will tell you that all the Structured Coding in the world won't help you 

solve a problem like that-- it takes actual talent. Some quick observations on Real 

Programmers and Structured Programming: 

Real Programmers aren't afraid to use GOTOs. 

Real Programmers can write five page long DO loops without getting confused. 

Real Programmers like Arithmetic IF statements-- they make the code more 

interesting. 

Real Programmers write self-modifying code, especially if they can save 20 

nanoseconds in the middle of a tight loop. 

Real Programmers don't need comments-- the code is obvious. 

Since Fortran doesn't have a structured IF, REPEAT ... UNTIL, or CASE 

statement, Real Programmers don't have to worry about not using them. Besides, 

they can be simulated when necessary using assigned GOTOs. 

Data structures have also gotten a lot of press lately. Abstract Data Types, Structures, 

Pointers, Lists, and Strings have become popular in certain circles. Wirth (the above 

mentioned Quiche Eater) actually wrote an entire book [2] contending that you could 

write a program based on data structures, instead of the other way around. As all Real 

Programmers know, the only useful data structure is the Array. Strings, Lists, Structures, 

Sets-- these are all special cases of arrays and can be treated that way just as easily 

without messing up your programming language with all sorts of complications. The 

worst thing about fancy data types is that you have to declare them, and Real 

Programming Languages, as we all know, have implicit typing based on the first letter of 

the (six character) variable name. 

What kind of operating system is used by a Real Programmer? CP/M? God forbid-- 

CP/M, after all, is basically a toy operating system. Even little old ladies and grade school 

students can understand and use CP/M. 

Unix is a lot more complicated of course-- the typical Unix hacker never can remember 

what the PRINT command is called this week-- but when it gets right down to it, Unix is 

a glorified video game. People don't do Serious Work on Unix systems: they send jokes 

around the world on UUCP-net and write Adventure games and research papers.

No, your Real Programmer uses OS/370. A good programmer can find and understand 

the description of the IJK305I error he just got in his JCL manual. A great programmer 

can write JCL without referring to the manual at all. A truly outstanding programmer can 

find bugs buried in a 6 megabyte core dump without using a hex calculator. (I have 

actually seen this done.) 

OS is a truly remarkable operating system. It's possible to destroy days of work with a 

single misplaced space, so alertness in the programming staff is encouraged. The best 

way to approach the system is through a keypunch. Some people claim there is a Time 

Sharing system that runs on OS/370, but after careful study I have come to the conclusion 

that they were mistaken. 

What kind of tools does a Real Programmer use? In theory, a Real Programmer could run 

his programs by keying them into the front panel of the computer. Back in the days when 

computers had front panels, this was actually done occasionally. Your typical Real 

Programmer knew the entire bootstrap loader by memory in hex, and toggled it in 

whenever it got destroyed by his program. (Back then, memory was memory-- it didn't go 

away when the power went off. Today, memory either forgets things when you don't want 

it to, or remembers things long after they're better forgotten.) Legend has it that Seymore 

Cray, inventor of the Cray I supercomputer and most of Control Data's computers, 

actually toggled the first operating system for the CDC7600 in on the front panel from 

memory when it was first powered on. Seymore, needless to say, is a Real Programmer. 

One of my favorite Real Programmers was a systems programmer for Texas Instruments. 

One day, he got a long distance call from a user whose system had crashed in the middle 

of saving some important work. Jim was able to repair the damage over the phone, 

getting the user to toggle in disk I/O instructions at the front panel, repairing system 

tables in hex, reading register contents back over the phone. The moral of this story: 

while a Real Programmer usually includes a keypunch and line printer in his toolkit, he 

can get along with just a front panel and a telephone in emergencies. 

In some companies, text editing no longer consists of ten engineers standing in line to use 

an 029 keypunch. In fact, the building I work in doesn't contain a single keypunch. The 

Real Programmer in this situation has to do his work with a "text editor" program. Most 

systems supply several text editors to select from, and the Real Programmer must be 

careful to pick one that reflects his personal style. Many people believe that the best text 

editors in the world were written at Xerox Palo Alto Research Center for use on their Alto 

and Dorado computers[3]. Unfortunately, no Real Programmer would ever use a 

computer whose operating system is called SmallTalk, and would certainly not talk to the 

computer with a mouse. 

Some of the concepts in these Xerox editors have been incorporated into editors running 

on more reasonably named operating systems-- EMACS and VI being two. The problem 

with these editors is that Real Programmers consider "what you see is what you get" to be 

just as bad a concept in Text Editors as it is in Women. No, the Real Programmer wants a 

"you asked for it, you got it" text editor-- complicated, cryptic, powerful, unforgiving, 

dangerous. TECO, to be precise. 

It has been observed that a TECO command sequence more closely resembles 

transmission line noise than readable text[4]. One of the more entertaining games to play

with TECO is to type your name in as a command line and try to guess what it does. Just 

about any possible typing error while talking with TECO will probably destroy your 

program, or even worse-- introduce subtle and mysterious bugs in a once working 

subroutine. 

For this reason, Real Programmers are reluctant to actually edit a program that is close to 

working. They find it much easier to just patch the binary object code directly, using a 

wonderful program called SUPERZAP (or its equivalent on non-IBM machines). This 

works so well that many working programs on IBM systems bear no relation to the 

original Fortran code. In many cases, the original source code is no longer available. 

When it comes time to fix a program like this, no manager would even think of sending 

anything less than a Real Programmer to do the job-- no Quiche Eating structured 

programmer would even know where to start. This is called "job security". 

Some programming tools NOT used by Real Programmers: 

Fortran preprocessors like MORTRAN and RATFOR. The Cuisinarts of 

programming-- great for making Quiche. See comments above on structured 

programming. 

Source language debuggers. Real Programmers can read core dumps. 

Compilers with array bounds checking. They stifle creativity, destroy most of the 

interesting uses for EQUIVALENCE, and make it impossible to modify the 

operating system code with negative subscripts. Worst of all, bounds checking is 

inefficient. 

Source code maintenance systems. A Real Programmer keeps his code locked up in 

a card file, because it implies that its owner cannot leave his important programs 

unguarded [5]. 

Where does the typical Real Programmer work? What kind of programs are worthy of the 

efforts of so talented an individual? You can be sure that no Real Programmer would be 

caught dead writing accounts-receivable programs in COBOL, or sorting mailing lists for 

People magazine. A Real Programmer wants tasks of earth-shaking importance 

(literally!). 

Real Programmers work for Los Alamos National Laboratory, writing atomic 

bomb simulations to run on Cray I supercomputers. 

Real Programmers work for the National Security Agency, decoding Russian 

transmissions. 

It was largely due to the efforts of thousands of Real Programmers working for 

NASA that our boys got to the moon and back before the Russkies. 

The computers in the Space Shuttle were programmed by Real Programmers. 

Real Programmers are at work for Boeing designing the operation systems for 

cruise missiles.

Some of the most awesome Real Programmers of all work at the Jet Propulsion 

Laboratory in California. Many of them know the entire operating system of the Pioneer 

and Voyager spacecraft by heart. With a combination of large ground-based Fortran 

programs and small spacecraft-based assembly language programs, they are able to do 

incredible feats of navigation and improvisation-- hitting ten-kilometer wide windows at 

Saturn after six years in space, repairing or bypassing damaged sensor platforms, radios, 

and batteries. Allegedly, one Real Programmer managed to tuck a pattern matching 

program into a few hundred bytes of unused memory in a Voyager spacecraft that 

searched for, located, and photographed a new moon of Jupiter. 

The current plan for the Galileo spacecraft is to use a gravity assist trajectory past Mars 

on the way to Jupiter. This trajectory passes within 80 +/- 3 kilometers of the surface of 

Mars. Nobody is going to trust a Pascal program (or Pascal programmer) for navigation 

to these tolerances. 

As you can tell, many of the world's Real Programmers work for the U.S. Government-mainly 

the Defense Department. This is as it should be. Recently, however, a black cloud 

has formed on the Real Programmer horizon. It seems that some highly placed Quiche 

Eaters at the Defense Department decided that all Defense programs should be written in 

some grand unified language called "ADA" ((C), DoD). For a while, it seemed that ADA 

was destined to become a language that went against all the precepts of Real 

Programming-- a language with structure, a language with data types, strong typing, and 

semicolons. In short, a language designed to cripple the creativity of the typical Real 

Programmer. Fortunately, the language adopted by DoD had enough interesting features 

to make it approachable-- it's incredibly complex, includes methods for messing with the 

operating system and rearranging memory, and Edsger Dijkstra doesn't like it [6]. 

(Dijkstra, as I'm sure you know, was the author of "GOTOs Considered Harmful"-- a 

landmark work in programming methodology, applauded by Pascal Programmers and 

Quiche Eaters alike.) Besides, the determined Real Programmer can write Fortran 

programs in any language. 

The Real Programmer might compromise his principles and work on something slightly 

more trivial than the destruction of life as we know it. Providing there's enough money in 

it. There are several Real Programmers building video games at Atari, for example. (But 

not playing them-- a Real Programmer knows how to beat the machine every time: no 

challenge in that.) Everyone working at LucasFilm is a Real Programmer. (It would be 

crazy to turn down the money of fifty million Star Trek fans.) The proportion of Real 

Programmers in Computer Graphics is somewhat lower than the norm, mostly because 

nobody has found a use for Computer Graphics yet. On the other hand, all Computer 

Graphics is done in Fortran, so there are a fair number of people doing Graphics in order 

to avoid having to write COBOL programs. 

Generally, the Real Programmer plays the same way he works-- with computers. He is 

constantly amazed that his employer actually pays him to do what he would be doing for 

fun anyway (although he is careful not to express this opinion out loud). Occasionally, the 

Real Programmer does step out of the office for a breath of fresh air and a beer or two. 

Some tips on recognizing Real Programmers away from the computer room: 

At a party, the Real Programmers are the ones in the corner talking about operating 

system security and how to get around it.

At a football game, the Real Programmer is the one comparing the plays against his 

simulations printed on 11 by 14 fanfold paper. 

At the beach, the Real Programmer is the one drawing flowcharts in the sand. 

At a funeral, the Real Programmer is the one saying "Poor George. And he almost 

had the sort routine working before the coronary." 

In a grocery store, the Real Programmer is the one who insists on running the cans 

past the laser checkout scanner himself, because he never could trust keypunch 

operators to get it right the first time. 

What sort of environment does the Real Programmer function best in? This is an 

important question for the managers of Real Programmers. Considering the amount of 

money it costs to keep one on the staff, it's best to put him (or her) in an environment 

where he can get his work done. 

The typical Real Programmer lives in front of a computer terminal. Surrounding this 

terminal are: 

Listings of all programs the Real Programmer has ever worked on, piled in roughly 

chronological order on every flat surface in the office. 

Some half-dozen or so partly filled cups of cold coffee. Occasionally, there will be 

cigarette butts floating in the coffee. In some cases, the cups will contain Orange 

Crush. 

Unless he is very good, there will be copies of the OS JCL manual and the 

Principles of Operation open to some particularly interesting pages. 

Taped to the wall is a line-printer Snoopy calendar for the year 1969. 

Strewn about the floor are several wrappers for peanut butter filled cheese bars-the 

type that are made pre-stale at the bakery so they can't get any worse while 

waiting in the vending machine. 

Hiding in the top left-hand drawer of the desk is a stash of double-stuff Oreos for 

special occasions. 

Underneath the Oreos is a flow-charting template, left there by the previous 

occupant of the office. (Real Programmers write programs, not documentation. 

Leave that to the maintenence people.) 

The Real Programmer is capable of working 30, 40, even 50 hours at a stretch, under 

intense pressure. In fact, he prefers it that way. Bad response time doesn't bother the Real 

Programmer-- it gives him a chance to catch a little sleep between compiles. If there is 

not enough schedule pressure on the Real Programmer, he tends to make things more 

challenging by working on some small but interesting part of the problem for the first 

nine weeks, then finishing the rest in the last week, in two or three 50-hour marathons. 

This not only impresses the hell out of his manager, who was despairing of ever getting 

the project done on time, but creates a convenient excuse for not doing the

documentation. In general: 

No Real Programmer works 9 to 5. (Unless it's the ones at night.) 

Real Programmers don't wear neckties. 

Real Programmers don't wear high heeled shoes. 

Real Programmers arrive at work in time for lunch. 

A Real Programmer might or might not know his wife's name. He does, however, 

know the entire ASCII (or EBCDIC) code table. 

Real Programmers don't know how to cook. Grocery stores aren't open at three in 

the morning. Real Programmers survive on Twinkies and coffee. 

What of the future? It is a matter of some concern to Real Programmers that the latest 

generation of computer programmers are not being brought up with the same outlook on 

life as their elders. Many of them have never seen a computer with a front panel. Hardly 

anyone graduating from school these days can do hex arithmetic without a calculator. 

College graduates these days are soft-- protected from the realities of programming by 

source level debuggers, text editors that count parentheses, and "user friendly" operating 

systems. Worst of all, some of these alleged "computer scientists" manage to get degrees 

without ever learning Fortran! Are we destined to become an industry of Unix hackers 

and Pascal programmers? 

From my experience, I can only report that the future is bright for Real Programmers 

everywhere. Neither OS/370 nor Fortran show any signs of dying out, despite all the 

efforts of Pas- cal programmers the world over. Even more subtle tricks, like adding 

structured coding constructs to Fortran, have failed. Oh sure, some computer vendors 

have come out with Fortran 77 compilers, but every one of them has a way of converting 

itself back into a Fortran 66 compiler at the drop of an option card-- to compile DO loops 

like God meant them to be. 

Even Unix might not be as bad on Real Programmers as it once was. The latest release of 

Unix has the potential of an operating system worthy of any Real Programmer-- two 

different and subtly incompatible user interfaces, an arcane and complicated teletype 

driver, virtual memory. If you ignore the fact that it's "structured", even 'C' programming 

can be appreciated by the Real Programmer: after all, there's no type checking, variable 

names are seven (ten? eight?) characters long, and the added bonus of the Pointer data 

type is thrown in-- like having the best parts of Fortran and assembly language in one 

place. (Not to mention some of the more creative uses for #define.) 

No, the future isn't all that bad. Why, in the past few years, the popular press has even 

commented on the bright new crop of computer nerds and hackers ([7] and [8]) leaving 

places like Stanford and MIT for the Real World. From all evidence, the spirit of Real 

Programming lives on in these young men and women. As long as there are ill-defined 

goals, bizarre bugs, and unrealistic schedules, there will be Real Programmers willing to 

jump in and Solve The Problem, saving the documentation for later. Long live Fortran! 

References:

References: 

[1] Feirstein, B., "Real Men don't Eat Quiche", New York, Pocket Books, 1982. 

[2] Wirth, N., "Algorithms + Data Structures = Programs", Prentice Hall, 1976. 

[3] Ilson, R., "Recent Research in Text Processing", IEEE Trans. Prof. Commun., Vol. 

PC-23, No. 4, Dec. 4, 1980. 

[4] Finseth, C., "Theory and Practice of Text Editors - or - a Cookbook for an EMACS", 

B.S. Thesis, MIT/LCS/TM-165, Massachusetts Institute of Technology, May 1980. 

[5] Weinberg, G., "The Psychology of Computer Programming", New York, Van 

Nostrand Reinhold, 1971, p. 110. 

[6] Dijkstra, E., "On the GREEN language submitted to the DoD", Sigplan notices, Vol. 

3, No. 10, Oct 1978. 

[7] Rose, Frank, "Joy of Hacking", Science 82, Vol. 3, No. 9, Nov 82, pp. 58-66. 

[8] "The Hacker Papers", Psychology Today, August 1980. 

ACKNOWLEGEMENT 

--------------------------------- 

I would like to thank Jan E., Dave S., Rich G., Rich E. for their help in characterizing the 

Real Programmer, Heather B. for the illustration, Kathy E. for putting up with it, and 

atd!avsdS:mark for the initial inspiration. 

Webbed by Greg Lindahl (lindahl@pbm.com)

Translations of this page | Accessibility 

The GNU Operating System 

Philosophy Licenses Education Downloads 

Documentation Help GNU Join the FSF! 

Why GNU/Linux? Search 

Releases | Supported Platforms | Obtaining Emacs | Documentation | Support | Further information 

GNU Emacs 

GNU Emacs is an extensible, customizable 

text editor—and more. At its core is an 

interpreter for Emacs Lisp, a dialect of the 

Lisp programming language with extensions 

to support text editing. The features of GNU 

Emacs include: 

Content-sensitive editing modes, 

including syntax coloring, for a variety of 

file types including plain text, source 

code, and HTML. 

Complete built-in documentation, 

New to Emacs? 

including a tutorial for new users. 

Take the Emacs tour 

Full Unicode support for nearly all human 

languages and their scripts. 

Highly customizable, using Emacs Lisp code or a graphical interface. 

A large number of extensions that add other functionality, including a project 

planner, mail and news reader, debugger interface, calendar, and more. 

Many of these extensions are distributed with GNU Emacs; others are 

available separately. 

Releases 

The current stable release is 24.2. To obtain it, visit the obtaining section. 

Emacs 24 has a wide variety of new features, including: 

Sign up for the Free Software Supporter 

A monthly email newsletter about GNU and Free Software 

Enter your email address (e.g. richard@example.com) 

Ok

A packaging system and interface (M-x list-packages) for downloading and 

installing extensions. A default package archive is hosted by GNU and 

maintained by the Emacs developers. 

Support for displaying and editing bidirectional text, including right-to-left 

scripts such as Arabic and Hebrew. 

Support for lexical scoping in Emacs Lisp. 

Improvements to the Custom Themes system (M-x customize-themes). 

Unified and improved completion system in many modes and packages. 

Built-in support for GnuTLS, GTK+ 3, ImageMagick, SELinux, and Libxml2. 

For more information, read the News file. 

Release History 

August 27, 2012 - Emacs 24.2 released 

June 10, 2012 - Emacs 24.1 released 

January 29, 2012 - Emacs 23.4 released 

March 10, 2011 - Emacs 23.3 released 

May 8, 2010 - Emacs 23.2 released 

July 29, 2009 - Emacs 23.1 released 

September 5, 2008 - Emacs 22.3 released 

March 26, 2008 - Emacs 22.2 released 

June 2, 2007 - Emacs 22.1 released 

Feb 6, 2005 - Emacs 21.4 released 



October 28, 2001 - Emacs 21.1 released 

Supported Platforms 

Emacs 24 runs on these operating systems regardless of the machine type: 

GNU 

GNU/Linux 

GNU/kFreeBSD 

FreeBSD 

NetBSD 

OpenBSD 

Solaris 

Mac OS X 

AIX 

MS Windows 

MS DOS 

GNU Emacs contains code for supporting several other operating systems and 

machine types; however, in many cases we don't know whether they still work. 

The definitive reference for this is the MACHINES file, which is also distributed with 

GNU Emacs; this file also lists the special requirements for compiling GNU Emacs 

on these systems.

Obtaining/Downloading GNU Emacs 

GNU Emacs can be downloaded from http://ftp.gnu.org/pub/gnu/emacs/, or from a 

GNU mirror. 

GNU Emacs development is hosted on savannah.gnu.org. See the Emacs project 

page on Savannah, where the latest development sources are publicly available 

from our Bazaar repository. 

Documentation 

Two Emacs manuals, the GNU Emacs manual and An Introduction to 

Programming in Emacs Lisp, can be purchased in printed form from the FSF store. 

These manuals, along with the Emacs Lisp Reference Manual and several other 

manuals documenting major modes and other optional features, can also be read 

online. They are also distributed with Emacs in Info format; type C-h i in Emacs to 

view them. 

GNU Emacs manual Read Online Purchase 

An Introduction to Programming in Emacs Lisp Read Online Purchase 

Emacs Lisp Reference Manual Read Online (out of print) 

Other Emacs manuals Read Online 

The Emacs distribution includes the full source code for the manuals, as well as 

the Emacs Reference Card in several languages. 

The Emacs FAQ can be read online as HTML or plain text. The Emacs on 

Windows FAQ is available here. The source code for these FAQs are also part of 

the Emacs distribution. 

Support 

To ask for help with GNU Emacs, use the mailing list help-gnuemacs@gnu.org 

or the newsgroup gnu.emacs.help. The mailing list and 

newsgroup are linked: messages posted on one appear on the other as well. 

To report bugs, or to contribute fixes and improvements, use the built-in 

Emacs bug reporter (M-x report-emacs-bug) or send email to bug-gnuemacs@gnu.org. 

You can browse our bug database at debbugs.gnu.org. For

more information on contributing, see the CONTRIBUTE file (also distributed 

with Emacs). 

For all other queries, consult the list of Emacs-related mailing lists on 

savannah.gnu.org and the complete list of GNU mailing lists on lists.gnu.org. 

See Get Help with GNU Software for help with GNU software in general. 

Further Information 

The Emacs FAQ (html, plain text) contains information about Emacs history, 

common problems, and how to obtain optional extensions. 

Emacs 24 includes a built-in package manager, which you can use to download 

additional Emacs extensions. Type M-x list-packages to view a list of available 

packages. The default package archive is hosted by the GNU project; more 

archives can be added by customizing the variable package-archives. 

The Emacs Wiki is a community website about using and programming Emacs, 

including information about optional extensions; complete manuals or 

documentation fragments; comments on the different Emacs versions, flavors, and 

ports; and references to other Emacs related information on the Web. 

The Savannah Emacs page has additional information about Emacs, including 

access to the Emacs development sources. 

For those curious about Emacs history: Emacs was originally implementated in 

1976 on the MIT AI Lab's Incompatible Timesharing System (ITS), as a collection 

of TECO macros. The name “Emacs” was originally chosen as an abbreviation of 

“Editor MACroS”. This version of Emacs, GNU Emacs, was originally written in 

1984. For more information, see the 1981 paper by Richard Stallman, describing 

the design of the original Emacs and the lessons to be learned from it, and a 

transcript of his 2002 speech at the International Lisp Conference, My Lisp 

Experiences and the Development of GNU Emacs. 

GNU Emacs Fun 

April Fool Mail - emacs rewrite 

More humor related to GNU Emacs and others 

Here is the cover of the original Emacs Manual for ITS; the cover of the 

original Emacs Manual for Twenex; and (the only cartoon RMS has ever 

drawn) the Self-Documenting Extensible Editor.

GNU home page FSF home page GNU Art GNU Fun GNU's Who? 

Free Software Directory Site map 

The Free Software Foundation is the principal organizational sponsor of the GNU Operating System. 

Our mission is to preserve, protect and promote the freedom to use, study, copy, modify, and 

redistribute computer software, and to defend the rights of Free Software users. Support GNU 

and the FSF by buying manuals and gear, joining the FSF as an associate member or by making a 

donation, either directly to the FSF or via Flattr. 

back to top 

Please send FSF & GNU inquiries & questions to gnu@gnu.org. There are also other ways 

to contact the FSF. 

Please send comments on these web pages to bug-emacs@gnu.org. 

We thank Greg Harvey for writing this page. 

Copyright © 1998, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2009 Free Software 

Foundation, Inc., 

51 Franklin St, Fifth Floor, Boston, MA 02110, USA 

Verbatim copying and distribution of this entire article is permitted in any medium, provided 

this notice is preserved. 

Updated: $Date: 2012/08/27 08:44:50 $ 

Please see the Translations README for information on coordinating and submitting 

translations of this article. 

Translations of this page 

English [en]

XEmacs Reference Card 

(for version 21.0+) 

Starting Emacs 

To enter XEmacs, just type its name: xemacs 

To read in a file to edit, see Files, below. 

Leaving Emacs 

suspend Emacs (or iconify frame under X) C-z 

exit Emacs permanently C-x C-c 

Files 

read a file into Emacs C-x C-f 

save a file back to disk C-x C-s 

save all files C-x s 

insert contents of another file into this buffer C-x i 

replace this file with the file you really want C-x C-v 

write buffer to a specified file C-x C-w 

Getting Help 

The Help system is simple. Type C-h and follow the directions. 

If you are a first-time user, type C-h t for a tutorial. 

quit Help window q 

scroll Help window space 

apropos: show commands matching a string C-h a 

show the function a key runs C-h c 

describe a function C-h f 

get mode-specific information C-h m 

Error Recovery 

abort partially typed or executing command C-g 

recover a file lost by a system crash M-x recover-file 

recover files from a previous Emacs session M-x recover-session 

undo an unwanted change C-x u or C-_ 

restore a buffer to its original contents M-x revert-buffer 

redraw garbaged screen C-l 

Incremental Search 

Motion 

entity to move over backward forward 

character C-b C-f 

word M-b M-f 

line C-p C-n 

go to line beginning (or end) C-a C-e 

sentence M-a M-e 

paragraph M-{ M-} 

page C-x [ C-x ] 

sexp C-M-b C-M-f 

function C-M-a C-M-e 

go to buffer beginning (or end) M-< M-> 

scroll to next screen C-v 

scroll to previous screen M-v 

scroll left C-x < 

scroll right C-x > 

scroll current line to center of screen C-u C-l 

Killing and Deleting 

entity to kill backward forward 

character (delete, not kill) DEL C-d 

word M-DEL M-d 

line (to end of) M-0 C-k C-k 

sentence C-x DEL M-k 

sexp M-- C-M-k C-M-k 

kill region C-w 

copy region to kill ring M-w 

kill through next occurrence of char M-z char 

yank back last thing killed C-y 

replace last yank with previous kill M-y 

Marking 

set mark here C-@ or C-SPC 

exchange point and mark C-x C-x 

set mark arg words away M-@ 

mark paragraph M-h 

mark page C-x C-p 

mark sexp C-M-@ 

mark function C-M-h 

mark entire buffer C-x h 

Query Replace 

Multiple Windows 

delete all other windows C-x 1 

delete this window C-x 0 

split window in two vertically C-x 2 

split window in two horizontally C-x 3 

scroll other window C-M-v 

switch cursor to another window C-x o 

shrink window shorter M-x shrink-window 

grow window taller C-x ^ 

shrink window narrower C-x { 

grow window wider C-x } 

select buffer in other window C-x 4 b 

display buffer in other window C-x 4 C-o 

find file in other window C-x 4 f 

find file read-only in other window C-x 4 r 

run Dired in other window C-x 4 d 

find tag in other window C-x 4 . 

Formatting 

indent current line (mode-dependent) TAB 

indent region (mode-dependent) C-M-\ 

indent sexp (mode-dependent) C-M-q 

indent region rigidly arg columns C-x TAB 

insert newline after point C-o 

move rest of line vertically down C-M-o 

delete blank lines around point C-x C-o 

join line with previous (with arg, next) M-^ 

delete all white space around point M-\ 

put exactly one space at point M-SPC 

fill paragraph M-q 

set fill column C-x f 

set prefix each line starts with C-x . 

Case Change 

uppercase word M-u 

lowercase word M-l 

capitalize word M-c 

uppercase region C-x C-u 

lowercase region C-x C-l 

capitalize region M-x capitalize-region 

The Minibuffer 

search forward C-s 

search backward C-r 

regular expression search C-M-s 

reverse regular expression search C-M-r 

select previous search string M-p 

select next later search string M-n 

exit incremental search RET 

undo effect of last character DEL 

abort current search C-g 

Use C-s or C-r again to repeat the search in either direction. 

If Emacs is still searching, C-g cancels only the part not done. 

c○ 1998 Free Software Foundation, Inc. Permissions on back. v2.0 XEmacs 

interactively replace a text string M-% 

using regular expressions M-x query-replace-regexp 

Valid responses in query-replace mode are 

replace this one, go on to next SPC or y 

replace this one, don’t move , 

skip to next without replacing DEL or n 

replace all remaining matches ! 

back up to the previous match ^ 

exit query-replace ESC 

enter recursive edit (C-M-c to exit) C-r 

delete match and enter recursive edit C-w 

The following keys are defined in the minibuffer. 

complete as much as possible TAB 

complete up to one word SPC 

complete and execute RET 

show possible completions ? 

fetch previous minibuffer input M-p 

fetch next later minibuffer input M-n 

regexp search backward through history M-r 

regexp search forward through history M-s 

abort command C-g 

Type C-x ESC ESC to edit and repeat the last command that 

used the minibuffer. The following keys are then defined. 

previous minibuffer command M-p 

next minibuffer command M-n 

1 2 3

Buffers 

XEmacs Reference Card 

select another buffer C-x b 

list all buffers C-x C-b 

kill a buffer C-x k 

Transposing 

transpose characters C-t 

transpose words M-t 

transpose lines C-x C-t 

transpose sexps C-M-t 

Spelling Check 

check spelling of current word M-$ 

check spelling of all words in region M-x ispell-region 

check spelling of entire buffer M-x ispell-buffer 

Tags 

find a tag (a definition) M-. 

find next occurrence of tag C-u M-. 

specify a new tags file M-x visit-tags-table 

regexp search on all files in tags table M-x tags-search 

run query-replace on all the files M-x tags-query-replace 

continue last tags search or query-replace M-, 

Shells 

execute a shell command M-! 

run a shell command on the region M-| 

filter region through a shell command C-u M-| 

start a shell in window *shell* M-x shell 

Rectangles 

copy rectangle to register C-x r r 

kill rectangle C-x r k 

yank rectangle C-x r y 

open rectangle, shifting text right C-x r o 

blank out rectangle M-x clear-rectangle 

prefix each line with a string M-x string-rectangle 

select rectangle with mouse M-button1 

Abbrevs 

add global abbrev C-x a g 

add mode-local abbrev C-x a l 

add global expansion for this abbrev C-x a i g 

add mode-local expansion for this abbrev C-x a i l 

explicitly expand abbrev C-x a e 

expand previous word dynamically M-/ 

Regular Expressions 

any single character except a newline . (dot) 

zero or more repeats * 

one or more repeats + 

zero or one repeat ? 

any character in the set [ . . . ] 

any character not in the set [^ . . . ] 

beginning of line ^ 

end of line $ 

quote a special character c \c 

alternative (“or”) \| 

grouping $ . . . $ 

nth group \n 

beginning of buffer \‘ 

end of buffer \’ 

word break \b 

not beginning or end of word \B 

beginning of word \< 

end of word \> 

any word-syntax character \w 

any non-word-syntax character \W 

character with syntax c \sc 

character with syntax not c \Sc 

Registers 

save region in register C-x r s 

insert register contents into buffer C-x r i 

save value of point in register C-x r SPC 

jump to point saved in register C-x r j 

Info 

enter the Info documentation reader C-h i 

Moving within a node: 

scroll forward SPC 

scroll reverse DEL 

beginning of node . (dot) 

Moving between nodes: 

next node n 

previous node p 

move up u 

select menu item by name m 

select nth menu item by number (1–5) n 

follow cross reference (return with l) f 

return to last node you saw l 

return to directory node d 

go to any node by name g 

Other: 

run Info tutorial h 

list Info commands ? 

quit Info q 

search nodes for regexp s 

Keyboard Macros 

start defining a keyboard macro C-x ( 

end keyboard macro definition C-x ) 

execute last-defined keyboard macro C-x e 

edit keyboard macro C-x C-k 

append to last keyboard macro C-u C-x ( 

name last keyboard macro M-x name-last-kbd-macro 

insert Lisp definition in buffer M-x insert-kbd-macro 

Commands Dealing with Emacs Lisp 

eval sexp before point C-x C-e 

eval current defun C-M-x 

eval region M-x eval-region 

eval entire buffer M-x eval-current-buffer 

read and eval minibuffer M-ESC 

re-execute last minibuffer command C-x ESC ESC 

read and eval Emacs Lisp file M-x load-file 

load from standard system directory M-x load-library 

Simple Customization 

Here are some examples of binding global keys in Emacs Lisp. 

(global-set-key [(control c) g] ’goto-line) 

(global-set-key [(control x) (control k)] ’kill-region) 

(global-set-key [(meta #)] ’query-replace-regexp) 

An example of setting a variable in Emacs Lisp: 

(setq backup-by-copying-when-linked t) 

Writing Commands 

(defun command-name (args) 

"documentation" 

(interactive "template") 

body) 

An example: 

(defun this-line-to-top-of-window (line) 

"Reposition line point is on to top of window. 

With ARG, put point on line ARG. 

Negative counts from bottom." 

(interactive "P") 

(recenter (if (null line) 

0 

(prefix-numeric-value line)))) 

The argument to interactive is a string specifying how to get 

the arguments when the function is called interactively. Type 

C-h f interactive for more information. 

Copyright c○ 1998 Free Software Foundation, Inc. 

designed by Stephen Gildea, April 1998 v2.0 XEmacs 

for GNU Emacs version 19 on Unix systems 

Updated for XEmacs in February 1995 by Ben Wing 

Permission is granted to make and distribute copies of this card provided 

the copyright notice and this permission notice are preserved on 

all copies. 

For copies of the GNU Emacs manual, write to the Free Software Foundation, 

Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. 

4 5 6

not logged in (login) 

Search 

Home 

Advanced search 

About Vim 

Community 

News 

Sponsoring 

Trivia 

Documentation 

Download 

Scripts 

Tips 

My Account 

Site Help 

1.2k 

What is Vim? 

Vim is a highly configurable 

text editor built to enable 

efficient text editing. It is an 

improved version of the vi 

editor distributed with most 

UNIX systems. Vim is 

distributed free as 

charityware. If you find Vim a 

useful addition to your life 

please consider helping 

needy children in Uganda. 

What is Vim online? 

Vim online is a central place 

for the Vim community to 

store useful Vim tips and 

tools. Vim has a scripting 

language that allows for 

plugin like extensions to 

enable IDE behavior, syntax 

highlighting, colorization as 

well as other advanced 

features. These scripts can 

be uploaded and maintained 

using Vim online. 

News Vim 7.3.659 is the current version 

Two decades of productivity: Vim's 20th 

anniversary 

[2011-11-26] Ryan Paul wrote a nice article after 

figuring out that Vim was born 20 years ago. That 

is the day Vim was first send out to the world. I 

have actually been working on it a big longer, let's 

consider that a pregnancy (without the side effects 

:-). You can find the full article here. (Bram 

Moolenaar) 

Vim charity update 

[2011-04-28] Vim users are encouraged to support 

needy children in Uganda, as a "thank you" for all 

the work. I have recently visited the project to see 

what they are doing with our donations. They are 

doing very well! Read my visit report, with lots of 

pictures, you can find it here. (Bram Moolenaar) 

more 

news... 

Recent Script 

Updates 

Show you like Vim: get a Tshirt 

from FreeWear 

Get a Vim 

poster 

4,148 scripts, 7,129,574 downloads 

[2012-09-07] Gist.vim : vimscript for gist 

(6.9) This is an upgrade for Gist.vim: 

fixed few bugs. - Yasuhiro Matsumoto 

[2012-09-06] Python-mode-klen : python mode 

(0.6.8) ## 2012-09-06 0.6.8 ----------------- 

-- * Add PEP8 indentation ":help 

'pymode_indent'" - Kirill Klenov 

[2012-09-06] ConflictMotions : Motions to and inside 

SCM conflict markers. 

(1.10) The [z / ]z mappings disable the 

built-in mappings for moving over the 

current open fold. Oops! Change default 

to [= / ]= / i= / a=. (= as for the characters 

in the separator between our and their 

change). - Ingo Karkat 

[2012-09-06] GrepHere : List occurrences in the 

current buffer in the quickfix window. 

(1.10) Make default flags for an empty 

:GrepHere command configurable via 

g:GrepHere_EmptyCommandGrepFlags. 

Default to 'g': List all occurrences, jump 

to first occurrence. - Ingo Karkat 

[2012-09-05] Lucius : Light and dark color scheme for 

GUI and 256 color terminal. 

(8.1.2) Fixed some issues that arise from 

setting Normal at different times in the 

file. This basically always caused the 

Buy at Amazon 

Help Uganda

"background" option to be set to "light". - 

Jonathan Filip 

more recent | most downloaded | top rated 

Vim Tips 

The tips are located on the Vim Tips wiki. This is a 

platform to exchange tips and tricks from and for 

Vim users. 

Vim Patches 

A list of patches available for Vim can be found on 

the vim_dev maillist pages. These add new or 

improved features, at the cost of having to rebuild 

Vim. 

If you have questions or remarks about this site, visit the vimonline development pages. Please use this site 

responsibly. 

Questions about Vim should go to the maillist. Help Bram help Uganda.

VIM QUICK REFERENCE CARD 

Basic movement 

h l k j . . . . . . . . . . . . character left, right; line up, down 

b w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . word/token left, right 

ge e . . . . . . . . . . . . . . . . . . . . . end of word/token left, right 

{ } . . . . . . . . . . . . . beginning of previous, next paragraph 

( ). . . . . . . . . . . . . . .beginning of previous, next sentence 

0 gm . . . . . . . . . . . . . . . . . . . . . . . . . beginning, middle of line 

^ $ . . . . . . . . . . . . . . . . . . . . . . . . . first, last character of line 

nG ngg . . . . . . . . . . . . . . . . . . . line n, default the last, first 

n%. . . . . . . .percentage n of the file (n must be provided) 

n| . . . . . . . . . . . . . . . . . . . . . . . . . . . . column n of current line 

%. . . . .match of next brace, bracket, comment, #define 

nH nL . . . . . . . . . . . . line n from start, bottom of window 

M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . middle line of window 

Insertion & replace → insert mode 

i a . . . . . . . . . . . . . . . . . . . . . . . . . insert before, after cursor 

I A . . . . . . . . . . . . . . . . . . . . insert at beginning, end of line 

gI . . . . . . . . . . . . . . . . . . . . . . . . . . insert text in first column 

o O. . . . . .open a new line below, above the current line 

rc . . . . . . . . . . . . . . . replace character under cursor with c 

grc . . . . . . . . . . . . . . . . like r, but without affecting layout 

R . . . . . . . . . . . . . replace characters starting at the cursor 

gR . . . . . . . . . . . . . . . . . like R, but without affecting layout 

cm . . . . . . . . . . . . . change text of movement command m 

cc or S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . change current line 

C . . . . . . . . . . . . . . . . . . . . . . . . . . . . change to the end of line 

s . . . . . . . . . . . . . . . . . . . . . change one character and insert 

~ . . . . . . . . . . . . . . . . . . . . . . switch case and advance cursor 

g~m . . . . . . . . . . . . switch case of movement command m 

gum gUm . . . lowercase, uppercase text of movement m 

m . . . . . . . . . . shift left, right text of movement m 

n> . . . . . . . . . . . . . . . . . . . . . . . shift n lines left, right 

Deletion 

x X . . . . . . . . . . . . . . delete character under, before cursor 

dm . . . . . . . . . . . . . . delete text of movement command m 

dd D . . . . . . . . . . . . . delete current line, to the end of line 

J gJ . . . . . . . . join current line with next, without space 

:rd←↪ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . delete range r lines 

:rdx←↪ . . . . . . . . . . . . . delete range r lines into register x 

Insert mode 

ˆVc ˆVn . . . . . . . . . insert char c literally, decimal value n 

Â . . . . . . . . . . . . . . . . . . . . . . insert previously inserted text 

ˆ@. . . . . . .same as Â and stop insert → command mode 

ˆRx ˆRˆRx . . . . . . . . . insert content of register x, literally 

ˆN ˆP. . . . . . . . . . . . . .text completion before, after cursor 

ˆW . . . . . . . . . . . . . . . . . . . . . . . . . . . delete word before cursor 

Û . . . . . . . . . . delete all inserted character in current line 

ˆD ˆT. . . . . . . . . . . . . . . . . . .shift left, right one shift width 

ˆKc1c2 or c1←c2 . . . . . . . . . . . . . . . . . . enter digraph {c1, c2} 

Ôc . . . . . . . . . . . . execute c in temporary command mode 

ˆXÊ ˆXˆY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . scroll up, down 

〈esc〉 or ˆ[ . . . . . . . . . abandon edition → command mode 

Copying 

"x . . . . . . . . . . . . use register x for next delete, yank, put 

:reg←↪ . . . . . . . . . . . . . . . show the content of all registers 

:reg x←↪ . . . . . . . . . . . . . . show the content of registers x 

ym . . . . . . . . . . . yank the text of movement command m 

yy or Y . . . . . . . . . . . . . . . . . . .yank current line into register 

p P . . . . . . . . . . . put register after, before cursor position 

]p [p . . . . . . . . . . . . . . . . . . . like p, P with indent adjusted 

gp gP . . . . . . . . . . . like p, P leaving cursor after new text 

Advanced insertion 

g?m . . . . . . . . . . perform rot13 encoding on movement m 

nÂ nˆX . . . . . . . . . . . . . . +n, −n to number under cursor 

gqm . . . . . . . format lines of movement m to fixed width 

:rce w←↪ . . . . . . . . . . . center lines in range r to width w 

:rle i←↪ . . . . . . . left align lines in range r with indent i 

:rri w←↪ . . . . . . right align lines in range r to width w 

!mc←↪ . filter lines of movement m through command c 

n!!c←↪ . . . . . . . . . . . . . . filter n lines through command c 

:r!c←↪ . . . . . . . . . filter range r lines through command c 

Visual mode 

v V ˆV . . start/stop highlighting characters, lines, block 

o . . . exchange cursor position with start of highlighting 

gv . . . . . . . . . . . start highlighting on previous visual area 

aw as ap . . . . . . . select a word, a sentence, a paragraph 

ab aB . . . . . . . . . . . . . . . . . . . select a block ( ), a block { } 

Undoing, repeating & registers 

u U . . . . . . undo last command, restore last changed line 

. ˆR. . . . . . . . . . . . . . . .repeat last changes, redo last undo 

n. . . . . . . repeat last changes with count replaced by n 

qc qC . . . .record, append typed characters in register c 

q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .stop recording 

@c . . . . . . . . . . . . . . . . . . . . execute the content of register c 

@@ . . . . . . . . . . . . . . . . . . . . . . . . repeat previous @ command 

:@c←↪ . . . . . . . . . . . execute register c as an Ex command 

:rg/p/c←↪. . . . . . . . . .execute Ex command c on range r 

⌊ where pattern p matches 

Complex movement 

- + . . . . . . . . . line up, down on first non-blank character 

B W . . . . . . . . . . . . . . . . . . . space-separated word left, right 

gE E . . . . . . . . . . . end of space-separated word left, right 

n . . . . . . . . down n − 1 line on first non-blank character 

g0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . beginning of screen line 

g^ g$. . . . . . . . . . . . . . . .first, last character of screen line 

gk gj . . . . . . . . . . . . . . . . . . . . . . . . . . . . screen line up, down 

fc Fc . . . . . . . . . . next, previous occurence of character c 

tc Tc . . . . . . . . . . . . . before next, previous occurence of c 

; , . . . . . . . . . . . . . repeat last fFtT, in opposite direction 

[[ ]] . . . . . . . . . . . . . . start of section backward, forward 

[] ][ . . . . . . . . . . . . . . . end of section backward, forward 

[( ]) . . . . . . . . . . . . . . . . . unclosed (, ) backward, forward 

[{ ]} . . . . . . . . . . . . . . . . unclosed {, } backward, forward 

[m ]m . . . . . . . . start of backward, forward Java method 

[# ]#.unclosed #if, #else, #endif backward, forward 

[* ]* . . . . . . . . . . start, end of /* */ backward, forward 

Search & substitution 

/s←↪ ?s←↪ . . . . . . . . . . . . . search forward, backward for s 

/s/o←↪ ?s?o←↪ . . . . . search fwd, bwd for s with offset o 

n or /←↪ . . . . . . . . . . . . . . . . . . . . . repeat forward last search 

N or ?←↪ . . . . . . . . . . . . . . . . . . . repeat backward last search 

# * . . . search backward, forward for word under cursor 

g# g* . . . . . . . . . . . . . same, but also find partial matches 

gd gD . . . local, global definition of symbol under cursor 

:rs/f/t/x←↪ . . . . . . . . . . . . . . substitute f by t in range r 

⌊ x : g—all occurrences, c—confirm changes 

:rs x←↪. . . . . . . . . . .repeat substitution with new r & x

Special characters in search patterns 

. ˆ $ . . . . . . . . . . . any single character, start, end of line 

\< \> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . start, end of word 

[c1-c2] . . . . . . . . . . . . . . a single character in range c1..c2 

[ˆc1-c2]. . . . . . . . . . . . . . . .a single character not in range 

\i \k \I \K . . . . . . . an identifier, keyword; excl. digits 

\f \p \F \P . . a file name, printable char.; excl. digits 

\s \S . . . . . . . . . . . . . . . . a white space, a non-white space 

\e \t \r \b . . . . . . . . . . . . . . . . . . . 〈esc〉, 〈tab〉, 〈←↪〉, 〈←〉 

\= * \+ . . . . match 0..1, 0..∞, 1..∞ of preceding atoms 

\| . . . . . . . . . . . . . . . . . . . . . . . separate two branches (≡ or) 

 . . . . . . . . . . . . . . . . . . . . group patterns into an atom 

\& \n . . . . . . . the whole matched pattern, n th () group 

\u \l . . . . . . . . . . . next character made upper, lowercase 

\c \C. . . . . . . . . . . . . .ignore, match case on next pattern 

Offsets in search commands 

n or +n . . . . . . . . . . . . . . . . . . . n line downward in column 1 

-n . . . . . . . . . . . . . . . . . . . . . . . . . n line upward in column 1 

e+n e-n . . . . . . . n characters right, left to end of match 

s+n s-n . . . . . . n characters right, left to start of match 

;sc . . . . . . . . . . . . . . . . . . execute search command sc next 

Marks and motions 

mc . . . . . . . . . mark current position with mark c ∈ [a..Z] 

‘c ‘C . . . . . . . . . . . go to mark c in current, C in any file 

‘0..9 . . . . . . . . . . . . . . . . . . . . . . . . . . . go to last exit position 

‘‘ ‘" . . . . . . . . . . go to position before jump, at last edit 

‘[ ‘] . . . . . go to start, end of previously operated text 

:marks←↪. . . . . . . . . . . . . . . . . . .print the active marks list 

:jumps←↪ . . . . . . . . . . . . . . . . . . . . . . . . . . print the jump list 

nÔ . . . . . . . . . . . . . . . go to n th older position in jump list 

nÎ . . . . . . . . . . . . . . go to n th newer position in jump list 

Key mapping & abbreviations 

:map c e←↪. . . . . . .map c ↦→ e in normal & visual mode 

:map! c e←↪ . . . . map c ↦→ e in insert & cmd-line mode 

:unmap c←↪ :unmap! c←↪ . . . . . . . . . . remove mapping c 

:mk f←↪ . . . write current mappings, settings... to file f 

:ab c e←↪ . . . . . . . . . . . . . . . . . add abbreviation for c ↦→ e 

:ab c←↪ . . . . . . . . . . . .show abbreviations starting with c 

:una c←↪ . . . . . . . . . . . . . . . . . . . . . . . remove abbreviation c 

Tags 

:ta t←↪. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .jump to tag t 

:nta←↪ . . . . . . . . . . . . . . . . . . jump to n th newer tag in list 

ˆ] ˆT . . . jump to the tag under cursor, return from tag 

:ts t←↪ . . . . list matching tags and select one for jump 

:tj t←↪. .jump to tag or select one if multiple matches 

:tags←↪ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . print tag list 

:npo←↪ :nˆT←↪ . . . . . . jump back from, to n th older tag 

:tl←↪ . . . . . . . . . . . . . . . . . . . . . . jump to last matching tag 

ˆW} :pt t←↪ . . . . . . . . . . . preview tag under cursor, tag t 

ˆW] . . . . . . . . . . . split window and show tag under cursor 

ˆWz or :pc←↪ . . . . . . . . . . . . . . . . . close tag preview window 

Scrolling & multi-windowing 

Ê ˆY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . scroll line up, down 

ˆD Û . . . . . . . . . . . . . . . . . . . . . . scroll half a page up, down 

ˆF ˆB . . . . . . . . . . . . . . . . . . . . . . . . . . . . scroll page up, down 

zt or z←↪ . . . . . . . . . . . . . set current line at top of window 

zz or z. . . . . . . . . . . .set current line at center of window 

zb or z-. . . . . . . . . . .set current line at bottom of window 

zh zl . . . . . . . . . . . . scroll one character to the right, left 

zH zL . . . . . . . . . . . . . scroll half a screen to the right, left 

ˆWs or :split←↪ . . . . . . . . . . . . . . . . . . . split window in two 

ˆWn or :new←↪. . . . . . . . . . . . . . . .create new empty window 

ˆWo or :on←↪ . . . . . . . make current window one on screen 

ˆWj ˆWk . . . . . . . . . . . . . . . . . move to window below, above 

ˆWw ˆWˆW. . . . . . . . .move to window below, above (wrap) 

Ex commands (←↪) 

:e f . . . . . . . edit file f, unless changes have been made 

:e! f . . . . edit file f always (by default reload current) 

:wn :wN . . . . . . . . . write file and edit next, previous one 

:n :N. . . . . . . . . . . . . . . . . . . .edit next, previous file in list 

:rw . . . . . . . . . . . . . . . . . . . . . . . write range r to current file 

:rw f . . . . . . . . . . . . . . . . . . . . . . . . . . .write range r to file f 

:rw>>f . . . . . . . . . . . . . . . . . . . . . . .append range r to file f 

:q :q!. . . . .quit and confirm, quit and discard changes 

:wq or :x or ZZ . . . . . . . . . . . . . write to current file and exit 

〈up〉 〈down〉 . . . . recall commands starting with current 

:r f . . . . . . . . . . . . . . insert content of file f below cursor 

:r! c. . . . . . . .insert output of command c below cursor 

:args . . . . . . . . . . . . . . . . . . . . . . . display the argument list 

:rco a :rm a. . . . . . . . .copy, move range r below line a 

Ex ranges 

, ; . . . . . . separates two lines numbers, set to first line 

n . . . . . . . . . . . . . . . . . . . . . . . . . . . an absolute line number n 

. $ . . . . . . . . . . . . . . . . the current line, the last line in file 

% * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . entire file, visual area 

’t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . position of mark t 

/p/ ?p?. . . . . . .the next, previous line where p matches 

+n -n . . . . . . . . . . . +n, −n to the preceding line number 

Folding 

zfm . . . . . . . . . . . . . . . . . . . . . . . create fold of movement m 

:rfo. . . . . . . . . . . . . . . . . . . . . . . . . . . .create fold for range r 

zd zE . . . . . . . . . . . . . . delete fold at cursor, all in window 

zo zc zO zC . . . . . . . . . . open, close one fold; recursively 

[z ]z. . . . . . . . . .move to start, end of current open fold 

zj zk . . . . . . . . move down, up to start, end of next fold 

Miscellaneous 

:sh←↪ :!c←↪. . .start shell, execute command c in shell 

K. . . . . . . . . . . . . . .lookup keyword under cursor with man 

:make←↪ . . . . . . start make, read errors and jump to first 

:cn←↪ :cp←↪ . . . . . . . . . . display the next, previous error 

:cl←↪ :cf←↪ . . . . . . . list all errors, read errors from file 

ˆL ˆG . . . . . . . redraw screen, show filename and position 

gˆG . . . show cursor column, line, and character position 

ga . . . . . . . . . show ASCII value of character under cursor 

gf. . . . . . . . . . . . .open file which filename is under cursor 

:redir>f←↪ . . . . . . . . . . . . . . . . . . redirect output to file f 

:mkview [f] . . . . . . . . . save view configuration [to file f] 

:loadview [f] . . . . load view configuration [from file f] 

ˆ@ ˆK ˆ \ Fn ˆFn . . . . . . . . . . . . . . . . . . . .unmapped keys 

This card may be freely distributed under the terms of the GNU 

general public licence — Copyright c○ 2003 by Laurent Grégoire 

〈laurent.gregoire@icam.fr〉 — v1.7 — The author assumes no 

responsibility for any errors on this card. The latest version can 

be found at http://tnerual.eriogerg.free.fr/

Search | Contacts | Home 

What's New Products Support Download Buy Forums 

TextPad.com home page 

English | Japanese | Polski 

TextPad ® 6.1 is a 

powerful, general purpose 

editor for plain text files. 

Easy to use, with all the 

features a power user 

requires. 

More ... 

Supported platforms for all 

products include Windows 7, 

Vista, XP, and Server 2003 

and 2008. 

WildEdit ® 2.0 is an 

interactive tool for power 

users to make the same 

changes to a set of text files 

in a folder hierarchy. 

More ... 

International editions for 

TextPad in Dutch, English, 

French, German, Italian, 

Japanese, Polish, Portuguese 

(Brazilian) and Spanish. 

Copyright © 2012 Helios Software Solutions. 

All rights reserved.

TextPad Quick Reference Card 

version 0.03 – editor: John Bokma – freelance programmer 

Cursor Movement 

Cursor left one character ← 

Cursor left one word c-← 

Cursor right one character → 

Cursor right one word c-→ 

Cursor down one line ↓ 

Cursor down to the start of the next paragraph a-↓ 

Cursor up one line ↑ 

Cursor up to the start of the previous paragraph a-↑ 

Move cursor forward to start of word c-W 

Move the cursor back to start of word c-B 

Move cursor back to end of word c-D 

Cursor to start of line, press twice to go to the left margin Home 

Cursor to end of line End 

Cursor to start of document c-Home 

Cursor to end of document c-End 

Cursor to the first visible line, in the current column, 

if possible a-Home 

Cursor to the last visible line, in the current column, 

if possible a-End 

Move cursor to the next tab stop, or indent selected lines Tab 

Move cursor to the previous tab stop, or reduce 

indentation of selected lines s-Tab 

Go to line c-G 

Find matching { [ ( < or > ) ] } c-M 

Deleting 

Delete selection, or character before the cursor, 

(replace it with a space in overtype mode) Backspace 

Delete back to the last start of word c-Backspace 

Delete selection, or character after the cursor Delete 

Delete forward to the next start of word c-Delete 

Delete to the end of the line c-s-Delete 

Delete all lines in the document a-Delete 

Undo and Redo 

Undo last edit c-Z 

Undo all edits c-s-Z 

Redo last undo c-Y 

Redo all undos c-s-Y 

Selection and Clipboard 

Select all c-A 

Cancel any existing selection Escape 

Select left one character s-← 

Select left one word c-s-← 

Select right one character → 

Select right one word c-s-→ 

Select down one line s-↓ 

Select to the start of the next paragraph a-s-↓ 

Select up one line ↑ 

Select to the start of the previous paragraph a-↑ 

Select forward to start of word c-W 

Select back to start of word c-s-B 

Select back to end of word c-s-D 

Select to start of line, press twice to select to the 

left margin s-Home 

Select to end of line s-End 

Select to start of document c-s-Home 

Select to end of document c-s-End 

Select to matching { [ ( < or > ) ] } c-s-M 

Switch in and out of selection mode c-Q-S 

Copy selection to clipboard c-C 

Append selection to clipboard c-s-C 

Cut the selection to the clipboard c-X 

Cut and append the selection to the clipboard c-s-X 

Paste text from the clipboard c-V 

Indent selected lines Tab 

Reduce indentation of selected lines s-Tab 

Delete selection Backspace 

Delete selection, or character after the cursor Delete 

Invert case of selection c-K 

Convert first character of selection to upper case and 

the rest to lower case c-s-U 

Check the spelling of the selection F7 

Formatting 

Start a new line Enter 

Insert new line after current line c-Enter 

Insert new line before current line c-s-Enter 

Increase indentation c-I 

Reduce indentation c-s-I 

Join selected lines c-J 

Reformat selected lines c-s-J 

Split word-wrapped lines c-a-J 

Center text c-E 

Right align text c-s-E 

Insert a page break c-s-L 

Display/hide visible spaces, tabs and paragraphs c-Q-I 

Display/hide line numbers c-Q-L 

Set the right margin at the cursor position c-Q-R 

Switch in and out of word-wrap mode c-Q-W 

Case Change and Transposing 

Convert selection to lower case c-L 

Convert selection to upper case c-U 

Convert first character of selection to upper case and 

the rest to lower case c-s-U 

Invert case of selection c-K 

Transpose the lines or characters either side of the cursor c-T 

Transpose the words either side of the cursor c-s-T 

Search and Replace 

Invoke the Replace dialog box F8 

Replace next instance of search pattern c-F8 

Invoke the Find dialog box F5 

Invoke the Find in Files dialog box c-F5 

Find next instance of search pattern c-F 

Find previous instance of search pattern c-s-F 

Hypertext jump in Search Results window Enter 

Hypertext jump to next item in Search Results window F4 

Hypertext jump to previous item in Search Results window s-F4 

Activate the Search Results window s-F11 

Bookmarks 

Set or clear a bookmark on the current line c-F2 

Go to next bookmark F2 

Go to previous bookmark s-F2 

Edit Modes 

Switch between insert and overtype mode Insert 

Switch in and out of block select mode c-Q-B 

Switch between read-only and edit modes c-Q-E 

Switch in and out of word-wrap mode c-Q-W 

Macros 

Record a new macro c-s-R 

Playback the scratch macro c-R 

Invoke the Playback Macro dialog box c-F7 

Documents 

Create a new document c-N 

Save the active document c-S 

Save all documents c-s-S 

Save as F12 

Open a document using the Open File dialog box c-O 

Open a document by typing its name c-s-O 

Insert the contents of a file at the cursor position c-s-V 

Delete all lines in the document a-Delete 

Next window c-Tab or c-F6 

Previous window c-s-Tab or c-s-F6 

Close the active window c-F4 

Display in-context properties dialog box a-Enter 

Display document statistics on status bar c-F1 

Invoke the Manage Files dialog box F3 

Invoke Windows File Manager or Explorer a-F3 

Print active document c-P 

Preview the active document as it will print c-s-P 

Check the spelling of the active document F7 

Sort F9 

Compare c-F9 

Invoke the document selector F11 

Scrolling and Scroll Bars 

Scroll the view up one line, without moving the cursor c-↓ 

Scroll the view down one line, without moving the cursor c-↑ 

Locks cursor position when scrolling with 

page up/down keys Scroll Lock 

Display/hide the horizontal scroll bar c-Q-H 

Display/hide the vertical scroll bar c-Q-V 

Switch in and out of synchronized scrolling mode c-Q-Y

Command Results 

Stop the tool running in the command window c-Break 

Hypertext jump in Command Results window Enter 

Hypertext jump to next item in Command Results window F4 

Hypertext jump to previous item in Command Results window s-F4 

Activate the Command Results window c-F11 

Views 

Activate next view F6 

Activate previous view s-F6 

Help 

In-context help F1 

Invoke in-context help cursor s-F1 

Miscellaneous 

Activate the Clip Library a-0 

Show or hide the Clip Library c-F3 

Display in-context properties dialog box a-Enter 

Activate the main menu F10 

Popup the in-context document menu s-F10 or right mouse 

Popup the insert date/time menu c-F10 or c-right mouse 

Display the Preferences dialog box c-Q-P 

Regular Expressions (POSIX) 

. Any single character. 

[ ] Any one of the characters in the brackets, or any of a 

range of characters separated by a hyphen (-), or a 

character class operator (see below). 

[^] Any characters except for those after the caret "^". 

^ The start of a line (column 1). 

$ The end of a line (not the line break characters). 

\< The start of a word. 

\> The end of a word. 

\t The tab character. 

\f The page break (form feed) character. 

\n A new line character, for matching expressions that span 

line boundaries. This cannot be followed by operators 

'*', '+' or {}. Do not use this for constraining matches to 

the end of a line. It's much more efficient to use "$". 

\xdd "dd" is the two-digit hexadecimal code for any 

character. 

 Groups a tagged expression to use in replacement 

expressions. An RE can have up to 9 such expressions. 

\| Matches either the expression to its left or its right. 

* Matches zero or more preceding characters/expressions. 

? Matches zero or one preceding characters/expressions. 

+ Matches one or more preceding characters/ expressions. 

{count} Matches the specified number of the preceding 

characters or expressions. 

{min,} Matches at least the specified number of the preceding 


{min,max} Matches between min and max of the preceding 


\ "Escapes" the special meaning of the above expressions, 

so that they can be matched as literal characters. 

[:alpha:] Any letter. 

[:lower:] Any lower case letter. 

[:upper:] Any upper case letter. 

[:alnum:] Any digit or letter. 

[:digit:] Any digit. 

[:xdigit:] Any hexadecimal digit (0-9, a-f or A-F). 

[:blank:] Space or tab. 

[:space:] Space, tab, vertical tab, return, line feed, form feed. 

[:cntrl:] Control characters (Delete and ASCII codes less than 

space). 

[:print:] Printable characters, including space. 

[:graph:] Printable characters, excluding space. 

[:punct:] Anything that is not a control or alphanumeric character. 

[:word:] Letters, hypens and apostrophes. 

[:token:] Any of the characters defined on the Syntax page for the 

document class, or in the syntax definition file if syntax 

highlighting is enabled for the document class. 

Replacement Expressions 

& Substitute the text matching the entire search pattern. 

\0 to \9 Substitute the text matching tagged expression 0 through 

9. \0 is equivalent to &. 

\f Substitute a page break (form feed). 

\i Substitute a sequence number. 

\n Substitute a newline. 

\p Substitute the contents of the clipboard. 

\t Substitute a tab. 

\xdd Substitute the character with hex code dd (must be 2 hex 

digits, excluding 00). 

\u Force the next substituted character to be in upper case. 

\l Force the next substituted character to be in lower case. 

\U Force all subsequent substituted characters to be in 

upper case. 

\L Force all subsequent substituted characters to be in 

lower case. 

\E or \e Turns off previous \U or \L. 

Tool Parameter Macros 

$File The fully qualified filename of the current 

document. 

$DOSFile Same as $File, except that DOS aliases are 

substituted for any long names in the path, and 

characters are converted to the DOS (OEM) code 

set. 

$UNIXFile Same as $File, except any '\' characters are 

changed to '/'. 

$FileName The simple filename of the current document. 

$BaseName $FileName stripped of any extension. 

$DOSBaseName Same as $BaseName, except that the DOS alias 

is substituted for a long file name, and characters 

are converted to the DOS (OEM) code set. 

$WspBaseName The workspace filename, stripped of any path 

and extension. 

$FileDir The drive and directory of the current document. 

$WspDir The drive and directory of the current workspace 

file. 

$FilePath The directory of the current document, stripped 

of the drive. 

$UnixPath Same as $FilePath, except any '\' characters are 


$Dir The current working drive and directory. 

$UNIXDir Same as $Dir, except any '\' characters are 


$Line The cursor line within the current document. 

$Col The cursor column within the current document. 

$Prompt Prompt for a value to substitute for $Prompt. If it 

is followed by a string in brackets, that string 

will be displayed in the prompt dialog box. 

$Password Prompt for a value to substitute for $Password. 

The value will not be echoed as it is typed. If it is 

followed by a string in brackets, that string will 

be displayed in the prompt dialog box. 

$Sel Selected text in the active document. This is 

limited to the first line in a multi-line selection. 

$SelLine The text on the line containing the cursor. This 

has the side effect of selecting that line. 

$SelWord The word containing the cursor. This has the side 

effect of selecting that word. 

$Clip Selected text in the active document, or the 

whole document if nothing is selected, is copied 

to the clipboard before running the tool. 

$AppWnd The handle of the main application window. This 

is a decimal number. 

$DocWnd The handle of the active document's window. 

This is a decimal number. 

$Encoding The characters encoding of the active document. 

This is of the forms: windows-ddd (or cpddd for 

DOS), UTF-8, UTF16-LE or UTF-16BE, where 

ddd is a code page number. 

Page Header/Footer Macros 

The normal font for subsequent text &n 

A bold font for subsequent text &b 

An italic font for subsequent text &i 

A bold italic font for subsequent text &I 

Subsequent text to be left justified &l 

Subsequent text to be centered (this is the default) &c 

Subsequent text to be right justified &r 

The current date in Windows short form &d 

The current date in Windows long form &D 

The current time in Windows format &t 

The filename, excluding its path &f 

The full filename, including its path &F 

The page number &p 

The total number of pages &P 

Based on the TextPad help file. Edited by John Bokma (freelance 

programmer). For the latest version: http://johnbokma.com/textpad/

Department of Engineering 

IT Services 

University of Cambridge Department of Engineering Computing Help 

introductions 

writing guides 

printable 

documentation 

bibliographies 

graphics 

maths 

tables 

packages 

fonts 

sources of 

information 

FAQ 

local search 

distributions 

converters 

editors/front- 

ends 

example 

exercises 

more exercises 

local updates 

(last changes May 

2011) 

Text Processing using LaTeX 

TeX is a powerful text 

processing language 

and is the required 

format for some 

periodicals now. TeX 

has many macros to 

Contact us 

which you can eventually add your own. LaTeX is a macro package which sits on top 

of TeX and provides all the structuring facilities to help with writing large documents. 

Automated chapter and section macros are provided, together with cross referencing 

and bibliography macros. LaTeX tends to take over the style decisions, but all the 

benefits of plain TeX are still present when it comes to doing maths. The Why LaTeX? 

page discusses LaTeX's strengths/weaknesses. 

On CUED's central system you can run latex from the command line using latex or 

pdflatex. We also have Kile and Lyx 

Introductions 

LaTeX: An introduction, Advanced LaTeX (full of examples) and LaTeX Maths and 

Graphics contain all you'll need to know for writing most documents - the "how" 

rather than the "why". 

LaTeX workshop exercise for beginners 

The Not So Short Introduction to LaTeX2e is a 141 page introduction to LaTeX2e 

by Tobias Oetiker et al. Worth a read. There are versions in german and french, 

italian etc. 

The very short guide to typesetting with LATEX (4 pages) 

LaTeX and Friends (M.R.C. van Dongen) (250+ pages) 

LaTeX for Complete Novices (Nicola L. C. Talbot) 

Introduzione al Mondo di LaTeX is a guide (PDF slides) in Italian 

online tutorials (Andy Roberts) 

A Simplified Introduction to LaTeX (by H.J. Greenberg) 

TeX Resources (A.J. Hildebrand) 

LaTeX for Word Processor Users 

The Indian TeX Users Group has tutorials on several subjects. 

The LaTeX Wikibook 

Making Friends with Latex 

LaTeX course (University of Cambridge Computing Service) 

Packages 

There are numerous "add-ons" for LaTeX. Some (like caption, enumerate, and 

fancyhdr) slightly enhance existing features, others provide extensive new 

functionality. The TeX and LaTeX Catalogue describes packages available elsewhere. 

See the Configuring LaTeX document if you intend to install many packages. 

Bibliographies, Graphics and Maths 

Front/Back matter 

See the bibliographies page. 

Search

ibliographies with biblatex 

Natural Science Citations - provides many options. See also the reference sheet 

CTAN has many bibliography styles in its bibtex section. 

Using Makeindex. How to add an index to your document 

Simple LaTeX Glossaries and Acronyms using the glossaries package 

The glossaries documentation 

The nomencl package How to add nomenclature sections 

Graphics 

Maths 

Using Imported Graphics in LaTeX and PDFLaTeX (by Keith Reckdahl) explains all 

there is to know about putting graphics into LaTeX documents. The Hints about 

tables and figures in LaTeX and Hints on adding figures to multicolumn 

environments documents deal with common problems. See also Klaus Hoeppner's 

Strategies for including graphics in LaTeX documents 

Graphics for Inclusion in Electronic Documents (Ian Hutchinson) 

The xfig graphics editor. 

Gnuplot displays data graphically. Use its "set term postscript eps color" to 

produce a postscript file which can be added to your latex document in the usual 

way. Matlab may be preferable. 

The pstricks tutorial show how to use the pstricks package to produce line 

drawings 

Matlab graphics with LaTeX 

The psfrag handout addresses the common problem of how to add LaTeX maths to 

a postscript file. 

Part of Math into LaTeX (by G. Grätzer) is online 

AMS-LaTeX provides specialist support. 

The Short Math Guide for LaTeX comes from the American Mathematical Society 

mathmode (133 pages) by Herbert Voß is useful. 

Matlab has some support for LaTeX production. Type "help latex" inside matlab for 

details. 

Effective Scientific Electronic Publishing (by Markus G. Kuhn) and AcroTeX by 

D.P.Story cover PDF production. 

Maths cheat sheet (Martin Jansche) 

Math Tutorial for mimeTeX 

A Survey of Free Math Fonts for TeX and LaTeX (Stephen G. Hartke) 

Detexify - LaTeX symbol classifier lets you draw a symbol and will give you the 

corresponding LaTeX 

Tables 

Tables in LaTeX: packages and methods 

Guides to writing various types of documents 

Posters and booklets 

Creating Technical Posters With LaTeX (by Nicola Talbot ) 

Reports (the squeezing space in LaTeX notes may also be useful) 

Using LaTeX to Write a PhD Thesis (Nicola L. C. Talbot) 

LaTeX IIB project report classes 

Harish Bhanderi's CUED PhD/MPhil Thesis Style 

Presentations and OHP slides 

HTML or PDF from LaTeX 

Creating a PDF document using PDFlatex (by Nicola Talbot) 

Producing PDF 

Multi-column output 

For collaborative or multi-draft documents, latexdiff might be useful. Doing 

latexdiff -CCHANGEBAR old.tex new.tex > diff.tex 

pdflatex diff.tex

should produce a document that compares and contrasts the 2 versions of the file. 

CUED users can access the current university identifiers (crests) using 

\includegraphics{BWUni3.eps} or \includegraphics{CUni3.eps} on our linux servers. 

These should only be used in their original sizes. 

Other sources of information 

General 

You can do a keyword search of the LaTeX documents on this server. 

LaTeX Matters (a blog) 

See the Frequently Asked Questions (or the Engineering Department's LaTeX FAQ) 

for more information. 

The UK archive of TeX-related material, CTAN contains everything to do with 

LaTeX. Use the CTAN search to search your nearest CTAN archive. 

TeX Live documentation 

Hypertext Help with LaTeX (an extensive indexed reference) 

The TeX Users Group (TUG) keeps lists of TeX resources and packages (free and 

commercial), etc. The LaTeX project site is useful too. 

References for TeX and Friends from mixie.org offers material in several formats. 

LaTeX cheat sheet 

The comp.text.tex newsgroup covers LaTeX issues. 

tex.stackexchange.com is a forum for questions and answers 

The PracTeX Journal includes low-tech articles like \begin{here} % getting started 

etc. 

texdoctk is often installed with LaTeX. It's an easy way to access installed 

documentation 

Distributions 

Note that the "front-end" (the program with 

an editor, buttons and menus) and the LaTeX files may well be separately distributed. 

If you install texmaker, for example, it will assume that you've already downloaded 

the latex system. 

Distributions for many machine types are available in CTAN's systems directory. 

For MS Windows 95/98/NT/2000 machines, proTeXt (based on MiKTeX) is worth a 

look. See LaTeX using MikTeX and WinEdt for information about using MikTeX and 

WinEdit on Windows. BaKoMa TeX might also be useful. 

TeX Live has binaries for most flavors of Unix, including GNU/Linux, and also 

Windows 

MacTeX for Macs includes support for using Mac fonts. 

The Macintosh TeX/LaTeX Web Site is very informative. 

Converters 

We have a site licence for tex2word. Contact Peter Benie (pjb1008) for help with it 

(with a demo licence it fails to convert some files that with the real licence it copes 

with). In addition 

wvLaTeX is installed (Word to LaTeX). 

OpenOffice has an option to export Word files as LaTeX 

There's a list of RTF/Word/WP - LaTeX - converters online. 

Excel2Latex may be useful to Windows users 

Fonts and Characters 

Using common PostScript fonts with LaTeX 

The Comprehensive LaTeX Symbol List 

LaTeX and fonts 

The Font Installation Guide (Philipp Lehman) 

character sets 

Typesetting 

The memoir package has very extensive documentation about design.

The CUED library page has sections on writing style guides and bibliography 

production. 

Editors/Front-ends 

With Kile (installed on our 

local system - type kile in the 

Terminal window to start it) 

you still need to type LaTeX 

code, but Kile has many 

facilities (templates, wizards, 

etc) to make it easier. 

You should be able to find 

what you want in the menus 

(for example, the File->Statistic option gives a word-count, etc). You can print the 

LaTeX file directly from Kile. To print the output file you need to use another 

program. For example, if you want to create a PDF file you can produce the DVI 

file, use the Build->Convert->DVItoPDF option, then the Build->View->ViewPDF 

option to view the file. The viewer has a Print option. 

lyx is a WYSIWYG front-end 

for LaTeX that's getting better 

all the time. It's installed on 

our teaching system. 

Warning: it may not always 

be easy to convert between 

LaTeX and lyx formats - use 

at your own risk! 

Texmaker (not installed) is a 

free cross-platform LaTeX 

editor 

LEd is a free integrated 

development environment 

(IDE) for use with Windows 

95/98/Me/NT4/2000/XP/2003/Vista operating systems 

The emacs editor offers extra menus when a LaTeX file is loaded in 

Miscellaneous 

Configuring LaTeX 

Extending LaTeX 

Travels in TeX Land: Tweaking LaTeX (David Walden) 

Printing PDF from LaTeX onto A4 

LaTeX tips (Volker Koch) 

Postscript, PDF and LaTeX versions of local documention are online.

Updates 

July 2012 - TeXLive 2011 installed 

May 2011 - biblatex installed 

May 2009 - LaTeX removed from gate. Use one of the Linux servers 

May 2009 - IIB project classes (also for LyX users) 

February 2009 - latexdiff program installed - to determine and mark up 

differences between two latex files. Type man latexdiff for details. 

January 2009 - glossaries package installed, to supercede glossary. See the 

glossaries documentation for details. 

September 2008 - The TeX Live distribution has replaced the teTeX distribution. 

Users shouldn't notice any difference. 

September 2007 - nomencl (nomenclature package) updated to version 4.2. It's 

incompatible with the old version - use \usepackage[compatible]{nomencl} if you 

want the old behaviour. See the documentation for details 

August 2007 - Metapost (mpost) and purifyeps installed 

July 2007 - TeTeX 3.0 installed on the teaching system 

23/10/06 - Harish Bhanderi's CUED PhD/MPhil Thesis Style 

Example 

One way to get started with LaTeX is to look at a simple example. A short document is 

reproduced below. Engineering Department users can find a file with a similar 

structure in /export/Examples/LaTeX/demo0.tex. Further examples (a letter, a CV, 

etc) are in the same directory. 

\documentclass{article} 

\begin{document} 

\section{Simple Text} % THIS COMMAND MAKES A SECTION TITLE. 

Words are separated by one or more spaces. Paragraphs are separated by 

one or more blank lines. The output is not affected by adding extra 

spaces or extra blank lines to the input file. 

Double quotes are typed like this: ``quoted text''. 

Single quotes are typed like this: `single-quoted text'. 

Long dashes are typed as three dash characters---like this. 

Italic text is typed like this: \textit{this is italic text}. 

Bold text is typed like this: \textbf{this is bold text}. 

\subsection{A Warning or Two} % THIS COMMAND MAKES A SUBSECTION TITLE. 

If you get too much space after a mid-sentence period---abbreviations 

like etc.\ are the common culprits)---then type a backslash followed by 

a space after the period, as in this sentence. 

Remember, don't type the 10 special characters (such as dollar sign and 

backslash) except as directed! The following seven are printed by 

typing a backslash in front of them: \$ \& \# \% \_ \{ and \}. 

The manual tells how to make other symbols. 

\end{document} % THE INPUT FILE ENDS WITH THIS COMMAND. 

Once you have created a LaTeX source file it must be processed by LaTeX before it 

can be printed out. The command

latex myfile.tex 

© Cambridge University, Engineering Department, Trumpington Street, Cambridge CB2 1PZ, UK (map) 

Tel: +44 1223 332600, Fax: +44 1223 332662 

Contact: tl136 (with help from jpmg, etc 

which will produce a number of files including myfile.log, myfile.aux and myfile.dvi. If 

you are using various sorts of cross referencing then you may have to run LaTeX more 

than once. If you want an automated bibliography you will also have to run bibtex. 

When this procedure is complete you will have a file myfile.dvi to print out. This is a 

device independent representation of your document which can be displayed by 

clicking on the icon or using the xdvi program.

L ATEX for viderekomne 

Harald Hanche-Olsen 

2005–05–18 

LATEX vk 2005–05–18

Unngå eksplisitt layout i teksten! 

L ATEX-misbruk 

For eksempel hyppig bruk av \\, \\[4mm], eksplisitt \vspace og \hspace etc. 

Bedre: Globale definisjoner og deklarasjoner, miljø (environment). 

Hold form og innhold adskilt! (Så langt du klarer.) 

Bruk gjerne \smallskip, \medskip, \bigskip for eksplisitte vertikale mellomrom, \enspace, 

\quad og \qquad for horisontale mellomrom. 

Ikke bruk $$...$$. Bruk heller \[...\]. 

(Du får riktigere mellomrom rundt formlene, blant annet.) 

– Men $...$ er ok, anbefales fremfor $...$. 

Unngå {\em ...} og {\it ...}. Bruk heller \emph{...} og \textit{...}. 

Sammenlign vold i hjemmet med vold i hjemmet. 

Mange fler – les l2tabuen! (texdoc l2tabuen.) 

LATEX vk 2005–05–18 1

Dokumentdeklarasjoner 

Ta med alle opsjoner som kan tenkes å ville brukes av flere pakker i klassedeklarasjonen. 

Eksempel: norsk, a4paper, draft. 

Men noen pakker vil ha private opsjoner. Eksempel: fontenc, inputenc, geometry. 

Pakker som ikke skal gis private opsjoner, kan listes i samme \usepackage. 

\documentclass[a4paper,12pt,norsk]{article} 

\usepackage[latin1]{inputenc} 

\usepackage[hscale=0.7,vscale=0.85,heightrounded]{geometry} 

\usepackage{babel,amsmath,graphicx} 

Et velstrukturert dokument vil nå fortsette med metadata som \author, \title, etc, etterfulgt av 

private definisjoner av kommandoer og environments, etc. 

Har du mange, kan det være lurt å skrive din egen pakkenavn.sty og inkludere den med 

\usepackage{pakkenavn}. 

LATEX vk 2005–05–18 2

Sidelayout 

Som en hovedregel, la dokumentklassen bestemme layouten. Spesifiser papirstørrelsen: 

\documentclass[a4paper,...]{klasse} 

Unngå pakker som a4, a4wide etc., de finnes i mange varianter, så du vet aldri hva du får. 

Du får god kontroll med geometry-pakken. Eksempel: 

\usepackage[hscale=0.7,vscale=0.85,heightrounded]{geometry} 

lar teksten fylle 70% av sidebredden og 85% av sidehøyden. 

Opsjonen heightrounded runder av teksthøyden til et helt antall linjer (\topskip pluss n − 1 

ganger \baselineskip for n linjer). 

Pakken har mange andre opsjoner og er veldokumentert. 

Pass på! Hvis tekstlinjene blir lange bør linjeavstanden økes noe, ellers blir teksten tung å lese. 

Et annet alternativ er å bruke alternative dokumentklasser. Det finnes mange: Den såkalte 

«KOMA-script bundle» har jeg ikke prøvd, heller ikke memoir-klassen. 

Personlig liker jeg å sette tekst på A5-papir og så generere PDF med to A5-ark per A4-side. 

LATEX vk 2005–05–18 3

Lorem ipsum dolor sit amet, 

consectetur adipisicing elit, 

sed do eiusmod tempor incididunt 

ut labore et dolore 

magna aliqua. Ut enim ad 

minim veniam, quis nostrud 

exercitation ullamco laboris 

nisi ut aliquip ex ea commodo 

consequat. 

Duis aute irure dolor in reprehenderit 

in voluptate velit 

esse cillum dolore eu fugiat 

nulla pariatur. Excepteur sint 

occaecat cupidatat non proident, 

sunt in culpa qui officia 

deserunt mollit anim id est laborum. 

Avsnittlayout 

Avsnittlayouten i eksempelet til venstre er vanlig: Innrykk 

undertrykkes i første avsnitt, ellers innrykk i hvert avsnitt 

uten mellomrom mellom avsnittene. 

\parindent er en lengde som angir normalt avsnittinnrykk. 

\parskip er en vertikal lengde som settes inn foran hvert 

nytt avsnitt. 

Normalt anbefales ikke å sette disse variablene selv! Men 

\usepackage{parskip} håndterer de verste bieffektene av å 

skru på disse variablene, og setter \parindent til null og 

\parskip til 0.5\baselineskip pluss 2 pt strekkbarhet. 

Etter å ha inkludert pakken kan du justere videre om du vil. 

(Vi har også \leftskip, \rightskip og \parfillskip.) 

LATEX vk 2005–05–18 4

Problem: Over- og underfulle bokser 

g h i J 

en bokstav er en boks. 

Alt TEX gjør er å stable bokser ved siden av hverandre (bokstaver i linjer) og oppå hverandre (linjer i 

avsnitt). 

Mellom boksene kan det være strekkbare og krympbare mellomrom («lim»): 

– mellom ordene i et avsnitt 

– mellom avsnitt (noen ganger) 

– rundt figurer og frittstående formler 

Dette er en ekstremt underfull hbox. 

Mens denne boksen er overfull, fordi den inneholder mye mer tekst enn det er plass til å klemme inn på 

Linjen over er den naturlige bredden her (\textwidth=\hsize). 

LATEX vk 2005–05–18 5




ut labore et dolore 

magna aliqua. Ut enim ad 

minim veniam, quis nostrud 

exercitation ullamco laboris 

nisi ut aliquip ex ea commodo 

consequat. Duis aute 

irure dolor in reprehenderit 

in voluptate velit esse cillum 

dolore eu fugiat nulla pariatur. 

Excepteur sint occaecat 

cupidatat non proident, sunt 

in culpa qui officia deserunt 

mollit anim id est laborum. 

Ombrekking 

TEX sjekker alle mulige valg av linjedelinger, regner ut en 

badness for hver av dem, og velger den linjedelingen som 

gir minst total badness for avsnittet som helhet. 

(Dijkstras algoritme for korteste vei i en graf.) 

Badness for en linje: 100 · |strekk el krymp/tillatt| 3 , så 

100 1/3 ≈ 4.6 ganger tillatt strekk i én linje er uendelig ille. 

(Men krymping over 100 % er også regnet som uendelig ille.) 

I tillegg til badness kommer straffepoeng (penalties) for 

annet som ødelegger for estetikken, som delte ord 

(\hyphenpenalty). 

Sideombrekking gjøres etter tilsvarende algoritmer, men 

her er algoritmen «grådig» i stedet for global: TEX beregner 

hver side optimalt og sender den fra seg, uten hensyn til 

eventuelle konsekvenser for neste side. 

Sideombrekkingsalgoritmen kompliseres i høyeste grad av 

fotnoter og floats. 

LATEX vk 2005–05–18 6




ut labore et dolore magna 

aliqua. Ut enim ad minim 

veniam, quis nostrud exercitation 

ullamco laboris nisi ut aliquip 

ex ea commodo consequat. 

Duis aute irure dolor in 

reprehenderit in voluptate velit 

esse cillum dolore eu fugiat 

nulla pariatur. Excepteur sint 

occaecat cupidatat non proident, 

sunt in culpa qui officia 

deserunt mollit anim id est 

laborum. 

Linjeombrekking 

TEX prøver først å sette avsnittet uten å dele opp ordene. 

Dersom det ikke gir godt nok resultat, prøver den på ny, 

med orddelinger. For hver orddeling økes badness med 

\hyphenpenalty. (Jeg har satt \hyphenpenalty=10000 i 

eksemplet til venstre). 

\pretolerance: Grense for «godt nok», uten orddeling. 

Standarverdi 100. 

\tolerance: Grense for «godt nok», med orddeling. 

Standarverdi 100. 

\emergencystretch: Ekstra strekkbarhet per linje. Brukes 

bare om paremeteren er positiv og setting med orddeling 

ikke ga resultat bedre enn \tolerance. 

LATEX vk 2005–05–18 7



sed do eiusmod tempor 

incididunt ut labore et 

dolore magna aliqua. Ut 

enim ad minim veniam, quis 

nostrud exercitation ullamco 

laboris nisi ut aliquip ex 

ea commodo consequat. 

Duis aute irure dolor in 

reprehenderit in voluptate 

velit esse cillum dolore 

eu fugiat nulla pariatur. 

Excepteur sint occaecat 

cupidatat non proident, sunt 

in culpa qui officia deserunt 

mollit anim id est laborum. 

Linjeombrekking 

Her er fortsatt \hyphenpenalty=10000, men også 

\emergencystretch=1em. 

Resultatet er ikke bra, og \emergencystretch må virkelig 

bare brukes i nødsfall. 

Anbefalinger (se l2tabuen): 

\pretolerance=1414 

\tolerance=1414 

\hbadness=1414 

\hfuzz=0.3pt 

\widowpenalty=10000 

\vfuzz=\hfuzz 

\raggedbottom (men helst ikke?) 

Hvis du fortsatt får under- og overfulle bokser, så undersøk 

saken! Skriv heller om teksten for å få bort problemet. 

(Kanskje TEX bare trenger hjelp til å dele et langt ord?) 

Ikke bruk \emergencystretch globalt. I ytterste nødsfall, 

avslutt et avsnitt med {\emergencystretch 1\par}. 

LATEX vk 2005–05–18 8

Linjeombrekking: Hjelp til orddeling 

Du kan eksplisitt deklarere, en gang for alle, hvordan et gitt ord skal deles: 

\hyphenation{saue-øye-eier over-retts-sak-fører} 

Med \usepackage[norsk]{babel} kan du også angi skillet mellom delene i et sammensatt ord i 

teksten slik: 

over"-buljong"-terning"-pakk"-mester"-assistent. 

Fordelen er at TEX også kan dele dette som overbul-jongterningpakkmesterassistent dersom det 

ellers er tillatt etter orddelingsmønsteret som er i bruk. (Standardmekanismen \- undertrykker 

orddeling andre steder i ordet.) 

Norsk babel har flere triks i ermet: 

o"ppasser blir til oppasser eller opp-passer. (Fungerer for andre konsonanter og.) 

hoff"|intriger kan deles til hoff-intriger, men blir ellers til hoffintriger (sammenlign med 

hoffintriger.) 

Du kan skrive tabloid"=journalistikk for å få tabloid-journalistikk, alltid med bindestrek, men 

tillate ordeling andre steder i tillegg. 

Og i"~går blir til igår, eller kan deles uten bindestrek etter i-en. 

Du kan bruke "< og "> i stedet for « og » i tilfelle du ikke finner de sistnevnte på tastaturet. 

LATEX vk 2005–05–18 9

Feilsøking 

L ATEX er implementert som makroer i TEX: Dette kompliserer feilsøkingen fordi L ATEX holder et mye 

høyere abstraksjonsnivå enn TEX. 

\errorcontextlines=99 gir deg mer kontekst. Det kan være mange makroer inni hverandre som 

er i ferd med å ekspanderes, og de vil nå alle vises, med inputlinjen som ga feilen nederst. Ser du 

for langt opp i listen roter du deg inn i L ATEXs interne rutiner, men de nederste to-tre nivåene kan 

ofte gi en pekepinn om hvor feilen ligger. 

Søk etter manglende krøllparenteser og andre syntaktiske feil i nærheten av der feilen skjedde. 

Når alt annet feiler: Binærsøk! 

\iffalse 

suspekt kode 

\fi 

Så snart du har isolert feilen, snevre inn søket ved å halvere søkeområdet. 

Merk! Du må passe på environments! 

Matchende \begin/\end-par på begge innenfor, eller begge utenfor \iffalse...\fi. 

LATEX vk 2005–05–18 10

Fotnoteproblematikk 

Husk: \footnote{tekst} er essensielt det samme som \footnotemark etterfulgt av 

\footnotetext{tekst}. 

\footnotemark oppdaterer fotnotetelleren og lager et merke i teksten, mens \footnotetext legger 

tekst til listen over fotnoter som skal inn på siden. 

På grunn av TEXs asynkrone natur må de to operasjonene ofte skilles, for eksempel om 

fotnotemerket skal inn i en boks av noe slag. 

Verre er det om fotnoten skal inn i en float, for eksempel i en tabell. Det er utenfor L ATEXs 

rekkevidde å lage et fotnotemerke i en float og få fotnoten på samme side. 

Løsning: 

\begin{table} 

\begin{minipage}{\textwidth} 

... \footnote{En fotnote} ... 

... \footnote{En fotnote til} ... 

\end{minipage} 

\caption{Tabell med fotnoter i.} 

\end{table} 

LATEX vk 2005–05–18 11

. . . eller hvordan håndtere tellere. 

Grunnleggende teori: 

Numerologi 

Nummererte objekter har tellere med samme navn som objektet selv: chapter, section, figure, 

equation og så videre. 

Til hver teller er assosiert en kommando \theteller som skriver ut den nåværende verdien av 

telleren. Det er ikke noe i veien for at kommandoen bruker andre tellere. 

For eksempel, om du vil at figurene i kapittel 3 skal være nummerert 3.1, 3.2, 3.3 og så videre: 

\renewcommand{\thefigure}{\thechapter.\arabic{figure}} 

Men dette er ikke nok: Vi trenger også sikre oss at figure-telleren settes tilbake til null hver gang vi 

starter et nytt kapittel, altså når chapter-telleren økes. Forfatteren av dokumentklassen vi bruker 

kunne ha ordnet dette med \newcounter{figure}[chapter], men om det ikke er gjort kan vi 

ordne det selv: 

\@addtoreset{figure}{chapter} 

(Pass på @-tegnet!) 

Dersom du laster pakken remreset kan du gjøre det motsatte: Altså 

\@removefromreset{figure}{chapter}, i tilfelle forfatteren av dokumentklassen har ordnet en 

automatisk nullstilling av tellere som du ikke ønsker. 

LATEX vk 2005–05–18 12

Numerologi 

Av og til ønsker man at delfigurer skal være nummerert som figur 2a, 2b, 2c etc. Til slikt finnes et 

par løsninger: 

Enklest er \usepackage{subfloat}, med miljøer subfigures og subfloats. 

Alternativt \subfiguresbegin . . . \subfiguresend, som ikke trenger nøstes rett i forhold til andre 

miljøer! (Også \subtablesbegin . . . \subtablesend.) 

Et annet alternativ er \usepackage{subfig}, som forvirrende nok definerer en kommando 

\subfloat. Denne tar seg av ikke bare nummereringen, men også plassering og til og med 

variasjoner over figurtekstene (fordi den også importerer caption-pakken). Jeg har ikke testet den. 

Se L ATEX Companion. 

LATEX vk 2005–05–18 13

Matematikk 

\usepackage{amsmath} er (bør være) obligatorisk for alle som skriver noe matematisk. 

Dokumentasjon: Les amsldoc (texdoc amsldoc). 

Unngå eqnarray; bruk align i stedet. Eller gather for å samle ligninger uten innbyrdes justering. 

x = a + b 

y = a − b 

z = ξ + η 

+ζ − ω 

f (x) = f (0) + f ′ (0)x + 1 

2 f ′′ (0)x 2 

\begin{eqnarray*} 

x&=&a+b\\ 

y&=&a-b\\ 

z&=&\xi+\eta\\ 

&&+\zeta-\omega 

\end{eqnarray*} 

+ 1 

6 f ′′′ (0)x 3 + ··· + 1 

n! f (n) (ξ)x n 

x = a + b 

y = a − b 

z = ξ + η 

+ ζ − ω 

\begin{align*} 

x&=a+b\\ 

y&=a-b\\ 

z&=\xi+\eta\\ 

&\quad+\zeta-\omega 

\end{align*} 

\begin{multline*} 

f(x)=f(0)+f’(0)x+\frac{1}{2}f’’(0)x^2\\ 

+\frac{1}{6}f’’’(0)x^3 

+\dotsb+\frac{1}{n!}f^{(n)}(\xi)x^n 

\end{multline*} 

LATEX vk 2005–05–18 14

Matematikk 

Med amsmath kan du lage egne operatorer: Etter \DeclareMathOperator{\sgn}{sgn} kan du 

skrive $\sgn\sigma$ og få sgnσ heller enn å skrive $sgn \sigma$ og få sg nσ. 

I et display kan du bruke \quad til å skille sidestilte deler, \qquad til å skille en formel fra en 

betingelse, og \text{...} for å putte inn tekst: 

\[ 

x_{n+1}=x_n+y_n,\quad y_{n+1}=x_n,\qquad\text{for } n=1,2,\dotsc 

\] 

gir 

xn+1 = xn + yn, yn+1 = xn, for n = 1,2,... 

LATEX vk 2005–05–18 15

Matematikk 

Kjekt å vite: TEX opererer med åtte forskjellige typer såkalte atomer i matematikk – Ordinary (a, b, 

α etc), (stor) Operator ( � , � etc), Binary operation (+, −, × etc), Relation (=, ≈, ≤ etc), Open 

(venstreparenteser), Close (høyreparenteser), Punctuation (komma, semikolon), Inner. 

Enhver del av en formel kan gjøres til en Ord ved å inneslutte den i {...}. 

Det finnes også kommandoer \mathbin, \mathrel, \mathopen, \mathclose, \mathpunct, 

\mathinner som tvinger det påfølgende atom inn i en annen klasse. 

Mellomrommene varier mellom disse forskjellige typene. Sammenlign for eksempel: a = b ($a=b$) 

med a=b ($a{=}b$). 

(Og sammenlign siste linje i eqnarray* og align* på forrige side.) 

Desimalkomma? Sammenlign 3,14 ($3,14$) og 3,14 ($3{,}14$). 

Enklere håndtering av desimalkomma: \usepackage{icomma}. Nå blir komma et Ordinært atom i 

matematikkmodus, hvis du ikke skriver et mellomrom bak. 

LATEX vk 2005–05–18 16

pdfL ATEX 

Standard TEX/L ATEX: fil.tex −→ fil.dvi −→ fil.ps −→ fil.pdf 

ved hjelp av dvips, ps2pdf el.l. Alternativt, direkte fra dvi til pdf med dvipdf. 

Med pdfTEX/pdfL ATEX: fil.tex −→ fil.pdf i én operasjon! 

– PDF blir mer og mer det universelle språket for sidebeskrivelse. 

– PostScript er primært for skrivere. 

– Trykkerier vil ha PDF, ikke PS. 

Men pass på fontene dine. 

– TEX i seg selv trenger bare kjenne fontmetrikken, beskrevet i *.tfm (og *.vf). 

– Tradisjonelle TEX-system bruker bitmappede fonter (*pk). 

– Men nå finnes de fleste fonter også som PostScript Type 1 (*.pfb), eventuelt som TrueType 

(*.ttf). 

Bitmappede fonter blir uleselig på skjerm. Sørg for at du har fontene tilgjengelig på vektorformat. 

Moderne TEX-systemer har nå de klassiske CM-fontene som Type 1. 

EC-fontene (\usepackage[T1]{fontenc}) finnes som Type 1, i den meget omfattende cm-super. 

Men Latin Modern (\usepackage{lmodern}) er å foretrekke. 

Dette foredraget bruker Utopia og Fourier (\usepackage{fourierx}). 

LATEX vk 2005–05–18 17

Typesnitt og fonter 

Lavnivå: En font spesifiseres av følgende attributter: 

– Koding: OT1 (gammel 7-bits), T1 (moderne 8-bits) 

Lavnivå: \fontencoding{koding} 

– Fontfamilie: cmr, cmss, cmtt, andre 

Lavnivå: \fontfamily{familie} 

– Serie (vekt og bredde i ett): m (medium), bx (bold extended) 

Lavnivå: \fontseries{serie} 

– Fasong: n (normal), it (kursiv) 

Lavnivå: \fontshape{fasong} 

– Størrelse: Designstørrelse 

Lavnivå: \fontsize{fontstørrelse}{baselineskip} 

Merk at å endre ett attributt ikke velger ny font: Velg alle attributter du vil endre, følg på med 

\selectfont. 

Hendig kortform for å sette de første fire attributtene: 

\usefont{koding}{familie}{serie}{fasong} 

denne gjør \selectfont av seg selv etterpå, så du slipper. Kjør eventuelt \fontsize først. 

Hendige verdier å bruke: \encodingdefault, \familydefault, \seriesdefault, \shapedefault. 

Se også \DeclareFixedFont. 

LATEX vk 2005–05–18 18

Høynivå: 

Typesnitt og fonter 

Høynivåkommandoene endrer ett eller flere attributter og gjør \selectfont, så du slipper. Det 

finnes ingen høynivåkommando for å endre fontkoding. 

– Familie: 

\textrm{...} eller {\rmfamily ...} 

\textsf{...} eller {\sffamily ...} 

\texttt{...} eller {\ttfamily ...} 

– Serie: 

\textmd{...} eller {\mdseries ...} 

\textbf{...} eller {\bfseries ...} 

– Fasong: 

\textup{...} eller {\upshape ...} 

\textit{...} eller {\itshape ...} 

\textsl{...} eller {\slshape ...} 

\textsc{...} eller {\scshape ...} 

\emph{...} pleier bety \textit eller \textup avhengig av omgivelsene. 

– Størrelse: \tiny, \scriptsize, \footnotesize, \small, \normalsize, \large, \Large, 

\LARGE, \huge, \Huge. 

Hva størrelser og familier her betyr i praksis, avhenger av klassefiler og pakker. 

LATEX vk 2005–05–18 19

Times: 

\usepackage{mathptmx} 

\usepackage[scaled=.90]{helvet} 

\usepackage{courier} 

Palatino: 

\usepackage{mathpazo} 

\usepackage[scaled=.95]{helvet} 

\usepackage{courier} 

Fourier og Utopia: 

\usepackage{fourierx} 

Noen populære fontvalg 

Kan kreve litt hjemmearbeid: Hente fourier-pakken fra CTAN, og forbedringer til denne (inklusive 

fourierx.sty) fra http://home2.vr-web.de/~was/putx.html. 

Latin Modern: 

\usepackage{lmodern} 

Anbefales som standardvalg fremfor CM- eller EC-fontene. 

LATEX vk 2005–05–18 20

– L ATEX sammen med dvips: Kun EPS. 

– pdfL ATEX: JPEG, PNG, PDF. 

Men hva med grafikken? 

– Konverter EPS til PDF med epstopdf (på unix). 

Pakkene epsfig, psfig, etc. er utdaterte. Bruk i stedet: \usepackage{graphicx} 

\includegraphics[opsjoner]{filnavn} 

Ikke ta med endelse på filnavnet. 

Vanlig L ATEX vil forsøke med endelser .eps og .ps. 

PdfL ATEX forsøker .png, .pdf, .jpg. 

Slik kan samme inputfil virke like bra med pdfL ATEX og vanlig L ATEX. 

\includegraphics[width=0.4\textwidth]{filnavn} gir en figur som er 40% av sidebredden. 

\includegraphics[height=50mm]{filnavn} gir en figur så høy som lengden av en fyrstikkeske. 

\includegraphics har mange andre opsjoner. Se grfguide (texdoc grfguide) for detaljene. 

LATEX vk 2005–05–18 21

Floats 

Figurer og tabeller (figure og table-miljøene) kalles floats fordi de flyter dit det passer L ATEX å 

plassere dem. 

Dette er veldig nyttig, men forårsaker også mye hodebry! 

Men først litt om innholdet i en float: 

Å typesette tekst inne i en float er ikke noe annerledes enn å typesette tekst alle mulige andre 

steder: TEX starter opp i vertikal modus, med en tom liste til å putte ting i, og en tekstbredde lik 

den i omgivelsene. For eksempel: 

\begin{figure} 

\centering 

\includegraphics[width=0.7\textwidth]{bilde} 

\smallskip 

\caption{Dette er et vakkert bilde.} 

\end{figure} 

Eneste forskjell på figure og table er hva \caption-kommandoen gjør inne i den: Den bruker 

enten figure- eller table-telleren, og starter teksten med «Figur x» eller «Tabell y». 

Og mens jeg husker det: Du kan bruke \caption flere ganger inne i samme float. 

Men en figur og en tabell i samme float går dessverre ikke. 

LATEX vk 2005–05–18 22

Figurer side om side 

Figur 1: En humle. Figur 2: En gåseflokk. 

\begin{figure}[ht] 

\makebox[\textwidth][s]{\hfil 

\parbox[t]{0.3\textwidth}{\centering\includegraphics[width=\hsize]{humle} 

\caption{En humle.}}\hfil 

\parbox[t]{0.45\textwidth}{\centering\includegraphics[width=\hsize]{gjess} 

\caption{En gåseflokk.}}\hfil} 

\end{figure} 

(Bruk \centering og ikke center-miljøet. Det sistnevnte legger til vertikale mellomrom.) 

LATEX vk 2005–05–18 23

Hvordan sentrere på desimaltegn. 

1/2 1,5 

π 3,14159 

10e 27,1828 

Tabeller 

\usepackage{dcolumn} 

... 

\begin{tabular}{cD{.}{,}{5}} 

$1/2$ & 1.5 \\ 

$\pi$ & 3.14159 \\ 

$10e$ & 27.1828 

\end{tabular} 

LATEX vk 2005–05–18 24

Vi kan også snu tabeller sidelengs! 

1/2 1,5 

π 3,14159 

10e 27,1828 

Tabeller 

\usepackage{rotating} 

... 

\begin{sideways} 

\begin{tabular}{cD{.}{,}{5}} 

$1/2$ & 1.5 \\ 

$\pi$ & 3.14159 \\ 

$10e$ & 27.1828 

\end{tabular} 

\end{sideways} 

Pakken rotating inneholder også miljøer sidewaystable og sidewaysfigure. (Jeg fikk problemer 

i mine forsøk med sidewaystable, har ikke rukket å undersøke nærmere.) 

LATEX vk 2005–05–18 25

\begin{figure}[hptb] (standard er [ptb]) 

Plassering av floats 

[h] Er det plass her? Hvis ja, sett den her, ellers må den flyte. 

[t] Figuren kan flyte til toppen av en side. 

[b] Figuren kan flyte til bunnen av en side. 

[p] Figuren kan flyte til en side som er reservert for floats. 

Hvis L ATEX ikke klarer å plassere en figur, flyter den til slutten av dokumentet. Kanskje får du også 

den fryktede feilmeldingen Too many unprocessed floats. 

LATEX vk 2005–05–18 26

Plassering av floats 

L ATEX har tre tellere (settes med \setcounter) som styrer figurplasseringen: 

topnumber (standard: 2) Maksimalt antall figurer øverst på en tekstside. 

bottomnumber (standard: 1) Maksimalt antall figurer nederst på en tekstside. 

totalnumber (standard: 1) Maksimalt antall figurer på en tekstside. 

L ATEX har fire kommandoer (settes med \renewcommand) som styrer figurplasseringen: 

\topfraction (standard: 0.7) Maksimal andel av en tekstside anvendelig til figurer øverst. 

\bottomfraction (standard: 0.3) Maksimal andel av en tekstside anvendelig til figurer nederst. 

\textfraction (standard: 0.2) Minimal andel av en tekstside som må være tekst. 

\floatpagefraction (standard: 0.5) Minimal fyllingsgrad for en dedikert float-side. 

LATEX vk 2005–05–18 27

Flere dokumenter til ett 

Problem: Fire artikler og to rapporter pluss en innledning skal bli en doktorgrad. 

Løsning: Flere muligheter. 

Kombiner pdf-filer med pdfL ATEX: Bruk pakken pdfpages for å importere enkeltsider eller hele 

pdf-dokumenter. (texdoc pdfpages for en svært detaljert forklaring.) 

Det kan være en fordel å gi enkeltdokumentene en mest mulig lik layout først. 

Kombiner L ATEX-kilder: \documentclass[...]{combine} 

Her må alle dokumentene være tilstrekkelig like til at det går greit å samle alle spesielle 

kommandoer og environments ett sted. 

Jeg har ikke prøvd combine.cls, og den er ikke med i standard-distribusjonen. Finn den på CTAN, 

med eventuell dokumentasjon. 

LATEX vk 2005–05–18 28

Kombiner pdf-filer med plain pdfTEX: 

\input pdf-1up 

\includepdf{fil-1} 

\includepdf{fil-2} 

... 

\bye 

– hvor pdf-1up.tex er filen 

Flere dokumenter til ett 

\pdfhorigin=0pt 

\pdfvorigin=0pt 

\countdef \fileno=1 

\def\includepdf#1{ 

\pageno 0 

\advance \fileno 1 

\loop 

\advance\pageno 1 

\setbox0\vbox{\pdfximage page \pageno{#1.pdf}\pdfrefximage\pdflastximage} 

\shipout\box0 

\ifnum\pageno

Register 

Et register er lett å lage: Om vil at ordet underrom skal forekomme i indeksen, med en henvisning 

til denne siden, skriver du bare inn \index{underrom} i teksten. 

I tillegg skal du med ordet \makeindex i preamble. 

Nå vil L ATEX bygge en fil filnavn.idx hvor alle indeks-innslagene står i den rekkefølgen de er i 

dokumentet. 

Så kjører du makeindex filnavn, og du har nå en alfabetisk sorter fil filnavn.ind. 

Endelig tar du med \input{\jobname.ind} i slutten av dokumentet, der registeret skal være. 

Du kan gjøre mye mer ut av dette. Programmet makeindex er vel dokumentert. 

LATEX vk 2005–05–18 30

Sammendrag: Ikke gjør det. 

Modifikasjon av klasser 

Dum idé: Kopier for eksempel article.cls og rediger den. 

Lur idé: Skriv din egen klassefil som laster inn article.cls og endrer utvalgte definisjoner i den. 

%% Dette er artikkel.cls 

\ProvidesClass{artikkel}[2005/05/18 Klassefil for mine artikler.] 

\DeclareOption{lur}{... gjør noe lurt ...} 

\DeclareOption*{\PassOptionsToClass{\CurrentOption}{article}} 

\PassOptionsToClass{twoside}{article} 

\ProcessOptions 

\LoadClass{article} 

\RequirePackage{amsmath} 

\RequirePackage{graphicx} 

Etter dette følger du på med dine egne redefinisjoner av ting du ikke liker i article.cls. Det er 

greit å klippe og lime fra originalen for å modifisere dem, men bare i begrenset omfang, ellers er 

risikoen for fremtidig inkompatibilitet for stor. 

LATEX vk 2005–05–18 31

Sammendrag: Ikke gjør det. 

Modifikasjon av pakker 

Dum idé: Kopier for eksempel icomma.sty og rediger den. 

Lur idé: Skriv din egen pakkefil som laster inn icomma.sty og endrer utvalgte definisjoner i den. 

%% Dette er ikomma.sty 

\ProvidesClass{ikomma}[2005/05/18 Lurere enn icomma.] 

\DeclareOption{lur}{... gjør noe lurt ...} 

\DeclareOption*{\PassOptionsToPackage{\CurrentOption}{article}} 

\ProcessOptions 

\RequirePackage{icomma} 

Etter dette følger du på med dine egne redefinisjoner av ting du ikke liker i icomma.sty. 

(Dette er et litt dårlig eksempel, for icomma.sty er så kort at det knapt er noe å modifisere.) 

LATEX vk 2005–05–18 32

Bøker: 

Informasjonskilder 

– L. Lamport: L AT E X A document preparation system 

– F. Mittelbach, M. Goossens et.al.: The L AT E X companion, second edition 

– D. E. Knuth: The T E Xbook 

– V. Eijkhout: T E X by Topic http://www.eijkhout.net/tbt/ 

(De to sistnevnte mest for de som virkelig vil gå dypt inn i materien.) 

I tillegg følger mye dokumentasjon med teTEX (unix) og MixTEX (windows), leses med texdoc hvis 

du vet filnavnet på dokumentasjonsfilen. Spesielt: l2tabuen, grfguide, amsldoc. Mange pakker har 

(heldigvis) dokumentasjon med samme navn som pakken. Dokumentasjonen for babel, derimot, 

heter user! 

På web: 

– TEX User group (TUG): http://www.tug.org/ 

– Comprehensive TEX Archive Network (CTAN): http://ctan.unik.no/ 

– Frequently Asked Questions (FAQ): 

http://www.tex.ac.uk/cgi-bin/texfaq2html?introduction=yes 

Og ikke glem: For nesten ethvert problem er det laget en pakke. 

LATEX vk 2005–05–18 33

↑↑ Home ↑ TeX tricks 

Generating high-quality portable PDF files 

The usual way to compile a TeX source file is to generate a .dvi file with the tex or 

latex command and then convert it into a PostScript file with dvips. If a PDF file is 

required, it can be generated from the PostScript by ps2pdf. This can be problematic in 

two respects: the quality of images may degrade for no apparent reason, and the resulting 

PDF file may not display correctly on other systems. 

A long time ago, I found on someone else's home page a dvips command line which 

prevents both problems. (It seems to be extremely well hidden, as I did not manage to 

find it again. However, I made a note of it.) Here it is: 

ps2pdf -sPAPERSIZE=a4 -dCompatibilityLevel=1.3 \ 

-dEmbedAllFonts=true -dSubsetFonts=true -dMaxSubsetPct=100 \ 

-dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode \ 

-dAutoFilterGrayImages=false -dGrayImageFilter=/FlateEncode \ 

-dAutoFilterMonoImages=false -dMonoImageFilter=/CCITTFaxEncode \ 

document.ps document.pdf 

I have since learned to understand the options. They are named, but hardly explained in 

the ps2pdf documentation which consists of the file Ps2pdf.htm in the Ghostscript 

documentation directory (use locate Ps2pdf.htm to find it). 

The important thing for the image quality is AutoFilter...Images=false and 

...ImageFilter=/FlateEncode. The first disables the automatic determination by 

Ghostscript of the "best" compression format, which tends to favour /DCTEncode, lossy 

JPEG encoding. The second set of options manually set the compression method to the 

lossless (de)flate encoding for colour and greyscale images and to CCITT encoding for 

monochrome images. 

The other options are for maximum compatibility of the generated PDF file. 

CompatibilityLevel sets the PDF version. The remaining options concern embedding of 

fonts into the generated PDF. EmbedAllFonts=true is self-explanatory and causes the 

output file to be readable even on systems which lack some of the fonts used. 

SubsetFonts=true together with MaxSubsetPct=100 causes the fonts to be embedded 

partly only, however many characters from them may be used. This protects you from 

lawsuits if you use copyrighted fonts, as embedding a font in full amouts to an illegal 

copy. Last, the option -sPAPERSIZE=a4 doesn't seem necessary unless you convert from 

some other size; replace a4 by letter if that is the paper size you use. 

An alternative way to arrive at a PDF file, if you do not require a PostScript file, is to use 

pdftex or pdflatex instead of tex or latex. In my experience, pdflatex embeds all 

fonts by default, as subsets, so you are safe on both the compatibility and the copyright 

issue. However, to be able to use pdflatex, you have to convert graphics into PDF 

format (or PNG for pixel graphics). To avoid any loss of quality, this should be done with 

the same ps2pdf command line shown above. The options relating to font embedding 

should not be omitted, as vector graphics can contain text which requires fonts. The paper

size option should be omitted. 

As an aside, the options of ps2pdf above can be required in different contexts as well. 

That is because ps2pdf is just a script calling the Ghostscript interpreter (gs) and passes 

its options to it unchanged. gs can be used for tasks as diverse as concatenating PDF files, 

with the command line 

gs -dBATCH -dNOPAUSE -dSAFER -sDEVICE=pdfwrite -sOUTPUTFILE=output.pdf \ 

source1.pdf source2.pdf ... 

where stands for the options given above. The options conserving 

image quality are especially useful when putting the scanned pages of a document 

together (even the large copier at my office outputs single-page PDF files unless you can 

put a stack of loose pages into its automatic feed). You can use gs with the same 

command line and only one source file to embed fonts into a PDF document without 

regenerating it, provided the fonts are available on the system where you do it. 

Unfortunately the resulting document can be significantly larger, not because of the 

embedded fonts, but because gs is inefficient at re-encoding the images (you can see that 

it is not due to the fonts by trying -dEmbedAllFonts=false). 

You can use the pdffonts command to find out which of the fonts used in a PDF 

document are embedded, and whether they are embedded as subsets.

Our Mission 

Our Mission 

Quick Facts 

Quick Facts 

Etext Center Articles 

Etext Center Articles 

Giving 

Giving 

Contact Us 

Contact Us 

Access & Conditions of Use 

Access & Conditions of Use 

Browse by Language 

Browse by Language 

Browse by Subject 

Browse by Subject 

Search Public Collections 

Search Public Collections 

Search Restricted Collections 

Search Restricted Collections 

Ebooks 

Ebooks 

Journals & Publications 

Journals & Publications 

Faculty Projects 

Faculty Projects 

Offline Collections 

Offline Collections 

Electronic Text Creation 

Electronic Text Creation 

Electronic Text Analysis 

Electronic Text Analysis 

Text & Image Scanning 

Text & Image Scanning 

Project Consultation 

Project Consultation 

Online Helpsheets 

Online Helpsheets 

Etext Courses 

Etext Courses 

Early American Fiction 

Early American Fiction 

Writings of George Washington 

Writings of George Washington 

Thomas Jefferson 

Thomas Jefferson 

Dictionary of the History of Ideas 

Dictionary of the History of Ideas 

Ebook Collection 

Ebook Collection 

Walter Reed Collection 

Walter Reed Collection 

Letters to Dr. James Carmichael 

Letters to Dr. James Carmichael 

Modern English Collection 

Modern English Collection 

Etext How-To Guides 

Etext How-To Guides 

XML, SGML & HTML 

XML, SGML & HTML 

The Text Encoding Initiative (TEI) 

The Text Encoding Initiative (TEI) 

Encoded Archival Description (EAD) 

Encoded Archival Description (EAD) 

Special Characters & Language Codes 

Special Characters & Language Codes 

Archival Imaging 

Archival Imaging 

Using Regular Expressions 

Stephen Ramsay 

Electronic Text Center 

University of Virginia

What are regular expressions? 

If you've ever typed "cp *.html ../" at the UNIX command prompt, or entered "garden?" into a web-based 

search engine, you've already used a simple regular expression. Regular expressions ("regex's" for short) are sets 

of symbols and syntactic elements used to match patterns of text. 

Even these simple examples testify to the power of regular expressions. In the first instance, you've copied all the 

files which end in ".html" (as opposed to copying them one by one); in the second, you've conducted a search not 

only for "garden," but for "garden, gardening, gardens, and gardeners" all at once. 

For a tool with full regex support, metacharacters like "*" and "?" (or "wildcard operators," as they are sometimes 

called) are only the tip of the iceberg. Using a good regex engine and a well-crafted regular expression, one can 

easily search through a text file (or a hundred text files) searching for words that have the suffix ".html" (but only if 

the word begins with a capital letter and occurs at the beginning of the line), replace the .html suffix with a .sgml 

suffix, and then change all the lower case characters to upper case. With the right tools, this series of regular 

expressions would do just that: 

s/(^[A_Z]{1})([a-z]+)\.sgml/\1\2\.html/g 

tr/a-z/A-Z/ 

As you might guess from this example, concision is everything when it comes to crafting regular expressions, and 

while this syntax won't win any beauty prizes, it follows a logical and fairly standardized format which you can learn 

to read and write easily with just a little bit of practice. 

What sort of things can I do with regular expressions? 

Regular expressions figure into all kinds of text-manipulation tasks. Searching and search-and-replace are among 

the more common uses, but regular expressions can also be used to test for certain conditions in a text file or data 

stream. You might use regular expressions, for example, as the basis for a short program that separates incoming 

mail from incoming spam. In this case, the program might use a regular expression to determine whether the name 

of a known spammer appeared in the "From:" line of the email. Email filtering programs, in fact, very often use 

regular expressions for exactly this type of operation. 

And the drawbacks? 

Regular expressions tend to be easier to write than they are to read. This is less of a problem if you are the only one 

who ever needs to maintain the program (or sed routine, or shell script, or what have you), but if several people need 

to watch over it, the syntax can turn into more of a hindrance than an aid. 

Ordinary macros (in particular, editable macros such as those generated by the major word processors and editors) 

tend not to be as fast, as flexible, as portable, as concise, or as fault-tolerant as regular expressions, but they have 

the advantage of being much more readable; even people with no programming background whatsoever can usually 

make enough sense of a macro script to change it if the need arises. For some jobs, such readablitity will outweigh 

all other concerns. As with all things in computing, it's largely a question of fitting the tool to the job. 

What do I need in order to use regular expressions? 

Actually, you probably already have everything you need to start using regular expressions to get your work done. 

Regular expressions don't constitute a "language" in the way that C or Perl are languages or a tool in the way that 

sed or grep are tools; instead, regular expressions constitute a syntax which many languages and tools (including 

these) support. 

Several languages, in fact, support regular expressions--Perl, Tcl, Python, awk, and the various shells naturally, but 

also many other popular languages (including C/C++, Java, and Visual Basic) with a little coaxing from libraries and 

whatnot. You don't need to be a programmer, however, to use regular expressions to the fullest. Several editors 

(including Nisus Writer, BBEdit, and every flavor of Emacs and vi you care to mention) and a great many textmanipulation 

tools used in UNIX (including sed and every flavor of grep) support regular expressions. grep, in fact, 

stands for global regular expression print. 

Why are they called "regular expressions?" 

Regular expressions trace back to the work of an American mathematician by the name of Stephen Kleene (one of 

the most influential figures in the development of theoretical computer science) who developed regular expressions 

as a notation for describing what he called "the algebra of regular sets." His work eventually found its way into some 

early efforts with computational search algorithms, and from there to some of the earliest text-manipulation tools on 

the Unix platform (including ed and grep). In the context of computer searches, the "*" is formally known as a 

"Kleene star." 

How do I write a simple search pattern using a regular expression? 

In a regular expression, everything is a generalized pattern. If I type the word "serendipitous" into my editor, I've 

created one instance of the word "serendipitous." If, however, I indicate to my tool (or compiler, or editor, or what 

have you) that I'm now typing a regular expression, I am in effect creating a template that matches all instances of 

the characters "s," "e," "r," "e," "n," "d," "i," "p," "i," "t," "o," "u," and "s" all in a row. The standard way to find 

"serendipitous" (the word) in a file is to use "serendipitous" (the regular expression) with a tool like egrep (or 

extended grep):

$ egrep "serendipitous" foobar >hits 

This line, as you might guess, asks egrep to find instances of the pattern "serendipitous" in the file "foobar" 

and write the results to a file called "hits". 

How do I write a simple search-and-replace using regular expressions? 

The process here is quite similar, and the general pattern tends to be the same from tool to tool. Suppose we 

wanted to find all instances of "serendipitous" in the file "foobar" and replace them with the word "fortuitous." 

You might use sed (which stands for stream editor) like so: 

$ sed 's/serendipity/fortuitous/g' foobar >hits. 

In most regular expression "environments," the "s" operator (for "substitute") at the beginning tells the interpreter to 

substitute one pattern for another; "g" (for global) tells it to do so as many times as possible on a line. 

How do I construct complex patterns? 

In the preceding examples, we have been using regular expressions that adhere to the first rule of regular 

expressions: namely, that all alphanumeric characters match themselves. There are other characters, however, that 

match in a more generalized fashion. These are usually referred to as the metacharacters. 

Single-Character Metacharacters 

Some metacharacters match single characters. This includes the following symbols: 

. Matches any one character 

[...] Matches any character listed between the brackets 

[^...] Matches any character except those listed between the brackets 

Suppose we have a number of filenames listed out in a file called "Important.files." We want to "grep out" those 

filenames which follow the pattern "blurfle1", "blurfle2", "blurfle3," and so on, but exclude files of the form 

"1blurfle", "2blurfle", "3blurfle" The following regex would do the trick: 

$ egrep "blurfle." Important.files >blurfles 

The important thing to realize here is that this line will not match merely the string "blurfle." (that is, "blurfle" 

followed by a period). In a regular expression, the dot is a reserved symbol (we'll get to matching periods a little 

further on). 

This is fine if we aren't particular about the character we match (whether it's a "1," a "2," or even a letter, a space, or 

an underscore). Narrowing the field of choices for a single character match, however, requires that we use a 

character class. 

Character classes match any character listed within that class and are separated off using square brackets. So, for 

example, if we wanted to match on "blurfle" but only when it is followed immediately by a number (including 

"blurfle1" but not "blurflez") we would use something like this: 

$ egrep "blurfle[0123456789]" Important.files >blurfles 

The syntax here is exactly as it seems: "Find 'blurfle' followed by a zero, a one, a two, a three, a four, a five, a six, a 

seven, an eight, or a nine." Such classes are usually abbreviated using the range operator ("-"): 

$ egrep "blurfle[0-9]" Important.files >blurfles 

The following regex would find "blurfle" followed by any alphanumeric character (upper or lower case). 

$ egrep "blurfle[0-9A-Za-z]" Important.files >blurfles 

(Notice that we didn't write blurfle[0-9 A-Z a-z] for that last one. The spaces might make it easier to read, but 

we'd be matching on anything between zero and nine, anything between a and z, anything between A and Z, or a 

space.) 

A carat at the beginning of the character class negates that class. In other words, if you wanted to find all instances 

of blurfle except those which end in a number, you'd use the following: 

$ egrep "blurfle[^0-9]" Important.files >blurfles 

Many regex implementations have "macros" for various character classes. In Perl, for example, \d matches any digit 

([0-9]) and \w matches any "word character" ([a-zA-Z0-9_]). Grep uses a slightly different notation for the same 

thing: [:digit:] for digits and [:alnum:] for alphanumeric characters. The man page (or other documentation) 

for the particular tool should list all the regex macros available for that tool. 

Quantifiers 

The regular expression syntax also provides metacharacters which specify the number of times a particular 

character should match.

? Matches any character zero or one times 

* Matches the preceding element zero or more times 

+ Matches the preceding element one or more times 

{num} Matches the preceding element num times 

{min, max} Matches the preceding element at least min times, but not more than max times 

These metacharacters allow you to match on a single-character pattern, but then continue to match on it until the 

pattern changes. In the last example, we were trying to search for patterns that contain "blurfle" followed by a 

number between zero and nine. The regex we came up with would match on blurfle1, blurfle2, blurfle3, 

etc. If, however, you had a programmer who mistakenly thought that "blurfle" was supposed to be spelled "blurffle," 

our regex wouldn't be able to catch it. We could fix it, though, with a quantifier. 

$ egrep "blur[f]+le[0-9]" Important.files >blurfles 

Here we have "Find 'b', 'l', 'u,' 'r' (in a row) followed by one or more instances of an 'f' followed by 'l' and 'e' and then 

any single digit character between zero and nine." 

There's always more than one way to do it with regular expressions, and in fact, if we use single-character 

metacharacters and quantifiers in conjunction with one another, we can search for almost all the variant spellings of 

"blurfle" ("bllurfle," "bllurrfle", bbluuuuurrrfffllle", and so on). One way, for example, might employ the ubiquitous (and 

exceedingly powerful) .* combination: 

$ egrep "b.*e" Important.files >blurfles 

If we work this out, we come out with something like: "find a 'b' followed by any character any number of times 

(including zero times) followed by an 'e'." 

It's tempting to use ".*" with abandon. However, bear in mind that the preceding example would match on words like 

"blue" and "baritone" as well as "blurfle." 

Suppose the filenames in blurfle are numbered up to 12324, but we only care about the first 999: 

$ egrep "blurfle[0-9]{3}" Important.files >blufles 

This regex tells egrep to match any number between zero and nine exactly three times in a row. Similarly, "blurfle[0- 

9]{3,5}" matches any number between zero and nine at lest three times but not more than five times in a row. 

Anchors 

Often, you need to specify the position at which a particular pattern occurs. This is often referred to as "anchoring" 

the pattern: 

^ Matches at the start of the line 

$ Matches at the end of the line 

\< Matches at the beginning of a word 

\> Matches at the end of a word 

\b Matches at the beginning or the end of a word 

\B Matches any charater not at the beginning or end of a word 

"^" and "$" are some of the most useful metacharacters in the regex arsenal--particularly when you need to run a 

search-and-replace on a list of strings. Suppose, for example, that we want to take the "blurfle" files listed in 

Important.files, list them out separately, run a program called "fragellate" on each one, and then append each 

successive output to a file called "fraggled_files." We could write a full-blown shell script (or Perl script) that would do 

this, but often, the job is faster and easier if we build a very simple shell script with a series of regular expressions. 

We'd begin by greping the files we want to operate on and writing the output to a file. 

$ egrep "blurfle[0-9]" Important.file >script.sh 

This would give us a list of files in script.sh that looked something like this: 

blurfle1 

blurfle2 

blurfle3 

blurfle4 

. 

. 

.

Now we use sed (or the "/%s" operator in vi, or the "query-replace-regexp" command in emacs) to put "fragellate" in 

front of each filename and ">>fraggled_files" after each filename. This requires two separate search-and-replace 

operations (though not necessarily, as I'll explain when we get to backreferences). With sed, you have the ability to 

put both substitution lines into a file, and then use that file to iterate through another making each substitution in turn. 

In other words, we create a file called "fraggle.sed" which contains the following lines: 

s/^/fraggelate / 

s/$/ >>fraggled_files/ 

Then run the following "sed routine" on script.sh like so: 

$ sed -f fraggle.sed script.sh >script2.sh 

Our script would then look like this: 

fraggelate blurfle1 >>fraggled_files 




. 

. 

. 

Chmod it, run it, and you're done. 

Of course, this is a somewhat trivial example ("Why wouldn't you just run "fragglate blurfle* >>fraggled_files" from 

the command line?"). Still, one can easily imagine instances where the criteria for the file name list is too 

complicated to express using [filename]* on the command line. In fact, you can probably see from this sed-routine 

example that we have the makings of an automatic shell-script generator or file filter. 

You may also have noticed something odd about that caret in our sed routine. Why doesn't it mean "except" as in 

our previous example? The answer has to do with the sometimes radical difference between what an operator 

means inside the range operator and what it means outside it. The rules change from tool to tool, but generally 

speaking, you should use metacharacters inside range operators with caution. Some tools don't allow them at all, 

and others change the meaning. To pick but one example, most tools would interpret [A-Za-z.] as "Any character 

between A and Z, a and z or a period." 

Most tools provide some way to anchor a match on a word boundary. In some versions of grep, for example, you are 

allowed to write: 

$ grep "fle\>" Important.files >blurfles 

This says: "Find the characters "f", "l", "e", but only when they come at the end of a word." \b tells the regex engine 

to match any word boundary (whether it's at the beginning or the end) and \B tells it to match any position that isn't a 

word boundary. This again can vary considerably from tool to tool. Some tools don't support word boundaries at all, 

and others support them using a slightly different syntax. The tools that do support word boundaries generally 

consider words to be bounded by spaces or punctuation, and consider numerals to be legitimate parts of words, but 

there are some variations on these rules that can effect the accuracy of your matches. The man page or other 

documentation should resolve the matter. 

Escape Characters 

By now, you're probably wondering how you go about searching for one of the special characters (asterisks, periods, 

slashes, and so on). The answer lies in the use of the escape character--for most tools, the backslash ("\"). To 

reverse the meaning of a special character (in other words, to treat it as a normal character instead of as a 

metacharacter), we simply put a backslash before that character. So, we know that a regex like ".*" finds any 

character any number of times. But suppose we're searching for ellipses of various lengths and we just want to find 

periods any number of times. Because the period is normally a special character, we'd need to escape it with a 

backslash: 

$ grep "\.*" Important.Files >ellipses.files 

Unfortunately, this contribute to the legendary ugliness of regular expressions more than any other element of the 

syntax. Add a few escape characters, and a simple sed routine designed to replace a couple of URL's quickly 

degenerates into confusion: 

sed 

's/http:\/\/etext\.lib\.virginia\.edu\//http:\/\/www\.etext\.virginia\.edu/g 

To make matters worse, the list of what needs to be escaped differs from tool to tool. Some tools, for example, 

consider the "+" quantifier to have its normal meaning (as a ordinary plus sign) until it is escaped. If you're having 

trouble with a regex (a sed routine that won't parse or a grep pattern that won't match even though you're certain the 

pattern exists), try playing around with the escapes. Or better yet, read the man page. 

Alternation 

Alternation refers to the use of the "|" symbol to indicate logical OR. In a previous example, we used "blur[f]+le" to 

catch those instances of "blurfle" that were misspelled with two "f's". Using alternation, we could have written: 

$ egrep "blurfle|blurffle" Important.files >blurfles 

This means simply "Find either blurfle OR blurffle." 

The power of this becomes more evident when we use parentheses to limit the scope of the alternative matches.

Consider the following regex, which accounts for both the American and British spellings of the word "gray": 

$ egrep "gr(a|e)y" Important.files >hazy.shades 

Or perhaps a mail-filtering program that uses the following regex to single out past correspondence between you 

and the boss: 

/(^To:|^From:) (Seaman|Ramsay)/ 

This says, "Find a 'To:' or a 'From:' line followed by a space and then either the word 'Seaman' or the word 'Ramsay' 

This can make your regex's extremely flexible, but be careful! Parentheses are also metacharacters which figure 

prominently in the use of . . . 

Backreferences 

Perhaps the most powerful element of the regular expression syntax, backreferences allow you to load the results of 

a matched pattern into a buffer and then reuse it later in the expression. 

In a previous example, we used two separate regular expressions to put something before and after a filename in a 

list of files. I mentioned at that point that it wasn't entirely necessary that we use two lines. This is because 

backreferences allow us to get it down to one line. Here's how: 

s/$blurfle[0-9]+$/fraggelate \1 >>fraggled_files/ 

The key elements in this example are the parentheses and the "\1". Earlier we noted that parentheses can be used 

to limit the scope of a match. They can also be used to save a particular pattern into a temporary buffer. In this 

example, everything in the "search" half of the sed routine (the "blurfle" part) is saved into a buffer. In the "replace" 

half we recall the contents of that buffer back into the string by referring to its buffer number. In this case, buffer "\1". 

So, this sed routine will do precisely what the earlier one did: find all the instances of blurfle followed by a number 

between zero and nine and replace it with "fragellate blurfle[some number] >>fraggled files". 

Backreferences allow for something that very few ordinary search engines can manage; namely, strings of data that 

change slightly from instance to instance. Page numbering schemes provide a perfect example of this. Suppose we 

had a document that numbered each page with the notation . The number and the chapter name change from page to page, but the rest of the string stays the same. 

We can easily write a regular expression that matches on this string, but what if we wanted to match on it and then 

replace everything but the number and the chapter name? 

s//Page \1, Chapter \2/ 

Buffer number one ("\1") holds the first matched sequence, ([0-9]+); buffer number two ("\2") holds the second, ([A- 

Za-z]+). 

Tools vary in the number of backreference they can hold. The more common tools (like sed and grep) hold nine, but 

Python can hold up to ninety-nine. Perl is limited only by the amount of physical memory (which, for all practical 

purposes, means you can have as many as you want). Perl also lets you assign the buffer number to an ordinary 

scalar variable ($1, $2, etc.) so you can use it later on in the code block. 

Perl and Regular Expressions 

Perl has evolved over the years into a flexible and sophisticated language capable of just about any programming 

task; including such "low-level language jobs" as large-scale application development and graphical user interface 

design. Still, there's no denying that it continues to dominate the field in the task for which it was originally designed: 

text manipulation. (Perl, as you may know, stands for "Practical Extraction and Report Language"). Part of the 

reason it's so good at text manipulation comes from the fact that it has the most extensive support for regular 

expressions of any tool out there. 

If you're a programmer who's new to regular expressions, you can probably imagine the advantage of using Perl as 

a regex "wrapper." As a full-blown programming language, Perl allows you to embed regular expressions in file tests, 

control loops, output formats, and everything else. Even if you're not a programmer, you can still use Perl and to 

enhance the capability of your regular expressions considerably. 

Let me end with a brief code fragment which illustrates how one might use Perl to automate a text-manipulation task. 

This code uncompresses a file specified on the command line, runs a search-and-replace on the file, and then recompresses 

it. 

#!/usr/bin/perl -w 

$file = $ARGV[0]; 

system( "uncompress $file" ); 

open( CURRENTFILE, "$file"); 

open( OUTFILE, ">outfile" ); 

while ( ) { 

} 

close( CURRENTFILE ); 

$_ =~ s/ he / she /g; 

print OUTFILE $_;

close( OUTFILE ); 

This program, like Perl itself, combines the strengths of the shell with the power of regular expressions. The heart of 

the program is the while ( ) loop, which tells the Perl interpreter to iterate through the file 

represented by the CURRENTFILE filehandle, making the specified substitution of "she" for "he" on each line. 

Outside the loop, we use the system() function to pass a command string to the shell. 

A simple example, but one which gains significant utility when we expand the number of shell commands and the 

number of potential files. We might, for example, read an entire directory using readdir(), test for the presence of 

the ".Z" suffix (using a regex, of course), load those files into an array, and then iterate through each file in the array. 

Perl also allows you to match on a string, save it into a buffer, evaluate the contents of that buffer, and perform a 

computation upon it. So for example, you might match on "page n" save the contents of n into a buffer as $1, and 

then use an expression like "$newnumber += $1" to increment the value of the page number by one. 

Where can I get more information on regular expressions? 

If you're looking for a book to read, you want Mastering Regular Expressions by Jeffrey E. F. Freidl (published by 

O'Reilly & Associates, Inc.). Friedl's book serves both as an extremely detailed tutorial and as an extremely detailed 

reference work on regular expression syntax. Get through this book, and you can consider yourself a serious expert 

on text manipulation in Unix. 

Man pages and other forms of documentation abound for the tools which support regular expressions. The regex 

documentation for Perl is included with the distribution and can be found in "perlre.pod," but there are also versions 

of the documentation in Tex, html, pdf, and ascii format (visit CPAN, the Comprehensive Perl Archive Network for 

details). 

If you're interested in regex libraries, you may want to check out GNU's regex package, available via ftp at 

ftp.gnu.org. 

There are also a number of introductions to and summaries of regular expression syntax on the web. A search for 

"regular expressions" through any of the major web-based search engines should turn up dozens of them. 

Digital Scholarship Services 

University of Virginia Library • PO Box 400148 

Charlottesville VA 22904 

phone: 434.243.8800 • fax: 434.924.1431 

Etext Home • UVa Library Home • UVa Home 

Maintained by: etextcenter@virginia.edu 

Last Modified: Monday, January 17, 2005 

© The Rector and Visitors of the University of Virginia

Regular Expressions - Quick Reference Guide 

Anchors 

^ 

$ 

\b 

\B 

\A 

\G 

\z 

\Z 

Non-printing characters 

\a alarm (BEL, hex 07) 

\cx "control-x" 

\e escape (hex 1B) 

\f formfeed (hex 0C) 

\n newline (hex 0A) 

\r carriage return (hex OD) 

\t tab (hex 09) 

\ddd octal code ddd 

\xhh hex code hh 

\x{hhh..} hex code hhh.. 

Generic character types 

\d 

\D 

\s 

\S 

\w 

\W 

POSIX character classes 

alnum 

alpha 

ascii 

blank 

cntrl 

digit 

graph 

lower 

print 

punct 

space 

upper 

word 

xdigit 

start of line 

end of line 

word boundary 

not at word boundary 

start of subject 

first match in subject 

end of subject 

end of subject 

or before newline at end 

decimal digit 

not a decimal digit 

whitespace character 

not a whitespace char 

"word" character 

"non-word" character 

letters and digits 

letters 

character codes 0-127 

space or tab only 

control characters 

decimal digits 

printing chars -space 

lower case letters 

printing chars +space 

printing chars -alnum 

white space 

upper case letters 

"word" characters 

hexadecimal digits 

Literal Characters 

Letters and digits match exactly 

Some special characters match exactly 

Escape other specials with backslash 

Character Groups 

Almost any character (usually not newline) 

Lists and ranges of characters 

Any character except those listed 

Counts (add ? for non-greedy) 

0 or more ("perhaps some") 

0 or 1 ("perhaps a") 

1 or more ("some") 

Between "n" and "m" of 

Exactly "n", "n" or more 

Alternation 

Either/or 

Lookahead and Lookbehind 

Followed by 

NOT followed by 

Following 

NOT following 

Grouping 

For capture and counts 

Non-capturing 

Named captures 

Alternation 

Back references 

Numbered 

Relative 

Named 

a x B 7 0 

@ - = % 

\. \\ \$ \[ 

. 

[ ] 

[^ ] 

* 

? 

+ 

{n,m} 

{n}, {n,} 

| 

(?= ) 

(?! ) 

(? 

Character group contents 

x 

x-y 

[:class:] 

[^:class:] 

Examples 

[a-zA-Z0-9_] 

[[:alnum:]_] 

Comments 

(?#comment) 

Replacements 

$n reference capture 

Case foldings 

\u 

\U 

\l 

\L 

\E 

individual chars 

character range 

posix char class 

negated class 

Conditional subpatterns 

(?(condition)yes-pattern) 

(?(condition)yes|no-pattern) 

Recursive patterns 

(?n) 

(?0) (?R) 

(?&name) 

Numbered 

Entire regex 

Named 

upper case next char 

upper case following 

lower case next char 

lower case following 

end case folding 

Conditional insertions 

(?n:insertion) 

(?n:insertion:otherwise) 

http://www.e-texteditor.com

BNF and EBNF: What are they and 

how do they work? 

Contents 

By: Lars Marius Garshol 

Introduction 

What is this? 

What is BNF? 

How it works 

The principles 

A real example 

EBNF: What is it, and why do we need it? 

An EBNF sample grammar 

Uses of BNF and EBNF 

Common uses 

How to use a formal grammar 

Parsing 

The easiest way 

Top-down parsing (LL) 

An LL analysis example 

An LL transformation example 

The slightly harder way 

Bottom-up parsing (LR) 

LL or LR? 

More information 

Appendices 

Acknowledgements 

Introduction 

What is this? 

This is a short article that attempts to explain what BNF is, based on 

message posted to comp.text.sgml on

16.Jun.98. Because of this it is a little rough, so if it leaves you with any 

unanswered questions, email me and I'll try to explain as best I can. 

It has been filled out substantially since then and has grown quite large. 

However, you needn't fear. The article gets more and more detailed as 

you read on, so if you don't want to dig really deep into this, just stop 

reading when the questions you are interested in have been answered 

and things start getting boring. 

What is BNF? 

Backus-Naur notation (more commonly known as BNF or Backus-Naur 

Form) is a formal mathematical way to describe a language, which was 

developed by John Backus (and possibly Peter Naur as well) to describe 

the syntax of the Algol 60 programming language. 

(Legend has it that it was primarily developed by John Backus (based on 

earlier work by the mathematician Emil Post), but adopted and slightly 

improved by Peter Naur for Algol 60, which made it well-known. Because 

of this Naur calls BNF Backus Normal Form, while everyone else calls it 

Backus-Naur Form.) 

It is used to formally define the grammar of a language, so that there is no 

disagreement or ambiguity as to what is allowed and what is not. In fact, 

BNF is so unambiguous that there is a lot of mathematical theory around 

these kinds of grammars, and one can actually mechanically construct a 

parser for a language given a BNF grammar for it. (There are some kinds 

of grammars for which this isn't possible, but they can usually be 

transformed manually into ones that can be used.) 

Programs that do this are commonly called "compiler compilers". The most 

famous of these is YACC, but there are many more. 

How it works 

The principles 

BNF is sort of like a mathematical game: you start with a symbol (called 

the start symbol and by convention usually named S in examples) and are 

then given rules for what you can replace this symbol with. The language 

defined by the BNF grammar is just the set of all strings you can produce 

by following these rules. 

The rules are called production rules, and look like this: 

symbol := alternative1 | alternative2 ... 

A production rule simply states that the symbol on the left-hand side of the 

:= must be replaced by one of the alternatives on the right hand side. The 

alternatives are separated by |s. (One variation on this is to use ::= instead

of :=, but the meaning is the same.) Alternatives usually consist of both 

symbols and something called terminals. Terminals are simply pieces of 

the final string that are not symbols. They are called terminals because 

there are no production rules for them: they terminate the production 

process. (Symbols are often called non-terminals.) 

Another variation on BNF grammars is to enclose terminals in quotes to 

distinguish them from symbols. Some BNF grammars explicitly show 

where whitespace is allowed by having a symbol for it, while other 

grammars leave this for the reader to infer. 

There is one special symbol in BNF: @, which simply means that the 

symbol can be removed. If you replace a symbol by @, you do it by just 

removing the symbol. This is useful because in some cases it is difficult to 

end the replacement process without using this trick. 

So, the language described by a grammar is the set of all strings you can 

produce with the production rules. If a string cannot in any way be 

produced by using the rules the string is not allowed in the language. 

A real example 

Below is a sample BNF grammar: 

S := '-' FN | 

FN 

FN := DL | 

DL '.' DL 

DL := D | 

D DL 

D := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' 

The different symbols here are all abbreviations: S is the start symbol, FN 

produces a fractional number, DL is a digit list, while D is a digit. 

Valid sentences in the language described by this grammar are all 

numbers, possibly fractional, and possibly negative. To produce a number, 

start with the start symbol S: 

S 

Then replace the S symbol with one of its productions. In this case we 

choose not to put a '-' in front of the number, so we use the plain FN 

production and replace S by FN: 

FN 

The next step is then to replace the FN symbol with one of its productions. 

We want a fractional number, so we choose the production that creates 

two decimal lists with a '.' between them, and after that we keep choosing 

replacing a symbol with one of its productions once per line in the example 

below: 

DL . DL 

D . DL

3 . DL 

3 . D DL 

3 . D D 

3 . 1 D 

3 . 1 4 

Here we've produced the fractional number 3.14. How to produce the 

number -5 is left as an exercise for the reader. To make sure you 

understand this you should also study the grammar until you understand 

why the string 3..14 cannot be produced with these production rules. 

EBNF: What is it, and why do we need it? 

In DL I had to use recursion (ie: DL can produce new DLs) to express the 

fact that there can be any number of Ds. This is a bit awkward and makes 

the BNF harder to read. Extended BNF (EBNF, of course) solves this 

problem by adding three operators: 

? : which means that the symbol (or group of symbols in parenthesis) 

to the left of the operator is optional (it can appear zero or one times) 

* : which means that something can be repeated any number of 

times (and possibly be skipped altogether) 

+ : which means that something can appear one or more times 

An EBNF sample grammar 

So in extended BNF the above grammar can be written as: 

S := '-'? D+ ('.' D+)? 

D := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' 

which is rather nicer. :) 

Just for the record: EBNF is not more powerful than BNF in terms of what 

languages it can define, just more convenient. Any EBNF production can 

be translated into an equivalent set of BNF productions. 

Uses of BNF and EBNF 

Common uses 

Most programming language standards use some variant of EBNF to 

define the grammar of the language. This has two advantages: there can 

be no disagreement on what the syntax of the language is, and it makes it 

much easier to make compilers, because the parser for the compiler can

e generated automatically with a compiler-compiler like YACC. 

EBNF is also used in many other standards, such as definitions of protocol 

formats, data formats and markup languages such as XML and SGML. 

(HTML is not defined with a grammar, instead it is defined with an SGML 

DTD, which is sort of a higher-level grammar.) 

You can see a collection of BNF grammars at the BNF web club . 

How to use a formal grammar 

OK. Now you know what BNF and EBNF are, what they are used for, but 

perhaps not why they are useful or how you can take advantage of them. 

The most obvious way of using a formal grammar has already been 

mentioned in passing: once you've given a formal grammar for your 

language you have completely defined it. There can be no further 

disagreement on what is allowed in the language and what is not. This is 

extremely useful because a syntax description in ordinary prose is much 

more verbose and open to different interpretations. 

Another benefit is this: formal grammars are mathematical creatures and 

can be "understood" by computers. There are actually lots of programs 

that can be given (E)BNF grammars as input and automatically produce 

code for parsers for the given grammar. In fact, this is the most common 

way to produce a compiler: by using a so-called compiler-compiler that 

takes a grammar as input and produces parser code in some 

programming language. 

Of course, compilers do much more checking than just grammar checking 

(such as type checking) and they also produce code. None of these things 

are described in an (E)BNF grammar, so compiler-compilers usually have 

a special syntax for associating code snippets (called actions) with the 

different productions in the grammar. 

The best-known compiler-compiler is YACC (Yet Another Compiler 

Compiler), which produces C code, but others exist for C++, Java, Python 

as well as many other languages. 

Parsing 

The easiest way 

Top-down parsing (LL) 

The easiest way of parsing something according to a grammar in use 

today is called LL parsing (or top-down parsing). It works like this: for each 

production find out which non-terminals the production can start with. (This 

is called the start set.)

Then, when parsing, you just start with the start symbol and compare the 

start sets of the different productions against the first piece of input to see 

which of the productions have been used. Of course, this can only be 

done if no two start sets for one symbol both contain the same terminal. If 

they do there is no way to determine which production to choose by 

looking at the first terminal on the input. 

LL grammars are often classified by numbers, such as LL(1), LL(0) and so 

on. The number in the parenthesis tells you the maximum number of 

terminals you may have to look at at a time to choose the right production 

at any point in the grammar. So for LL(0) you don't have to look at any 

terminals at all, you can always choose the right production. This is only 

possible if all symbols have only one production, and if they only have one 

production the language can only have one string. In other words: LL(0) 

grammars are not interesting. 

The most common (and useful) kind of LL grammar is LL(1) where you 

can always choose the right production by looking at only the first terminal 

on the input at any given time. With LL(2) you have to look at two symbols, 

and so on. There exist grammars that are not LL(k) grammars for any 

fixed value of k at all, and they are sadly quite common. 

An LL analysis example 

As a demonstration, let's do a start set analysis of the sample grammar 

above. For the symbol D this is easy: all productions have a single digit as 

their start set (the one they produce) and the D symbol has the set of all 

ten digits as its start set. This means that we have at best an LL(1) 

grammar, since in this case we need to look at one terminal to choose the 

right production. 

With DL we run into trouble. Both productions start with D and thus both 

have the same start set. This means that one cannot see which production 

to choose by looking at just the first terminal of the input. However, we can 

easily get round this problem by cheating: if the second terminal on input 

is not a digit we must have used the first production, but if they both are 

digits we must have used the second one. In other words, this means that 

this is at best an LL(2) grammar. 

I actually simplified things a little here. The productions for DL alone don't 

tells us which terminals are allowed after the first terminal in the D @ 

production, because we need to know which terminals are allowed after a 

DL symbol. This set of terminals is called the follow set of the symbol, and 

in this case it is '.' and the end of input. 

The FN symbol turns out to be even worse, since both productions have 

all digits as their start set. Looking at the second terminal doesn't help 

since we need to look at the first terminal after the last digit in the digit list 

(DL) and we don't know how many digits there are until we've read them 

all. And since there is no limit on the number of digits there can be, this 

isn't an LL(k) grammar for any value of k at all (there can always be more 

digits than k, no matter which value of k value you choose). 

Somewhat surprisingly perhaps, the S symbol is easy. The first production 

has '-' as its start set, the second one has all digits. In other words, when 

you start parsing you'll start with the S symbol and look at the input to

decide which production was used. If the first terminal is '-' you know that 

the first production was used. If not, the second one was used. It's only the 

FN and DL productions that cause problems. 

An LL transformation example 

However, there is no need to despair. Most grammars that are not LL(k) 

can fairly easily be converted to LL(1) grammars. In this case we'll need to 

change two symbols: FN and DL. 

The problem with FN is that both productions begin with DL, but the 

second one continues with a '.' and another DL after the initial DL. This is 

easily solved: we change FN to have just one production that starts with 

DL followed by FP (fractional part), where FP can be nothing or '.' followed 

by a DL, like this: 

FN := DL FP 

FP := @ | '.' DL 

Now there are no problems with FN anymore, since there's just one 

production, and FP is unproblematic because the two productions have 

different start sets. End of input and '.', respectively. 

The DL is a tougher nut to crack, since the problem is the recursion and 

it's compounded by the fact that we need at least one D to result from the 

DL. The solution is to give DL a single production, a D followed by DR 

(digits rest). DR then has two productions: D DR (more digits) or @ (no 

more digits). The first production has a start set of all digits, while the 

second has '.' and end of input as its start set, so this solves the problem. 

This is the complete LL(1) grammar as we've now transformed it: 

S := '-' FN | FN 

FN := DL FP 

FP := @ | '.' DL 

DL := D DR 

DR := D DR | @ 

D := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' 

The slightly harder way 

Bottom-up parsing (LR) 

A harder way to parse is the one known as shift-reduce or bottom-up 

parsing. This technique collects input until it finds that it can reduce an 

input sequence with a symbol. This may sound difficult, so I'll give an 

example to clarify. We'll parse the string '3.14' and see how it was 

produced from the grammar. We start by reading 3 from the input: 

3 

and then we look to see if we can reduce it to the symbol it was produced 

from. And indeed we can, it was produced from the D symbol, which we 

replace the 3 with. Then we note that we can produce the D from DL and 

replace the D with DL. (The grammar is ambiguous, which means that we

can reduce further to FN, which would be wrong. For simplicity we just 

skip the wrong steps here, but an unambiguous grammar would not allow 

these wrong choices.) After that we read the . from the input and try to 

reduce it, but fail: 

D 

DL 

DL . 

This can't be reduced to anything, so we read the next character from the 

input: 1. We then reduce that to a D and read the next character, which is 

4. 4 can be reduced to D, then to DL, and then the "D DL" sequence can 

be further reduced to a DL. 

DL . 

DL . 1 

DL . D 

DL . D 4 

DL . D D 

DL . D DL 

DL . DL 

Looking at the grammar we quickly note that FN can produce just this "DL 

. DL" sequence and do a reduction. We then note that FN can be 

produced from S and reduce the FN to S and then stop, as we've 

completed the parse. 

DL . DL 

FN 

S 

As you may have noted we could often choose whether to do a reduction 

now or wait until we had more symbols and then do a different reduction. 

There are more complex variations on this shift-reduce parsing algorithm, 

in increasing complexity and power: LR(0), SLR, LALR and LR(1). LR(1) 

usually needs unpractically large parse tables, so LALR is the most 

commonly used algorithm, since SLR and LR(0) are not powerful enough 

for most programming languages. 

LALR and LR(1) are too complex for me to cover here, but you get the 

basic idea. 

LL or LR? 

This question has already been answered much better by someone else, 

so I'm just quoting his news message in full here: 

I hope this doesn't start a war...

First - - Frank, if you see this, don't shoot me. (My boss is Frank 

DeRemer, the creator of LALR parsing...) 

(I borrowed this summary from Fischer&LeBlanc's "Crafting a Compiler") 

Simplicity - - LL 

Generality - - LALR 

Actions - - LL 

Error repair - - LL 

Table sizes - - LL 

Parsing speed - - comparable (me: and tool-dependent) 

Simplicity - - LL wins 

========== 

The workings of an LL parser are much simpler. And, if you have to 

debug a parser, looking at a recursive-descent parser (a common way to 

program an LL parser) is much simpler than the tables of a LALR parser. 

Generality - - LALR wins 

========== 

For ease of specification, LALR wins hands down. The big 

difference here between LL and (LA)LR is that in an LL grammar you must 

left-factor rules and remove left recursion. 

Left factoring is necessary because LL parsing requires selecting an 

alternative based on a fixed number of input tokens. 

Left recursion is problematic because a lookahead token of a rule is 

always in the lookahead token on that same rule. (Everything in set A 

is in set A...) This causes the rule to recurse forever and ever and 

ever and ever... 

To see ways to convert LALR grammars to LL grammars, take a look at my 

page on it: 

http://www.jguru.com/thetick/articles/lalrtoll.html 

Many languages already have LALR grammars available, so you'd have to 

translate. If the language _doesn't_ have a grammar available, then I'd 

say it's not really any harder to write a LL grammar from scratch. (You 

just have to be in the right "LL" mindset, which usually involves 

watching 8 hours of Dr. Who before writing the grammar... I actually 

prefer LL if you didn't know...) 

Actions - - LL wins 

======= 

In an LL parser you can place actions anywhere you want without 

introducing a conflict 

Error repair - - LL wins 

============ 

LL parsers have much better context information (they are top-down 

parsers) and therefore can help much more in repairing an error, not to 

mention reporting errors. 

Table sizes - - LL 

=========== 

Assuming you write a table-driven LL parser, its tables are nearly half 

the size. (To be fair, there are ways to optimize LALR tables to make 

them smaller, so I think this one washes...) 

Parsing speed - comparable (me: and tool-dependent) 

--Scott Stanchfield in article 

on

comp.lang.java.softwaretools Mon, 07 Jul 1997. 

More information 

John Aycock has developed an unusually nice and simple to use parsing 

framework in Python called SPARK, which is described in his very 

readable paper. 

The definitive work on parsing and compilers is 'The Dragon Book', or 

Compilers : Principles, Techniques, and Tools, by Aho, Sethi and 

Ullman. Beware, though, that this is a rather advanced and mathematical 

book. 

A free online alternative, which looks rather good, is this book, but I can't 

comment on the quality, since I haven't read it yet. 

Henry Baker has written an article about parsing in Common Lisp, 

which presents a simple, high-performant and very convenient framework 

for parsing. The approach is similar to that of compiler-compilers, but 

instead relies on the very powerful macro system of Common Lisp. 

One syntax for specifying BNF grammars can be found in RFC 2234. 

Another can be found in the ISO 14977 standard. 

Appendices 

Acknowledgements 

Thanks to: 

Jelks Cabaniss, for encouraging me to turn the news article into a 

web article, and for providing very useful criticism of the article once 

it appeared in web form. 

C. M. Sperberg-McQueen for extra historical information about the 

name of BNF. 

Scott Stanchfield for writing the great comparison of LALR and LL. I 

have asked for permission to quote this, but have received no reply, 

unfortunately. 

James Huddleston for correcting me on John Backus' name. 

Dave Pawson for correcting a bad link. 

Last update 2008-08-22, by Lars M. Garshol.

Jonah Probell 

professional 

YAP IP 

resume 

publications 

inventions 

source code 

keyboard 

shortcuts 

consumer 

products 

Lexra 

digital video 

personal 

contact info 

Search Site 

Questions or 

Comments? 

send me e-mail 

Windows Shortcut Key Quick Reference 

This is a list of some of the shortcut key combinations recognized in Microsoft 

Windows. These are largely universal across different versions of Windows and 

across different Windows programs. Many of these keyboard shortcuts have 

been adopted by Linux window managers. Macintosh and Sun's Common 

Desktop Environment have some shortcuts in common, but not many. 

The more of these keyboard shortcuts that you learn, the more efficient you 

will be at using your computer. 

These shortcuts are organized with the most frequently used shortcuts at the 

top of each category list. 

Universal Controls 

note: not all keyboards have a Menu Icon key 

Ctrl & V Paste 

Ctrl & C Copy 

Ctrl & X Cut 

Ctrl & Z Undo 

Ctrl & A Highlight all 

Ctrl & B Bold 

Ctrl & U Underline 

Ctrl & I Italic 

RightClick Open the context sensitive menu for the current cursor location 

Shift & F10 Open the context sensitive menu for the current cursor location 

MenuIcon Open the context sensitive menu for the current cursor location 

F1 Activate the help system for the foreground program 

F5 Refresh the current view 

Delete Delete an item in a program or move a file to the Recycle Bin 

Shift & 

Delete 

Delete an item permanently, without putting it in the Recycle 

Bin 

Dialog Box and Menu Control 

Space 

In a dialog box, this clicks the outlined button, toggle the 

outlined check box, or selects the outlined option 

Enter In a dialog box, click the shadowed button 

Esc In a dialog box, the same as clicking Cancel

Alt & letter 

In a dialog box or menu, selects the choice with the letter 

underlined 

letter In a menu, selects the choice with the letter underlined 

Tab 

Shift & Tab 

Ctrl & 

PageDown 

Ctrl & 

PageUp 

Program Control 

Outline the next control in a dialog box or the next window pane 

in the program control order 

Outline the previous control in a dialog box or the previous 

window pane in the program control order 

Switch to the tab to the right in a tabbed dialog box 

Switch to the tab to the left in a tabbed dialog box 

For all running programs, the Z-order represents the order in which they are 

displayed. The program at the top of the Z-order is displayed in the 

foreground and the program at the bottom of the Z-order is displayed 

furthest in the background behind all others. 

note: Microsoft Windows Vista changed the standard Z-order behavior and 

broke the shortcuts described below. The standard behavior can be restored 

by editing the registry. To the key 

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer 

add a REG_DWORD named AltTabSettings with value 1. 

If you do not know how to edit the registry then ask an expert or disregard 

this section of Program Control shortcuts. 

Alt & Tab 

Alt & (Tab & 

Tab) 

Alt & (Tab & 

Tab & Tab) 

etc 

Brings the second program in the Z-order to the foreground 

(top of the Z-order) 

Brings the third program in the Z-order to the foreground (top 

of the Z-order) 

Brings the fourth program in the Z-order to the foreground (top 

of the Z-order) 

Alt & F4 Closes the foreground program 

Alt & Space Display the foreground program system menu 

Alt & Space N 

Alt & Shift & 

Tab 

Alt & Shift & 

(Tab & Tab) 

etc 

Minimize the foreground program and put it at the bottom of 

the Z-order 

Brings the program at the bottom of the Z-order to the 

foreground (top of the Z-order) 

Brings the program second to the bottom of the Z-order to the 


Alt & Esc Put the foreground program at the bottom of the Z-order 

Alt & Shift & 

Esc 

Bring the program at the bottom of the Z-order to the 


Alt Activate menu bar options

Alt Activate menu bar options 

F10 Activate menu bar options 

Document and Window Control 

Alt & - 

Display the Multiple Document Interface (MDI) child window's 

system menu 

Alt & - N Minimize the foreground MDI child window 

Ctrl & Tab Switch to the next MDI child window 

Ctrl & F4 Closes the foreground MDI child window 

Alt & F6 

List and Explorer Control 

Switches between multiple windows in the same program (for 

example the Find dialog box and the main program window) 

arrow keys Move the cursor within a field or the highlight within a list 

Shift & 

arrows 

Ctrl & arrow 

Ctrl & Shift & 

arrow 

Move the cursor within a field or the highlight within a list and 

highlight every element traversed 

Move the cursor within a field by the next larger jump size, such 

as by words rather than by characters 

Move the cursor within a field by the next larger jump size, such 

as by words rather than by characters, and highlight every 

element traversed 

Alt & Enter View properties of the currently highlighted item 

Alt & 

DoubleClick 

Display properties of a file 

Shift & Click Extend highlight from previous cursor location 

Ctrl & Click 

Add or remove the clicked item from the set of highlighted 

items 

F2 Rename highlighted item 

F3 Find files 

DragAndDrop Move or Copy the highlighted item 

Ctrl & 

Copy or Move the highlighted item 

DragAndDrop 

Ctrl & Shift & 

Create a shortcut to the highlighted item 

DragAndDrop 

Backspace Switch to the parent folder 

NumberPad+ Expand the currently highlighted folder 

RightArrow 

Expand the currently highlighted folder if collapsed, otherwise 

go to the first child 

NumberPad* Expand full tree under the currently highlighted folder 

NumberPad- Collapse the currently highlighted folder 

LeftArrow 

Collapse the currently highlighted folder if expanded, otherwise 

go to the parent 

Operating System Control 

note: not all keyboards have a Windows Logo key

note: not all keyboards have a Windows Logo key 

Ctrl & Esc Opens the start menu 

WindowsLogo Opens the start menu 

Ctrl & Esc 

Tab 

WindowsLogo 

Tab 

Activate the task bar 

Activate the task bar 

Ctrl & Esc P Opens the list of installed programs 

Ctrl & Esc S 

C 

Opens the Control Panel 

Ctrl & Esc U Shuts down the computer 

WindowsLogo 

& L 

WindowsLogo 

& E 

WindowsLogo 

& E 

WindowsLogo 

& R 

WindowsLogo 

& M 

Log off Windows 

Opens the Windows Explorer for browsing local and mapped 

network drives 

Windows Explorer 

Run dialog box 

Minimize all (this may change the order of the recently used 

programs list) 

Shift & 

Undo minimize all (this may change the order of the recently 

WindowsLogo 

used programs list) 

& M 

WindowsLogo 

& D 

WindowsLogo 

& F 

Minimize all open windows and display the desktop (this may 

change the order of the recently used programs list) 

Find files or folders 

Ctrl & 

WindowsLogo Find computer 

& F 

Ctrl & 

Move focus from Start, to the Quick Launch toolbar, to the 

WindowsLogo 

& Tab system tray 

WindowsLogo 

& Tab 

WindowsLogo 

& Break 

Shift & 

InsertCD 

Cycle through task bar buttons 

System Properties dialog box 

Bypass the CD automatic run feature 

© Copyright 2004-2006 Jonah Probell

Sublime Text Docs » 

Reference » 

Warning 

This topic is a draft and may contain wrong information. 

Keypress Command 

Ctrl + X Delete line 

Ctrl + ↩ Insert line after 

Ctrl + ⇧ + ↩ Insert line before 

Ctrl + ⇧ + ↑ Move line/selection up 

Ctrl + ⇧ + ↓ Move line/selection down 

Ctrl + L Select line - Repeat to select next lines 

previous | 

Ctrl + D Select word - Repeat select others occurrences 

next | 

Ctrl + M Jump to closing parentheses Repeat to jump to opening 

parentheses 

Ctrl + ⇧ + M Select all contents of the current parentheses 

Ctrl + KK Delete from cursor to end of line 

Ctrl + K + ⌫ Delete from cursor to start of line 

Ctrl + ] Indent current line(s) 

Ctrl + [ Un-indent current line(s) 

Ctrl + ⇧ + D Duplicate line(s) 

Ctrl + J Join line below to the end of the current line 

Ctrl + / Comment/un-comment current line 

Ctrl + ⇧ + / Block comment current selection 

Ctrl + Y Redo, or repeat last keyboard shortcut command 

Ctrl + ⇧ + V Paste and indent correctly 

index 

Keyboard Shortcuts - Windows/Linux 

Editing

Alt + [NUM] Switch to tab number [NUM] where [NUM]

Table Of Contents 

Keyboard Shortcuts - Windows/Linux 

Editing 

Navigation/Goto Anywhere 

General 

Find/Replace 

Tabs 

Split window 

Bookmarks 

Text manipulation 

Previous topic 

Commands 

Next topic 

Keyboard Shortcuts - OSX 

This Page 

Show Source 

© Copyright 2012, Sublime Text Community. 

previous | 

next | 

index

Menu Symbols 

Menu Symbol Key on Keyboard 

Command/Apple Key (like Control on a PC) 

Also written as Cmd 

Option (like Alt on a PC) 

Shift 

Control (Control-click = Right-click) 

Tab 

Return 

Enter (on Number Pad) 

Eject 

Escape 

Page Up 

Page Down 

Home 

End 

Arrow Keys 

Delete Left (like Backspace on a PC) 

Delete Right (also called Forward Delete) 

App Switcher 

Action Keystroke 

Quickly switch between 2 apps 

(like InDesign & Photoshop) 

Press Cmd-Tab to switch to last used app. 

Press Cmd-Tab again to switch back. 

NOTE: Press keys quickly and do NOT 

hold. 

Switch between apps Press Cmd-Tab & continue holding Cmd. 

While holding Cmd, to choose which app 

you want to switch to you can: 

press Tab (several times if needed) to 

scroll right 

press tilde(~) or Shift-Tab to scroll left 

use the left/right arrow keys 

aim with the mouse 

use end/home key to go to first/last app 

Quit an app in the app switcher When in the app switcher you’re already 

holding Cmd, so hit Q to quit selected app. 

Hide an app in the app switcher In the app switcher you’re already holding 

Cmd, so hit H to hide selected app. 

Cancel the app switcher In the app switcher you're already holding 

Cmd, so hit Esc or period(.) 

Dock 

Mac Keyboard Shortcuts 

I like to figure out the fastest way to do things. I hope these keystrokes help you to become the power user that lies within. They should work on most 

versions of Mac OS (10.7 Lion, 10.6 Snow Leopard, 10.5 Leopard, and even 10.4 Tiger). I’ll be adding more 10.7 Lion keystrokes, so check back! 

Finder 


Open Sidebar item in a new window Cmd-Click 

Switch Finder views 

(Icon, List, Column, Cover Flow) 

In List view, expand a folder Right Arrow 

In List view, collapse a folder Left Arrow 

Cmd-1, Cmd-2, Cmd-3, 

Cmd-4 

Rename the selected file/folder Press Return (or Enter) 

Go into selected folder or open the 

selected file 

Cmd-Down Arrow 

Go to parent folder Cmd-Up Arrow 

Go Back Cmd-[ (that’s left bracket) 

Go Forward Cmd-] (that’s right bracket) 

Select the next icon in Icon and List views Tab (Shift-Tab reverses 

direction) 

Alternate columns in Column View Tab (Shift-Tab reverses 

direction) 

Instantly show long file name (for names 

condensed with a “...”) 

Resize one column to fit the longest 

file name 

Resize all columns to fit their longest 

file names 

Hold Option while mousing 

over long filename 

Double-Click column resize 

widget 

Option Double-Click resize 

widget 

Copy and Paste files Cmd-C, then Cmd-V 

Move a file instead of copying. 

(Copies to the destination and removes it 

from the original disk.) 

Cmd-Drag file to disk 

Move selected files to the Trash Cmd-Delete 

Empty the Trash (with warning) Cmd-Shift-Delete 

Empty the Trash (without warning) Cmd-Opt-Shift-Delete 

Cancel a drag-n-drop action while in the 

midst of dragging 

Show Inspector (a single, live refreshing 

Info window) 

Undo the last action (such as rename file, 

copy file, etc.) 

Esc 

Cmd-Opt-I 

Cmd-Z 

Hide/Show Sidebar (on the left) Cmd-Opt-T 

Move or Remove item in toolbar (at the top 

of the window). Works in most programs. 

Cmd-Drag 

Open Quick Look (Mac OS 10.5+) With file selected, tap 

Spacebar (or Cmd-Y) 

Zoom In/Out on a Quick Look Preview Cmd-Plus(+) or 

Cmd-Minus(-) 

Find by File Name (Mac OS 10.5+) Cmd-Shift-F


Hide all other applications (except the 

one you're clicking on) 

Reveal a Dock item’s location in 

the Finder 

Move and a Dock item to somewhere 

else on the hard drive 

Command-Option click an App’s 

icon in Dock 

Command Click on the icon in 

the Dock 

Command Drag the icon from the 

Dock to new destination 

Force a file to open in a specific app While dragging the file onto an 

app’s icon in the Dock, 

hold Command-Option 

When in an app’s Dock menu, change 

the Quit to Force Quit 

Force the Dock to only resize to noninterpolated 

icon sizes 

Move Dock to left, bottom, right side 

of screen 

Hold Option while in Dock menu 

Hold Option while dragging 

Dock separator 

Hold Shift and drag Dock divider 

Change the icon size of a stack Cmd-plus(+) or Cmd-minus(–) 

Temporarily turn magnification on/off Hold Control-Shift (Mac OS 10.5+) 

Working with Text 

Some only work in Cocoa apps like Safari, Mail, TextEdit, etc. 


Go to end of line Cmd-right arrow 

Go to beginning of line Cmd-left arrow 

Go to end of all the text Cmd-down arrow 

Go to beginning of all the text Cmd-up arrow 

Go to end of current or next word Option-right arrow 

Go to beginning of current or 

previous word 

Option-left arrow 

Add Shift to the above keystrokes to make a selection to that point. 

On Laptops: Delete Text to the 

right of the cursor (like the Del 

key on a full keyboard) 

Non-touching (Discontinuous) 

text selections 

Function(fn)-Delete 

Command-drag 

Select non-linear areas Option-drag 

Delete entire word to the left Opt-Delete 

Look up word in dictionary Position mouse over a word and hold 

Cmd-Ctrl-D 

Auto completion word Start typing the word. Press Esc (or F5) 

to open suggested word list 

Switch to Outline Mode in TextEdit Press Option-Tab to convert the 

current line into a list item 

Press Return to create another list item 

Press Tab at the start of a blank list 

item to indent it, creating a sublist 

Press Shift-Tab to remove a level 

of indention 

Press Return twice to decrease the 

indent, exiting the current sublist 

Dashboard 


Open/Close Widget Dock Cmd-Plus(+) 

Cycle to next/previous “page” of 

widgets in widget dock 

Close a widget without having to 

open the widget dock 

Reload/Refresh a widget Cmd-R 

Cmd-Right/Left Arrow 

Hold Option and hover over widget 

(close box will appear) 

Screenshots 

Screenshots are saved to the Desktop as PNG in OS 10.4+ (PDF in 10.3 and prior). 


Take picture of the entire screen Cmd-Shift-3 

Take picture of a selected area Cmd-Shift-4 and Drag over an area 

New in Mac OS 10.5: While dragging: 

Hold Spacebar to move selected area. 

Hold Shift to change size in one 

direction only (horizontal or vertical) 

Hold Option for center-based resizing. 

Take picture of a specific 

window/object 

Copy the screenshot to the 

clipboard instead of making a file 

Cmd-Shift-4, then press Spacebar, then 

Click on the window/object 

Hold Control with the above keystrokes 

Managing Windows & Dialogs 


Switch to next window Cmd-tilde(~) 

Switch to previous window Cmd-Shift-tilde(~) 

See where the File/Folder is located 

(a menu will pop-up displaying the 

folder hierarchy). Works in most 

programs, including the Finder. 

Move a window in the background 

without switching to it. 

Cmd-Click on name of the window 

in its titlebar 

Cmd-Drag on the window’s titlebar 

Choose “Don’t Save” in a Dialog Cmd-D in most apps, but starting 

in Lion, some apps use Cmd-Delete 

(Cmd-D will change the location to 

the Desktop) 

Spotlight 


Open Spotlight Menu Cmd-Space 

Open Spotlight Window Cmd-Option-Space 

Launch Top Hit (in the Menu) Return (In Mac OS 10.4 it’s Cmd-Return) 

Reveal selected item in Finder In Spotlight Menu: 

Cmd-click item or press Cmd-Return 

In Spotlight Window: Press Cmd-R 

Skip to first result in a category Cmd up/down arrow 

Clear Spotlight’s search field Esc clears to do another search. 

Esc a second time closes spotlight menu.

Startup, Restart, Shutdown & Sleep 


Eject CD on boot Hold Mouse button down 

immediately after powering on 

OS X Safe boot Hold Shift during startup 

Start up in FireWire Target Disk mode Hold T during startup 

Startup from a CD, DVD Hold C during startup 

Bypass primary startup volume and seek a 

different startup volume (CD, etc.) 

Hold Cmd-Opt-Shift-Delete 

during startup 

Choose Startup disk before booting Hold Option during startup 

Start up in Verbose mode Hold Cmd-V during startup 

Start up in Single-User mode 

(command line) 

Hold Cmd-S during startup 

Force OS X startup Hold X during startup 

Shutdown immediately (no confirmation) Cmd-Opt-Ctrl-Eject 

Restart immediately (no confirmation) Cmd-Ctrl-Eject 

Sleep immediately (no confirmation) Cmd-Opt-Eject 

Show Dialog with Restart, Sleep & 

Shutdown Options 

Ctrl-Eject 

Put display to sleep Ctrl-Shift-Eject 

Miscellaneous 


Force Quit (displayed list of apps) Cmd-Opt-Esc 

Force Quit Frontmost App 

(no confirmation) 

Scroll using a Trackpad (like a 

mouse’s scroll wheel) 

Right-click using a Trackpad (like 

on a 2 button mouse) 

Quickly find any menu item and 

launch it. (Mac OS 10.5+) 

10.7 Lion: Quit & Discard Windows 

(Do not re-open windows) 

10.7 Lion: Some apps re-open the 

windows that were open when you 

quit. To NOT have an app re-open 

the way it was... 

Change system volume without the 

confirmation beeps 

Completely smooth scrolling, 

one pixel at a time. (Only works in 

Cocoa apps.) 

Hold Cmd-Opt-Shift-Escape for 

several seconds 

Slide 2 fingers on the trackpad 

(Must be enabled in System Prefs and 

doesn’t work on older trackpads.) 

Place 2 fingers on the trackpad and 

Click (Must be enabled in System Prefs 

and doesn’t work on older trackpads.) 

1. Press Cmd-? which is Cmd-Shift-/ 

2. In the Help menu Search that 

opens, start typing a few letters of 

your desired menu command. 

3. Arrow key down to the item you 

want and press Return to choose it. 

Cmd-Opt-Q 

Hold Shift while launching an app 

Hold Shift while changing volume 

Hold Option while dragging scrollbar 

Open System Preferences: To open “Sound” Preferences: 

Spaces Mac OS 10.5 and higher 


Activate Spaces (birds-eye view 

of all spaces) 

Consolidate all windows into a 

Single Workspace 

F8 

After pressing F8, press C to consolidate 

(press C again to restore) 

Move to a neighboring space Ctrl-arrow key (left, right, up or down) 

Move to a specific space Ctrl-number of the space (1, 2, 3, etc.) 

Move all windows of an app to 

another space 

Safari 


Cmd-Drag in Space’s birds-eye view 

(Control and Shift also work) 

Switch to Next Tab Ctrl-Tab (or Cmd-Shift-Right Arrow) 

Switch to Previous Tab Ctrl-Shift-Tab (or Cmd-Shift-Left Arrow) 

Go to one of the first 9 

bookmarks in the 

Bookmarks Bar (doesn’t 

work on folders) 

Cmd-1 through Cmd-9 

Move between found items Cmd-F, enter your search text and Press: 

Return to Move Forward 

Shift-Return to Move Backward 

Cancel current Find Press Escape or Cmd-Period(.) 

Scroll by one full screen Scroll Down: Spacebar or Option-Down Arrow 

Scroll Up: Shift-Spacebar or Option-Up Arrow 

Add to Reading List Shift-Click a link 

Apple Mail 


Go to next/previous email in a thread 

even if you aren’t viewing as threads 

Scroll the listing of emails at the top 

(not the actual contents of an email) 

Option-Up/Down Arrow 

Ctrl-Page Up/Down 

Reply to Message Cmd–R or Opt-Double Click Message 

Preview 


Choose the Scroll/Move tool Cmd-1 

Choose the Text tool Cmd-2 

Choose the Select tool Cmd-3 

Zoom In or Out Cmd-Plus(+) or Cmd-Minus(-) 

Zoom to Actual Size Cmd-0 

Scroll Large Images Hold Spacebar & drag on the image 

Emacs Key Bindings 

Only work in Cocoa apps like Safari, Mail, TextEdit, iChat, etc. 

Action Keystroke Remember As 

go to start of line (move cursor to start of line) Ctrl-A A = Start of 

alphabet 

go to end of line (move cursor to end of line) Ctrl-E E = End 

go up one line Ctrl-P P = Previous 

go down one line Ctrl-N N = Next 

go back a character (move cursor left) Ctrl-B B = Back 

go forward a character (move cursor right) Ctrl-F F = Forward 

delete the character to the right of the cursor Ctrl-D D = Delete

These launch directly into a 

preference pane. Here are 

2 examples. 

Open Front Row Cmd-Esc 

Hold Option & hit a Sound key 

(Mute, Volume Up or Down ) 

To open “Displays” Preferences: 

Hold Option & hit a Brightness key 

Quickly Exit Front Row Press any F key, like F5. 

In OS 10.5+ other keys also work. 

Customize the toolbar at the top of 

a window. Works in the Finder, 

Apple Mail, Preview, etc. but not 

some apps, like Firefox. 

Cmd drag icons to rearrange. 

Cmd drag icon off toolbar to remove. 

Ctrl-click toolbar and choose 

Customize for more options. 

Like 1k Tweet 561 205 437 

“Thanks Dan!” 

If you like this site, considering sending Dan a few bucks. Your support will: 

Encourage Dan to keep adding useful information. 

Make you (and Dan) feel warm and fuzzy inside. 

delete the character to the left of the cursor 

delete the selection or to the end of the line 

(acts like cutting the text) 

Ctrl-H 

Ctrl-K K = Kill 

yank back the killed text (acts like pasting) Ctrl-Y Y = Yank 

scroll down Ctrl-V 

center the current line in the window Ctrl-L 

insert line break after the cursor without 

moving the cursor 

transpose letters (swaps letters on left and 

right of cursor) 

Chaqwa fra Coca-Cola AS Ferskbrygget kaffe - Høy kvalitet. Vi leverer til din arbeidsplass. www.altavdrikke.no 

TuneUp Your iTunes Fix Bad Song Info, Get Album Art & Remove Duplicates! www.TuneUpMedia.com 

Ctrl-O 

Ctrl-T T = Transpose 

Run Windows 7 on Mac Windows 7 + Mac = Easy & Powerful. Lion Features in Windows. Try Free! parallels.com/Windows-7-on-Mac

The Transparent Language Popularity Index 

Results: September 2012 update 

The tool 

The Language Popularity Index tool is a fully automatic, transparent, open-source and free tool to measure the popularity of programming 

languages on the Internet. 

The measurement of programming language popularity suffers two common problems that are probably the same for most market studies: 

1. The study depends on arbitrary choices on what to measure: blogs, book sales, wikis, open-source projects, jobs or videos ? Or 

which mix of all that ? 

2. The study may depend on cumbersome semi-manual methods with their own problems: 

if the method is too complex, nobody will try to verify the results 

mistakes may remain undetected 

when a mistake becomes evident, one may either correct it but must admit and explain that the results were wrong for a long 

time, or postpone or smooth the correction. 

The first problem has no real solution since it is a question of definition. However, a fully parametrizable measurement tool may help 

discussing the various aspects of that definition. 

We have a solution to the second problem: the tool behind the Language Popularity Index is fully automatic. 

Moreover, you can easily verify the results: 

all results, including intermediary ones, are published 

a detailed results grid let you verify individual queries just by clicking the results 

you can download and build the automatic tool and run it yourself. 

Download 

Download the Language Popularity Index tool from the SourceForge project page. 

NB: you may also need to check the Subversion revisions to get the latest source changes and search engine configurations. 

The results 

1. The main result is the following ranking, updated from time to time on this page (but regularily in the maintainer's database): 

Language Popularity Index - Web queries done on: 2012/09/03 11:36 

Language category: 

any *) 

122 entries. 

Rank Name Share 

Last 

month's 

share 

Last 

year's 

share 

1 C 17.507% 16.769% 17.445% 

2 Java 16.987% 19.576% 15.002% 

3 Objective-C 10.333% 9.872% 2.582% 

4 C++ 7.730% 8.093% 9.025% 

5 Basic 7.283% 7.611% 6.302% 

6 Python 4.370% 3.759% 3.531% 

7 PHP 4.316% 4.188% 7.753% 

8 C# 4.296% 4.383% 4.914% 

9 Perl 2.312% 2.102% 6.025% 

10 Ruby 1.691% 1.656% 1.809% 

11 JavaScript 1.401% 1.402% 1.674% 

12 R 1.377% 1.281% 1.389% 

13 Pascal 1.119% 1.125% 0.993% 

14 D 1.058% 1.179% 0.843% 

15 Ada 0.955% 0.923% 0.999% 

16 Delphi 0.765% 0.749% 1.076% 

17 Go 0.733% 0.786% 0.714% 

18 Bourne shell 0.720% 0.745% 0.130% 


general-purpose *) 

47 entries. 


1 C 23.805% 

2 Java 23.098% 

3 Objective-C 14.049% 

4 C++ 10.510% 

5 Basic 9.903% 

6 C# 5.842% 

7 Pascal 1.521% 

8 D 1.439% 

9 Ada 1.299% 

10 Delphi 1.040% 

11 Go 0.997% 

12 Fortran 0.825% 

13 Haskell 0.585% 

14 Smalltalk 0.526% 

15 ML 0.439% 

16 Forth 0.434% 

17 Scala 0.414% 

18 Erlang 0.406% 

19 Eiffel 0.302% 

20 PL/I 0.256% 


script *) 

49 entries. 


1 Python 19.699% 

2 PHP 19.459% 

3 Perl 10.424% 

4 Ruby 7.625% 

5 JavaScript 6.315% 

6 R 6.208% 

7 Bourne shell 3.246% 

8 Lua 2.643% 

9 Lisp/Scheme 2.522% 

10 MATLAB 2.111% 

11 APL 1.659% 

12 NXT-G 1.565% 

13 Scratch 1.545% 

14 ABC 1.468% 

15 Awk 1.218% 

16 J 1.013% 

17 VBScript 0.824% 

18 Alice 0.820% 

19 ActionScript 0.745% 

20 Groovy 0.609%

19 Logo 0.647% 0.673% 0.740% 

20 Fortran 0.607% 0.571% 0.454% 

21 Lua 0.586% 0.574% 0.659% 

22 COBOL 0.582% 0.547% 0.333% 

23 Lisp/Scheme 0.559% 0.555% 0.675% 

24 SAS 0.533% 0.486% 0.441% 

25 MATLAB 0.468% 0.445% 0.376% 

26 PL/SQL 0.432% 0.392% 0.543% 

27 Haskell 0.431% 0.407% 0.479% 

28 Prolog 0.405% 0.372% 0.351% 

29 Smalltalk 0.387% 0.361% 0.370% 

30 APL 0.368% 0.354% 0.410% 

31 NXT-G 0.347% 0.268% 0.089% 

32 Scratch 0.343% 0.315% 0.560% 

33 ABC 0.326% 0.309% 0.417% 

34 ML 0.323% 0.311% 0.438% 

35 Forth 0.319% 0.317% 0.495% 

36 Scala 0.304% 0.265% 0.269% 

37 Erlang 0.298% 0.277% 0.339% 

38 Awk 0.270% 0.291% 0.267% 

39 RPG (OS/400) 0.263% 0.281% 0.775% 

40 ABAP 0.230% 0.217% 0.342% 

41 Focus 0.230% 0.218% 0.466% 

42 J 0.225% 0.210% 0.414% 

43 Eiffel 0.222% 0.203% 0.213% 

44 PL/I 0.189% 0.160% 0.311% 

45 VBScript 0.183% 0.190% 0.053% 

46 Alice 0.182% 0.151% 0.551% 

47 ActionScript 0.165% 0.219% 0.202% 

48 Icon 0.157% 0.135% 0.265% 

49 LabView 0.153% 0.136% 0.143% 

50 MUMPS 0.145% 0.124% 0.122% 

51 Groovy 0.135% 0.133% 0.168% 

52 Caml/F# 0.134% 0.109% 0.259% 

53 Clojure 0.132% 0.117% 0.095% 

54 IDL 0.122% 0.103% 0.127% 

55 Dylan 0.118% 0.108% 0.113% 

56 Occam 0.118% 0.106% 0.108% 

57 Dart 0.117% 0.109% -- 

58 Oberon 0.115% 0.109% 0.122% 

59 PowerShell 0.110% 0.090% 0.132% 

60 Q 0.104% 0.082% 0.317% 

61 Oz 0.103% 0.088% 0.279% 

62 X10 0.102% 0.087% -- 

63 REXX 0.101% 0.077% 0.065% 

64 Ocaml 0.100% 0.084% 0.062% 

65 Modula-2 0.099% 0.087% 0.053% 

66 VHDL 0.099% 0.088% 0.066% 

67 Clipper 0.099% 0.076% 0.076% 

68 ColdFusion 0.098% 0.089% 0.195% 

69 PostScript 0.093% 0.069% 0.069% 

70 Factor 0.092% 0.079% 0.169% 

71 io 0.085% 0.070% 0.079% 

21 Icon 0.214% 

22 Caml/F# 0.183% 

23 Dylan 0.161% 

24 Occam 0.160% 

25 Oberon 0.156% 

26 X10 0.139% 

27 Ocaml 0.137% 

28 Modula-2 0.134% 

29 Clipper 0.134% 

30 Factor 0.125% 

31 Boo 0.093% 

32 Vala 0.092% 

33 Limbo 0.088% 

34 Euphoria 0.085% 

35 Fantom 0.058% 

36 Modula-3 0.050% 

37 SuperCollider 0.050% 

38 Genie 0.047% 

39 MAD 0.047% 

40 Cyclone 0.035% 

41 Rust 0.029% 

42 BlitzMax 0.028% 

43 ATS 0.020% 

44 SISAL 0.019% 

45 Parasail 0.013% 

46 Nemerle 0.007% 

47 Harbour 0.006% 

21 Clojure 0.595% 

22 IDL 0.552% 

23 Dart 0.527% 

24 PowerShell 0.495% 

25 Q 0.470% 

26 Oz 0.463% 

27 REXX 0.455% 

28 ColdFusion 0.440% 

29 PostScript 0.418% 

30 io 0.385% 

31 Lingo 0.376% 

32 Tcl/Tk 0.354% 

33 Mathematica 0.338% 

34 S-lang 0.325% 

35 Falcon 0.270% 

36 PowerBuilder 0.245% 

37 Maple 0.228% 

38 CL (OS/400) 0.222% 

39 AppleScript 0.216% 

40 MAX/MSP 0.211% 

41 Racket 0.201% 

42 Rebol 0.132% 

43 TeX / LaTeX 0.076% 

44 

JavaFX 

Script 

0.075% 

45 Coffeescript 0.068% 

46 Scilab 0.048% 

47 NetRexx 0.041% 

48 Scriptol 0.038% 

49 Metafont 0.015%

72 Lingo 0.083% 0.067% 0.051% 

73 Cg (Nvidia) 0.083% 0.054% -- 

74 Tcl/Tk 0.078% 0.069% 0.061% 

75 SIGNAL 0.077% 0.063% 0.142% 

76 Mathematica 0.075% 0.046% 0.046% 

77 S-lang 0.072% 0.062% 0.206% 

78 Boo 0.068% 0.061% 0.048% 

79 Natural 0.068% 0.055% 0.135% 

80 Vala 0.068% 0.060% 0.035% 

81 Limbo 0.065% 0.055% 0.068% 

82 Transact-SQL 0.062% 0.064% 0.071% 

83 Euphoria 0.062% 0.043% 0.046% 

84 Falcon 0.060% 0.041% 0.057% 

85 Verilog 0.055% 0.037% 0.021% 

86 PowerBuilder 0.054% 0.048% 0.022% 

87 Progress 0.054% 0.036% 0.207% 

88 Maple 0.051% 0.035% 0.092% 

89 CL (OS/400) 0.049% 0.051% 0.100% 

90 XSLT 0.049% 0.029% 0.018% 

91 AppleScript 0.048% 0.037% 0.039% 

92 MAX/MSP 0.047% 0.035% 0.072% 

93 Racket 0.045% 0.048% -- 

94 Fantom 0.043% 0.042% 0.031% 

95 Modula-3 0.037% 0.034% 0.042% 

96 SuperCollider 0.037% 0.033% 0.609% 

97 Genie 0.035% 0.021% 0.040% 

98 MAD 0.034% 0.027% 0.164% 

99 Rebol 0.029% 0.021% -- 

100 Cyclone 0.026% 0.023% -- 

101 Avenue 0.022% 0.016% -- 

102 Rust 0.021% 0.020% -- 

103 BlitzMax 0.021% 0.019% -- 

104 LabWindows/CVI 0.020% 0.009% 0.001% 

105 XQuery 0.018% 0.010% -- 

106 TeX / LaTeX 0.017% 0.009% 0.023% 

107 JavaFX Script 0.017% 0.008% 0.002% 

108 Coffeescript 0.015% 0.013% -- 

109 ATS 0.015% 0.011% -- 

110 SISAL 0.014% 0.013% 0.013% 

111 Csound 0.013% 0.010% -- 

112 YACC 0.012% 0.010% -- 

113 FoxPro/xBase 0.012% 0.009% 0.195% 

114 Scilab 0.011% 0.008% -- 

115 Informix/4GL 0.010% 0.007% 0.022% 

116 Parasail 0.009% 0.008% -- 

117 NetRexx 0.009% 0.008% -- 

118 Scriptol 0.008% 0.008% -- 

119 Nemerle 0.005% 0.003% 0.000% 

120 Harbour 0.004% 0.003% 0.000% 

121 Metafont 0.003% 0.003% 0.000% 

122 OpenEdge ABL 0.003% 0.002% --

2. But wait: since we are transparent, we also provide all necessary data to reproduce the above results in the following table. Note that 

you can click on each count of the cells (language, engine) to see yourself the results. Depending on your location, browser, 

language settings, cookies, etc. the engine may return a slightly different search count +/- a few percents. Note also that the time lag 

between now and the time at which the grid has been obtained (displayed on the top) is crucial: the search engine data are very 

dynamic, especially, of course, those with a one-year filtering. 

Language Popularity Index - Web queries done on: 2012/09/03 11:36 

Language 

display name 

Name in query 

Search 

engine → 

Category's 

short 

name ↓ 

Google Yahoo! Bing 

1-year 

filter 

1-year 

filter 

1-year 

filter 

Google 

Blogs 

1-year 

filter 

Amazon YouTube Wikipedia 

no filter no filter no filter 

Weight → 

Normalized → 

Results Results Results Results Results Results Results Confidence ↓ 

ABAP ABAP other 14 800 293 3 690 6 840 9 39 6 100% 

ABC ABC script 14 700 1 400 7 520 3 480 3 32 210 23% 

ActionScript ActionScript script 6 450 522 3 350 1 430 13 88 8 100% 

Ada Ada 

generalpurpose 

5 680 440 4 700 13 500 215 234 108 100% 

Alice Alice script 1 300 418 3 120 2 030 4 675 2 100% 

APL APL script 1 120 512 3 440 501 12 30 54 100% 

AppleScript AppleScript script 186 275 1 660 336 5 17 3 100% 

ATS ATS 


258 125 125 3 0 3 1 100% 

Avenue Avenue other 79 292 1 700 2 0 2 3 55% 

Awk Awk script 2 140 551 4 090 12 100 20 7 20 100% 

Basic Basic 

BlitzMax BlitzMax 

Boo Boo 




323 000 4 930 28 400 121 000 2 519 2 560 355 97% 

286 99 96 4 1 9 2 100% 

266 242 1 670 5 0 113 5 100% 

Bourne shell Bash script 3 120 441 3 350 96 000 6 39 0 100% 

C C 

C# C%23 

C++ C%2B%2B 




826 000 21 200 124 000 509 000 3 295 9 730 1 018 86% 

196 000 3 220 17 600 253 000 342 1 840 76 100% 

527 000 5 640 29 700 352 000 795 3 810 251 85% 

Cg (Nvidia) Cg other 629 373 2 750 3 470 2 53 6 80% 

Caml/F# F%20sharp 


627 1 470 1 460 2 750 1 23 5 100% 

CL (OS/400) CL script 2 230 225 1 840 420 7 2 1 100% 

Clipper Clipper 


786 417 3 010 556 13 6 8 100% 

Clojure Clojure script 3 490 532 3 820 3 580 3 14 6 100% 

COBOL COBOL other 11 300 1 090 5 760 4 180 372 89 25 100% 

ColdFusion ColdFusion script 2 140 335 2 810 784 2 39 7 100% 

Coffeescript Coffeescript script 297 111 111 37 1 3 1 100% 

Csound Csound other 30 124 123 4 0 1 1 100% 

Cyclone Cyclone 

D D 



56 154 154 2 0 1 3 100% 

16 600 592 6 740 13 700 19 3 890 82 77% 

Dart Dart script 2 540 391 2 820 2 920 1 38 7 100% 

Delphi Delphi 

Dylan Dylan 

Eiffel Eiffel 

Erlang Erlang 




general- 

45 500 1 020 5 650 20 700 145 178 18 100% 

215 296 1 970 61 1 80 14 100% 

828 484 3 240 418 4 90 28 100% 

6 200 350 4 760 2 050 13 41 33 100%

Euphoria Euphoria 

Factor Factor 

purpose 



670 390 2 950 6 0 8 4 100% 

655 364 2 290 314 6 149 5 100% 

Falcon Falcon script 455 246 1 640 119 0 13 6 100% 

Fantom Fantom 


432 142 143 7 0 61 4 100% 

Focus Focus other 916 488 2 970 233 2 261 25 100% 

Forth Forth 

Fortran Fortran 



1 530 308 1 970 1 780 20 77 44 100% 

5 890 504 4 120 3 380 340 40 49 100% 

FoxPro/xBase Fox%20Pro other 64 172 171 3 1 10 0 100% 

Genie Genie 

Go Go 



114 233 1 390 3 1 23 2 100% 

25 500 447 4 760 13 600 9 1 170 39 100% 

Groovy Groovy script 2 800 358 2 420 608 4 73 12 100% 

Harbour Harbour 

Haskell Haskell 

Icon Icon 




53 74 74 2 0 0 0 100% 

2 390 597 3 960 1 890 18 99 58 100% 

1 390 584 4 080 629 14 77 13 100% 

Informix/4GL Informix%2F4GL other 55 130 130 4 0 13 0 100% 

IDL IDL script 2 780 310 3 070 1 050 11 13 10 100% 

io io script 667 446 3 020 676 1 66 5 100% 

J J script 1 390 544 3 910 1 630 21 63 24 100% 

Java Java 


885 000 14 900 90 400 422 000 2 007 10 300 660 100% 

JavaFX Script JavaFX%20Script script 145 182 1 340 1 0 0 0 100% 

JavaScript JavaScript script 61 600 2 020 12 500 35 600 341 664 42 100% 

LabView LabView other 5 090 520 4 140 1 060 17 221 3 100% 

LabWindows/CVI CVI other 261 218 1 320 3 1 0 0 100% 

Limbo Limbo 


198 238 1 330 72 0 0 8 100% 

Lingo Lingo script 580 430 2 780 291 1 16 7 100% 

Lisp/Scheme Scheme script 5 050 417 4 800 3 120 23 78 77 100% 

Logo Logo other 6 480 600 4 410 2 040 43 1 310 53 100% 

Lua Lua script 16 200 488 4 900 8 340 14 582 47 100% 

MAD MAD 


460 364 2 280 308 0 15 7 45% 

Maple Maple script 411 405 2 900 237 7 11 1 100% 

Mathematica Mathematica script 907 643 4 350 442 7 14 1 100% 

MATLAB MATLAB script 23 300 578 4 940 12 900 105 206 11 100% 

MAX/MSP MAX%2FMSP script 410 313 1 710 183 0 13 3 100% 

Metafont Metafont script 10 62 62 1 0 0 0 100% 

ML ML 

Modula-2 Modula%2D2 

Modula-3 Modula%2D3 


2 040 418 3 080 522 16 81 44 100% 


183 276 1 870 3 55 4 7 100% 


10 141 141 3 2 1 5 100% 

MUMPS MUMPS other 403 440 3 510 211 8 8 17 100% 

Natural Natural other 1 610 615 5 850 850 0 360 3 45% 

Nemerle Nemerle 


48 85 83 4 0 0 0 100% 

NXT-G NXT%2DG script 1 960 3 140 3 130 20 200 23 48 1 100% 

Oberon Oberon 

Objective-C Objective%2DC 


general- 

152 258 1 670 5 5 1 16 100% 

24 500 4 590 4 600 1 440 000 21 384 17 100%

Ocaml Ocaml 

Occam Occam 

purpose 



740 368 2 540 83 0 15 11 100% 

225 303 1 830 7 3 5 16 100% 

OpenEdge ABL OpenEdge other 15 48 47 0 0 0 0 100% 

Oz Oz script 811 330 3 140 187 0 49 10 100% 

Parasail Parasail 

Pascal Pascal 



112 54 54 3 0 0 1 100% 

10 400 445 4 540 8 680 566 311 95 100% 

Perl Perl script 150 000 2 020 11 200 39 500 145 243 123 100% 

PHP PHP script 173 000 2 730 16 700 230 000 130 2 530 146 100% 

PL/I PL%2FI 


602 475 3 290 140 23 3 23 100% 

PL/SQL PL%2FSQL other 22 100 451 4 720 21 000 80 46 5 100% 

PostScript PostScript script 741 483 3 630 515 5 0 7 100% 

PowerBuilder PowerBuilder script 528 260 1 940 203 42 1 0 100% 

PowerShell PowerShell script 4 510 356 2 910 4 740 11 6 1 100% 

Progress Progress other 733 532 3 590 196 0 19 0 100% 

Prolog Prolog other 5 300 574 4 740 5 560 111 242 30 100% 

Python Python script 140 000 9 530 53 900 68 200 190 2 910 283 100% 

NetRexx NetRexx script 15 65 67 0 0 0 1 100% 

RPG (OS/400) RPG other 4 840 354 4 500 2 050 39 1 310 9 70% 

Q Q script 1 330 457 3 340 476 2 6 9 100% 

R R script 34 200 2 190 12 200 27 500 48 859 109 100% 

Racket Racket script 285 164 166 8 0 36 5 100% 

Rebol Rebol script 193 196 1 250 4 0 6 2 100% 

REXX REXX script 858 392 4 080 194 3 10 9 100% 

Ruby Ruby script 52 400 1 330 7 690 32 500 55 881 146 100% 

Rust Rust 


364 132 132 57 0 1 2 100% 

SAS SAS other 29 100 1 620 8 820 12 100 113 91 6 100% 

Scala Scala 


9 700 451 3 400 2 180 7 159 27 100% 

Scilab Scilab script 130 164 164 3 1 2 0 100% 

Scratch Scratch script 5 700 425 3 300 2 440 13 758 21 100% 

Scriptol Scriptol script 10 52 52 0 0 0 1 100% 

SIGNAL SIGNAL other 699 507 4 220 276 3 16 11 65% 

SISAL SISAL 


20 53 53 0 0 0 2 100% 

S-lang S%2Dlang script 164 238 1 660 6 1 0 9 100% 

Smalltalk Smalltalk 

SuperCollider SuperCollider 



992 555 3 780 1 100 12 34 56 100% 

308 166 167 141 0 15 4 100% 

Tcl/Tk Tcl%2FTk script 1 010 354 2 160 2 240 15 45 2 100% 

TeX / LaTeX TeX script 213 191 1 120 4 0 3 0 100% 

Transact-SQL Transact%2DSQL other 1 860 199 1 540 1 950 4 29 2 100% 

Vala Vala 


296 200 1 050 2 400 0 6 6 100% 

VBScript VBScript script 998 529 3 950 2 250 168 15 1 100% 

Verilog Verilog other 1 130 435 3 430 276 6 18 0 100% 

VHDL VHDL other 3 470 467 3 010 1 360 13 47 2 100% 

X10 X10 


422 298 2 200 6 1 37 12 100% 

XSLT XSLT other 396 443 3 140 374 0 2 1 100% 

XQuery XQuery other 33 113 113 2 0 0 2 100% 

YACC YACC other 51 118 119 4 0 1 1 100% 

Documentation of the parameters, resources, credits can be found in the "search.xls" file, in the project's lang-index-*.zip archive.

Some links about the subject: 

Contact 

http://www.langpop.com/ 

http://www.blackducksoftware.com/oss/projects#languageos 

http://www.complang.tuwien.ac.at/anton/comp.lang-statistics/ 

http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html 

http://groups.google.ch/groups/dir?sel=usenet%3Dcomp.lang,& 

http://www.altiusdirectory.com/Computers/programming-languages-list.html 

http://99-bottles-of-beer.net/ 

http://rosettacode.org/ 

Wikipedia article 

For bug reports, issues, wishes, patches, please go to the "develop" section on the project page, here. 

For any demand about historical statistics (LPI monthly values) or specific analyses (trends, data mining,... ), please write an e-mail at the 

following address: 

Some news can be found on this blog. 

Sponsoring is welcome (even the equivalent of one beer)! 

*) 

Some explanations about the choices of categories "general-purpose", "script", "other". The choices are subjective and cannot be perfect due 

to some overlap of categories' definitions. 

The union of all categories, "Any", covers all queried languages. If you don't like the attempt of categorization at all, stick with the "any" 

results column and just ignore the other ones! 

"General-purpose" covers general-purpose languages, usually compiled or compilable. Variables don't exist before or after the program's 

execution. After compilation and linking, a standalone executable is available. 

"Script" covers mostly interpreted languages (optionally compiled), often with dynamic typing. They are often interactive or have an 

interactive mode, where commands or scripts can be run in random order. Variables are set on-the-fly, may change type, may not be 

declared, may exist before the program's execution. Some may exist after the program's execution and can be reused for the next execution, 

with different values. Mostly programs need a specific environment to be run. 

"Other" covers specialized languages, also called Domain-Specific Languages (DSL's). They are bound to a specific purpose like 

education, a database system, an accounting system, an administration software, a sound or graphics system, a document display, or a 

hardware description. Their use outside their dedicated purpose may be theoretically possible but perhaps difficult and definitely not 

appropriate. 

Like 46 

The Transparent Language Popularity Index. Ada programming.

Starting out 

Get the Ebook 

Get Started with C or C++ 

Getting a Compiler 

Book Recommendations 

Tutorials 

C Tutorial 

C++ Tutorial 

Java Tutorial 

Game Programming 

Graphics Programming 

Algorithms & Data Structures 

Debugging 

All Tutorials 

Practice 

Practice Problems 

Quizzes 

Resources 

Source Code 

Source Code Snippets 

C and C++ Tips 

Finding a Job 

References 

Function Reference 

Syntax Reference 

Programming Links 

Programming FAQ 

Getting Help 

Message Board 

Ask an Expert 

Email 

About Us 

Ads by Google 

C Programming 

C 

Tutorial 

Search 

The Hows and Whys of Commenting C and C++ Code 

Programs are meant to be beautiful. If someone tells you otherwise, you'd probably do best not to 

listen to the rest of his advice. A good program is beautiful in both its concept -- the algorithm used, 

the design, the flow of control -- but also in its readability. Good code is not replete with question mark 

colon operators and pointer arithmetic, or at least not when the code doesn't need to be optimized to 

By Alex Allain save a few seconds every few months of operation. But readable code, while a nice ideal, often 

requires some help from the English (or from some other) language. Sometimes the algorithm is too 

complex to be understood rapidly or completely without some explanation, or the code requires some esoteric function 

call from the C library that has both a cryptic and a misleading name. And if you ever plan to code for a living, you will 

almost certainly have to go back several years later to modify the one section of code that you didn't feel like 

commenting -- or someone else will, and will curse your name. Finally, commenting code can also lead to a better 

understanding of the program and may help uncover bugs before testing. For both aesthetic and practical reasons, good 

commenting is an essential and often overlooked programming skill. 

Before discussing how to comment, a few words of warning: it is possible to comment too much. Just as good writing is 

spare, so too is good coding. You do not want to include a comment telling the reader something obvious, or something 

that can be discerned from a single line of code. (You are probably not writing your code to teach someone how to 

program!) For instance, "int callCount = 0; //declares an integer variable," conveys no meaningful information. 

Comments should reveal the underlying structure of the code, not the surface details. A more meaningful comment 

would be "int callCount = 0; //holds number of calls to [function]." 

Comments should not be overly long. Comments should not give details for the sake of details; only when a fact is 

necessary or interesting should it be brought to the attention of the program's reader. If you were reading someone 

else's program, you would not want to be forced to pick through paragraphs of text describing the intricacies of a for 

loop's operation only to realize that you could have discovered the same information simply by having read the for loop. 

For the sake of future readers, you should generally include some header information at the top of your program. This 

information may include your name and contact information, the date the code was last modified, the purpose of the 

program, and if necessary, a brief exposition of the algorithm used or the design decisions made. You may also want to 

include a list of known bugs, inefficiencies, or suggestions for improvement. It is convenient to demarcate this section of 

the program file in a large comment block of the form 

/******** 

* * 

* * 

********/ 

This form of comment is best done at the end of the program, except perhaps for an overview of the algorithm, which 

you may find helps to crystallize your thoughts when you are dealing with new, confusing, or complex concepts. 

Second, whenever you create a new class or a new function definition, you should add a comment explaining what the 

class or function does. For a class, you should explain the purpose of the class, what publicly accessible functions and 

variables are available, any limitations of the class, and information that the programmer may need if he or she wanted 

to inherit from the class. When defining a function, you should describe what the function does and whether or not it 

has side effects such as changing global variables, interacting with the user, or so forth. It is also convenient to describe 

what sort of arguments the function takes and what, if any, value it returns: e.g., if you have a function findTime(int 

distance, int speed), you would want to tell the user that the distance is in rods, and the speed is in furlongs per 

fortnight, and that the function returns the time taken by the travel in epochs. 

You could make the variable names more descriptive, but the increased descriptiveness is unnecessary inside the 

function because the relationship between distance and speed is the significant feature, not the relationship between 

distanceinrods and speedinfurlongsperfortnight. As you can see, long names are unnecessarily complex and can lead to 

hard-to-find typos. In general, variable names should be only descriptive enough to express the relationship between 

variables. Any implementation details should be handled by comments at points where the details may matter. For 

instance, in the above function, the function's caller must know the units, but those units do not matter inside the 

function (except when conversions are being made, but the programmer should make a note of this at the time of the 

conversion). In order to avoid confusion, you should eschew abbreviations; a word can be abbreviated many ways, and 

a single abbreviation can describe more than one word. Avoiding abbreviations avoids this problem. 

Third, when adding comments directly to your code, be spare. The less you say, the better; if you force yourself to be 

precise, you will make the comments more helpful, and you will force yourself to synthesize the flow of your program 

rather than allow yourself to repeat what the code already tells the user. The code should fit seamlessly into the

ather than allow yourself to repeat what the code already tells the user. The code should fit seamlessly into the 

algorithm rather than wrap entirely around it and smother the logic embedded in the C++. 

Fourth, good commenting can improve your programming. You can use them as an organizational device by including 

comments prior to filling in code -- for instance, when you have a long block of conditional statements, you may wish to 

comment what should happen when each conditional is executed before you flesh out the code. In doing so, you save 

yourself the burden of remembering the details of the entire program, allowing you to concentrate on the 

implementation of one aspect at a time. Additionally, when you force yourself to comment during the programming 

process, you cannot get away with writing code that "you hope will work" if you make yourself explain why it works. 

(Keep in mind that if the code is so complex that you don't know how it works, then you probably should be 

commenting it for the sake of both yourself and others.) 

Finally, keep in mind that what seems obvious now may not seem obvious later. While you shouldn't excessively 

comment, do make sure to comment things that are nonstandard algorithms. You do not need to comment a 

programming idiom, but you do want to comment an algorithm you designed for the program, no matter how simple it 

may seem to you. No doubt it will seem foreign three weeks after you write the code, and if you plan (and even if you 

do not plan) to come back to the code, it will be immeasurably helpful. 

Comments are for yourself and others. You may be forced to work with uncommented code, and it helps to comment 

the code as you work through what it does. This can be as simple as renaming the variable names from, say, r, x, and y 

to currentNumber, largestPrime, and currentDivisor. With any luck, after one or two of these experiences you will 

recognize the wisdom of commenting your code. Moreover, you will see the greater elegance of a well-commented, 

carefully written piece of code in comparison to a hack thrown together only to "work." Related Articles 

Programming Style: Why Whitespace Matters, and how to use (and avoid misusing) it 

How you can write readable code, and why you should 

Naming Conventions, and Names you Should Avoid 

Recommend 3 

Popular pages 

Tweet 0 1 

Want to become a C++ 

programmer? The 

Cprogramming.com ebook, 

Jumping into C++, will walk 

you through it, step-by-step. 

Get Jumping into C++ today! 

Exactly how to get started with C++ (or C) today 

C Tutorial 

C++ Tutorial 

5 ways you can learn to program faster 

The 5 Most Common Problems New Programmers Face 

How to set up a compiler 

8 Common programming Mistakes 

What is C++11? 

How to make a game in 48 hours 

Recent additions 

How to create a shared library on Linux with GCC - December 30, 2011 

Enum classes and nullptr in C++11 - November 27, 2011 

Learn about The Hash Table - November 20, 2011 

Rvalue References and Move Semantics in C++11 - November 13, 2011 

C and C++ for Java Programmers - November 5, 2011 

A Gentle Introduction to C++ IO Streams - October 10, 2011 

Custom Search 

Ads by Google C Programming C Programming Code C Programming Help C 

Join our mailing list to keep up with 

the latest news and updates about 

Cprogramming.com! 

Name 

Email 

Search

Advertising | Privacy policy | Copyright © 1997-2011 Cprogramming.com. All rights reserved. | 

webmaster@cprogramming.com

START BROWSE LANGUAGES SEARCH LANGUAGES TOP LISTS GUESTBOOK SUBMIT NEW LANGUAGE 

Team Song Lyrics History Privacy 

Welcome to 99 Bottles of Beer 

99 Bottles of Beer 

one program in 1500 variations 

This Website holds a collection of the Song 99 Bottles of Beer programmed in different programming languages. 

Actually the song is represented in 1500 different programming languages and variations. For more detailed 

information refer to historic information. 

All these little programs generate the lyrics to the song 99 Bottles of Beer as an output. In case you do not know 

the song, you will find the lyrics to the song here. 

Feel free to browse, to comment and to rate the different programming languages. In case your favourite 

programming language is missing, please submit your own piece of code. After a short review it will appear on the 

website. 

For any comment, critic or praise concerning this website drop a message in our guestbook or contact one of the team 

members. 

Have a lot of fun, 

Oliver, Gregor and Stefan 

Start | Browse Languages | Search Languages | Top Lists | Guestbook | Submit new Language

Programming 

Paradigms 

2. Overview of the four main programming 

paradigms 

In this section we will characterize the four main programming paradigms, as identified 

in Section 1.2. 

As the main contribution of this exposition, we attempt to trace the basic discipline and 

the idea behind each of the main programming paradigms. 

With this introduction to the material, we will also be able to see how the functional 

programming paradigm corresponds to the other main programming paradigms. 

2.1 Overview of the imperative paradigm 2.3 Overview of the logic paradigm 

2.2 Overview of the functional paradigm 2.4 Overview of the object-oriented 

paradigm 

2.1. Overview of the imperative paradigm 

Contents Up Previous Next Slide Speak Subject index Program index Exercise index 

First do this and next 

do that 

The 'first do this, next do that' is a short phrase which really in a nutshell describes the 

spirit of the imperative paradigm. The basic idea is the command, which has a measurable 

effect on the program state. The phrase also reflects that the order to the commands is 

important. 'First do that, then do this' would be different from 'first do this, then do that'. 

In the itemized list below we describe the main properties of the imperative paradigm. 

Characteristics: 

Discipline and idea 

Digital hardware technology and the ideas of Von Neumann 

Incremental change of the program state as a function of time. 

Execution of computational steps in an order governed by control 

structures 

We call the steps for commands

Straightforward abstractions of the way a traditional Von Neumann 

computer works 

Similar to descriptions of everyday routines, such as food recipes and car 

repair 

Typical commands offered by imperative languages 

Assignment, IO, procedure calls 

Language representatives 

Fortran, Algol, Pascal, Basic, C 

The natural abstraction is the procedure 

Abstracts one or more actions to a procedure, which can be called 

as a single command. 

"Procedural programming" 

We use several names for the computational steps in an imperative language. The word 

statement is often used with the special computer science meaning 'a elementary 

instruction in a source language'. The word instruction is another possibility; We prefer to 

devote this word the computational steps performed at the machine level. We will use the 

word 'command' for the imperatives in a high level imperative programming language. 

A procedure abstracts one or more actions to a procedure, which can be activated as a 

single action. 

2.2. Overview of the functional paradigm 


We here introduce the functional paradigm at the same level as imperative programming 

was introduced in Section 2.1. 

Functional programming is in many respects a simpler and more clean programming 

paradigm than the imperative one. The reason is that the paradigm originates from a 

purely mathematical discipline: the theory of functions. As described in Section 2.1, the 

imperative paradigm is rooted in the key technological ideas of the digital computer, 

which are more complicated, and less 'clean' than mathematical function theory. 

Below we characterize the most important, overall properties of the functional 

programming paradigm. Needless to say, we will come back to most of them in the 

remaining chapters of this material. 

Evaluate an expression and use the resulting value



for something 

Mathematics and the theory of functions 

The values produced are non-mutable 

Atemporal 

Applicative 

Impossible to change any constituent of a composite value 

As a remedy, it is possible to make a revised copy of composite 

value 

Time only plays a minor role compared to the imperative paradigm 

All computations are done by applying (calling) functions 

The natural abstraction is the function 

Abstracts a single expression to a function which can be evaluated 

as an expression 

Functions are first class values 

Functions are full-fledged data just like numbers, lists, ... 

Fits well with computations driven by needs 

Opens a new world of possibilities 

2.3. Overview of the logic paradigm 


The logic paradigm is dramatically different from the other three main programming 

paradigms. The logic paradigm fits extremely well when applied in problem domains that 

deal with the extraction of knowledge from basic facts and relations. The logical 

paradigm seems less natural in the more general areas of computation. 

Answer a question via search for 

a solution

Below we briefly characterize the main properties of the logic programming paradigm. 



Automatic proofs within artificial intelligence 

Based on axioms, inference rules, and queries. 

Program execution becomes a systematic search in a set of facts, making 

use of a set of inference rules 

2.4. Overview of the object-oriented paradigm 


The object-oriented paradigm has gained great popularity in the recent decade. The 

primary and most direct reason is undoubtedly the strong support of encapsulation and 

the logical grouping of program aspects. These properties are very important when 

programs become larger and larger. 

The underlying, and somewhat deeper reason to the success of the object-oriented 

paradigm is probably the conceptual anchoring of the paradigm. An object-oriented 

program is constructed with the outset in concepts, which are important in the problem 

domain of interest. In that way, all the necessary technicalities of programming come in 

second row. 

Send messages between objects to simulate the temporal 

evolution of a set of real world phenomena 

As for the other main programming paradigms, we will now describe the most important 

properties of object-oriented programming, seen as a school of thought in the area of 

computer programming. 



The theory of concepts, and models of human interaction with real 

world phenomena 

Data as well as operations are encapsulated in objects 

Information hiding is used to protect internal properties of an object 

Objects interact by means of message passing

A metaphor for applying an operation on an object 

In most object-oriented languages objects are grouped in classes 

Objects in classes are similar enough to allow programming of the 

classes, as opposed to programming of the individual objects 

Classes represent concepts whereas objects represent phenomena 

Classes are organized in inheritance hierarchies 

Provides for class extension or specialization 

This ends the overview of the four main programming paradigms. From now on the main 

focus will be functional programming in Scheme, with special emphasis on examples 

drawn from the domain of web program development. 

Generated: Wednesday July 7, 

2010, 15:36:39

2.21 Real Programmers (Ed Post), see also Sec. 2.1 

First reference occurs in Real Programmers use FORTRAN, see Section 2.1 on page 12. 

126

The Story of Mel 

Prev Appendix A. Hacker Folklore Next 

The Story of Mel 

This was posted to Usenet by its author, Ed Nather (), 

on May 21, 1983. 

A recent article devoted to the macho side of programming 

made the bald and unvarnished statement: 

Real Programmers write in FORTRAN. 

Maybe they do now, 

in this decadent era of 

Lite beer, hand calculators, and “user-friendly” software 

but back in the Good Old Days, 

when the term “software” sounded funny 

and Real Computers were made out of drums and vacuum tubes, 

Real Programmers wrote in machine code. 

Not FORTRAN. Not RATFOR. Not, even, assembly language. 

Machine Code. 

Raw, unadorned, inscrutable hexadecimal numbers. 

Directly. 

Lest a whole new generation of programmers 

grow up in ignorance of this glorious past, 

I feel duty-bound to describe, 

as best I can through the generation gap, 

how a Real Programmer wrote code. 

I'll call him Mel, 

because that was his name. 

I first met Mel when I went to work for Royal McBee Computer Corp., 

a now-defunct subsidiary of the typewriter company. 

The firm manufactured the LGP-30, 

a small, cheap (by the standards of the day) 

drum-memory computer, 

and had just started to manufacture 

the RPC-4000, a much-improved, 

bigger, better, faster — drum-memory computer. 

Cores cost too much, 

and weren't here to stay, anyway. 

(That's why you haven't heard of the company, 

or the computer.) 

I had been hired to write a FORTRAN compiler

I had been hired to write a FORTRAN compiler 

for this new marvel and Mel was my guide to its wonders. 

Mel didn't approve of compilers. 

“If a program can't rewrite its own code”, 

he asked, “what good is it?” 

Mel had written, 

in hexadecimal, 

the most popular computer program the company owned. 

It ran on the LGP-30 

and played blackjack with potential customers 

at computer shows. 

Its effect was always dramatic. 

The LGP-30 booth was packed at every show, 

and the IBM salesmen stood around 

talking to each other. 

Whether or not this actually sold computers 

was a question we never discussed. 

Mel's job was to re-write 

the blackjack program for the RPC-4000. 

(Port? What does that mean?) 

The new computer had a one-plus-one 

addressing scheme, 

in which each machine instruction, 

in addition to the operation code 

and the address of the needed operand, 

had a second address that indicated where, on the revolving drum, 

the next instruction was located. 

In modern parlance, 

every single instruction was followed by a GO TO! 

Put that in Pascal's pipe and smoke it. 

Mel loved the RPC-4000 

because he could optimize his code: 

that is, locate instructions on the drum 

so that just as one finished its job, 

the next would be just arriving at the “read head” 

and available for immediate execution. 

There was a program to do that job, 

an “optimizing assembler”, 

but Mel refused to use it. 

“You never know where it's going to put things”, 

he explained, “so you'd have to use separate constants”. 

It was a long time before I understood that remark. 

Since Mel knew the numerical value 

of every operation code,

and assigned his own drum addresses, 

every instruction he wrote could also be considered 

a numerical constant. 

He could pick up an earlier “add” instruction, say, 

and multiply by it, 

if it had the right numeric value. 

His code was not easy for someone else to modify. 

I compared Mel's hand-optimized programs 

with the same code massaged by the optimizing assembler program, 

and Mel's always ran faster. 

That was because the “top-down” method of program design 

hadn't been invented yet, 

and Mel wouldn't have used it anyway. 

He wrote the innermost parts of his program loops first, 

so they would get first choice 

of the optimum address locations on the drum. 

The optimizing assembler wasn't smart enough to do it that way. 

Mel never wrote time-delay loops, either, 

even when the balky Flexowriter 

required a delay between output characters to work right. 

He just located instructions on the drum 

so each successive one was just past the read head 

when it was needed; 

the drum had to execute another complete revolution 

to find the next instruction. 

He coined an unforgettable term for this procedure. 

Although “optimum” is an absolute term, 

like “unique”, it became common verbal practice 

to make it relative: 

“not quite optimum” or “less optimum” 

or “not very optimum”. 

Mel called the maximum time-delay locations 

the “most pessimum”. 

After he finished the blackjack program 

and got it to run 

(“Even the initializer is optimized”, 

he said proudly), 

he got a Change Request from the sales department. 

The program used an elegant (optimized) 

random number generator 

to shuffle the “cards” and deal from the “deck”, 

and some of the salesmen felt it was too fair, 

since sometimes the customers lost. 

They wanted Mel to modify the program 

so, at the setting of a sense switch on the console, 

they could change the odds and let the customer win.

Mel balked. 

He felt this was patently dishonest, 

which it was, 

and that it impinged on his personal integrity as a programmer, 

which it did, 

so he refused to do it. 

The Head Salesman talked to Mel, 

as did the Big Boss and, at the boss's urging, 

a few Fellow Programmers. 

Mel finally gave in and wrote the code, 

but he got the test backwards, 

and, when the sense switch was turned on, 

the program would cheat, winning every time. 

Mel was delighted with this, 

claiming his subconscious was uncontrollably ethical, 

and adamantly refused to fix it. 

After Mel had left the company for greener pa$ture$, 

the Big Boss asked me to look at the code 

and see if I could find the test and reverse it. 

Somewhat reluctantly, I agreed to look. 

Tracking Mel's code was a real adventure. 

I have often felt that programming is an art form, 

whose real value can only be appreciated 

by another versed in the same arcane art; 

there are lovely gems and brilliant coups 

hidden from human view and admiration, sometimes forever, 

by the very nature of the process. 

You can learn a lot about an individual 

just by reading through his code, 

even in hexadecimal. 

Mel was, I think, an unsung genius. 

Perhaps my greatest shock came 

when I found an innocent loop that had no test in it. 

No test. None. 

Common sense said it had to be a closed loop, 

where the program would circle, forever, endlessly. 

Program control passed right through it, however, 

and safely out the other side. 

It took me two weeks to figure it out. 

The RPC-4000 computer had a really modern facility 

called an index register. 

It allowed the programmer to write a program loop 

that used an indexed instruction inside; 

each time through, 

the number in the index register 

was added to the address of that instruction,

was added to the address of that instruction, 

so it would refer 

to the next datum in a series. 

He had only to increment the index register 

each time through. 

Mel never used it. 

Instead, he would pull the instruction into a machine register, 

add one to its address, 

and store it back. 

He would then execute the modified instruction 

right from the register. 

The loop was written so this additional execution time 

was taken into account — 

just as this instruction finished, 

the next one was right under the drum's read head, 

ready to go. 

But the loop had no test in it. 

The vital clue came when I noticed 

the index register bit, 

the bit that lay between the address 

and the operation code in the instruction word, 

was turned on — 

yet Mel never used the index register, 

leaving it zero all the time. 

When the light went on it nearly blinded me. 

He had located the data he was working on 

near the top of memory — 

the largest locations the instructions could address — 

so, after the last datum was handled, 

incrementing the instruction address 

would make it overflow. 

The carry would add one to the 

operation code, changing it to the next one in the instruction set: 

a jump instruction. 

Sure enough, the next program instruction was 

in address location zero, 

and the program went happily on its way. 

I haven't kept in touch with Mel, 

so I don't know if he ever gave in to the flood of 

change that has washed over programming techniques 

since those long-gone days. 

I like to think he didn't. 

In any event, 

I was impressed enough that I quit looking for the 

offending test, 

telling the Big Boss I couldn't find it. 

He didn't seem surprised.

He didn't seem surprised. 

When I left the company, 

the blackjack program would still cheat 

if you turned on the right sense switch, 

and I think that's how it should be. 

I didn't feel comfortable 

hacking up the code of a Real Programmer. 

This is one of hackerdom's great heroic epics, free verse or no. In a few spare images it 

captures more about the esthetics and psychology of hacking than all the scholarly 

volumes on the subject put together. (But for an opposing point of view, see the entry for 

Real Programmer.) 

[1992 postscript — the author writes: “The original submission to the net was not in free 

verse, nor any approximation to it — it was straight prose style, in non-justified 

paragraphs. In bouncing around the net it apparently got modified into the ‘free verse' 

form now popular. In other words, it got hacked on the net. That seems appropriate, 

somehow.” The author adds that he likes the ‘free-verse' version better than his prose 

original...] 

[1999 update: Mel's last name is now known. The manual for the LGP-30 refers to “Mel 

Kaye of Royal McBee who did the bulk of the programming [...] of the ACT 1 system”.] 

[2001: The Royal McBee LPG-30 turns out to have one other claim to fame. It turns out 

that meteorologist Edward Lorenz was doing weather simulations on an LGP-30 when, in 

1961, he discovered the “Butterfly Effect” and computational chaos. This seems, 

somehow, appropriate.] 

[2002: A copy of the programming manual for the LGP-30 lives at http://edthelen.org/comp-hist/lgp-30-man.html] 

Prev Up Next 

OS and JEDGAR 

Home 

Appendix B. A Portrait of J. 

Random Hacker

T HE TAO OF PROGRAMMING 

Translated by Geoffrey James 

Transcribed by Duke Hillard 

Transmitted by Anupam Trivedi, Sajitha Tampi, 

and Meghshyam Jagannath 

Re-html-ized and edited by Kragen Sittler 

Last modified 1996-04-10 or earlier 

1. The Silent Void 

2. The Ancient Masters 

3. Design 

4. Coding 

5. Maintenance 

6. Management 

7. Corporate Wisdom 

8. Hardware and Software 

9. Epilogue 

T ABLE OF CONTENTS 

B OOK 1 - THE SILENT VOID 

Thus spake the master programmer: 

``When you have learned to snatch the error code 

from the trap frame, it will be time for you to 

leave.'' 

Something mysterious is formed, born in the 

silent void. Waiting alone and unmoving, it is at 

once still and yet in constant motion. It is the 

source of all programs. I do not know its name, so 

I will call it the Tao of Programming. 

If the Tao is great, then the operating system is 

great. If the operating system is great, then the 

compiler is great. If the compiler is great, then the 

application is great. The user is pleased and there 

exists harmony in the world. 

The Tao of Programming flows far away and 

returns on the wind of morning. 

1.1

The Tao gave birth to machine language. Machine 

language gave birth to the assembler. 

The assembler gave birth to the compiler. Now 

there are ten thousand languages. 

Each language has its purpose, however humble. 

Each language expresses the Yin and Yang of 

software. Each language has its place within the 

Tao. 

But do not program in COBOL if you can avoid it. 

In the beginning was the Tao. The Tao gave birth 

to Space and Time. Therefore Space and Time are 

Yin and Yang of programming. 

Programmers that do not comprehend the Tao are 

always running out of time and space for their 

programs. Programmers that comprehend the 

Tao always have enough time and space to 

accomplish their goals. 

How could it be otherwise? 

The wise programmer is told about Tao and 

follows it. The average programmer is told about 

Tao and searches for it. The foolish programmer 

is told about Tao and laughs at it. 

If it were not for laughter, there would be no Tao. 

The highest sounds are hardest to hear. 

Going forward is a way to retreat. 

Great talent shows itself late in life. 

Even a perfect program still has bugs. 

1.2 

1.3 

1.4 

B OOK 2 - THE ANCIENT 

M ASTERS 


`Àfter three days without programming, life 

becomes meaningless.''

The programmers of old were mysterious and 

profound. We cannot fathom their thoughts, so all 

we do is describe their appearance. 

Aware, like a fox crossing the water. Alert, like a 

general on the battlefield. Kind, like a hostess 

greeting her guests. Simple, like uncarved blocks 

of wood. Opaque, like black pools in darkened 

caves. 

Who can tell the secrets of their hearts and 

minds? 

The answer exists only in Tao. 

Grand Master Turing once dreamed that he was a 

machine. When he awoke he exclaimed: 

`Ì don't know whether I am Turing 

dreaming that I am a machine, or a 

machine dreaming that I am 

Turing!'' 

A programmer from a very large computer 

company went to a software conference and then 

returned to report to his manager, saying: ``What 

sort of programmers work for other companies? 

They behaved badly and were unconcerned with 

appearances. Their hair was long and unkempt 

and their clothes were wrinkled and old. They 

crashed our hospitality suite and they made rude 

noises during my presentation.'' 

The manager said: `Ì should have never sent you 

to the conference. Those programmers live 

beyond the physical world. They consider life 

absurd, an accidental coincidence. They come and 

go without knowing limitations. Without a care, 

they live only for their programs. Why should 

they bother with social conventions? 

``They are alive within the Tao.'' 

2.1 

2.2 

2.3

A novice asked the Master: ``Here is a 

programmer that never designs, documents or 

tests his programs. Yet all who know him 

consider him one of the best programmers in the 

world. Why is this?'' 

The Master replies: ``That programmer has 

mastered the Tao. He has gone beyond the need 

for design; he does not become angry when the 

system crashes, but accepts the universe without 

concern. He has gone beyond the need for 

documentation; he no longer cares if anyone else 

sees his code. He has gone beyond the need for 

testing; each of his programs are perfect within 

themselves, serene and elegant, their purpose 

self-evident. Truly, he has entered the mystery of 

Tao.'' 


2.4 

B OOK 3 - DESIGN 

``When the program is being tested, it is too late 

to make design changes.'' 

There once was a man who went to a computer 

trade show. Each day as he entered, the man told 

the guard at the door: 

`Ì am a great thief, renowned for 

my feats of shoplifting. Be 

forewarned, for this trade show 

shall not escape unplundered.'' 

This speech disturbed the guard greatly, because 

there were millions of dollars of computer 

equipment inside, so he watched the man 

carefully. But the man merely wandered from 

booth to booth, humming quietly to himself. 

When the man left, the guard took him aside and 

searched his clothes, but nothing was to be found. 

3.1

On the next day of the trade show, the man 

returned and chided the guard saying: `Ì escaped 

with a vast booty yesterday, but today will be 

even better.'' So the guard watched him ever 

more closely, but to no avail. 

On the final day of the trade show, the guard 

could restrain his curiosity no longer. ``Sir Thief,'' 

he said, `Ì am so perplexed, I cannot live in 

peace. Please enlighten me. What is it that you are 

stealing?'' 

The man smiled. `Ì am stealing ideas,'' he said. 

There once was a master programmer who wrote 

unstructured programs. A novice programmer, 

seeking to imitate him, also began to write 

unstructured programs. When the novice asked 

the master to evaluate his progress, the master 

criticized him for writing unstructured programs, 

saying, ``What is appropriate for the master is not 

appropriate for the novice. You must understand 

the Tao before transcending structure.'' 

There was once a programmer who was attached 

to the court of the warlord of Wu. The warlord 

asked the programmer: ``Which is easier to 

design: an accounting package or an operating 

system?'' 

`Àn operating system,'' replied the programmer. 

The warlord uttered an exclamation of disbelief. 

``Surely an accounting package is trivial next to 

the complexity of an operating system,'' he said. 

``Not so,'' said the programmer, ``when designing 

an accounting package, the programmer operates 

as a mediator between people having different 

ideas: how it must operate, how its reports must 

appear, and how it must conform to the tax laws. 

By contrast, an operating system is not limited by 

outside appearances. When designing an 

operating system, the programmer seeks the 

3.2 

3.3

operating system, the programmer seeks the 

simplest harmony between machine and ideas. 

This is why an operating system is easier to 

design.'' 

The warlord of Wu nodded and smiled. ``That is 

all good and well, but which is easier to debug?'' 

The programmer made no reply. 

A manager went to the master programmer and 

showed him the requirements document for a 

new application. The manager asked the master: 

``How long will it take to design this system if I 

assign five programmers to it?'' 

`Ìt will take one year,'' said the master promptly. 

``But we need this system immediately or even 

sooner! How long will it take if I assign ten 

programmers to it?'' 

The master programmer frowned. `Ìn that case, it 

will take two years.'' 

`Ànd what if I assign a hundred programmers to 

it?'' 

The master programmer shrugged. ``Then the 

design will never be completed,'' he said. 


3.4 

B OOK 4 - CODING 

`À well-written program is its own heaven; a 

poorly-written program is its own hell.'' 

A program should be light and agile, its 

subroutines connected like a string of pearls. The 

spirit and intent of the program should be 

retained throughout. There should be neither too 

little or too much, neither needless loops nor 

useless variables, neither lack of structure nor 

overwhelming rigidity. 

A program should follow the `Law of Least 

Astonishment'. What is this law? It is simply that 

4.1

Astonishment'. What is this law? It is simply that 

the program should always respond to the user in 

the way that astonishes him least. 

A program, no matter how complex, should act as 

a single unit. The program should be directed by 

the logic within rather than by outward 

appearances. 

If the program fails in these requirements, it will 

be in a state of disorder and confusion. The only 

way to correct this is to rewrite the program. 

A novice asked the master: `Ì have a program 

that sometime runs and sometimes aborts. I have 

followed the rules of programming, yet I am 

totally baffled. What is the reason for this?'' 

The master replied: ``You are confused because 

you do not understand Tao. Only a fool expects 

rational behavior from his fellow humans. Why 

do you expect it from a machine that humans 

have constructed? Computers simulate 

determinism; only Tao is perfect. 

``The rules of programming are transitory; only 

Tao is eternal. Therefore you must contemplate 

Tao before you receive enlightenment.'' 

``But how will I know when I have received 

enlightenment?'' asked the novice. 

``Your program will then run correctly,'' replied 

the master. 

A master was explaining the nature of Tao of to 

one of his novices. ``The Tao is embodied in all 

software - regardless of how insignificant,'' said 

the master. 

`Ìs the Tao in a hand-held calculator?'' asked the 

novice. 

`Ìt is,'' came the reply. 

`Ìs the Tao in a video game?'' continued the 

novice. 

4.2 

4.3

`Ìt is even in a video game,'' said the master. 

`Ànd is the Tao in the DOS for a personal 

computer?'' 

The master coughed and shifted his position 

slightly. ``The lesson is over for today,'' he said. 

Prince Wang's programmer was coding software. 

His fingers danced upon the keyboard. The 

program compiled without an error message, and 

the program ran like a gentle wind. 

`Èxcellent!'' the Prince exclaimed, ``Your 

technique is faultless!'' 

``Technique?'' said the programmer turning from 

his terminal, ``What I follow is Tao - beyond all 

techniques! When I first began to program I 

would see before me the whole problem in one 

mass. After three years I no longer saw this mass. 

Instead, I used subroutines. But now I see 

nothing. My whole being exists in a formless 

void. My senses are idle. My spirit, free to work 

without plan, follows its own instinct. In short, 

my program writes itself. True, sometimes there 

are difficult problems. I see them coming, I slow 

down, I watch silently. Then I change a single line 

of code and the difficulties vanish like puffs of 

idle smoke. I then compile the program. I sit still 

and let the joy of the work fill my being. I close 

my eyes for a moment and then log off.'' 

Prince Wang said, ``Would that all of my 

programmers were as wise!'' 

4.4 

B OOK 5 - MAINTENANCE 


``Though a program be but three lines long, 

someday it will have to be maintained.'' 

A well-used door needs no oil on its hinges. 

A swift-flowing stream does not grow stagnant. 

5.1

A swift-flowing stream does not grow stagnant. 

Neither sound nor thoughts can travel through a 

vacuum. 

Software rots if not used. 

These are great mysteries. 

A manager asked a programmer how long it 

would take him to finish the program on which 

he was working. `Ìt will be finished tomorrow,'' 

the programmer promptly replied. 

`Ì think you are being unrealistic,'' said the 

manager, ``Truthfully, how long will it take?'' 

The programmer thought for a moment. `Ì have 

some features that I wish to add. This will take at 

least two weeks,'' he finally said. 

`Èven that is too much to expect,'' insisted the 

manager, `Ì will be satisfied if you simply tell me 

when the program is complete.'' 

The programmer agreed to this. 

Several years later, the manager retired. On the 

way to his retirement luncheon, he discovered the 

programmer asleep at his terminal. He had been 

programming all night. 

A novice programmer was once assigned to code 

a simple financial package. 

The novice worked furiously for many days, but 

when his master reviewed his program, he 

discovered that it contained a screen editor, a set 

of generalized graphics routines, an artificial 

intelligence interface, but not the slightest 

mention of anything financial. 

When the master asked about this, the novice 

became indignant. ``Don't be so impatient,'' he 

said, `Ì'll put in the financial stuff eventually.'' 

Does a good farmer neglect a crop he has 

planted? 

5.2 

5.3 

5.4

planted? 

Does a good teacher overlook even the most 

humble student? 

Does a good father allow a single child to starve? 

Does a good programmer refuse to maintain his 

code? 

B OOK 6 - MANAGEMENT 


``Let the programmers be many and the 

managers few - then all will be productive.'' 

When managers hold endless meetings, the 

programmers write games. When accountants 

talk of quarterly profits, the development budget 

is about to be cut. When senior scientists talk blue 

sky, the clouds are about to roll in. 

Truly, this is not the Tao of Programming. 

When managers make commitments, game 

programs are ignored. When accountants make 

long-range plans, harmony and order are about to 

be restored. When senior scientists address the 

problems at hand, the problems will soon be 

solved. 

Truly, this is the Tao of Programming. 

Why are programmers non-productive? 

Because their time is wasted in meetings. 

Why are programmers rebellious? 

Because the management interferes too much. 

Why are the programmers resigning one by one? 

Because they are burnt out. 

Having worked for poor management, they no 

longer value their jobs. 

A manager was about to be fired, but a 

programmer who worked for him invented a new 

6.1 

6.2 

6.3

program that became popular and sold well. As a 

result, the manager retained his job. 

The manager tried to give the programmer a 

bonus, but the programmer refused it, saying, `Ì 

wrote the program because I thought it was an 

interesting concept, and thus I expect no reward.'' 

The manager upon hearing this remarked, ``This 

programmer, though he holds a position of small 

esteem, understands well the proper duty of an 

employee. Let us promote him to the exalted 

position of management consultant!'' 

But when told this, the programmer once more 

refused, saying, `Ì exist so that I can program. If I 

were promoted, I would do nothing but waste 

everyone's time. Can I go now? I have a program 

that I'm working on." 

A manager went to his programmers and told 

them: `Às regards to your work hours: you are 

going to have to come in at nine in the morning 

and leave at five in the afternoon.'' At this, all of 

them became angry and several resigned on the 

spot. 

So the manager said: `Àll right, in that case you 

may set your own working hours, as long as you 

finish your projects on schedule.'' The 

programmers, now satisfied, began to come in at 

noon and work to the wee hours of the morning. 

6.4 

B OOK 7 - CORPORATE WISDOM 


``You can demonstrate a program for a corporate 

executive, but you can't make him computer 

literate.'' 

A novice asked the master: `Ìn the east there is a 

great tree-structure that men call `Corporate 

Headquarters'. It is bloated out of shape with vice 

7.1

presidents and accountants. It issues a multitude 

of memos, each saying `Go, Hence!' or `Go, 

Hither!' and nobody knows what is meant. Every 

year new names are put onto the branches, but all 

to no avail. How can such an unnatural entity 

be?" 

The master replied: ``You perceive this immense 

structure and are disturbed that it has no rational 

purpose. Can you not take amusement from its 

endless gyrations? Do you not enjoy the 

untroubled ease of programming beneath its 

sheltering branches? Why are you bothered by its 

uselessness?'' 

In the east there is a shark which is larger than all 

other fish. It changes into a bird whose wings are 

like clouds filling the sky. When this bird moves 

across the land, it brings a message from 

Corporate Headquarters. This message it drops 

into the midst of the programmers, like a seagull 

making its mark upon the beach. Then the bird 

mounts on the wind and, with the blue sky at its 

back, returns home. 

The novice programmer stares in wonder at the 

bird, for he understands it not. The average 

programmer dreads the coming of the bird, for he 

fears its message. The master programmer 

continues to work at his terminal, for he does not 

know that the bird has come and gone. 

The Magician of the Ivory Tower brought his 

latest invention for the master programmer to 

examine. The magician wheeled a large black box 

into the master's office while the master waited in 

silence. 

``This is an integrated, distributed, generalpurpose 

workstation,'' began the magician, 

`èrgonomically designed with a proprietary 

operating system, sixth generation languages, and 

multiple state of the art user interfaces. It took my 

7.2 

7.3

multiple state of the art user interfaces. It took my 

assistants several hundred man years to 

construct. Is it not amazing?'' 

The master raised his eyebrows slightly. `Ìt is 

indeed amazing,'' he said. 

``Corporate Headquarters has commanded,'' 

continued the magician, ``that everyone use this 

workstation as a platform for new programs. Do 

you agree to this?'' 

``Certainly,'' replied the master, `Ì will have it 

transported to the data center immediately!'' And 

the magician returned to his tower, well pleased. 

Several days later, a novice wandered into the 

office of the master programmer and said, `Ì 

cannot find the listing for my new program. Do 

you know where it might be?'' 

``Yes,'' replied the master, ``the listings are 

stacked on the platform in the data center.'' 

The master programmer moves from program to 

program without fear. No change in management 

can harm him. He will not be fired, even if the 

project is cancelled. Why is this? He is filled with 

Tao. 

7.4 

B OOK 8 - HARDWARE AND 

S OFTWARE 


``Without the wind, the grass does not move. 

Without software, hardware is useless.'' 

A novice asked the master: `Ì perceive that one 

computer company is much larger than all others. 

It towers above its competition like a giant among 

dwarfs. Any one of its divisions could comprise 

an entire business. Why is this so?'' 

The master replied, ``Why do you ask such 

foolish questions? That company is large because 

8.1

it is large. If it only made hardware, nobody 

would buy it. If it only made software, nobody 

would use it. If it only maintained systems, 

people would treat it like a servant. But because it 

combines all of these things, people think it one of 

the gods! By not seeking to strive, it conquers 

without effort.'' 

A master programmer passed a novice 

programmer one day. The master noted the 

novice's preoccupation with a hand-held 

computer game. `Èxcuse me,'' he said, ``may I 

examine it?'' 

The novice bolted to attention and handed the 

device to the master. `Ì see that the device claims 

to have three levels of play: Easy, Medium, and 

Hard,'' said the master. ``Yet every such device 

has another level of play, where the device seeks 

not to conquer the human, nor to be conquered 

by the human.'' 

``Pray, great master,'' implored the novice, ``how 

does one find this mysterious setting?'' 

The master dropped the device to the ground and 

crushed it underfoot. And suddenly the novice 

was enlightened. 

There was once a programmer who worked upon 

microprocessors. ``Look at how well off I am 

here,'' he said to a mainframe programmer who 

came to visit, `Ì have my own operating system 

and file storage device. I do not have to share my 

resources with anyone. The software is selfconsistent 

and easy-to-use. Why do you not quit 

your present job and join me here?'' 

The mainframe programmer then began to 

describe his system to his friend, saying ``The 

mainframe sits like an ancient sage meditating in 

the midst of the data center. Its disk drives lie 

end-to-end like a great ocean of machinery. The 

8.2 

8.3

end-to-end like a great ocean of machinery. The 

software is as multifaceted as a diamond, and as 

convoluted as a primeval jungle. The programs, 

each unique, move through the system like a 

swift-flowing river. That is why I am happy 

where I am.'' 

The microcomputer programmer, upon hearing 

this, fell silent. But the two programmers 

remained friends until the end of their days. 

Hardware met Software on the road to Changtse. 

Software said: ``You are Yin and I am Yang. If we 

travel together we will become famous and earn 

vast sums of money.'' And so the set forth 

together, thinking to conquer the world. 

Presently they met Firmware, who was dressed in 

tattered rags and hobbled along propped on a 

thorny stick. Firmware said to them: ``The Tao 

lies beyond Yin and Yang. It is silent and still as a 

pool of water. It does not seek fame, therefore 

nobody knows its presence. It does not seek 

fortune, for it is complete within itself. It exists 

beyond space and time.'' 

Software and Hardware, ashamed, returned to 

their homes. 


`Ìt is time for you to leave.'' 

8.4 

B OOK 9 - EPILOGUE

Computer Languages History 

Computer Languages Timeline 

Below, you can see the preview of the Computer Languages History (move on the 

white zone to get a bigger image): 

If you want to print this timeline, you can freely download one of the following PDF 

files: 

A4 Letter Plotter 

There is only 50 languages listed in my chart, if you don't find "your" language, see 

The Language List of Bill Kinnersley (he has listed more than 2500 languages). 

Here is the ChangeLog of this history. 

Note: I have now a page where I explain how I build this chart. 

Another chart on the wall 

If you have put 

this diagram 

on the wall of 

your office and 

have taken a 

photo of it, 

please send 

me a copy and 

I'll put it on 

this page. ;-) 

My other 

charts: 

UNIX 

History. 

Windows 

History. 

Share Share 0

Magisk ferie i 

Nord-Norge 

Hvite strender, 

hvalsafari eller hva 

med kajakktur i vill 

natur? 

www.visitnorway.com 

RainCode - IT 

konsulenter 

Programutvikling: 

Lang erfaring. Lav 

overhead. 

www.raincode.no 

Meet Sexy 

Ukraine Women 

Ukraine Dating 

and Singles Site. 

Find the Perfect 

Ukraine Woman 

Now! 

www.UkraineDate.co… 

Do you think in 

closures? 

We do too. 

Scheme 

programmers 

welcome. 

janestreet.com 

Some useful links 

ABC A Short Introduction to 

the ABC Language 

Ada Ada 95 

Ada Home Page 

AdaPower 

Special Interest Group on Ada 

Ada Information Clearinghouse 

History of programming 

languages on Wikipedia 

ALGOL 

The ALGOL Programming Language 

AWK The AWK Programming Language by 

Alfred V. Aho, Brian W. Kernighan, and 

Peter J. Weinberger 

APL Apl Language 

APL 

B The Programming Language B (abstract) 

Users' Reference to B by Ken Thompson 

BASIC 

The Basic Archives 

Visual Basic Instinct 

Visual Basic & Visual Basic .NET 

Resources 

True BASIC 

REALbasic 

BCPL BCPL Reference Manual by Martin 

Richards 

C 

The Development of the C Language by 

Dennis Ritchie 

Very early C compilers and language by 

Dennis Ritchie 

The C Programming Language (book) 

Programming languages - C ANSI by 

ISO/IEC (draft) 

C Programming Course

C++ The C++ Programming Language (book) 

C and C++: Siblings (pdf) by Bjarne 

Stroustrup 

C++0x - the next ISO C++ standard by 

Bjarne Stroustrup 

C# Visual C# Language by Microsoft. 

Caml The Caml language 

Objective Caml 

The Objective-Caml system 

CLU CLU Home Page 

COBOL 

IBM COBOL family 

COBOL Portal 

TinyCOBOL 

COBOL User Groups - COBUG 

CORAL 

Coral66 

Computer On-line Real-time Applications 

Language Coral 66 Specification for 

Compilers (pdf) 

CPL Combined Programming Language 

(Wikipedia) 

Delphi 

Delphi 2005 by Borland 

Pascal and Delphi 

A brief history of Borland's Delphi 

Delphi Treff: Delphi versions (german) 

Eiffel Eiffel 

EiffelStudio by Eiffel Software 

Visual Eiffel by Object Tools 

SmartEiffel 

EiffelZone 

Flow-Matic 

Flow-Matic and Cobol 

Forth Forth Interest Group Home Page 

colorForth by Chuck Moore

Fortran 

User notes on Fortran programming 

Fortran 2000 draft 

Fortran 2003 JTC1/SC22/WG5 

Haskell 

Haskell Home Page 

Icon 

The Icon Programming Language 

Icon 

History of the Icon programming language 

Unicon, the Unified Extended Dialect of 

Icon 

J 

J software 

A management perspective of the "J" 

programming language 

Java Java by Sun Microsystems 

Java Technology: an early history 

Programming Languages for the Java 

Virtual Machine 

James Gosling's home page 

JavaScript 

Cmm History by Nombas 

JavaScript Language Resources from 

Mozilla 

Standard ECMA-262 

Lisp The Association of Lisp Users 

An Introduction and Tutorial for Common 

Lisp 

Mainsail 

Mainsail from Xidak. 

Mainsail Implementation Overview by 

Stanford Computer Systems Laboratory. 

M (MUMPS) 

M technologies 

M[UMPS] Development Committee 

What is M Technology? 

ML Standard ML 

Standard ML '97 

Modula

Modula-2 

Modula-3 Home Page 

Modula-2 ISO/IEC 

Oberon 

A Brief History of Oberon 

A Description of the Oberon-2 Language 

The Programming Language Oberon-2 

Oberon Language Genealogy Tree 

The Oberon Community Platform 

Objective-C 

Objective-C 

Objective-C FAQ 

Introduction to The Objective-C 

Programming Language by Apple 

Objective-C: Links, Resources, Stuff 

Pascal 

ISO Pascal (document) 

Pascal and Delphi 

Perl 

Perl Home Page 

Perl 

Larry Wall's Very Own Home Page 

PHP PHP: Hypertext Preprocessor 

PL/I Multics PL/I 

IBM PL/I family by IBM 

Plankalkül 

Plankalkül 

PostScript 

PostScript level 3 by Adobe 

PostScript GhostScript PDF 

GhostScript Home Page 

Prolog 

Prolog Programming Language 

The Prolog Language 

Python 

Python Home Page 

Rexx IBM REXX Family by IBM 

The Rexx Language Association 

Ruby

Ruby Home Page 

Ruby programming language (Wikipedia) 

Ruby - doc 

Sail Sail (Stanford Artificial Intelligence 

Language) 

Sather 

Sather History 

Sather 

GNU Sather 

Scheme 

Scheme by MIT 

The Revised 5 Report on the Algorithmic 

Language Scheme (in PostScript) 

Schemers Home Page 

SCM 

Self Self Home Page from Sun 

Sh The Traditional Bourne Shell Family by 

Sven Mascheck 

Korn Shell by David Korn 

Bash from GNU 

Zsh 

Simula 

Simula by Jan Rune Holmevik 

Smalltalk 

Smalltalk Home Page 

Smalltalk FAQ 

The Early History of Smalltalk 

The Smalltalk Industry Council web site 

VisualAge Smalltalk from IBM 

VisualWorks from Cincom 

The history of Squeak 

ANSI Smalltalk 

SNOBOL 

Snobol4 Resources by Phil Budne 

Introduction to SNOBOL Programming 

Language by Mohammad Noman Hameed 

Snobol4 

Tcl/Tk 

Tcl/Tk Information

Other links on same subject 

The Language List (about 2500 

computer languages) by Bill 

Kinnersley. 

An interactive historical roster of computer languages by Diarmuid 

Pigott.. 

Programming languages by The Brighton University. 

Programming languages. 

Diagram of programming languages history. 

The Programming Languages Genealogy Project. 

History of Programming Languages. 

99 Bottles of Beer. 

Dictionary of Programming Languages. 

Wikipedia: Computer languages. 

Computer-Books.us: free computer books. 

Rosetta Code: a comparison of tasks in more than 150 languages. 

My other links 

UNIX History. 

Unix Hierarchy (an old paper). 

Windows History. 

NeXT History (in french). 

Another Chart On The Wall. 

Statistics of this site. 

Other Unix Products. 

Last update : March 4, 2012 

Please send comments to Éric Lévénez 

You can freely use this diagram for non-commercial purpose. 

Computer Languages History on Google 

Search in all levenez.com 

Search

The original version of this list I got through e-mail and, at the moment, I don't know who the author 

was. (The person who used to be listed as the author here has informed me that he isn't.) Other 

contributions have been added at the end. 

Last update: January 12, 2001 

TASK :- To Shoot Yourself In The Foot 

+++++++++++++++++++++++++++++++++++++ 

C 

C++ 

You shoot yourself in the foot. 

You accidentally create a dozen instances of yourself and shoot them all in the foot. Providing 

emergency medical care is impossible since you can't tell which are bitwise copies and which are 

just pointing at others and saying, "That's me over there." 

FORTRAN 

You shoot yourself in each toe, iteratively, until you run out of toes, then you read in the next 

foot and repeat. If you run out of bullets, you continue anyway because you have no exception 

handling ability. 

Cobol USE HANDGUN.COLT(45), AIM AT LEG.FOOT, THEN WITH ARM.HAND.FINGER ON 

HANDGUN.COLT(TRIGGER) PERFORM.SQUEEZE RETURN HANDGUN.COLT(45) TO HIP.HOLSTER. 

LISP 

You shoot yourself in the appendage which holds the gun with 

which you shoot yourself in the appendage which holds the gun with 




which you shoot yourself in the appendage which holds... 

Basic (interpreted) 

You shoot yourself in the foot with a water pistol until your foot is waterlogged and rots off. 

Basic (compiled) 

You shoot yourself in the foot with a BB using a SCUD missile launcher. 

FORTH 

Foot in yourself shoot. 

APL 

You shoot yourself in the foot, then spend all day figuring out how to do it in fewer characters. 

Pascal 

The compiler won't let you shoot yourself in the foot. 

SNOBOL 

If you succeed, shoot yourself in the left foot. If you fail, shoot yourself in the right foot. 

Concurrent Euclid 

You shoot yourself in somebody else's foot. 

HyperTalk 

Put the first bullet of the gun into the foot left of leg of you. Answer the result. 

Motif 

You spend days writing a UIL description of your foot, the trajectory, the bullet, and the intricate 

scrollwork on the ivory handles of the gun. When you finally get around to pulling the trigger, the 

gun jams. 

Unix 

% ls 

foot.c foot.h foot.o toe.c toe.o

% rm * .o 

rm: .o: No such file or directory 

% ls 

% 

XBase 

Shooting yourself is no problem. If you want to shoot yourself in the foot, you'll have to use 

Clipper. 

Paradox 

Not only can you shoot yourself in the foot, your users can, too. 

Revelation 

You'll be able to shoot yourself in the foot just as soon as you figure out what all these bullets are 

for. 

Visual Basic 

You'll really only appear to have shot yourself in the foot, but you'll have had so much fun doing 

it that you won't care. 

Prolog 

You tell your program that you want to be shot in the foot. The program figures out how to do it, 

but the syntax doesn't permit it to explain it to you. 

370 JCL 

You send your foot down to MIS and include a 400-page document explaining exactly how you 

want it to be shot. Three years later, your foot comes back deep-fried. 

Apple 

We'll let you shoot yourself, but it'll cost you a bundle. 

IBM 

You insert a clip into the gun, wait half an hour, and it goes off in random directions. If a bullet 

hits your foot, you're lucky. 

Microsoft 

Object "Foot" will be included in the next release. You can upgrade for $500. 

Cray 

I knew you were going to shoot yourself in the foot. 

Hewlett-Packard 

You can use this machine-gun to shoot yourself in the foot, but the firing pin is broken. 

NeXT 

We don't sell guns anymore, just ammunition. 

Sun 

Just as soon as Solaris gets here, you can shoot yourself anywhere you want. 

Ada 

After correctly packing your foot, you attempt to concurrently load the gun, pull the trigger, 

scream, and shoot yourself in the foot. When you try, however, you discover you can't because 

your foot is of the wrong type. 

Access 

You try to point the gun at your foot, but it shoots holes in all your Borland distribution diskettes 

instead. 

Assembler 

You try to shoot yourself in the foot, only to discover you must first invent the gun, the bullet, the 

trigger, and your foot. 

Modula2 

After realizing that you can't actually accomplish anything in this language, you shoot yourself in 

the head. 

csh 

After searching the manual until your foot falls asleep, you shoot the computer and switch to C. 

dBase 

You buy a gun. Bullets are only available from another company and are promised to work so 

you buy them. Then you find out that the next version of the gun is the one that is scheduled to 

actually shoot bullets. 

PL/1 

After consuming all system resources including bullets, the data processing department doubles

its size, acquires 2 new mainframes and drops the original on your foot. 

Smalltalk, Actor, et al 

After playing with the graphics for 3 weeks, the programming manager shoots you in the head. 

HTML 

Shoot 

here 

tv's Spatch 

Java 

The gun fires just fine, but your foot can't figure out what the bullets are and ignores them. 

MOO 

You ask a wizard for a pair of hands. After lovingly handcrafting the gun and each bullet, you tell 

everyone that you've shot yourself in the foot. 

Smalltalk 

You daydream repeatedly about shooting yourself in the foot. 

FTP 

Petréa Mitchell 

% ftp lower-body.me.org 

ftp> cd /foot 

ftp> put bullets 

Jim Gould 

DCL 

You manage to shoot yourself in the foot, but while doing so you also shoot yourself in the arm, 

stomach, and leg, plus you shoot your best friend in the chest, the neighbour's dog and your car. 

A month later you're not able to understand your program anymore when you read the source. 

Originator unknown 

Windows95 

d:\setup 

And lest we forget our roots 

>shoot self in foot 

I don't see any self here. 

>shoot me in foot 

There is no you in the foot. 

>shoot foot 

I don't know which foot you're talking about. 

>shoot left foot 

You don't have the gun. 

>get gun 

You take the gun. 

You're lantern just went out. 

You are attacked by grues. 

* * * YOU HAVE DIED * * * 

Mikey "Dreamy" Sphar 

Petréa Mitchell 

pravn@m5p.com

Danny Yee >> Humour 

by Dave Pritchard 

The Lord of the Rings: 

an allegory of the PhD? 

The story starts with Frodo: a young hobbit, quite bright, a bit 

dissatisfied with what he's learnt so far and with his mates back 

home who just seem to want to get jobs and settle down and drink 

beer. He's also very much in awe of his tutor and mentor, the very 

senior professor Gandalf, so when Gandalf suggests he take on a 

short project for him (carrying the Ring to Rivendell), he agrees. 

Frodo very quickly encounters the shadowy forces of fear and 

despair which will haunt the rest of his journey and leave 

permanent scars on his psyche, but he also makes some useful 

friends. In particular, he spends an evening down at the pub with 

Aragorn, who has been wandering the world for many years as 

Gandalf's postdoc and becomes his adviser when Gandalf isn't 

around. 

After Frodo has completed his first project, Gandalf (along with 

head of department Elrond) proposes that the work should be 

extended. He assembles a large research group, including visiting 

students Gimli and Legolas, the foreign postdoc Boromir, and 

several of Frodo's own friends from his undergraduate days. Frodo 

agrees to tackle this larger project, though he has mixed feelings 

about it. ("'I will take the Ring', he said, 'although I do not know 

the way.'") 

Very rapidly, things go wrong. First, Gandalf disappears and has 

no more interaction with Frodo until everything is over. (Frodo 

assumes his supervisor is dead: in fact, he's simply found a more 

interesting topic and is working on that instead.) At his first 

international conference in Lorien, Frodo is cross-questioned 

terrifyingly by Galadriel, and betrayed by Boromir, who is anxious 

to get the credit for the work himself. Frodo cuts himself off from 

the rest of his team: from now on, he will only discuss his work 

with Sam, an old friend who doesn't really understand what it's all 

about, but in any case is prepared to give Frodo credit for being 

rather cleverer than he is. Then he sets out towards Mordor. 

The last and darkest period of Frodo's journey clearly represents 

the writing-up stage, as he struggles towards Mount Doom

(submission), finding his burden growing heavier and heavier yet 

more and more a part of himself; more and more terrified of 

failure; plagued by the figure of Gollum, the student who carried 

the Ring before him but never wrote up and still hangs around as a 

burnt-out, jealous shadow; talking less and less even to Sam. 

When he submits the Ring to the fire, it is in desperate confusion 

rather than with confidence, and for a while the world seems 

empty. 

Eventually it is over: the Ring is gone, everyone congratulates 

him, and for a few days he can convince himself that his troubles 

are over. But there is one more obstacle to overcome: months 

later, back in the Shire, he must confront the external examiner 

Saruman, an old enemy of Gandalf, who seeks to humiliate and 

destroy his rival's protege. With the help of his friends and 

colleagues, Frodo passes through this ordeal, but discovers at the 

end that victory has no value left for him. While his friends return 

to settling down and finding jobs and starting families, Frodo 

remains in limbo; finally, along with Gandalf, Elrond and many 

others, he joins the brain drain across the Western ocean to the 

new land beyond. 

Related humour: One OS to Rule Them All 

Review: The Monsters and the Critics (Tolkien) 

National Lampoon parody: Bored of the Rings (Amazon) 

"I dislike Allegory - the conscious and intentional 

allegory - yet any attempt to explain the purport of myth 

or fairytale must use allegorical language." 

J.R.R. Tolkien 

Humour

32635 

Back 

Ode to a Spell Checker 

I have a spelling checker 

I disk covered four my PC. 

It plane lee marks four my revue 

Miss steaks aye can knot see. 

Eye ran this poem threw it. 

Your sure real glad two no. 

Its very polished in its weigh, 

My checker tolled me sew. 

A checker is a blessing. 

It freeze yew lodes of thyme. 

It helps me right awl stiles two reed, 

And aides me when aye rime. 

Each frays comes posed up on my screen 

Eye trussed too bee a joule. 

The checker pours o'er every word 

To cheque sum spelling rule. 

Bee fore wee rote with checkers 

Hour spelling was inn deck line, 

Butt now when wee dew have a laps, 

Wee are not maid too wine. 

And now bee cause my spelling 

Is checked with such grate flare, 

There are know faults in awl this peace, 

Of nun eye am a wear. 

To rite with care is quite a feet 

Of witch won should be proud, 

And wee mussed dew the best wee can, 

Sew flaws are knot aloud. 

That's why eye brake in two averse 

Caws Eye dew want too please. 

Sow glad eye yam that aye did bye 

This soft wear four pea seas. 

--Author Unknown

wjames@usd.edu - - -

foo 

Prev� F �Next 

foo: /foo/ 

1. interj. Term of disgust. 

2. [very common] Used very generally as a sample name for absolutely anything, 

esp. programs and files (esp. scratch files). 

3. First on the standard list of metasyntactic variables used in syntax examples. See 

also bar, baz, qux, quux, garply, waldo, fred, plugh, xyzzy, thud. 

When ‘foo’ is used in connection with ‘bar’ it has generally traced to the WWII-era 

Army slang acronym FUBAR (‘Fucked Up Beyond All Repair’ or ‘Fucked Up 

Beyond All Recognition’), later modified to foobar. Early versions of the Jargon 

File interpreted this change as a post-war bowdlerization, but it it now seems more 

likely that FUBAR was itself a derivative of ‘foo’ perhaps influenced by German 

furchtbar (terrible) — ‘foobar’ may actually have been the original form. 

For, it seems, the word ‘foo’ itself had an immediate prewar history in comic strips 

and cartoons. The earliest documented uses were in the Smokey Stover comic strip 

published from about 1930 to about 1952. Bill Holman, the author of the strip, 

filled it with odd jokes and personal contrivances, including other nonsense phrases 

such as “Notary Sojac” and “1506 nix nix”. The word “foo” frequently appeared on 

license plates of cars, in nonsense sayings in the background of some frames (such 

as “He who foos last foos best” or “Many smoke but foo men chew”), and Holman 

had Smokey say “Where there's foo, there's fire”. 

According to the Warner Brothers Cartoon Companion Holman claimed to have 

found the word “foo” on the bottom of a Chinese figurine. This is plausible; 

Chinese statuettes often have apotropaic inscriptions, and this one was almost 

certainly the Mandarin Chinese word fu (sometimes transliterated foo), which can 

mean “happiness” or “prosperity” when spoken with the rising tone (the lion-dog 

guardians flanking the steps of many Chinese restaurants are properly called “fu 

dogs”). English speakers' reception of Holman's ‘foo’ nonsense word was 

undoubtedly influenced by Yiddish ‘feh’ and English ‘fooey’ and ‘fool’. 

Holman's strip featured a firetruck called the Foomobile that rode on two wheels. 

The comic strip was tremendously popular in the late 1930s, and legend has it that 

a manufacturer in Indiana even produced an operable version of Holman's 

Foomobile. According to the Encyclopedia of American Comics, ‘Foo’ fever swept 

the U.S., finding its way into popular songs and generating over 500 ‘Foo Clubs.’ 

The fad left ‘foo’ references embedded in popular culture (including a couple of 

appearances in Warner Brothers cartoons of 1938-39; notably in Robert Clampett's 

“Daffy Doc” of 1938, in which a very early version of Daffy Duck holds up a sign 

saying “SILENCE IS FOO!”) When the fad faded, the origin of “foo” was 

forgotten.

One place “foo” is known to have remained live is in the U.S. military during the 

WWII years. In 1944-45, the term ‘foo fighters’ was in use by radar operators for 

the kind of mysterious or spurious trace that would later be called a UFO (the older 

term resurfaced in popular American usage in 1995 via the name of one of the 

better grunge-rock bands). Because informants connected the term directly to the 

Smokey Stover strip, the folk etymology that connects it to French “feu” (fire) can 

be gently dismissed. 

The U.S. and British militaries frequently swapped slang terms during the war (see 

kluge and kludge for another important example) Period sources reported that 

‘FOO’ became a semi-legendary subject of WWII British-army graffiti more or 

less equivalent to the American Kilroy. Where British troops went, the graffito 

“FOO was here” or something similar showed up. Several slang dictionaries aver 

that FOO probably came from Forward Observation Officer, but this (like the 

contemporaneous “FUBAR”) was probably a backronym . Forty years later, Paul 

Dickson's excellent book “Words” (Dell, 1982, ISBN 0-440-52260-7) traced “Foo” 

to an unspecified British naval magazine in 1946, quoting as follows: “Mr. Foo is a 

mysterious Second World War product, gifted with bitter omniscience and 

sarcasm.” 

Earlier versions of this entry suggested the possibility that hacker usage actually 

sprang from FOO, Lampoons and Parody, the title of a comic book first issued in 

September 1958, a joint project of Charles and Robert Crumb. Though Robert 

Crumb (then in his mid-teens) later became one of the most important and 

influential artists in underground comics, this venture was hardly a success; indeed, 

the brothers later burned most of the existing copies in disgust. The title FOO was 

featured in large letters on the front cover. However, very few copies of this comic 

actually circulated, and students of Crumb's oeuvre have established that this title 

was a reference to the earlier Smokey Stover comics. The Crumbs may also have 

been influenced by a short-lived Canadian parody magazine named ‘Foo’ published 

in 1951-52. 

An old-time member reports that in the 1959 Dictionary of the TMRC Language, 

compiled at TMRC, there was an entry that went something like this: 

FOO: The first syllable of the sacred chant phrase “FOO MANE 

PADME HUM.” Our first obligation is to keep the foo counters 

turning. 

(For more about the legendary foo counters, see TMRC.) This definition used Bill 

Holman's nonsense word, then only two decades old and demonstrably still live in 

popular culture and slang, to a ha ha only serious analogy with esoteric Tibetan 

Buddhism. Today's hackers would find it difficult to resist elaborating a joke like 

that, and it is not likely 1959's were any less susceptible. Almost the entire staff of 

what later became the MIT AI Lab was involved with TMRC, and the word spread 

from there. 

Prev� Up �Next 

fontology� Home �foobar

home > factoids > programming languages 

programming languages 

a b c d e 

f g h i j 

k l m n o 

p q r s t 

u v w x y 

z 

For those who think the world begins and end with C++, or with 

Java, here is a very incomplete list of programming languages: 

just the ones I've heard of, or been told about (not including 

assembly languages, or special purpose 'little languages' like yacc 

or nroff). 

19. A language that doesn't affect the 

way you think about programming, is not 

worth knowing. 

-- Alan J. Perlis. Epigrams on 

Programming. 

SIGPLAN Notices 17(9):7-13, September 

1982 

THE LANGUAGE LIST is much more comprehensive, with over 2000 entries! 

The Jargon File is a great source for computing terms. 

Remember, no matter which language you choose, you can always Shoot 

Yourself In The Foot. It's just so much easier in some than in others. 

And, of course, Real Programmers Don't Use PASCAL 

Ada -- after Ada, Countess Lovelace, a friend of Charles Babbage, and claimed 

by some to be the first computer programmer. 

Ada the language was commissioned by the US Department of Defense in 

the 1980s as the language to be used for all its software. Descended from 

Pascal, with support for structuring via the package. 

The PL/I of the 1980s. 

-- unknown 

package Stack is 

procedure Pop return INTEGER; 

procedure Push(x:INTEGER); 

procedure IsEmpty return BOOLEAN; 

end Stack; 

The mistakes which have been made 

in the last twenty years [of designing 

large overly-complex languages like 

Ada] are being repeated today on an 

even grander scale. 

... 

Gadgets and glitter prevail over

fundamental concerns of safety and 

economy. 

-- C. A. R. Hoare, "The Emperor's Old 

Clothes", CACM 24(2), 1981 

Barnes • Programming in Ada 

Habermann, Perry • Ada for Experienced Programmers 

Ichbiah et al. • Rationale for the Design of the Ada Programming Language 

McGettrick • Program Verification Using Ada 

Sommerville, Morrison • Software Development with Ada 

Algol -- "Algorithmic Language" 

Algol-60. Algol-68W. Algol-68. A family of procedural languages. 

The more I ponder the principles of 

language design, and the techniques 

that put them into practice, the more is 

my amazement at and admiration of 

ALGOL 60. Here is a language so far 

ahead of its time that it was not only 

an improvement on its predecessors 

but also on nearly all its successors. 

-- C. A. R. Hoare, "Hints on 

Programming Language Design", 

1974 

THE EMPEROR'S OLD CLOTHES -- An extract from Tony Hoare's 1980 ACM 

Turing Award lecture, on the birth of the monster Algol 68 

I conclude that there are two ways of 

constructing a software design: One 

way is to make it so simple that there 

are obviously no deficiencies and the 

other way is to make it so complicated 

that there are no obvious deficiences. 

-- C. A. R. Hoare, "The Emperor's Old 

Clothes", CACM 24(2), 1981 

(on the design of ALGOL 68 v. Ada) 

Randall, Russell • Algol 60 Implementation 

APL -- "A Programming Language" 

Designer: Ken Iverson, in the late 1950s/early 1960s. 

There are three things a man must do 

Before his life is done; 

Write two lines in APL, 

And make the buggers run. 

-- Stan Kelly-Bootle, The Devil's DP

Dictionary, 1981 

Famous for its enormous character set, and for being able to write whole 

accounting packages or air traffic control systems with a few 

incomprehensible key strokes. 

APL, in which you can write a program 

to simulate shuffling a deck of cards 

and then dealing them out to several 

players in four characters, none of 

which appear on a standard keyboard. 

-- David Given 

Michael Gertelman has coded Conway's Game of Life in one line of APL: 

APL is a mistake, carried through to 

perfection. It is the language of the 

future for the programming techniques 

of the past: it creates a new 

generation of coding bums. 

-- Edsger W. Dijkstra, How do we tell 

truths that might hurt? EWD498, 1975 

Mason • Learning APL: An Array Processing Language 

Pommier • An Introduction to APL 

Thomson • APL Programs for the Mathematics Classroom 

awk -- after the initials of its inventors: Aho, Weinberger, Kernighan 

An interpreted language with pattern matching, associative arrays, no 

declarations, implicit type casting, and C-like syntax. Wonderful for quickly 

hacking small special-purpose Unix filters; a nightmare when grown into 

large programs 

BEGIN { FS = "\t" } 

{ total[$4] += $3 } 

END { for (name in total) 

} 

comp.lang.awk FAQ 

print name, total[name] 

Aho, Kernighan, Weinberger • The AWK Programming Language 

Dougherty, Robbins • sed and awk 

Robbins • Effective awk Programming

B -- a revised version of BCPL 

Babbage -- after Charles Babbage, the inventor of the first (mechanical) 

computer 

On two occasions I have been 

asked [by members of 

Parliament], 'Pray, Mr. Babbage, 

if you put into the machine 

wrong figures, will the right 

answers come out?' I am not 

able rightly to apprehend the 

kind of confusion of ideas that 

could provoke such a question. 

-- Charles Babbage, 1792-1871 

A rather different Babbage is the Language of the Future 

BASIC -- "Beginners All-purpose Symbolic Instruction Code" 

An interpreted procedural language, originally invented in the 1960s for 

teaching, which has spread out of control. 

It is practically impossible to teach 

good programming style to students 

that have had prior exposure to 

BASIC: as potential programmers they 

are mentally mutilated beyond hope of 

regeneration. 



80 INPUT N% 

90 IF (N% > M%) THEN 80 

100 FOR I% + 1 TO N% 

110 X(I%) = RND 

120 NEXT I% 

130 GOSUB 6000 

BASIC is to computer languages what 

Roman numerals are to arithmetic 

-- unknown 

[that is, great for simple addition, a 

nightmare for more sophisticated long 

division] 

BBC BASIC -- designed for Acorn's BBC micro -- added control structures 

and procedures, and is a greatly improved language, but is still suitable 

only for small programs. 

BCPL -- "Basic CPL", a modified version of CPL

Bliss 

Designer: Martin Richards. 

LET SWAP(X,Y) BE 

$( 

LET TEMP = !X 

!X := !Y 

!Y := TEMP 

$) 

Whitby-Strevens & Richards • BCPL : The Language and Its Compiler 

C -- a revised version of B 

Designer: Dennis Ritchie, Bell Labs in the early 1970s. A procedural 

language originally designed as a system programming language for the 

PDP, now out of control. 

A language that combines all the 

elegance and power of assembly 

language with all the readability and 

maintainability of assembly language. 

-- New Hacker's Dictionary 

Variants: K&R C (Kernighan and Richie C). Ansi-C. 

... one of the main causes of the fall of 

the Roman Empire was that, lacking 

zero, they had no way to indicate 

successful termination of their C 

programs. 

-- Robert Firth 

for( i=0; (c=getchar())!=EOF && c != '\n'; i++ ) 

s[i] = c!='\t' ? c : ' '; 

The above is everyday C code. (And some people who quite happily write 

this sort of stuff all day complain that Z is difficult "because of all those 

strange symbols"!) OBFUSCATED C, on the other hand, looks more like: 

/* 

* HELLO WORLD 

* by Jack Applin and Robert Heckendorn, 1985 

*/ 

main(v,c)char**c;{for(v[c++]="Hello, world!\n)"; 

(!!c)[*c]&&(v--||--c&&execlp(*c,*c,c[!!c]+!!c,!c)); 

**c=!c)write(!!*c,*c,!!**c);} 

-- Eric Raymond, ed. The New Hacker's Dictionary 

The last good thing written in C was 

Franz Schubert's Symphony No. 9 

-- Erwin Dieterich

Kernighan & Ritchie • The C Programming Language 

C++ -- C incremented 

Designer: Bjarne Stroustrup. C with object oriented extensions; even more 

out of control than C 

C++ : where friends have access to 

your private members 

-- Gavin Russell Baker 

void f() { 

olist ll; 

name nn; 

ll.insert(&nn); 

name* pn = (name*)ll.get(); 

} 

When your hammer is C++, everything 

begins to look like a thumb. 

-- Steve Haflich, comp.lang.lisp, 

December 1994 

> C++ has its place in the history of 

programming languages. 

Just as Caligula has his place in the 

history of the Roman Empire? 

-- Robert Firth (firth @ sei.cmu.edu) 

94/03/18 

as quoted by Dirk Craeynest at 

http://www.cs.kuleuven.ac.be/~dirk/quotes.html 

Actually I made up the term "objectoriented", 

and I can tell you I did not 

have C++ in mind. 

-- Alan Kay 

The Computer Revolution hasn't 

happend yet : Keynote, OOPSLA, 

1997 

C++: "an octopus made by nailing 

extra legs onto a dog" 

-- Steve Taylor, 1998 

There are only two things wrong with 

C++: The initial concept and the 

implementation. 

-- Bertrand Meyer 

Chill -- CCITT High Level Language 

(where CCITT = Comité Consultatif International Télégraphique et

Clean 

Téléphonique) 

A lazy, purely functional language, with "almost-as-good-as-C" efficiency. 

/* sieve of Eratosthenes */ 

primes :: [Int] 

primes = sieve [2..] 

sieve :: [Int] -> [Int] 

sieve [prime:rest] = 

[prime: sieve [i \\ i

-- program fragment taken from A COMAL SUPPLIER SITE 

CORAL -- "Common Real-time Application Language" 

CPL -- "Combined Programming Language" 

Dynamo -- "Dynamic Models" 

Eiffel 

Euclid 

A descendant of Simple, used for the Limits to Growth models. 

Designer: Bertrand Meyer. An elegant object oriented language, designed 

to support reuse, and including support for logical assertions. 

putIth(v: like first; i:INTEGER) is 

require 

indexLargeEnough: i >= 1; 

indexSmallEnough: i SWAP 2* 

+ @ 

EXECUTE 

; 

FORTRAN -- "Formula Translation" 

If a variable is not declared, it is implicitly given a type based on its first

Gypsy 

letter (I to N being integers, the rest floats), which led to the famous story of 

Handel-C 

losing a spacecraft. 

Consistently separating words by 

spaces became a general custom 

about the tenth century A.D., and 

lasted until about 1957, when 

FORTRAN abandoned the practice. 

-- Sun FORTRAN Reference Manual 

DO 70 I = 1,3 

N = KEY(1,I) 

DO 50 J = 1,N 

IF (KARD(J) .NE. KEY(J+1,I)) GOTO 70 

50 CONTINUE 

GOTO 200 

70 CONTINUE 

200 END 

The primary purpose of the DATA 

statement is to give names to 

constants; instead of referring to PI as 

3.141592653589793 at every 

appearance, the variable PI can be 

given that value with a DATA 

statement and used instead of the 

longer form of the constant. This also 

simplifies modifying the program, 

should the value of PI change. 

-- FORTRAN manual for Xerox 

computers 

[Some net-copies of this quote have 

the last digit as a 7. But pi=3.14159 

26535 89793 23846 ... Is it a 

transcription error, or an error in the 

original manual? Is the whole 

quotation just a UL, or is it real?] 

Variants: Fortran-II. Fortran-IV, roughly equal to Fortran-66. Fortran-77. 

Fortran-90 (previously Fortran-8X). Watfor = Waterloo Fortran. Ratfor = 

Rational Fortran (a preprocessor to add control structures to Fortran-66) 

Designer: Ian Page. A language hiding an occam-like semantics 

underneath a C-like syntax, designed for compiling down to hardware, 

especially FPGAs. 

prialt { 

case louder ? any : 

volume = volume + 1 ; 

break ;

case softer ? any : 

volume = volume - 1 ; 

break ; 

} 

amplifier ! volume ; 

Celoxica Home Page 

Haskell -- after Haskell Curry, a logician 

Icon 

A functional language. 

Designers: Ralph Griswold, Dave Hanson, Tim Korb. A string processing 

language, a descendent of Snobol, with structuring. 

while line := read() do 

if line := line[upto(wchar,line):0] 

then return line[1:many(wchar,line)] 

Java -- slang for coffee, the programmer's staple diet 

Syntax like C++ "with all the nasty bits taken out". Compiles down to 

bytecode for a virtual machine, greatly increasing portability (if not 

performance). 

Java book reviews 

Lisp -- "List Processing language" 

(or... Lots of Irritating Superfluous Parentheses) 

Designer: John McCarthy, MIT, late 1950s. A functional language, used 

mainly for AI applications. 

55. A LISP programmer knows the 

value of everything, but the cost of 

nothing. 

-- Alan J. Perlis. Epigrams on 

Programming. 

SIGPLAN Notices 17(9):7-13, 

September 1982 

(DEFUN MEMBER (ITEM S) 

(COND ((NULL S) NIL) 

((EQUAL ITEM (CAR S)) S) 

(T (MEMBER ITEM (CDR S))))) 

If you learn Lisp correctly, you can 

grok all programming styles with it: 

procedural, OO, predicate, functional, 

pure or full of side-effects. Recursion 

will be your friend, function references

your allies, you will truly know what a 

closure is, and that an argument stack 

is actually a performance hack. You 

will see that the most elegant way to 

solve a problem is to create a custom 

language, solve the generic problem, 

and have your specific one fall out as 

a special case. You will learn to truly 

separate intent from the bare metal, 

and you will finally understand the two 

deepest secrets, which are really the 

same secret, which we tell all, but so 

few understand, that code and data 

are the same thing, but organize your 

data and your code will follow. 

-- Mark Atwood, rec.arts.sf.written, Jan 

2002 

Variants include: Scheme 

Logo -- from the Greek logos, meaning 'word' or 'thought' 

Lucid 

Matlab 

Miranda 

ML 

Designer: Seymour Papert. Turtle graphics 

TO SQUARE 

REPEAT 4 

FORWARD 100 

RIGHT 90 

Designers: Ed Ashcroft and Bill Wadge, 1974. Lucid programs are 

intrisically parallel and provable. 

A matrix-based language that lets you write maths how it wants to be 

written, with hardly a loop in sight. 

Example: take a 2D matrix M, perform a singular value decomposition, 

normalise the resulting vector of singular values s i to treat them as a vector 

of probabilities p i , and calculate the Shannon entropy H: 

S = svd(M); 

P = S/sum(S); 

H = - dot(P,log2(P)); 

A functional language.

A functional language with modules, developed at the University of 

Edinburgh. 

fun reverse ([], ys) = ys 

| reverse (x::xs, ys) = reverse(xs, x::ys); 

Modula -- "Modular Language" 

Designer: Niklaus Wirth. A descendent of Pascal that added modules for 

large-scale structuring. 

Variants: Modula, Modula-2, Modula-3. 

DEFINITION MODULE InOut; 

EXPORT QUALIFIED 

EOL, Done, termCH; 

CONST EOL = 36C; 

VAR Done: BOOLEAN; 

termCH: CHAR; 

PROCEDURE OpenInput(defext: ARRAY OF CHAR); 

PROCEDURE OpenOutput(defext: ARRAY OF CHAR); 

END InOut. 

Oberon -- after Oberon, a moon of Uranus (which was being passed by 

Voyager at the time) 

Designers: Niklaus Wirth and Jurg Gutknecht. 

An object oriented language descended from Pascal and Modula-2 

DEFINITION Texts; 

IMPORT Display, Files, Fonts; 

CONST 

replace = 0; insert = 1; delete = 2; 

TYPE 

Buffer = POINTER TO BufDesc; 

BufDesc = RECORD 

len: LONGINT 

END; 

PROCEDURE Append(T:Text; B:Buffer); 

END Texts. 

Objective C -- an object oriented C 

Designer: Brad Cox. A hybrid object oriented language containing all of C 

and some Smalltalk-like method syntax. 

float total = emptyWeight; 

int i, n = [self size]; 

for (i=0; i

A parallel programming language, based on Hoare's formal language CSP 

(Communicating Sequential Processes), supported by the inmos 

Transputer. 

SEQ 

ALT 

louder ? any 

volume := volume + 1 

softer ? any 

volume := volume - 1 

amplifier ! volume 

OPS5 -- "Official Production System version 5" 

A rule based AI programming language 

Pascal -- after Blaise Pascal 

Designer: Niklaus Wirth in the late 1970s. A descendent of Algol, originally 

invented for teaching, which has spread out of control. (Uses semicolons to 

separate statements, rather than to terminate them, a cause of much grief.) 

while not eof(fn) do 

begin 

read(fn,next); 

sum := sum + next; 

end 

Perl -- "Practical Extraction and Report Language" 

(or... "Pathologically Eclectic Rubbish Lister", sometimes known as "the 

Swiss-Army chainsaw") 

Python is executable pseudocode. 

Perl is executable line noise. 

-- unknown 

Designer: Larry Wall. A descendent of awk, and lots of other things. 

Perl: The only language that looks the 

same before and after RSA 

encryption. 

-- Keith Bostic 

while ( ) { 

next unless s/^(.*?):\s*//; 

$HoL{$1} = [ split ]; 

} 

Perl is the only language where you 

can bang your head on the keyboard 

and it compiles. 

-- unknown

If you put a million monkeys at a 

million keyboards, one of them will 

eventually write a Java program. The 

rest of them will write Perl programs. 

-- anon 

Pilot -- "Programmed Inquiry Learning or Teaching" 

PL/I -- "Programming Language 1" 

Criticised for being large and complex. 

ON CONDSIGNAL (NEW) BEGIN 

LINECOUNT = 1; 

PAGECOUNT = PAGECOUNT + 1; 

WRITE FILE(REPORT) FROM (HEADLINE); 

END; 

PL/I —"the fatal disease"— belongs 

more to the problem set than to the 

solution set. 



Pop-2, Pop-11 -- "Pop-2 for the PDP-11" 

An AI programming language, originally developed at the University of 

Edinburgh, then the University of Sussex 

PostScript 

define doubleList(lst); 

vars temp; 

[] -> temp; 

until lst = [] 

do temp [^(hd(lst)*2)] -> temp; 

tl(lst) -> lst 

enduntil; 

temp 

enddefine; 

Designed by Adobe. A stack-based procedural language, designed for 

driving laser printers and graphics. 

currentpoint 

4 2 roll exch 4 -1 roll exch 

sub 3 1 roll sub 

exch atan rotate dup scale 

-1 2 rlineto 

7 -2 rlineto 

-7-2 rlineto 

closepath fill 

Prolog -- "Programming in Logic"

A logic language, used mainly for AI applications. 

descendant(X,Y) :- offspring(X,Y). 

descendant(X,Y) :- offspring(X,Z), descendant(Z,Y). 

"How many Prolog programmers does 

it take to change a lightbulb?" 

"No." 

Clocksin & Mellish • Programming in Prolog 

Python -- after Monty Python's Flying Circus 

Ruby 

SAS 

Python is executable pseudocode. 

Perl is executable line noise. 

-- unknown 

I always thought Smalltalk would beat 

Java, I just didn't know it would be 

called 'Ruby' when it did. 

-- Kent Beck 

Rick DeNatale -- Old Smalltalker’s 

perceptions of Ruby 

Simple -- "Simulation of Industrial Management Problems with Lots of 

Equations" 

Simula-67 -- "Simulation language" 

Smalltalk 

Designers: Ole-Johan Dahl, Bjorn Myhrhaug, Kristen Nygaard. The first 

object oriented language, an extension of Algol 60. 

CLASS MEMBER; 

BEGIN REF(MEMBER)NEXT; 

PROCEDURE PUSHDOWN(L);REF(CHAIN)L; 

END***MEMBER*** 

IF L=/=NONE THEN 

BEGIN NEXT:-L.FIRST; 

L.FIRST:-THIS MEMBER; 

END***PUSHDOWN***; 

Designed by Xerox-Parc. A pure object oriented, untyped language. 

^(self includes: aSymbol)

ifTrue: [self controlKeys at: aSymbol] 

ifFalse: [aBlock value] 

My absolute favorite programming language in the world, ever (with Matlab 

up there in the running, depending on the application). 

Snobol -- "String Oriented Symbolic Language" 

Tcl 

A string processing language, much used in the Humanities for textual 

analyses. 

MORE LINE = INPUT :F(END) 

LINE PAT :F(MORE) 

OUTPUT = LINE :(MORE) 

END 

TeX -- tau, epsilon, chi 

Donald Knuth's macro-based text formatting language, started in the late 

1970s. Included here because of its incredible complexity, and because 

someone has written Towers of Hanoi, and 8-queens, in it (presumably just 

because they could). 

\def\listoftables{\section*{List of Tables\@markboth 

{LIST OF TABLES}{LIST OF TABLES}}\@starttoc{lot}} 

Variants: LaTeX, AMSTeX 

COMPREHENSIVE TEX ARCHIVE 

Turing -- after Alan Turing 

Turing (and OOT) is a general purpose programming language designed 

specifically for teaching the concepts of computer science. 

% Roll a die until you get 6. 

var die : int 

loop 

rand int (die, 1, 6) 

exit when die = 6 

put "This roll is ", die 

end loop 

put "Stopping with roll of 6" 

TURING PROGRAMMING LANGUAGE HOME PAGE

A Beginner's Python Tutorial 

When CivilizationTM IV (Firaxis Games, published by Take2) was announced, one 

of the most exciting features was that much of the scripting code will be in python, 

and the game data in XML. This tutorial attempts to teach you the basics of python 

programming that you could use with civIV. 

Of course, this tutorial is not limited to those who want to play a slow-paced turnbased 

strategy game. That is what it was written for, but is perfectly useful to any 

person with no programming knowledge at all, who wants to learn python. But 

what makes this tutorial unique, is that it is written for beginners, by a beginner. 

Table 1 - Lessons 

Number Name 

Lesson 1 Installing Python 

Lesson 2 Very Simple Programs 

Lesson 3 Variables, and Programs in a Script 

Lesson 4 Loops and Conditionals 

Lesson 5 Functions 

Lesson 6 Tuples, Lists, and Dictionaries 

Lesson 7 The for loop 

Lesson 8 Classes 

Lesson 9 Importing Modules 

Lesson 10 File I/O 

Lesson 11 Error Handling 

Then we also have the (to be written) Civilization IV python introduction. It will begin 

its release in early November. 

Number 

Table 2 - Lessons 

Name 

Lesson 1 

Introduction to Civilization IV python (not 

released yet)

A Brief Introduction 

The Epytext Markup Language 

Epytext is a simple lightweight markup language that lets you add formatting and structue to docstrings. Epydoc 

uses that formatting and structure to produce nicely formatted API documentation. The following example (which 

has an unusually high ratio of documentaiton to code) illustrates some of the basic features of epytext: 

def x_intercept(m, b): 

""" 

Return the x intercept of the line M{y=m*x+b}. The X{x intercept} 

of a line is the point at which it crosses the x axis (M{y=0}). 

This function can be used in conjuction with L{z_transform} to 

find an arbitrary function's zeros. 

@type m: number 

@param m: The slope of the line. 

@type b: number 

@param b: The y intercept of the line. The X{y intercept} of a 

line is the point at which it crosses the y axis (M{x=0}). 

@rtype: number 

@return: the x intercept of the line M{y=m*x+b}. 

""" 

return -b/m 

You can compare this function definition with the API documentation generated by epydoc. Note that: 

Paragraphs are separated by blank lines. 

Inline markup has the form "x{...}", where "x" is a single capital letter. This example uses inline markup to 

mark mathematical expressions ("M{...}"); terms that should be indexed ("X{...}"); and links to the 

documentation of other objects ("L{...}"). 

Descriptions of parameters, return values, and types are marked with "@field:" or "@field arg:", where "field" 

identifies the kind of description, and "arg" specifies what object is described. 

Epytext is intentionally very lightweight. If you wish to use a more expressive markup language, I recommend 

reStructuredText. 

Epytext Language Overview 

Epytext is a lightweight markup language for Python docstrings. The epytext markup language is used by epydoc to 

parse docstrings and create structured API documentation. Epytext markup is broken up into the following 

categories: 

Block Structure divides the docstring into nested blocks of text, such as paragraphs and lists. 

o Basic Blocks are the basic unit of block structure. 

o Hierarchical blocks represent the nesting structure of the docstring. 

Inline Markup marks regions of text within a basic block with properties, such as italics and hyperlinks. 

Block Structure 

Block structure is encoded using indentation, blank lines, and a handful of special character sequences. 

Indentation is used to encode the nesting structure of hierarchical blocks. The indentation of a line is defined 

as the number of leading spaces on that line; and the indentation of a block is typically the indentation of its 

first line. 

Blank lines are used to separate blocks. A blank line is a line that only contains whitespace. 

Special character sequences are used to mark the beginnings of some blocks. For example, '-' is used as a 

bullet for unordered list items, and '>>>' is used to mark doctest blocks. 

The following sections describe how to use each type of block structure.

Paragraphs 

A paragraph is the simplest type of basic block. It consists of one or more lines of text. Paragraphs must be left 

justified (i.e., every line must have the same indentation). The following example illustrates how paragraphs can be 

used: 

Lists 

Docstring Input Rendered Output 

def example(): 

""" 

This is a paragraph. Paragraphs can 

span multiple lines, and can contain 

I{inline markup}. 

This is another paragraph. Paragraphs 

are separated by blank lines. 

""" 

*[...]* 

This is a paragraph. Paragraphs can span 

multiple lines, and contain inline markup. 

This is another paragraph. Paragraphs are 

separated from each other by blank lines. 

Epytext supports both ordered and unordered lists. A list consists of one or more consecutive list items of the same 

type (ordered or unordered), with the same indentation. Each list item is marked by a bullet. The bullet for 

unordered list items is a single dash character (-). Bullets for ordered list items consist of a series of numbers 

followed by periods, such as 12. or 1.2.8.. 

List items typically consist of a bullet followed by a space and a single paragraph. The paragraph may be indented 

more than the list item's bullet; often, the paragraph is intended two or three characters, so that its left margin lines 

up with the right side of the bullet. The following example illustrates a simple ordered list. 



""" 

1. This is an ordered list item. 

2. This is a another ordered list 

item. 

3. This is a third list item. Note that 

the paragraph may be indented more 

than the bullet. 

""" 

*[...]* 

1. This is an ordered list item. 

2. This is another ordered list item. 

3. This is a third list item. Note that the 

paragraph may be indented more than 

the bullet. 

List items can contain more than one paragraph; and they can also contain sublists, literal blocks, and doctest 

blocks. All of the blocks contained by a list item must all have equal indentation, and that indentation must be 

greater than or equal to the indentation of the list item's bullet. If the first contained block is a paragraph, it may 

appear on the same line as the bullet, separated from the bullet by one or more spaces, as shown in the previous 

example. All other block types must follow on separate lines. 

Every list must be separated from surrounding blocks by indentation: 


This is a paragraph. 


""" 

1. This is a list item. 


2. This is a second list item. 


This is a sublist. 

2. This a second list 

item. 

- This is a sublist 

""" 

[...] 

Note that sublists must be separated from the blocks in their parent list item by indentation. In particular, the 

following docstring generates an error, since the sublist is not separated from the paragraph in its parent list item by

indentation: 


L5: Error: Lists must be indented. 


""" 

1. This is a list item. Its 

paragraph is indented 7 spaces. 

- This is a sublist. It is 

indented 7 spaces. 

""" 

#[...] 

The following example illustrates how lists can be used: 




""" 




- This is a sublist. 

- The sublist contains two 

items. 

- The second item of the 

sublist has its own sublist. 

This is a sublist. 

The sublist contains two items. 

The second item of the sublist has 

its own own sublist. 

2. This list item contains two paragraphs and a 

2. This list item contains two 

paragraphs and a doctest block. 

doctest block. 

>>> print 'This is a doctest block' 

This is a doctest block 

This is the second paragraph. 

""" 

#[...] 

>>> print 'This is a doctest block' 

This is a doctest block 

This is the second paragraph. 

Epytext will treat any line that begins with a bullet as a list item. If you want to include bullet-like text in a 

paragraph, then you must either ensure that it is not at the beginning of the line, or use escaping to prevent epytext 

from treating it as markup: 


L4: Error: Lists must be indented. 


""" 

This sentence ends with the number 

1. Epytext can't tell if the "1." 

is a bullet or part of the paragraph, 

so it generates an error. 

""" 

#[...] 


""" 

This sentence ends with the number 1. 

Sections 

This sentence ends with the number 

E{1}. 

""" 

#[...] 

A section consists of a heading followed by one or more child blocks. 



The heading is a single underlined line of text. Top-level section headings are underlined with the '=' 

character; subsection headings are underlined with the '-' character; and subsubsection headings are 

underlined with the '~' character. The length of the underline must exactly match the length of the heading. 

The child blocks can be paragraphs, lists, literal blocks, doctest blocks, or sections. Each child must have

equal indentation, and that indentation must be greater than or equal to the heading's indentation. 

The following example illustrates how sections can be used: 



""" 

This paragraph is not in any section. 

Section 1 

========= 

This is a paragraph in section 1. 

Section 1.1 

----------- 

This is a paragraph in section 1.1. 

Section 2 

========= 


""" 

#[...] 

Literal Blocks 

Section 1 


Section 1.1 

This is a paragraph in section 1.1. 

Section 2 


Literal blocks are used to represent "preformatted" text. Everything within a literal block should be displayed 

exactly as it appears in plaintext. In particular: 

Spaces and newlines are preserved. 

Text is shown in a monospaced font. 

Inline markup is not detected. 

Literal blocks are introduced by paragraphs ending in the special sequence "::". Literal blocks end at the first line 

whose indentation is equal to or less than that of the paragraph that introduces them. The following example shows 

how literal blocks can be used: 


The following is a literal block: 


""" 

Literal / 

The following is a literal block:: 

/ Block 

Literal / 

/ Block 

This is a paragraph following the 

literal block. 

""" 

#[...] 

This is a paragraph following the literal block. 

Literal blocks are indented relative to the paragraphs that introduce them; for example, in the previous example, the 

word "Literal" is displayed with four leading spaces, not eight. Also, note that the double colon ("::") that 

introduces the literal block is rendered as a single colon. 

Doctest Blocks 

Doctest blocks contain examples consisting of Python expressions and their output. Doctest blocks can be used by 

the doctest module to test the documented object. Doctest blocks begin with the special sequence ">>>". Doctest 

blocks are delimited from surrounding blocks by blank lines. Doctest blocks may not contain blank lines. The 

following example shows how doctest blocks can be used: 


The following is a doctest block: 


""" 

>>> print (1+3, 

The following is a doctest block: 

... 3+5) 

(4, 8) 

>>> print (1+3, 

>>> 'a-b-c-d-e'.split('-')

Fields 

... 3+5) 

(4, 8) 

>>> 'a-b-c-d-e'.split('-') 

['a', 'b', 'c', 'd', 'e'] 

This is a paragraph following the 

doctest block. 

""" 

#[...] 

['a', 'b', 'c', 'd', 'e'] 

This is a paragraph following the doctest block. 

Fields are used to describe specific properties of a documented object. For example, fields can be used to define the 

parameters and return value of a function; the instance variables of a class; and the author of a module. Each field is 

marked by a field tag, which consist of an at sign ('@') followed by a field name, optionally followed by a space and 

a field argument, followed by a colon (':'). For example, '@return:' and '@param x:' are field tags. 

Fields can contain paragraphs, lists, literal blocks, and doctest blocks. All of the blocks contained by a field must all 

have equal indentation, and that indentation must be greater than or equal to the indentation of the field's tag. If the 

first contained block is a paragraph, it may appear on the same line as the field tag, separated from the field tag by 

one or more spaces. All other block types must follow on separate lines. 

Fields must be placed at the end of the docstring, after the description of the object. Fields may be included in any 

order. 

Fields do not need to be separated from other blocks by a blank line. Any line that begins with a field tag followed 

by a space or newline is considered a field. 

The following example illustrates how fields can be used: 



""" 

@param x: This is a description of 

the parameter x to a function. 

Note that the description is 

indented four spaces. 

@type x: This is a description of 

x's type. 

@return: This is a description of 

the function's return value. 

It contains two paragraphs. 

""" 

#[...] 

Parameters: 

x - This is a description of the parameter x to 

a function. Note that the description is 

indented four spaces. 

(type=This is a description of x's 

type.) 

Returns: 

This is a description of the function's return 

value. 

It contains two paragraphs. 

For a list of the fields that are supported by epydoc, see the epydoc fields chapter. 

Inline Markup 

Inline markup has the form 'x{...}', where x is a single capital letter that specifies how the text between the braces 

should be rendered. Inline markup is recognized within paragraphs and section headings. It is not recognized within 

literal and doctest blocks. Inline markup can contain multiple words, and can span multiple lines. Inline markup 

may be nested. 

A matching pair of curly braces is only interpreted as inline markup if the left brace is immediately preceeded by a 

capital letter. So in most cases, you can use curly braces in your text without any form of escaping. However, you 

do need to escape curly braces when: 

1. You want to include a single (un-matched) curly brace. 

2. You want to preceed a matched pair of curly braces with a capital letter. 

Note that there is no valid Python expression where a pair of matched curly braces is immediately preceeded by a 

capital letter (except within string literals). In particular, you never need to escape braces when writing Python 

dictionaries. See also escaping.

Basic Inline Markup 

Epytext defines four types of inline markup that specify how text should be displayed: 

I{...}: Italicized text. 

B{...}: Bold-faced text. 

C{...}: Source code or a Python identifier. 

M{...}: A mathematical expression. 

By default, source code is rendered in a fixed width font; and mathematical expressions are rendered in italics. But 

those defaults may be changed by modifying the CSS stylesheet. The following example illustrates how the four 

basic markup types can be used: 



""" 

I{B{Inline markup} may be nested; and 

it may span} multiple lines. 

URLs 

- I{Italicized text} 

- B{Bold-faced text} 

- C{Source code} 

- M{Math} 

Without the capital letter, matching 

braces are not interpreted as markup: 

C{my_dict={1:2, 3:4}}. 

""" 

#[...] 

Inline markup may be nested; and it may span 

multiple lines. 

Italicized text 

Bold-faced text 

Source code 

Math: m*x+b 

Without the capital letter, matching braces are 

not interpreted as markup: my_dict={1:2, 

3:4}. 

The inline markup construct U{text} is used to create links to external URLs and URIs. 'text' is the text that 

should be displayed for the link, and 'url' is the target of the link. If you wish to use the URL as the text for the link, 

you can simply write "U{url}". Whitespace within URL targets is ignored. In particular, URL targets may be split 

over multiple lines. The following example illustrates how URLs can be used: 



""" 

- U{www.python.org} 

- U{http://www.python.org} 

- U{The epydoc homepage} 

- U{The B{Python} homepage 

} 

- U{Edward Loper} 

""" 

#[...] 

Documentation Crossreference Links 

www.python.org 

http://www.python.org 

The epydoc homepage 

The Python homepage 

Edward Loper 

The inline markup construct 'L{text}' is used to create links to the documentation for other Python objects. 

'text' is the text that should be displayed for the link, and 'object' is the name of the Python object that should be 

linked to. If you wish to use the name of the Python object as the text for the link, you can simply write L{object}``. 

Whitespace within object names is ignored. In particular, object names may be split over multiple lines. The 

following example illustrates how documentation crossreference links can be used: 



""" 

- L{x_transform} 

- L{search} 

- L{The I{x-transform} function 

x_transform 

search 

The x-transform function

} 

""" 

#[...] 

In order to find the object that corresponds to a given name, epydoc checks the following locations, in order: 

1. If the link is made from a class or method docstring, then epydoc checks for a method, instance variable, or 

class variable with the given name. 

2. Next, epydoc looks for an object with the given name in the current module. 

3. Epydoc then tries to import the given name as a module. If the current module is contained in a package, then 

epydoc will also try importing the given name from all packages containing the current module. 

4. Epydoc then tries to divide the given name into a module name and an object name, and to import the object 

from the module. If the current module is contained in a package, then epydoc will also try importing the 

module name from all packages containing the current module. 

5. Finally, epydoc looks for a class name in any module with the given name. This is only returned if there is a 

single class with such name. 

If no object is found that corresponds with the given name, then epydoc issues a warning. 

Indexed Terms 

Epydoc automatically creates an index of term definitions for the API documentation. The inline markup construct 

'X{...}' is used to mark terms for inclusion in the index. The term itself will be italicized; and a link will be created 

from the index page to the location of the term in the text. The following example illustrates how index terms can be 

used: 



""" 

An X{index term} is a term that 

should be included in the index. 

""" 

#[...] 

Symbols 

An index term is a term that should be included in 

the index. 

Index 

index term example 

x intercept x_intercept 

y intercept x_intercept 

Symbols are used to insert special characters in your documentation. A symbol has the form 'S{code}', where code 

is a symbol code that specifies what character should be produced. The following example illustrates how symbols 

can be used to generate special characters: 


Symbols can be used in equations: 


""" 

∑ α/x ≤ β 

Symbols can be used in equations: 

← and ← both give left arrows. Some other arrows 

- S{sum}S{alpha}/x S{

dash (which would normally signal a list item), write 'E{-}'. In addition, two special escape codes are defined: 

'E{lb}' produces a left curly brace ('{'); and 'E{rb}' produces a right curly brace ('}'). The following example 

illustrates how escaping can be used: 



""" 

This paragraph ends with two 

colons, but does not introduce 

a literal blockE{:}E{:} 

Graphs 

E{-} This is not a list item. 

Escapes can be used to write 

unmatched curly braces: 

E{rb}E{lb} 

""" 

#[...] 

This paragraph ends with two colons, but does not 

introduce a literal block:: 

- This is not a list item. 

Escapes can be used to write unmatched curly 

braces: }{ 

The inline markup construct 'G{graphtype args...}' is used to insert automatically generated graphs. The following 

graphs generation constructions are currently defines: 

Markup Description 

G{classtree classes...} Display a class hierarchy for the given class or 

classes (including all superclasses & subclasses). If 

no class is specified, and the directive is used in a 

class's docstring, then that class's class hierarchy 

will be displayed. 

G{packagetree modules...} Display a package hierarchy for the given module or 

modules (including all subpackages and 

submodules). If no module is specified, and the 

directive is used in a module's docstring, then that 

module's package hierarchy will be displayed. 

G{importgraph modules...} Display an import graph for the given module or 

modules. If no module is specified, and the directive 

is used in a module's docstring, then that module's 

import graph will be displayed. 

G{callgraph functions...} Display a call graph for the given function or 

functions. If no function is specified, and the 

directive is used in a function's docstring, then that 

function's call graph will be displayed. 

Characters 

Valid Characters 

Valid characters for an epytext docstring are space (\040); newline (\012); and any letter, digit, or punctuation, as 

defined by the current locale. Control characters (\000-\010` and ``\013-\037) are not valid content characters. 

Tabs (\011) are expanded to spaces, using the same algorithm used by the Python parser. Carridge-return/newline 

pairs (\015\012) are converted to newlines. 

Content Characters 

Characters in a docstring that are not involved in markup are called content characters. Content characters are 

always displayed as-is. In particular, HTML codes are not passed through. For example, consider the following 

example: 


test 


""" 

test

""" 

#[...] 

The docstring is rendered as test, and not as the word "test" in bold face. 

Spaces and Newlines 

In general, spaces and newlines within docstrings are treated as soft spaces. In other words, sequences of spaces and 

newlines (that do not contain a blank line) are rendered as a single space, and words may wrapped at spaces. 

However, within literal blocks and doctest blocks, spaces and newlines are preserved, and no word-wrapping 

occurs; and within URL targets and documentation link targets, whitespace is ignored. 

Home Installing Epydoc Using Epydoc Epytext

Epydoc Fields 

Fields are used to describe specific properties of a documented object. For example, 

fields can be used to define the parameters and return value of a function; the instance 

variables of a class; and the author of a module. Each field consists of a tag, an 

optional argument, and a body. 

The tag is a case-insensitive word that indicates what kind of documentation is 

given by the field. 

The optional argument specifies what object, parameter, or group is documented 

by the field. 

The body contains the main contents of the field. 

Field Markup 

Each docstring markup langauge marks fields differently. The following table shows 

the basic fields syntax for each markup language. For more information, see the 

definition of field syntax for each markup language. 

Epytext reStructuredText Javadoc 

@tag: body... 

@tag arg: body... 

Definition of epytext 

fields 

Supported Fields 

:tag: body... 

:tag arg: body... 

Definition of 

ReStructuredText fields 

@tag body... 

@tag arg body... 

Definition of Javadoc 

fields 

The following table lists the fields that epydoc currently recognizes. Field tags are 

written using epytext markup; if you are using a different markup language, then you 

should adjust the markup accordingly. 

Functions and Methods parameters 

@param p: ... 

A description of the parameter p for a function or method. 

@type p: ... 

The expected type for the parameter p. 

@return: ... 

The return value for a function or method. 

@rtype: ... 

The type of the return value for a function or method. 

@keyword p: ... 

A description of the keyword parameter p. 

@raise e: ...

A description of the circumstances under which a function or method raises 

exception e. 

These tags can be used to specify attributes of parameters and return value of function 

and methods. These tags are usually put in the the docstring of the function to be 

documented. 

Note 

constructor parameters 

In C extension modules, extension classes cannot have a docstring 

attached to the __init__ function; consequently it is not possible to 

document parameters and exceptions raised by the class constructor. To 

overcome this shortcoming, the tags @param, @keyword, @type, 

@exception are also allowed to appear in the class docstring. In this case 

they refer to constructor parameters. 

@param fields should be used to document any explicit parameter (including the 

keyword parameter). @keyword fields should only be used for non-explicit keyword 

parameters: 

def plant(seed, *tools, **options): 

""" 

@param seed: The seed that should be planted. 

@param tools: Tools that should be used to plant the seed. 

@param options: Any extra options for the planting. 

@keyword dig_deep: Plant the seed deep under ground. 

@keyword soak: Soak the seed before planting it. 

""" 

#[...] 

Since the @type field allows for arbitrary text, it does not automatically create a 

crossreference link to the specified type, and is not written in fixed-width font by 

default. If you want to create a crossreference link to the type, or to write the type in a 

fixed-width font, then you must use inline markup: 

def ponder(person, time): 

""" 

@param person: Who should think. 

@type person: L{Person} or L{Animal} 

@param time: How long they should think. 

@type time: C{int} or C{float} 

""" 

#[...] 

Variables parameters 

@ivar v: ... 

A description of the class instance variable v.

@cvar v: ... 

A description of the static class variable v. 

@var v: ... 

A description of the module variable v. 

@type v: ... 

The type of the variable v. 

These tags are usually put in a module or class docstring. If the sources can be parsed 

by Epydoc it is also possible to document the variable in their own docstrings: see 

variable docstrings 

Epydoc considers class variables the ones defined directly defined in the class body. A 

common Python idiom is to create instance variables settings their default value in the 

class instead of the constructor (hopefully if the default is immutable...). 

If you want to force Epydoc to classify as instance variable one whose default value is 

set at class level, you can describe it using the tag @ivar in the context of a variable 

docstring: 

class B: 

y = 42 

"""@ivar: This is an instance variable.""" 

Properties parameters 

@type: ... 

The type of the property. 

The @type tag can be attached toa property docstring to specify its type. 

Grouping and Sorting 

@group g: c1,...,cn 

Organizes a set of related children of a module or class into a group. g is the 

name of the group; and c1,...,cn are the names of the children in the group. To 

define multiple groups, use multiple group fields. 

@sort: c1,...,cn 

Specifies the sort order for the children of a module or class. c1,...,cn are the 

names of the children, in the order in which they should appear. Any children 

that are not included in this list will appear after the children from this list, in 

alphabetical order. 

These tags can be used to present groups of related items in a logical way. They apply 

to modules and classes docstrings. 

For the @group and @sort tags, asterisks (*) can be used to specify multiple children at 

once. An asterisk in a child name will match any substring: 

class widget(size, weight, age):

""" 

@group Tools: zip, zap, *_tool 

@group Accessors: get_* 

@sort: get_*, set_*, unpack_*, cut 

""" 

#[...] 

Note 

group markers 

It is also possible to group set of related items enclosing them into special 

comment starting with the group markers '#{' and '#}' The group title can 

be specified after the opening group marker. Example: 

#{ Database access functions 

def read(id): 

#[...] 

def store(item): 

#[...] 

def delete(id): 

#[...] 

# groups can't be nested, so a closing marker is not required here. 

#{ Web publish functions 

def get(request): 

#[...] 

def post(request): 

#[...] 

#} 

Notes and Warnings 

@note: ... 

A note about an object. Multiple note fields may be used to list separate notes. 

@attention: ... 

An important note about an object. Multiple attention fields may be used to list 

separate notes. 

@bug: ... 

A description of a bug in an object. Multiple bug fields may be used to report 

separate bugs. 

Note

If any @bug field is used, the HTML writer will generate a the page 

bug-index.html, containing links to all the items tagged with the 

field. 

@warning: ... 

A warning about an object. Multiple warning fields may be used to report 

separate warnings. 

Status 

@version: ... 

The current version of an object. 

@todo [ver]: ... 

A planned change to an object. If the optional argument ver is given, then it 

specifies the version for which the change will be made. Multiple todo fields 

may be used if multiple changes are planned. 

Note 

If any @todo field is used, the HTML writer will generate a the 

page todo-index.html, containing links to all the items tagged 

with the field. 

@deprecated: ... 

Indicates that an object is deprecated. The body of the field describe the reason 

why the object is deprecated. 

@since: ... 

The date or version when an object was first introduced. 

@status: ... 

The current status of an object. 

@change: ... 

A change log entry for this object. 

@permission: ... 

The object access permission, for systems such Zope/Plone supporting this 

concept. It may be used more than once to specify multiple permissions. 

Formal Conditions 

@requires: ... 

A requirement for using an object. Multiple requires fields may be used if an 

object has multiple requirements. 

@precondition: ... 

A condition that must be true before an object is used. Multiple precondition 

fields may be used if an object has multiple preconditions. 

@postcondition: ... 

A condition that is guaranteed to be true after an object is used. Multiple

postcondition fields may be used if an object has multiple postconditions. 

@invariant: ... 

A condition which should always be true for an object. Multiple invariant fields 

may be used if an object has multiple invariants. 

Bibliographic Information 

@author: ... 

The author(s) of an object. Multiple author fields may be used if an object has 

multiple authors. 

@organization: ... 

The organization that created or maintains an object. 

@copyright: ... 

The copyright information for an object. 

@license: ... 

The licensing information for an object. 

@contact: ... 

Contact information for the author or maintainer of a module, class, function, or 

method. Multiple contact fields may be used if an object has multiple contacts. 

Other fields 

@summary: ... 

A summary description for an object. This description overrides the default 

summary (which is constructed from the first sentence of the object's 

description). 

@see: ... 

A description of a related topic. see fields typically use documentation 

crossreference links or external hyperlinks that link to the related topic. 

Fields synonyms 

Several fields have synonyms, or alternate tags. The following table lists all field 

synonyms. Field tags are written using epytext markup; if you are using a different 

markup language, then you should adjust the markup accordingly. 

Name Synonims 

@param p: ... @parameter p: ... 

@arg p: ... 

@argument p: ... 

@return: ... @returns: ... 

@rtype: ... @returntype: ... 

@raise e: ... @raises e: ... 

@except e: ... 

@exception e: ... 

@keyword p: ... @kwarg p: ...

@kwparam p: ... 

@ivar v: ... @ivariable v: ... 

@cvar v: ... @cvariable v: ... 

@var v: ... @variable v: ... 

@see: ... @seealso: ... 

@warning: ... @warn: ... 

@requires: ... @require: ... 

@requirement: ... 

@precondition: ... @precond: ... 

@postcondition: ... @postcond: ... 

@organization: ... @org: ... 

@copyright: ... @(c): ... 

@change: ... @changed: ... 

Module metadata variables 

Some module variables are commonly used as module metadata. Epydoc can use the 

value provided by these variables as alternate form for tags. The following table lists 

the recognized variables and the tag they replace. Customized metadata variables can 

be added using the method described in Adding New Fields. 

Tag Variable 

@author __author__ 

@authors __authors__ 

@contact __contact__ 

@copyright __copyright__ 

@license __license__ 

@deprecated __deprecated__ 

@date __date__ 

@version __version__ 

Adding New Fields 

New fields can be defined for the docstrings in a module using the special @newfield 

tag (or its synonym, @deffield). This tag has the following syntax: 

@newfield tag: label [, plural ] 

Where tag is the new tag that's being defined; label is a string that will be used to mark 

this field in the generated output; and plural is the plural form of label, if different. 

New fields can be defined in any Python module. If they are defined in a package, it 

will be possible to use the newly defined tag from every package submodule. 

Each new field will also define a metadata variable which can be used to set the field 

value instead of the tag. For example, if a revision tag has been defined with: 

@newfield revision: Revision

then it will be possible to set a value for the field using a module variable: 

__revision__ = "1234" 

The following example illustrates how the @newfield can be used: Docstring Input 

Rendered Output 


Corpora: 

""" 

@newfield corpus: Corpus, Corpora 

""" 


""" 

@corpus: Bob's wordlist. 

@corpus: The British National Corpus. 

""" 

[...] 

Note 

Bob's wordlist. 

The British 

National Corpus. 

The module-level variable __extra_epydoc_fields__ is deprecated; use 

@newfield instead. 

Home 

Installing 

Epydoc 

Using Epydoc Epytext

Python Docstrings 

Python documentation strings (or docstrings) provide a convenient way of 

associating documentation with Python modules, functions, classes, and methods. 

An object's docsting is defined by including a string constant as the first statement 

in the object's definition. For example, the following function defines a docstring: 

def x_intercept(m, b): 

""" 

Return the x intercept of the line y=m*x+b. The x intercept of a 

line is the point at which it crosses the x axis (y=0). 

""" 

return -b/m 

Docstrings can be accessed from the interpreter and from Python programs using 

the "__doc__" attribute: 

>>> print x_intercept.__doc__ 



The pydoc module, which became part of the standard library in Python 2.1, can be 

used to display information about a Python object, including its docstring: 

>>> from pydoc import help 

>>> help(x_intercept) 

Help on function x_intercept in module __main__: 

x_intercept(m, b) 



For more information about Python docstrings, see the Python Tutorial or the 

O'Reilly Network article Python Documentation Tips and Tricks. 

Variable docstrings 

Python don't support directly docstrings on variables: there is no attribute that can 

be attached to variables and retrieved interactively like the __doc__ attribute on 

modules, classes and functions. 

While the language doesn't directly provides for them, Epydoc supports variable 

docstrings: if a variable assignment statement is immediately followed by a bare 

string literal, then that assignment is treated as a docstring for that variable. In 

classes, variable assignments at the class definition level are considered class 

variables; and assignments to instance variables in the constructor (__init__) are 

considered instance variables:

class A: 

x = 22 

"""Docstring for class variable A.x""" 

def __init__(self, a): 

self.y = a 

"""Docstring for instance variable A.y 

Variables may also be documented using comment docstrings. If a variable 

assignment is immediately preceeded by a comment whose lines begin with the 

special marker '#:', or is followed on the same line by such a comment, then it is 

treated as a docstring for that variable: 

#: docstring for x 

x = 22 

x = 22 #: docstring for x 

Notice that variable docstrings are only available for documentation when the 

source code is available for parsing: it is not possible to retrieve variable 

Items visibility 

Any Python object (modules, classes, functions, variables...) can be public or 

private. Usually the object name decides the object visibility: objects whose name 

starts with an underscore and doesn't end with an underscore are considered private. 

All the other objects (including the "magic functions" such as __add__) are public. 

For each module and class, Epydoc generates pages with both public and private 

methods. A Javascript snippet allows you to toggle the visibility of private objects. 

If a module wants to hide some of the objects it contains (either defined in the 

module itself or imported from other modules), it can explicitly list the names if its 

public names in the __all__ variable. 

If a module defines the __all__ variable, Epydoc uses its content to decide if the 

module objects are public or private. 

Home 

Installing 

Epydoc 

Using Epydoc Epytext

An extract from 'Scientific Scripting with Python'. Copyright 2008 Drew McCormack. 

Regular Expressions in Python 

Regular expressions are a special syntax for describing textual patterns. If you are familiar with the 

UNIX command line, you will have used a technique known as globbing in order to match file and 

directory names. For example, the UNIX command 

ls *.py 

matches all files in the current working directory that end with the .py extension. The wild card 

character (*) matches any number of characters, so the whole pattern '*.py' matches all names 

that end in '.py'. 

It is easy to confuse regular expressions with globbing, because both provide a means of matching 

textual patterns, but the syntax of the two is quite different, so try to keep them separate in your 

mind — globbing and regular expressions are not the same thing. 

There are two basic operations that regular expressions are used for: searching and matching. 

Searching involves moving through a string to locate a sub-string that matches a given pattern, 

and matching involves testing a string to see if it conforms to a pattern. 

To illustrate the difference, imagine first that you are trying to locate a number in a line of text. This 

requires a search, because you do not require the line to conform to a particular pattern; instead, 

you want to scan through the line looking for a particular pattern of digits. 

Now consider that you want to verify that a particular string conforms to some predefined format. 

For example, an example might be 'HI454NNN'. You want to confirm that the text begins with two 

letters, is followed by some digits, and finishes with three letters. This is an example of matching: 

you want to see if the string matches a given pattern. 

Python regular expressions are handled by the re package. After you have imported it, you have 

access to functions for searching (search, findall), matching (match), substituting strings 

(sub), and splitting strings (split). We will address each of these in due course, but first you 

need to know the basics of the regular expression syntax. We'll begin with a table of the most 

important pattern matching characters. 

Characters Description Example 

* Matches zero or more of the preceding 

expression. 

+ Matches one or more of the preceding 

expression. 

. Matches any single character, except the new 

line. (You can change this behavior by 

passing re.DOTALL.) 

a* matches '', 'a', 'aa', 'aaa', etc 

a+ matches 'a', 'aa', 'aaa', etc 

. matches 'a', 'b', '2', '(' etc

Characters Description Example 

? Matches zero or one of the preceding 

expression. 

$ Matches the end of a string, or just before a 

new line. 

^ Matches the start of a string, or just after a 

new line. 

{m} Matches exactly m instances of the preceding 

expression. 

[] Matches any character, or character set (eg, 

\d), that appears in the square brackets. 

| Matches if either the preceding expression, or 

the expression that follows, matches. 

\b Matches the start or end of a word. 

\s Matches a whitespace character, including 

new lines and tabs. 

\d Matches any digit, ie, 0–9. 

\w Matches any alphanumeric character, or the 

underscore. 

\n Matches a new line character. 

() Group together terms in a single expression. 

a? matches '' or 'a' 

a{2} matches only 'aa' 

[a3t_] matches 'a', '3', 't', or '_'. 

a|b matches 'a' or 'b' 

There are many more regular expression operators, but you can go a long way with just those 

listed in the table. We will now consider them in more detail. 

One of the most used regular expression characters is +; it matches one or more instances of an 

expression. Let's take a look at an example that makes use of this regular expression character in 

the match function: 

>>> import re 

>>> re.match('a+', 'aaa') 

 

The first argument to the match function is the regular expression, and the second is the string to 

be checked for matching. If the regular expression matches at the start of the string, a Match 

object will be returned; otherwise, None will be returned. In this example, the expression a+ 

matches one or more of the letter 'a', so a Match object gets returned. 

The Match object contains information about the range of characters in the string that matched. 

Most of the time you don't need this detail, and it is only important to know whether there was a 

match or not. In such cases, you can use an if statement to check for a match.

if re.match('a+', 'aaa'): 

... print 'It matched!' 

... 

It matched! 

To show that the regular expression need only match at the start of the string, consider this 

>>> re.match('a+', 'aaabbb') 

 

The output shows that 'aaabbb' is also a match, even though the regular expression does not 

match the letter 'b'. 'bbbaaa', on the other hand, does not match, because the pattern does not 

match at the start of the string. 

>>> print re.match('a+', 'bbbaaa') 

None 

The search function is similar to match, but the match can occur anywhere in the string. Using 

the same regular expression and string as in the preceding example, the search function returns 

a Match object, corresponding to the first substring that matches the pattern. 

>>> print re.search('a+', 'bbbaaa') 

 

These simple examples could give you the idea that regular expressions are as primitive as 

command line globbing, but nothing could be further from the truth. Regular expressions are very 

powerful, and much of their power comes from the way you can combine operators into complex 

pattern matching expressions. For example, suppose you were searching a file for a line of text like 

this 

Wavelength (cm-1) :: 734.45 

A pattern that matches such a line is 

\s*Wavelength.*::\s*[\d\.]+ 

Let's dissect this to try to understand it. The pattern begins with \s*. The \s matches a 

whitespace character, and the * matches zero or more of the preceding expression. Taken 

together, the expressions match zero or more whitespace characters. 

Next in the pattern is the text Wavelength. You can enter literal text like this in a regular 

expression. It will only match if exactly the same text is found in the string. 

The character combination .* follows. This is similar to the \s* above, except that it matches zero 

or more of any character, not just whitespace characters. 

Next we have the literal text ::, followed again by \s*, which — as we now know — matches zero 

or more whitespace characters. 

Lastly, consider the expression [\d\.]+. Square brackets match any of the characters they 

enclose. We wish to match all positive real numbers, so we could use a pattern like this 

[0123456789] to match any digit, but Python gives us some abbreviations for this. You can use 

ranges, like [0-9], but you can also use \d, which matches any digit. 

That covers the digits, but what about the decimal point? The period character has special 

meaning in regular expressions — as we have already seen — so you have to escape that

meaning by using a backslash, similar to how you use a backslash to escape special meaning of 

characters in strings. With this in mind, [\d\.] will match any digit, or a decimal point. [\d\.]+ 

thus matches one or more digits and/or decimal points, which are the characters that make up a 

real number. 

Putting this altogether, the regular expression would thus read something like this in English: 

A string that begins with zero or more whitespace characters, followed by the text 'Wavelength', 

followed by zero or more characters of any type, followed by two colons and zero or more whitespace 

characters, and concluding with one or more digits and/or periods. 

It's a mouthful, but hopefully this gives you some insight into how regular expressions work. Once 

you understand the strange syntax, you should realize they are just a language for describing 

textual patterns. 

It's useful to be able to search and match patterns of text, because it allows you to locate the 

proverbial needle in a haystack, but when you have found that elusive line of text, you still need to 

extract the values you are interested in. Regular expressions has a means of doing that two: 

groups. 

A group is nothing more than a section of a regular expression that is enclosed in parentheses. 

When the expression matches, the value matched by the group will be stored for later retrieval. 

To demonstrate this, we'll return to the previous example, and modify the regular expression to use 

groups to retrieve the numeric value from the data. 

\s*Wavelength.*::\s*([\d\.]+) 

The only difference between this regular expression, and the previous, is the parentheses around 

the part of the expression that matches digits and periods, ie, the part designed to match the real 

number. With this small change, whenever a match occurs, the sub-string that matches the pattern 

in the parentheses will be stored so that we can retrieve it afterwards. Here's how: the Match 

object returned by functions like match and search includes a method called group; if you pass 

an index corresponding to a group, it will return the string that matched. 

>>> r = '\s*Wavelength.*::\s*([\d\.]+)' 

>>> s = ' Wavelength (cm-1) :: 734.45' 

>>> match = re.match(r, s) 

>>> match.group(0) 

' Wavelength (cm-1) :: 734.45' 

>>> match.group(1) 

'734.45' 

Note that the group with index 0 is the part of the string that matched the whole regular expression. 

Thereafter, the indexes correspond to the order of groups in the regular expression. In this 

example, group number 1 holds the value we are interested in. 

A regular expression can have as many groups as you like, and they can even be embedded. 

Consider the following data by way of example: 

X 2.45 -3454.4443 

Here is an expression that will match the line, and extract the label and numerical values. 

(\w+)\s+((+|-)?\d*\.?\d*)\s+((+|-)?\d*\.?\d*)

This is quite a convoluted expression, so let's rewrite it in verbose mode. 

(\w+) # Match and store the label 

\s+ # Skip whitespace 

((+|-)?\d*\.?\d*) # Match a real number, with optional sign. Store group 

\s+ # More whitespace 

((+|-)?\d*\.?\d*) # Another real number 

Verbose mode allows you to spread out your regular expression, and use comments and 

whitespace to make it more legible. All whitespace and comments are ignored, unless explicitly 

escaped with a backslash. 

Here is how you use a verbose regular expression: 

import re 

from string import rjust 

# Setup regular expression string. 

# Use a raw string to prevent any substitutions. 

regEx = r""" 

(\w+) # Match and store the label 


((\+|-)?(\d*)\.?\d*) # Match a real number, with optional sign. Store group 

\s+ # More whitespace 

((\+|-)?(\d*)\.?\d*) # Another real number 

""" 

# Call function with verbose flag 

m = re.match(regEx, 'X 2.45 -3454.4443', re.VERBOSE) 

# Print results from groups 

print rjust('Label:', 20), m.group(1) 

print rjust('First Value:', 20), m.group(2) 

print rjust('Sign:', 20), m.group(3) 

print rjust('Integer part:', 20), m.group(4) 

print rjust('Second Value:', 20), m.group(5) 

print rjust('Sign:', 20), m.group(6) 

print rjust('Integer part:', 20), m.group(7) 

This script prints out 

Label: X 

First Value: 2.45 

Sign: None 

Integer part: 2 

Second Value: -3454.4443 

Sign: - 

Integer part: 3454 

To use the verbose mode, you pass an extra argument, and set it equal to the VERBOSE variable 

from the re module. 

A regular expression may contain many groups, and they can even be embedded within one 

another. Given this fact, how do you know which group corresponds to which index in the Match 

object? The easiest way to establish the index of a group is to count the opening parentheses: the 

group at index 1 is the one with the first opening parenthesis when reading from left-to-right; the 

group with index 2 is the one with the next opening parenthesis, and so forth.

To make this discussion more concrete, take a look at the various print statements, and try to 

match the group index for each with the value printed. In particular, note how various aspects of 

the numeric values can be extracted by careful embedding of groups, including the complete 

number, its sign, and the integer part of the real number. 

The numeric values are each matched by an expression that looks like this 

((\+|-)?(\d*)\.?\d*) 

Each one has three groups in all. The first encloses the whole expression, and will thus take on the 

value of the complete number. The second, (\+|-), matches either the plus symbol — which 

must be escaped due to its special meaning in regular expressions — or the negative symbol. If a 

sign is given in the string, its value will end up in the corresponding group; if no sign is given, the 

group will get the value None. The last group matches the integer value of the number, which 

appears before the decimal point. 

This is an advanced example which hopefully conveys just how powerful regular expressions can 

be. A single expression can be used to carve up a textual string, extracting any useful information, 

and storing it in groups for later use. 

We saw above that the regular expression functions support an optional third argument, which can 

be used to pass in flags like re.VERBOSE. These flags are particularly useful if you want to match 

strings that extend over several lines. Consider this data, for example: 

Irreducible Representations, including subspecies 

------------------------------------------------- 

S 

P:x P:y P:z 

D:z2 D:x2-y2 D:xy D:xz D:yz 

F:z3 F:z F:xyz F:z2x F:z2y F:x F:y 

Configuration of Valence Electrons 

================================== 

Occupation Numbers 

------------------------------------------------- 

S 1 

P 0 

D 0 

F 0 

------------------------------------------------- 

Now suppose you are interested in extracting the block of text that begins after the horizontal rule 

following 'Occupation Numbers'. This is clearly a multi-line piece of string. Here is how you could 

do it. 

import re, sys 

m = re.search(r'Occupation Numbers\s*-*(.*?)-', sys.stdin.read(), 

re.MULTILINE | re.DOTALL) 

print m.group(1) 

When run, and passed the data above via standard input, this script produces 

S 1

P 0 

D 0 

F 0 

There are a number of aspects of this short script that warrant discussion. First, a number of flags 

are passed via the third argument to the search function. You can pass multiple flags by 

combining them with the | operator. The re.MULTILINE option causes the ^ and $ operators to 

match wherever new line characters are found, rather than just at the start and end of the string. 

This isn't strictly necessary in this particular case, because neither of these characters appear in 

the regular expression. But should the expression be altered in the future, the multiline behavior 

would be desirable, so it has been included anyway. 

The re.DOTALL flag is more important: it causes the . operator, which is a single period, to match 

all characters including the new line. Usually, the . operator does not match the new line 

character, but in multiline matching it is useful for the new line to be treated just like any other 

character. 

The regular expression also has some interesting aspects to it. 

Occupation Numbers\s*-*(.*?)- 

It begins with the text 'Occupation Numbers', followed by some whitespace (\s*) and zero or more 

hyphens(-*). Together, these expressions form a 'landmark': it is quite common when scripting to 

extract a small section of data from a large file. A way to do this is to look for a unique sequence of 

characters just before and just after the section of interest. These allow you to anchor your regular 

expression, and extract the desired text. 

The terminal landmark in this case is the line of hyphens under the section of interest. A single 

hyphen has been included at the end of the regular expression, because once that has been 

found, we know that the block of text is finished. 

A group has been used to capture the section of text we are interested in. It looks like this 

(.*?) 

As we have already seen, .* matches zero or more characters, but what role is the ? playing in 

this case? The question mark actually modifies the behavior of the *, causing it to become non- 

greedy. Regular expressions usually try to match as much as possible — they are said to be 

greedy. If you want them to match the minimum possible, you need to make them non-greedy by 

using the ? character. 

What would happen if you didn't use the non-greedy operator in this case? .* will match zero or 

more of any character, since we are using the re.DOTALL flag, so it would simply match 

everything to the end of the string, including the line of hyphens, and anything else that might 

appear afterwards. This is clearly not the behavior we are looking for. We want to match as few 

characters as possible to get to the first of the trailing hyphens, and the non-greedy operator helps 

achieve this. 

Thus far, we have looked at regular expressions that get used once, and then discarded. 

Sometimes you will need to use a regular expression repeatedly. To improve performance, and the 

need to duplicate the regular expression text, it is possible to compile an expression and store it in

egular expression object. Compiling involves taking the string representation of the regular 

expression, and converting that into an internal form that can be applied much faster. 

Here is an example of compiling and applying a regular expression object. 

import re 

from string import ljust 

dateEx = re.compile(r''' 

^([A-Z][a-z]{2}) # Match a month (eg Jan, Feb) 


(\d{1,2}) # Match date (eg 1, 2, 10) 

,\s* # Match comma, and optional whitespace 

(\d{4})$ # Match year (eg 1999, 2008) 

''', re.VERBOSE) 

dates = ['Jan 23, 1999', 'jan 23, 1999', '23 Jan, 1999', 'Jan 23, 99'] 

for l in dates: 

m = dateEx.match(l) 

print 40*'-' 

print l 

if m: 

print 'Correct Date Format' 

print ljust('Month',20), m.group(1) 

print ljust('Day',20), m.group(2) 

print ljust('Year',20), m.group(3) 

else: 

print 'Incorrect Date Format' 

In this example, which checks the validity of a number of date strings, rather than passing the 

regular expression string directly to the match function, the compile function is used to create a 

regular expression object. compile takes both the regular expression string, and the flags (eg 

VERBOSE), as arguments. The script then calls the match method of the object, rather than the 

match function, to apply the regular expression to a given string. 

The output of the script is 

---------------------------------------- 

Jan 23, 1999 

Correct Date Format 

Month Jan 

Day 23 

Year 1999 

---------------------------------------jan 

23, 1999 

Incorrect Date Format 

---------------------------------------- 

23 Jan, 1999 


---------------------------------------- 

Jan 23, 99 


It identifies the first date as being correctly formatted, and extracts strings for the month, day, and 

year. The other dates are all incorrectly formatted. 

(A small aside: the horizontal rules are generated by passing 40*'-' to the print command. In 

Python, you can do such a 'multiplication' to repeat strings, in this case generating 40 hyphens.')

You can do a lot just with the search and match functions/methods, but there are a few other 

very useful functions in the re package. The first is split, which is similar to the string module 

split function, but more powerful. You use it to split up a string into components. For example, 

take this string: 

XXX,36346, 6633.334, -1 

This may seem trivial enough, but the string modules's split function would have trouble, 

because it can only work with either whitespace-delimited components, or components separated 

by a constant string. In this case, each component is separated by a comma and zero or more 

spaces. 

With the split function from the re module, you can use a regular expression to define the 

separator, like this 

>>> re.split(r',\s*', 'XXX,36346, 6633.334, -1') 

['XXX', '36346', '6633.334', '-1'] 

The first argument is the regular expression, in this case matching a comma followed by zero or 

more whitespace characters. The second argument is the string to be split. The result is a list of 

the string components, just as you get when using string.split. 

The search function allows you to locate a single sub-string matching a given regular expression, 

but what if you want to locate many such sub-strings? You could apply the search function 

repeatedly, each time passing in what remains of the string to be searched, but this is a bit clumsy. 

A better solution is to use the findall function, which locates all non-overlapping matches, and 

returns them in a list. 

import re 

data = """ 

Coordinates 

H 3.234 34.3 55. 

O 3.234 14.3 12. 

Zn 3.234 34.2 55.2 

Other 

Sn 3.234 34.2 55.2 

Pd -3.23 34.2 55.2 

""" 

numPattern = r'\s+([\+\-]?\d*\.?\d*)' # Matches a real number with leading space 

regEx = re.compile(r''' 

^\s* # Skip whitespace at start of line 

[A-Z][a-z]? # Match a chemical symbol 

%s%s%s # Three numbers, each preceded by whitespace 

\s*$ # Optional whitespace, and end of line 

''' % (numPattern, numPattern, numPattern), 

re.VERBOSE | re.MULTILINE) 

for m in regEx.findall(data): 

print m 

The output of this script is 

('3.234', '34.3', '55.')

('3.234', '14.3', '12.') 

('3.234', '34.2', '55.2') 

('3.234', '34.2', '55.2') 

('-3.23', '34.2', '55.2') 

The script is designed to extract three coordinate values from any line in the data that matches a 

particular format, beginning with a chemical element symbol, and followed by three real numbers. 

The regular expression is quite involved. Note how it has been simplified somewhat by extracting 

the real number pattern — which gets repeated — into a variable, and using string substitution to 

form the regular expression. Without this the expression would be less readable and maintainable. 

Consider doing this in your own scripts: reduce complexity by moving parts of your regular 

expressions into variables, and using string operators to combine them into a single string. 

The return value of findall is a list. If there are multiple groups in the regular expression, such 

as is the case here, each entry in the list will be a tuple corresponding to a particular match, and 

each will contain the groups for the match. In the example above, each entry in the list is a tuple 

containing three strings, corresponding to the three coordinate values of the atoms in the data. 

The last function that we will cover is sub. This allows you to search for, and replace, sequences of 

characters that match a given regular expression. For instance, imagine you have a program in 

which has many labels of the form MT..., such as MTWaveFunction and MTOptimizer. You 

wish to replace the MT in each label with TM. How can you do this swiftly and safely? 

With the sub function, you can identify labels of the correct form, and transform them, like so 

import re 

code = """ 

waveFunc = MTWaveFunction() 

waveFunc += 5.0 

opt = MTOptimizer(waveFunc) 

""" 

print re.sub(r'\bMT(\w+)\b', r'TM\1', code) 

The output is 

waveFunc = TMWaveFunction() 

waveFunc += 5.0 

opt = TMOptimizer(waveFunc) 

The sub function takes three arguments: the regular expression to replace; the string to replace it 

with, and the string to search through and modify. The regular expression in this example is quite 

straightforward: 

\bMT(\w+)\b 

This matches a word boundary (\b), followed by the letters MT, followed by one or more 

alphanumeric or underscore characters, and finishing with another word boundary (\b). This 

describes the labels we are trying to transform.

Parentheses have been added around the pattern that matches the second half of the label, 

following the MT prefix. This creates a group which stores the matched sub-string. The reason for 

doing this is that you can access any groups matched in the regular expression from within the 

substitution string. To do this, you simply supply a backslash, followed by the index of the group. 

In the example above the substitution string is TM\1, which means 'replace the matched string with 

TM followed by the first group from the match'. The first group from the match was the text that 

followed MT, so the net effect is to swap TM for MT. 

This is quite a simple example of substitution, but you can do some very powerful manipulations 

using regular expressions, and some astute use of grouping. 

We will finish off this section on regular expressions with a warning, best encapsulated in the 

following quote: 

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they 

have two problems. 

Jamie Zawinski 

Regular expressions are powerful, but they are not suitable for every situation. Not only that, they 

can be difficult to write — even for experienced developers — and even more difficult to read. By 

all means use them, but don't use them where a better solution already exists. (For example, don't 

parse XML documents with regular expressions. Use a specialized XML parser, as described in the 

next sub-section.) 

Exercise: Getting regular 

Come up with regular expression to match the following date format: 

17/05/2009 8:15 

Test your regular expression using the re.match function on the string above. 

Exercise: Groupies 

Introduce groups in the regular expression from the script in the previous exercise, to extract the hour 

of the day, and the minutes. Use these values to calculate how many seconds have passed since 

midnight, and print out the answer. 

Exercise: Needle in haystack 

Consider the following data: 

*********************** 

* T E C H N I C A L * 

***********************

============================================================= 

P A R A L L E L I Z A T I O N and V E C T O R I Z A T I O N 

============================================================= 

Nr of parallel processes: 1 

Internal max. (compile-time) nr of processes: 8 

Maximum vector length in NumInt loops: 128 

=============== 

I O vs. C P U *** (store numerical data on disk or recalculate) *** 

=============== 

Basis functions: recalculate when needed 

Fit functions: recalculate when needed 

IO buffersize (Mb): 64.000000 

===================== 

S C F U P D A T E S 

===================== 

Max. nr. of cycles: 100 

Convergence criterion: 0.0000000100 

secondary criterion: 0.0000000100 

Mix parameter (when DIIS does not apply): 0.2000000000 

Special mix parameter for the first cycle: 1.0000000000 

Write a script that uses regular expressions to extract and print the number given for the 'IO 

buffersize'. 

Exercise: Pick up sticks 

Write a script that uses the re.findall function to match and extract data values from lines of the 

following form: 

XY19 : 23.4 -234.0 9854.0, 645.345 34453 34.3 b=b1 

Your script should extract the label at the beginning, each of the numbers before the comma, each of 

the numbers after the comma, and the string on the right of the = sign (b1 in above example). Test 

your script on this data: 

XY19 : 23.4 -234.0 9854.0, 645.345 34453 34.3 b=b1 

XY19 : 23.4 -234.0 9854.0, 645.345 34453 34.3 b=b1 

XY19 

Elevation 

--------------- 

YY19 : 2.4 -234.0 984.0, 645.345 3445 34. b=b3 

XY20 : 3.4 -24.0 9854.0, 65.345 3453 34.3 b=a1 

---- 

Print out the extracted values, and confirm that they are correct.

Exercise: The splits 

Use the re.split function with an appropriately formed regular expression to extract the numbers 

from the following line of data: 

45, 3453 : 19, -1.e-10 

Your script should be able to handle the case that any of the numbers are in exponential form (such as 

the last number shown above). 

Exercise: No substitute for practice 

Imagine you have a script that names variables with a leading underscore, like this: _someVar. You 

decide you want to remove the leading underscore, and use a trailing underscore instead, like this: 

someVar_. Write a short script that uses the re.sub function to achieve this transformation. 

Come up with a small amount of trial data, and test your script on it.

Home 

Unit Testing in Python 

This page is a short tutorial on unit testing in Python, using the PyUnit module 

that ships with Python. I assume the reader is familiar with xUnit test 

frameworks in general, for example JUnit for Java and NUnit for .NET. I also 

assume the reader is a new Python programmer (which I am), so I will explain 

Python concepts more than I will explain xUnit concepts. And finally, I assume 

that your primary programming languages are C# and Ruby. I have based 

these reader assumptions on myself because I also assume I will be the 

primary reader of this page. 

There is an introduction to the Python Unit Testing Framework that I tried to 

read, but I found it hard to follow because it is a depth-first exposition. You 

have to read all about setup and teardown methods, testcase classes with 

several test methods, aggregating tests into test suites, nesting test suites, 

and a discussion of how to organize large bodies of test code before you can 

run the simplest test. I got frustrated because I just wanted to know the 

simplest thing I could do to test my code. That is why I am writing my own 

tutorial. 

A Unit to Test 

We need to have an example to test. Wanting to keep this demo as simple as 

possible, I have decided to specify some really simple requirements. 

1. You must write a class named ClassUnderTest. 

2. This class must provide a method named krajik. 

3. The krajik method must accept a number and return a number twice 

the given number. 

Importing the Unit Test Module 

Before you can use the types in the unit test module, you have to import it. 

Here is a simple way to import it: 

import unittest 

The Testcase Class 

To write a testcase class, write a class that derives from the TestCase class of 

the unittest module: 

class CheckCUT(unittest.TestCase): 

def runTest(self): 

# test procedure goes here... 

On the class statement, the base class (or classes) goes in parentheses after 

the name of the class. I say "classes" here because Python supports multiple 

inheritance. I have never tried using it, though. 

The class overrides the runTest method.

Asserting 

Write the usual four-phase unit test: Setup, Exercise, Verify, and Teardown. In 

this test, the teardown is done automatically by the garbage collector: 

class CheckCUT(unittest.TestCase): 

def runTest(self): 

cut = ClassUnderTest() 

actual = cut.krajik(17) 

expected = 34 

assert expected == actual, 'you are screwed' 

I just copied the assert statement from the site mentioned earlier and edited 

it for this scenario. I do not understand enough Python to really understand its 

syntax. The referenced page says this: 

Note that in order to test something, we just use the built-in 'assert' statement of 

Python. If the assertion fails when the test case runs, an AssertionError will be raised, 

and the testing framework will identify the test case as a 'failure'. 

Running the Test 

To run the test, you have to construct an object of the CheckCUT class, 

construct a TextTestRunner, and finally ask the runner to run your test case. 

Like this: 

testCase = CheckCUT() 

runner = unittest.TextTestRunner() 

runner.run(testCase) 

Then you just invoke the script from the command line. Here is what you get 

when you run what we have so far: 

E:\PyUnit>ut 

E 

====================================================================== 

ERROR: runTest (__main__.CheckCUT) 

---------------------------------------------------------------------- 

Traceback (most recent call last): 

File "E:\PyUnit\ut.py", line 16, in runTest 

cut = ClassUnderTest(); 

NameError: global name 'ClassUnderTest' is not defined 

---------------------------------------------------------------------- 

Ran 1 test in 0.001s 

FAILED (errors=1) 

Of course it failed because we have not yet written the implementation. 

Write the Code 

Now let us see if we can fix the "not defined" error. First, let us write a class 

that contains the method to be sure it is declared correctly. And here it is: 

class ClassUnderTest : 

def krajik(self, foo):

eturn 0 

In some ways Python is a very clean language in terms of not having a lot of 

needless punctuation. For example, notice the refreshing lack of braces and 

semicolons. The block structure of the code is indicated strictly by its level of 

indentation. However, like all languages, it has its idiosyncracies. In the case 

of Python, notice the gratuitous colons at the ends of the class and def lines. 

Those colons also show up at the ends of if and while statements. In fact, the 

basic use of a colon is to say redundantly "the next line should be indented." 

So anyway, the Python evangelists who wax eloquent about their favorite 

lovely language, tend to overlook this little detail. 

Enough whingeing for the moment. The class statement contains a method 

definition, which is indicated by the def keyword and another colon. Notice 

that you must always remember to put in a self keyword as the first 

argument. I guess that is a replacement for leaving out a static keyword for 

instance methods. 

Like Ruby, you do not need to declare variables before you assign to them. In 

fact, the first assignment that is executed for a given variable is a combination 

of setting a value and declaring the variable. In the case of a method 

parameter, it just acts like it is an assignment from the value of the argument 

that was used to call the method. 

Run the Test Again 

Now when we run the test we get this result: 

E:\PyUnit>ut 

F 

====================================================================== 

FAIL: runTest (__main__.CheckCUT) 

---------------------------------------------------------------------- 


File "E:\PyUnit\ut.py", line 19, in runTest 

assert expected == actual, 'you are screwed' 

AssertionError: you are screwed 

---------------------------------------------------------------------- 


FAILED (failures=1) 

We have fixed the problem with the class being undefined, but we still have a 

virtual "red bar", because we have not implemented the method correctly. 

In other xUnit test frameworks, the message would have been more like this: 

you are screwed: Expected 34 but got 0 

I think I prefer the version that tells you what the expected and actual values 

are. I guess PyUnit is not really ready for prime time yet. 

Fix the Implementation 

Now let us fix the error and make the test pass: 

class ClassUnderTest :

def krajik(self, foo): 

return foo * 2 

Green Bar 

Now when we run the test, it passes. This is what we see: 

E:\PyUnit>ut 

. 

---------------------------------------------------------------------- 


OK 

Summary 

This tutorial has shown you the very simplest thing you can do to use the 

PyUnit framework without bogging you down with a lot of details that would 

only distract you at the beginning. Naturally, you will want to learn about 

those details after you get this much working. To learn those details, you 

could go to this site: 

Python Unit Testing Framework. 

Last updated August 14, 2010

Notes from 

Well House 

Consultants 

These notes are written by Well House Consultants and distributed 

under their Open Training Notes License. If a copy of this license is not 

supplied at the end of these notes, please visit 

http://www.wellho.net/net/whcotnl.html 

for details. 

Well House Consultants Samples Notes from Well House Consultants 1 

1

Q110 

1.1 Well House Consultants 

Well House Consultants provides niche training, primarily but not exclusively in 

Open Source programming languages. We offer public courses at our training centre 

and private courses at your offices. We also make some of our training notes available 

under our "Open Training Notes" license, such as we’re doing in this document here. 

1.2 Open Training Notes License 

With an "Open Training Notes License", for which we make no charge, you’re 

allowed to print, use and disctibute these notes provided that you retain the complete 

and unaltered license agreement with them, including our copyright statement. This 

means that you can learn from the notes, and have others learn from them too. 

You are NOT allowed to charge (directly or indirectly) for the copying or distribution 

of these notes, nor are you allowed to charge for presentations making any use 

of them. 

1.3 Courses presented by the author 

If you would like us to attend a course (Java, Perl, Python, PHP, Tcl/Tk, MySQL 

or Linux) presented by the author of these notes, please see our public course 

schedule at 

http://www.wellho.net/course/index.html 

If you have a group of 4 or more trainees who require the same course at the same 

time, it will cost you less to have us run a private course for you. Please visit our onsite 

training page at 

http://www.wellho.net/course/otc.html 

which will give you details and costing information 

1.4 Contact Details 

Well House Consultants may be found online at 

http://www.wellho.net 

graham@wellho.net technical contact 

lisa@wellho.net administration contact 

Our full postal address is 

404 The Spa 

Melksham 

Wiltshire 

UK SN12 6QL 

Phone +44 (0) 1225 708225 

Fax +44 (0) 1225 707126 

2 Notes from Well House Consultants Well House Consultants, Ltd.

Best 

Programming 

Practice 

You can write good and bad programs in any programming language, 

and that includes Python. What makes for good and bad code? What guidelines 

should you follow to make your code quick to develop, be robust, easy 

to follow later, and flexible enough to be amendable to meet future requirements 

that you hadn’t even dreamed of when you wrote it? 

Isn’t it enough to be able to write a working program? . . . . . . . . . . . . . . . . . 4 

Analysing the requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

Designing the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

Reusing code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

Official style guide for Python code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

Python Programming Best Programming Practice 3 

2

Y116 

2.1 Isn’t it enough to be able to write a working program? 

No, it isn’t! 

A far higher proportion of the life costs of a piece of software are in its maintenance 

rather than its original writing, so it pays to spend a little more time to make a 

piece of code a lot more maintainable. 

Writing and maintaining a program usually occupies a lot less time (and costs a lot 

less) than the investment that users will put into it in entering data and generating 

outputs. It pays to spend a little more time as you write a program ensuring that it has 

an excellent user interface that provides the user with what he needs to use it 

efficiently. 

Requirements change over time, and it’s usually far cheaper to adopt and adapt 

the existing system than keep coming up with a completely new one at each change. 

Sure, in time you may get to the point of doing a re-write but better to have a fouryear 

cycle than a two-year cycle, and better to have a 10-year cycle than a five-year one. 

2.2 Analysing the requirement 

These paragraphs could be written for ANY language; it just happens to be part of 

a Python course in this case. Listen to the user’s requirements, question the user, 

learn as much as you can about what the application is to do. 

You may try and listen all at once (and it’s a good idea to do so in broad overview) 

and/or you may listen to details and partial requirements. Techniques such as 

extreme programming suggest a series of requirements, each of a few sentences and 

implemented and tested and integrated into the whole in a relatively short timescale, 

and with the whole project consisting of 50 to 100 such steps. 

2.3 Designing the solution 

These paragraphs are written for a language that supports the Object Oriented 

Mantra, of which Python is one of the most ardent adherents. 

For huge projects, formalised design systems such as UML, implemented using 

Rational Rose or other software, may be appropriate for you to use. For projects that 

are just large, that’s probably overkill, but you want to look for a good design solution 

and framework. 

Even if you’re not going to use a full UML system, learn the principles and how 

the views are derived from the model and think of how each of the diagrams would 

look for your system. Remember: 

• Use Case diagram 

• Class and Object diagrams 

• State diagram 

• Sequence diagram 

• Activity diagram 

• Component diagram 

Deployment diagram 

No need, probably, to use all the fancy 

symbols, simple boxes and arrows will be fine, 

although you might want to come up with 

company standards if there’s a team of you 

working on a project. 

2.4 Reusing code 

Write your code to be re-usable. You’re using an Object Oriented language and so 

you should naturally be thinking of objects that your whole organisation can use 

within all of their applications and not just in your own little area! 

Figure 1 Example of a UML symbol 

that you really need to draw if you’re 

just using the principles of UML ;-) 

4 Best Programming Practice Well House Consultants, Ltd.

Figure 2 A source of Python code for 

Bioinformatics applications 

Chapter 2 

See if others have written re-usable code. If you’re working for a university, has 

someone else already written a "student" and a "lecturer" class, and can you simply call 

their classes? If you’re working for a pharmaceutical company, has someone already 

written an amino acid class, perhaps with subclasses for Alanine, Glycine and the 

rest? 

If you’ve got more than a handful of Python projects within your organisation, it 

may be worth someone’s while setting up a central repository or web site or discussion 

forum as appropriate. Perhaps you can even persuade your management to sponsor 

an annual meeting or event away from the office for cross-fertilisation of ideas and 

even a lecture or two from someone who’s using Python in another organisation. 

Search the Internet, too. There may already be classes out there that are freely available 

and will give you an excellent start. Have a look at the vaults of Parnassus. You’ll 

probably find a lot of things that are not useful – that’s the nature of searching – but 

you’ll find some that are. 

Here’s the web site http://biopython.org as one Python-source example: 

2.5 Official style guide for Python code 

The following is from the official style guide for Python code, written by Guido 

van Rossum, the author of the Python language, and placed in the public domain 

(which is why we’re able to reproduce it here). It’s available online at 

http://www.python.org/peps/pep-0008.html 

Why has Guido chosen to make this available not just "open source", but public 

domain? Because it is SO IMPORTANT that you write your Python code so that it’s 

easy to follow and easy to maintain, and he wants the document to have the widest 

possible circulation. 

Python Programming Best Programming Practice 5

Y116 

Title: Style Guide for Python Code 

Version: Revision: 1.25 

Author: Guido van Rossum 

Barry Warsaw 

Status: Active 

Type: Informational 

Created: 05-Jul-2001 

Post-History: 05-Jul-2001 

Introduction 

This document gives coding conventions for the Python code comprising the 

standard library for the main Python distribution. Please see the companion informational 

PEP describing style guidelines for the C code in the C implementation of 

Python[1]. 

This document was adapted from Guido's original Python Style Guide essay[2], 

with some additions from Barry's style guide[5]. Where there's conflict, Guido's style 

rules for the purposes of this PEP. This PEP may still be incomplete (in fact, it may 

never be finished ). 

A Foolish Consistency is the Hobgoblin of Little Minds 

A style guide is about consistency. Consistency with this style guide is important. 

Consistency within a project is more important. Consistency within one module or 

function is most important. 

But most importantly: know when to be inconsistent -- sometimes the style guide 

just doesn't apply. When in doubt, use your best judgement. Look at other examples 

and decide what looks best. And don't hesitate to ask! 

Two good reasons to break a particular rule: 

(1) When applying the rule would make the code less readable, even for someone 

who is used to reading code that follows the rules. 

(2) To be consistent with surrounding code that also breaks it (maybe for historic 

reasons) -- although this is also an opportunity to clean up someone else's mess 

(in true XP style). 

Code lay-out 

Indentation 

Use the default of Emacs' Python-mode: 4 spaces for one indentation level. For 

really old code that you don't want to mess up, you can continue to use 8-space tabs. 

Emacs Python-mode auto-detects the prevailing indentation level used in a file and 

sets its indentation parameters accordingly. 

Tabs or Spaces? 

Never mix tabs and spaces. The most popular way of indenting Python is with 

spaces only. The second-most popular way is with tabs only. Code indented with a 

mixture of tabs and spaces should be converted to using spaces exclusively. (In Emacs, 

select the whole buffer and hit ESC-x untabify.) When invoking the python 

command line interpreter with the -t option, it issues warnings about code that illegally 

mixes tabs and spaces. When using -tt these warnings become errors. These 

options are highly recommended! 

For new projects, spaces-only are strongly recommended over tabs. Most editors 

have features that make this easy to do. (In Emacs, make sure indent-tabs-mode is nil). 

Maximum Line Length 

There are still many devices around that are limited to 80 character lines; plus, 

limiting windows to 80 characters makes it possible to have several windows side-byside. 

The default wrapping on such devices looks ugly. Therefore, please limit all lines 


Chapter 2 

to a maximum of 79 characters (Emacs wraps lines that are exactly 80 characters 

long). For flowing long blocks of text (docstrings or comments), limiting the length 

to 72 characters is recommended. 

The preferred way of wrapping long lines is by using Python's implied line continuation 

inside parentheses, brackets and braces. If necessary, you can add an extra pair 

of parentheses around an expression, but sometimes using a backslash looks better. 

Make sure to indent the continued line appropriately. Emacs Python-mode does this 

right. Some examples: 

class Rectangle(Blob): 

def __init__(self, width, height, 

color='black', emphasis=None, highlight=0): 

if width == 0 and height == 0 and \ 

color == 'red' and emphasis == 'strong' or \ 

highlight > 100: 

raise ValueError, "sorry, you lose" 

if width == 0 and height == 0 and (color == 'red' or 

emphasis is None): 

raise ValueError, "I don't think so" 

Blob.__init__(self, width, height, 

color, emphasis, highlight) 

Blank Lines 

Separate top-level function and class definitions with two blank lines. Method definitions 

inside a class are separated by a single blank line. Extra blank lines may be 

used (sparingly) to separate groups of related functions. Blank lines may be omitted 

between a bunch of related one-liners (e.g. a set of dummy implementations). 

When blank lines are used to separate method definitions, there is also a blank 

line between the `class' line and the first method definition. 

Use blank lines in functions, sparingly, to indicate logical sections. 

Python accepts the control-L (i.e. ^L) form feed character as whitespace; Emacs 

(and some printing tools) treat these characters as page separators, so you may use 

them to separate pages of related sections of your file. 

Encodings (PEP 263) 

Code in the core Python distribution should always use the ASCII or Latin-1 

encoding (a.k.a. ISO-8859-1). Files using ASCII should not have a coding cookie. 

Latin-1 should only be used when a comment or docstring needs to mention an 

author name that requires Latin-1; otherwise, using \x escapes is the preferred way 

to include non-ASCII data in string literals. An exception is made for those files that 

are part of the test suite for the code implementing PEP 263. 

Imports 

Imports should usually be on separate lines, e.g.: 

No: import sys, os 

Yes: import sys 

import os 

It's okay to say this though: 

from types import StringType, ListType 

Imports are always put at the top of the file, just after any module comments and 

docstrings, and before module globals and constants. Imports should be grouped, 

with the order being 

1. standard library imports 

2. related major package imports (i.e. all email package imports next) 

3. application specific imports 


Y116 

You should put a blank line between each group of imports. 

Relative imports for intra-package imports are highly discouraged. Always use the 

absolute package path for all imports. 

When importing a class from a class-containing module, it's usually okay to spell 

this 

from MyClass import MyClass 

from foo.bar.YourClass import YourClass 

If this spelling causes local name clashes, then spell them 

import MyClass 

import foo.bar.YourClass 

and use "MyClass.MyClass" and "foo.bar.YourClass.YourClass" 

Whitespace in Expressions and Statements 

Pet Peeves 

Guido hates whitespace in the following places: 

• Immediately inside parentheses, brackets or braces, as in: 

spam( ham[ 1 ], { eggs: 2 } ) 

Always write this as 

spam(ham[1], {eggs: 2}) 

• Immediately before a comma, semicolon, or colon, as in: 

if x == 4 : print x , y ; x , y = y , x 


if x == 4: print x, y; x, y = y, x 

• Immediately before the open parenthesis that starts the argument list of a function 

call, as in spam (1) 

Always write this as spam(1) 

• Immediately before the open parenthesis that starts an indexing or slicing, as in: 

dict ['key'] = list [index] 


dict['key'] = list[index] 

• More than one space around an assignment (or other) operator to align it with 

another, as in: 

x = 1 

y = 2 

long_variable = 3 

Always write this as: 

x = 1 

y = 2 

long_variable = 3 

(Don't bother to argue with him on any of the above -- Guido's grown accustomed 

to this style over 20 years.) 

Other recommendations 

• Always surround these binary operators with a single space on either side: 

assignment (=) 

comparisons (==, , !=, , =, in, not in, is, is not) 

Booleans (and, or, not). 

• Use your better judgment for the insertion of spaces around arithmetic operators. 

Always be consistent about whitespace on either side of a binary operator. Some 

examples: 

i = i+1 

submitted = submitted + 1 

x = x*2 - 1 

hypot2 = x*x + y*y 


Chapter 2 

c = (a+b) * (a-b) 

c = (a + b) * (a - b) 

• Don't use spaces around the '=' sign when used to indicate a keyword argument 

or a default parameter value. For instance: 

def complex(real, imag=0.0): 

return magic(r=real, i=imag) 

• Compound statements (multiple statements on the same line) are generally 

discouraged. 

No: if foo == 'blah': do_blah_thing() 

Yes: if foo == 'blah': 

do_blah_thing() 

Comments 

No: do_one(); do_two(); do_three() 

Yes: do_one() 

do_two() 

do_three() 

Comments that contradict the code are worse than no comments. Always make a 

priority of keeping the comments up-to-date when the code changes! 

Comments should be complete sentences. If a comment is a phrase or sentence, 

its first word should be capitalized, unless it is an identifier that begins with a lower 

case letter (never alter the case of identifiers!). 

If a comment is short, the period at the end is best omitted. Block comments 

generally consist of one or more paragraphs built out of complete sentences, and each 

sentence should end in a period. 

You should use two spaces after a sentence-ending period, since it makes Emacs 

wrapping and filling work consistently. 

When writing English, Strunk and White apply. 

Python coders from non-English speaking countries: please write your comments 

in English, unless you are 120% sure that the code will never be read by people who 

don't speak your language. 

Block Comments 

Block comments generally apply to some (or all) code that follows them, and are 

indented to the same level as that code. Each line of a block comment starts with a # 

and a single space (unless it is indented text inside the comment). Paragraphs inside 

a block comment are separated by a line containing a single #. Block comments are 

best surrounded by a blank line above and below them (or two lines above and a 

single line below for a block comment at the start of a a new section of function 

definitions). 

Inline Comments 

An inline comment is a comment on the same line as a statement. Inline 

comments should be used sparingly. Inline comments should be separated by at least 

two spaces from the statement. They should start with a # and a single space. 

Inline comments are unnecessary and in fact distracting if they state the obvious. 

Don't do this: 

x = x+1 # Increment x 

But sometimes, this is useful: 

x = x+1 # Compensate for border 

Documentation Strings 

Conventions for writing good documentation strings (a.k.a. "docstrings") are 

immortalized in PEP 257 [3]. 

Write docstrings for all public modules, functions, classes, and methods. 


Y116 

Docstrings are not necessary for non-public methods but you should have a comment 

that describes what the method does. This comment should appear after the "def" 

line. 

PEP 257 describes good docstring conventions. Note that most importantly, the 

""" that ends a multiline docstring should be on a line by itself, e.g.: 

"""Return a foobang 

Optional plotz says to frobnicate the bizbaz first. 

""" 

For one liner docstrings, it's okay to keep the closing """ on the same line. 

Version Bookkeeping 

If you have to have RCS or CVS crud in your source file, do it as follows. 

__version__ = "$Revision: 1.25 $" 

# $Source: /cvsroot/python/python/nondist/peps/pep-0008.txt,v $ 

These lines should be included after the module's docstring, before any other 

code, separated by a blank line above and below. 

Naming Conventions 

The naming conventions of Python's library are a bit of a mess, so we'll never get 

this completely consistent -- nevertheless, here are the currently recommended 

naming standards. New modules and packages (including 3rd party frameworks) 

should be written to these standards, but where an existing library has a different 

style, internal consistency is preferred. 

Descriptive: Naming Styles 

There are a lot of different naming styles. It helps to be able to recognize what 

naming style is being used, independently from what they are used for. 

The following naming styles are commonly distinguished: 

- b (single lowercase letter) 

- B (single uppercase letter) 

- lowercase 

- lower_case_with_underscores 

- UPPERCASE 

- UPPER_CASE_WITH_UNDERSCORES 

- CapitalizedWords (or CapWords, or CamelCase -- so named because of the 

bumpy look of its letters[4]). This is also sometimes known as StudlyCaps. 

- mixedCase (differs from CapitalizedWords by initial lowercase character!) 

- Capitalized_Words_With_Underscores (ugly!) 

There's also the style of using a short unique prefix to group related names 

together. This is not used much in Python, but it is mentioned for completeness. For 

example, the os.stat() function returns a tuple whose items traditionally have names 

like st_mode, st_size, st_mtime and so on. The X11 library uses a leading X for all its 

public functions. (In Python, this style is generally deemed unnecessary because 

attribute and method names are prefixed with an object, and function names are 

prefixed with a module name.) 

In addition, the following special forms using leading or trailing underscores are 

recognized (these can generally be combined with any case convention): 

- _single_leading_underscore: weak "internal use" indicator (e.g. "from M import 

*" does not import objects whose name starts with an underscore). 

- single_trailing_underscore_: used by convention to avoid conflicts with Python 

keyword, e.g. "Tkinter.Toplevel(master, class_='ClassName')". 

- __double_leading_underscore: class-private names as of Python 1.4. 

- __double_leading_and_trailing_underscore__: "magic" objects or attributes that 

10 Best Programming Practice Well House Consultants, Ltd.

Chapter 2 

live in user-controlled namespaces, e.g. __init__, __import__ or __file__. Sometimes 

these are defined by the user to trigger certain magic behavior (e.g. operator overloading); 

sometimes these are inserted by the infrastructure for its own use or for 

debugging purposes. Since the infrastructure (loosely defined as the Python interpreter 

and the standard library) may decide to grow its list of magic attributes in 

future versions, user code should generally refrain from using this convention for its 

own use. User code that aspires to become part of the infrastructure could combine 

this with a short prefix inside the underscores, e.g. __bobo_magic_attr__. 

Prescriptive: Naming Conventions 

Names to Avoid 

Never use the characters `l' (lowercase letter el), Ò' (uppercase letter oh), or Ì' 

(uppercase letter eye) as single character variable names. In some fonts, these characters 

are indistinguishable from the numerals one and zero. When tempted to use `l' 

use `L' instead. 

Module Names 

Modules should have short, lowercase names, without underscores. 

Since module names are mapped to file names, and some file systems are case 

insensitive and truncate long names, it is important that module names be chosen to 

be fairly short -- this won't be a problem on Unix, but it may be a problem when the 

code is transported to Mac or Windows. 

When an extension module written in C or C++ has an accompanying Python 

module that provides a higher level (e.g. more object oriented) interface, the C/C++ 

module has a leading underscore (e.g. _socket). 

Python packages should have short, all-lowercase names, without underscores. 

Class Names 

Almost without exception, class names use the CapWords convention. Classes for 

internal use have a leading underscore in addition. 

Exception Names 

If a module defines a single exception raised for all sorts of conditions, it is generally 

called "error" or "Error". It seems that built-in (extension) modules use "error" (e.g. 

os.error), while Python modules generally use "Error" (e.g. xdrlib.Error). The trend 

seems to be toward CapWords exception names. 

Global Variable Names 

(Let's hope that these variables are meant for use inside one module only.) The 

conventions are about the same as those for functions. Modules that are designed for 

use via "from M import *" should prefix their globals (and internal functions and 

classes) with an underscore to prevent exporting them. 

Function Names 

Function names should be lowercase, possibly with words separated by underscores 

to improve readability. mixedCase is allowed only in contexts where that's 

already the prevailing style (e.g. threading.py), to retain backwards compatibility. 

Method Names and Instance Variables 

The story is largely the same as with functions: in general, use lowercase with words 

separated by underscores as necessary to improve readability. 

Use one leading underscore only for internal methods and instance variables 

which are not intended to be part of the class's public interface. Python does not 

enforce this; it is up to programmers to respect the convention. 

Use two leading underscores to denote class-private names. Python "mangles" 

these names with the class name: if class Foo has an attribute named __a, it cannot 

be accessed by Foo.__a. (An insistent user could still gain access by calling 

Foo._Foo__a.) Generally, double leading underscores should be used only to avoid 

name conflicts with attributes in classes designed to be subclassed. 


Y116 

Designing for inheritance 

Always decide whether a class's methods and instance variables should be public 

or non-public. In general, never make data variables public unless you're implementing 

essentially a record. It's almost always preferable to give a functional 

interface to your class instead (and some Python 2.2 developments will make this 

much nicer). 

Also decide whether your attributes should be private or not. The difference 

between private and non-public is that the former will never be useful for a derived 

class, while the latter might be. Yes, you should design your classes with inheritance 

in mind! 

Private attributes should have two leading underscores, no trailing underscores. 

Non-public attributes should have a single leading underscore, no trailing 

underscores. 

Public attributes should have no leading or trailing underscores, unless they 

conflict with reserved words, in which case, a single trailing underscore is preferable 

to a leading one, or a corrupted spelling, e.g. class_ rather than klass. (This last point 

is a bit controversial; if you prefer klass over class_ then just be consistent. :). 

Programming Recommendations 

Code should be written in a way that does not disadvantage other implementations 

of Python (PyPy, Jython, IronPython, Pyrex, Psyco, and such). For example, do 

not rely on CPython's efficient implementation of in-place string concatenation for 

statements in the form a+=b or a=a+b. Those statements run more slowly in Jython. 

In performance sensitive parts of the library, the ''.join()" form should be used 

instead. This will assure that concatenation occurs in linear time across various 

implementations. 

Comparisons to singletons like None should always be done with 'is' or 'is not'. 

Also, beware of writing "if x" when you really mean "if x is not None" -- e.g. when 

testing whether a variable or argument that defaults to None was set to some other 

value. The other value might be a value that's false in a Boolean context! 

Class-based exceptions are always preferred over string-based exceptions. Modules 

or packages should define their own domain-specific base exception class, which 

should be subclassed from the built-in Exception class. Always include a class 

docstring. E.g.: 

class MessageError(Exception): 

"""Base class for errors in the email package.""" 

Use string methods instead of the string module unless backward-compatibility 

with versions earlier than Python 2.0 is important. String methods are always much 

faster and share the same API with unicode strings. 

Avoid slicing strings when checking for prefixes or suffixes. Use startswith() 

and endswith() instead, since they are cleaner and less error prone. For example: 

No: if foo[:3] == 'bar': 

Yes: if foo.startswith('bar'): 

The exception is if your code must work with Python 1.5.2 (but let's hope not!). 

Object type comparisons should always use isinstance() instead of comparing 

types directly. E.g. 

No: if type(obj) is type(1): 

Yes: if isinstance(obj, int): 

When checking if an object is a string, keep in mind that it might be a unicode 

string too! In Python 2.3, str and unicode have a common base class, basestring, so 

you can do: 

if isinstance(obj, basestring): 

In Python 2.2, the types module has the StringTypes type defined for that purpose, 


Chapter 2 

e.g.: 

from types import StringTypes 

if isinstance(obj, StringTypes): 

In Python 2.0 and 2.1, you should do: 

from types import StringType, UnicodeType 

if isinstance(obj, StringType) or \ 

isinstance(obj, UnicodeType) : 

For sequences, (strings, lists, tuples), use the fact that empty sequences are false, so 

"if not seq" or "if seq" is preferable to "if len(seq)" or "if not len(seq)". 

Don't write string literals that rely on significant trailing whitespace. Such trailing 

whitespace is visually indistinguishable and some editors (or more recently, 

reindent.py) will trim them. 

Don't compare boolean values to True or False using == (bool types are new in 

Python 2.3): 

No: if greeting == True: 

Yes: if greeting: 

References 

[1] PEP 7, Style Guide for C Code, van Rossum 

[2] http://www.python.org/doc/essays/styleguide.html 

[3] PEP 257, Docstring Conventions, Goodger, van Rossum 

[4] http://www.wikipedia.com/wiki/CamelCase 

[5] Barry's GNU Mailman style guide 

http://barry.warsaw.us/software/STYLEGUIDE.txt 

Copyright 

This [The Style Guide for Python code] document has been placed in the public 

domain. 

The Style Guide finishes here. 

Copyright of the rest of this module is retained by Well House Consultants and is subject to 

the full copyright statement that is reproduced elsewhere and covers this set of training notes as 

a whole. 


Y116 


License 

These notes are distributed under the Well House Consultants 

Open Training Notes License. Basically, if you distribute it and use it 

for free, we’ll let you have it for free. If you charge for its distribution of 

use, we’ll charge. 

Well House Consultants Samples License 15 

3

Q111 

3.1 Open Training Notes License 

Training notes distributed under the Well House Consultants Open Training 

Notes License (WHCOTNL) may be reproduced for any purpose PROVIDE THAT: 

• This License statement is retained, unaltered (save for additions to the change log) 

and complete. 

• No charge is made for the distribution, nor for the use or application thereof. This 

means that you can use them to run training sessions or as support material for 

those sessions, but you cannot then make a charge for those training sessions. 

• Alterations to the content of the document are clearly marked as being such, and 

a log of amendments is added below this notice. 

• These notes are provided "as is" with no warranty of fitness for purpose. Whilst 

every attempt has been made to ensure their accuracy, no liability can be accepted 

for any errors of the consequences thereof. 

Copyright is retained by Well House Consultants Ltd, of 404, The Spa, Melksham, 

Wiltshire, UK, SN12 6QL - phone number +44 (1) 1225 708225. Email 

contact - Graham Ellis (graham@wellho.net). 

Please send any amendments and corrections to these notes to the Copyright 

holder - under the spirit of the Open Distribution license, we will incorporate suitable 

changes into future releases for the use of the community. 

If you are charged for this material, or for presentation of a course (Other than by 

Well House Consultants) using this material, please let us know. It is a violation of 

the license under which this notes are distributed for such a charge to be made, 

except by the Copyright Holder. 

If you would like Well House Consultants to use this material to present a training 

course for your organisation, or if you wish to attend a public course is one is available, 

please contact us or see our web site - http://www.wellho.net - for further 

details. 

Change log 

Original Version, Well House Consultants, 2004 

Updated by: ___________________ on _________________ 

Updated by: ___________________ on _________________ 

Updated by: ___________________ on _________________ 

Updated by: ___________________ on _________________ 

Updated by: ___________________ on _________________ 

Updated by: ___________________ on _________________ 

Updated by: ___________________ on _________________ 

License Ends. 

16 License Well House Consultants, Ltd.

Download Getting Started Documentation Report Bugs Read the Blog 

NumPy is the fundamental package for scientific computing with Python. It contains among other 

things: 

Numpy » 

a powerful N-dimensional array object 

sophisticated (broadcasting) functions 

tools for integrating C/C++ and Fortran code 

useful linear algebra, Fourier transform, and random number capabilities 

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional 

container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly 

and speedily integrate with a wide variety of databases. 

Numpy is licensed under the BSD license, enabling reuse with few restrictions. 

Getting Started 

Getting Numpy 

Installing NumPy and SciPy 

NumPy and SciPy documentation page 

NumPy Tutorial 

NumPy for MATLAB© Users 

NumPy functions by category 

NumPy Mailing List 

More Information 

NumPy Sourceforge Home Page 

SciPy Home Page 

Interfacing with compiled code 

Older python array packages 

© Copyright 2012 Numpy developers. Created using Sphinx 1.1.2. 

next

Basic Plotting with Python and Matplotlib 

This guide assumes that you have already installed NumPy and Matplotlib for your Python distribution. 

You can check if it is installed by importing it: 

import numpy as np 

import matplotlib.pyplot as plt # The code below assumes this convenient renaming 

For those of you familiar with MATLAB, the basic Matplotlib syntax is very similar. 

1 Line plots 

The basic syntax for creating line plots is plt.plot(x,y), where x and y are arrays of the same length that 

specify the (x, y) pairs that form the line. For example, let’s plot the cosine function from −2 to 1. To do 

so, we need to provide a discretization (grid) of the values along the x-axis, and evaluate the function on 

each x value. This can typically be done with numpy.arange or numpy.linspace. 

xvals = np.arange(-2, 1, 0.01) # Grid of 0.01 spacing from -2 to 10 

yvals = np.cos(xvals) # Evaluate function on xvals 

plt.plot(xvals, yvals) # Create line plot with yvals against xvals 

plt.show() # Show the figure 

You should put the plt.show command last after you have made all relevant changes to the plot. You can 

create multiple figures by creating new figure windows with plt.figure(). To output all these figures at 

once, you should only have one plt.show command at the very end. Also, unless you turned the interactive 

mode on, the code will be paused until you close the figure window. 

Suppose we want to add another plot, the quadratic approximation to the cosine function. We do so 

below using a different color and line type. We also add a title and axis labels, which is highly recommended 

in your own work. Also note that we moved the plt.show command to the end so that it shows both plots. 

newyvals = 1 - 0.5 * xvals**2 # Evaluate quadratic approximation on xvals 

plt.plot(xvals, newyvals, ’r--’) # Create line plot with red dashed line 

plt.title(’Example plots’) 

plt.xlabel(’Input’) 

plt.ylabel(’Function values’) 

plt.show() # Show the figure (remove the previous instance) 

The third parameter supplied to plt.plot above is an optional format string. The particular one specified 

above gives a red dashed line. See the extensive Matplotlib documentation online for other formatting 

commands, as well as many other plotting properties that were not covered here: 

http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.plot 

1

2 Contour plots 

The basic syntax for creating contour plots is plt.contour(X,Y,Z,levels). To trace a contour, plt.contour 

requires a 2-D array Z that specifies function values on a grid. The underlying grid is given by X and Y, 

either both as 2-D arrays with the same shape as Z, or both as 1-D arrays where len(X) is the number of 

columns in Z and len(Y) is the number of rows in Z. 

In most situations it is more convenient to work with the underlying grid (i.e., the former representation). 

The meshgrid function is useful for constructing 2-D grids from two 1-D arrays. It returns two 2-D arrays 

X,Y of the same shape, where each element-wise pair specifies an underlying (x, y) point on the grid. Function 

values on the grid Z can then be calculated using these X,Y element-wise pairs. 

plt.figure() # Create a new figure window 

xlist = np.linspace(-2.0, 1.0, 100) # Create 1-D arrays for x,y dimensions 

ylist = np.linspace(-1.0, 2.0, 100) 

X,Y = np.meshgrid(xlist, ylist) # Create 2-D grid xlist,ylist values 

Z = np.sqrt(X**2 + Y**2) # Compute function values on the grid 

We also need to specify the contour levels (of Z) to plot. You can either specify a positive integer for the 

number of automatically- decided contours to plot, or you can give a list of contour (function) values in the 

levels argument. For example, we plot several contours below: 

plt.contour(X, Y, Z, [0.5, 1.0, 1.2, 1.5], colors = ’k’, linestyles = ’solid’) 

plt.show() 

Note that we also specified the contour colors and linestyles. By default, negative contours are given by 

dashed lines, hence we specified solid. Again, many properties are described in the Matplotlib specification: 

http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.contour 

3 More plotting properties 

The function considered above should actually have circular contours. Unfortunately, due to the different 

scales of the axes, the figure likely turned out to be flattened and the contours appear like ellipses. This 

is undesirable, for example, if we wanted to visualize 2-D Gaussian covariance contours. We can force the 

aspect ratio to be equal with the following command (placed before plt.show): 

plt.axes().set_aspect(’equal’) # Scale the plot size to get same aspect ratio 

Finally, suppose we want to zoom in on a particular region of the plot. We can do this by changing the 

axis limits (again before plt.show). The input list to plt.axis has form [xmin, xmax, ymin, ymax]. 

plt.axis([-1.0, 1.0, -0.5, 0.5]) # Set axis limits 

Notice that the aspect ratio is still equal after changing the axis limits. Also, the commands above only 

change the properties of the current axis. If you have multiple figures you will generally have to set them 

for each figure before calling plt.figure to create the next figure window. 

You can find out how to set many other axis properties at: 

http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.axis 

http://matplotlib.sourceforge.net/api/axes_api.html#matplotlib.axes 

The final link covers many things, but most functions for changing axis properties begin with “set_”. 

2

4 Figures 

Figure 1: Example from section on line plots. 

Figure 2: Example from section on contour plots. 

3

5 Code 

import numpy as np 

Figure 3: Setting the aspect ratio to be equal and zooming in on the contour plot. 

import matplotlib.pyplot as plt 

xvals = np.arange(-2, 1, 0.01) # Grid of 0.01 spacing from -2 to 10 

yvals = np.cos(xvals) # Evaluate function on xvals 

plt.plot(xvals, yvals) # Create line plot with yvals against xvals 

# plt.show() # Show the figure 

newyvals = 1 - 0.5 * xvals**2 # Evaluate quadratic approximation on xvals 

plt.plot(xvals, newyvals, ’r--’) # Create line plot with red dashed line 

plt.title(’Example plots’) 

plt.xlabel(’Input’) 

plt.ylabel(’Function values’) 

# plt.show() # Show the figure 

plt.figure() # Create a new figure window 

xlist = np.linspace(-2.0, 1.0, 100) # Create 1-D arrays for x,y dimensions 

ylist = np.linspace(-1.0, 2.0, 100) 

X,Y = np.meshgrid(xlist, ylist) # Create 2-D grid xlist,ylist values 

Z = np.sqrt(X**2 + Y**2) # Compute function values on the grid 

plt.contour(X, Y, Z, [0.5, 1.0, 1.2, 1.5], colors = ’k’, linestyles = ’solid’) 

plt.axes().set_aspect(’equal’) # Scale the plot size to get same aspect ratio 

plt.axis([-1.0, 1.0, -0.5, 0.5]) # Change axis limits 

plt.show() 

4

2.39 Scientific Python (scipy.org), see also Sec. 2.37 

First reference occurs in Numerical Python (scipy.org), see Section 2.37 on page 246. 

251

About 

SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer 

algebra system (CAS) while keeping the code as simple as possible in order to be 

comprehensible and easily extensible. SymPy is written entirely in Python and does not require 

any external libraries. 

Features 

Core capabilities 

Basic arithmetic: Support for operators such as +, -, *, /, ** (power) 

Simplification 

Expansion 

Functions: trigonometric, hyperbolic, exponential, roots, logarithms, absolute value, 

spherical harmonics, factorials and gamma functions, zeta functions, polynomials, 

special functions, ... 

Substitution 

Numbers: arbitrary precision integers, rationals, and floats 

Noncommutative symbols 

Pattern matching 

Polynomials 

Basic arithmetic: division, gcd, ... 

Factorization 

Square-free decomposition 

Gröbner bases 

Partial fraction decomposition 

Resultants 

Calculus 

Limits: limit(x*log(x), x, 0) -> 0 

Differentiation 

Integration: It uses extended Risch-Norman heuristic 

Taylor (Laurent) series 

Solving equations 

Polynomial equations 

Algebraic equations 

Differential equations 

Difference equations 

Systems of equations 

Discrete math 

Binomial coefficients 

Summations 

Products 

Number theory: generating prime numbers, primality testing, integer factorization, ... 

Logic expressions 

Matrices 

Basic arithmetic 

Eigenvalues/eigenvectors 

Determinants 

Inversion 

Solving 

Geometric Algebra 

Geometry 

points, lines, rays, segments, ellipses, circles, polygons, ... 

Intersection 

Tangency 

Similarity 

Plotting 

Coordinate modes 

Plotting Geometric Entities 

2D and 3D 

Interactive interface 

Colors 

Physics 

Units 

Mechanics 

Quantum 

Gaussian Optics 

Pauli Algebra 

SymPy 

Main Page Download Documentation Support Screenshots Development Online Shell 

Download Now 

Releases: Google Code downloads 

Latest git version: github.com/sympy/sympy 

Quick Links 

Documentation 

Downloads (source tarballs) 

Downloads (packages for distributions) 

Mailing list 

Source code 

Issues tracker 

Google Code page 

Wiki 

Try SymPy online now 

Official SymPy blog 

Planet SymPy 

News 

More Information 

17 Mar 2012 SymPy is accepted as a a mentoring organization 

for Google Summer of Code 2012 

29 Jul 2011 Version 0.7.1 released (changes) 

28 Jun 2011 Version 0.7.0 released (changes) 

18 Mar 2011 SymPy is accepted as a mentoring organization 

for Google Summer of Code 2011 

23 Oct 2010 New website launched at sympy.org 

18 Oct 2010 Final page about the 2010 Google Summer of 

Code in SymPy is available. 

17 Mar 2010 Version 0.6.7 released (changes) 

20 Dec 2009 Version 0.6.6 released (changes) 

26 Sep 2009 Final page about the 2009 Google Summer of 

Code in SymPy is available.

Statistics 

Normal distributions 

Uniform distributions 

Probability 

Printing 

Pretty printing: ASCII/Unicode pretty printing, LaTeX 

Code generation: C, Fortran, Python 

Copyright © 2012 SymPy Development Team. This page is open source. Fork the project on 

GitHub to edit it. Languages (beta): [Cs], [De], [En], [Fr], [Pt], [Ru], [Zh]

Moka Minimalist functional python library 

List 

Dict 

Summary 

all 

append 

attr 

compact 

count 

do 

empty 

extend 

flatten 

join 

insert 

invoke 

item 

keep 

map 

rem 

reverse 

some 

sort 

uniq 

Summary 

all 

compact 

count 

do 

empty 

fromkeys 

invoke 

keep 

map 

rem 

some 

Download 

Moka Minimalist functional python library 

Moka is a minimalist functional library wrapping commons Python default data 

structures. In other words, it let you chain functional constructs in a readable and 

pythonic way. 

(List() # Create a new instance of moka.List 

left) 

space. 

.extend(range(1,20)) # Insert the numbers from 1 to 20 

.keep(lambda x: x > 5) # Keep only the numbers bigger than 5 

.rem(operator.gt, 7) # Remove the numbers bigger than 7 using partial application 

.rem(eq=6) # Remove the number 6 using the 'operator shortcut' 

.map(str) # Call str on each numbers (Creating a list of string) 

.invoke('zfill', 3) # Call zfill(x, 3) on each string (Filling some 0 on the 

.insert(0, 'I am') # Insert the string 'I am' at the head of the list 

.join(' ')) # Joining every string of the list and separate them with a 

>>> 'I am 007' 

Get started 

# With pip 

pip install moka 

# From github 

git clone git@github.com:phzbox/Moka.git 

python setup.py install 

Why Moka? 

Although the standard library provides various useful functional constructs, it's hard 

to use them as each have their own interface. 

For instance, when should one use map/filter rather than list comprehension? When 

should one use itertools instead of the default dict/list builtins? 

Sometime, for simple code, one construct seems useful.. but with a bit more 

complexity, it starts to become a hell to maintain. (List comprehension spawning 

multiple lines anyone? Clever map/filter/reduce hard to read?) 

The goal of Moka is to create a simple and uniform interface to make it easy to use 

functional paradigms. 

General Idea high level view of Moka. 

Although a somewhat lispy syntax, Moka has been built in a pythonic mentality and 

thus is perfectly usable in conjonction with the standard python library. 

In fact, moka's constructs are simple wrappers around the builtins list and dict. 

Among the differences are: 

1. how you can chain multiple operations to improve readability. 

2. how, by default, each method returns a new structure (i.e. moka is immutable by 

default). 

Tweet 46 

Download

3. how moka tries to reduce the friction of using high level functions by providing 

some shortcuts. 

Chaining 

Inspired by jQuery and clojure's '->', we believe chaining constructs are easier to 

read and maintain than deeply nested expressions. 

For instance: 

Dict(a=1, b=2).update(c=3).rem(lambda x, y: x=='a') 

# to debug 

Dict(a=1, b=2).update(c=3).do(print).rem(lambda x, y: x=='a') 

# making it do more complex operations is trivial 

(Dict(a=1, b=2) 

.update(c=3) 

.rem(lambda x, y: x=='a') 

.all(lambda x, y: y

# Equivalent to: 

List(users).keep(lambda x: User.has_permission(x, 'write')) 

Shortcuts 

Whenever where it is logical to do so, Moka let you use keywords as a shortcut for 

operator.*name* 

List([1,2,3]).keep(gt=1) # [2,3] 

# Equivalent to (Using partial application): 

import operator 

List([1,2,3]).keep(operator.gt, 1) # [2,3] 

# Or: 

List([1,2,3]).keep(lambda x: operator.gt(x, 1)) 

List([1,2,3]).keep(eq=1) # [1] 

# Equivalent, using partial application: 

List([1,2,3]).keep(operator.eq,1) # [1] 

# Note: We need to use 'Blank' to reverse the order of arguments. 

# The syntax is contains(haystack, needle). 

# Sadly, the operator module doesn't have a 'in_' 

# function (As it has a not_ function) 

from moka import Blank as _ 

List([4]).keep(operator.contains, [1,2,3,4], _) #4 

# Equivalent to: 

List([4]).keep(lambda x: operator.contains([1,2,3,4], x)) #4 

List provides any sequences a functional/chainable interface. 

Summary 

moka.List is a wrapper around the builtin python list. 

All default methods will work as expected. But the ones returning None will instead 

return 'self'. 

all append attr compact count do empty extend flatten join insert invoke item keep 

map rem reverse some sort Summary uniq 

All 

Returns True if all elements satisfy a predicate. If the predicate is not callable, the 

identify function is used. 

# All elements are smaller than 100 

List(range(1,10)).all(lambda x: x < 100)) 

>>> True 

# All elements are *not* even 

List(range(1,10)).all(lambda x: x % 2)) 

>>> False 

# All elements equal 5 

List([5,5,5]).all(eq=5) 

>>> True

Usage in real code 

def user_logged(users): 

for user in users: 

if not user.is_logged(): 

return True 

# With moka 

return False 

def user_logged(users): 

return List(users).all(User.is_logged) 

Append 

Same as list.append, but returns a new list. 

List(range(5)).append(5) 

>>> [0, 1, 2, 3, 4, 5] 

Attr 

Shortcut for (lambda x: x.item) 

List([complex(1,2)]).attr('imag') 

>>> [2.0] 

# Also possible to use operator.attrgetter: 

List([complex(1,2)]).map(operator.attrgetter('imag')) 

>>> [2.0] 

Compact 

Remove all falsy elements 

List([None, 0, 2, []]).compact() 

>>> [2] 

# Customize what is false: 

List([None, 0, 2, []]).compact(lambda x: x != None) 

>>> [0, 2, []] 

Count 

Total of elements matching a predicate 

# How many elements smaller than 5? 

List(range(1,10)).count(lambda x: x < 5) 

>>> 4 

# How many 5? 

List(range(1,10)).count(eq=5) 

>>> 1 

# Without predicate, equivalent to len(list) 

List(range(1,10)).count() 

>>> 9 

Do 

*do* invokes a function passing the whole list as first parameter. 

It can be useful to debug or for operations with side-effects. 

# Let 'print' be a function

from __future__ import print_function 

def my_function(my_list, param_1): 

print('My function is called. List: %s Param1: %s' % (my_list, param_1)) 

(List(range(10)) 

.do(print) 

.keep(lambda x: x < 5) 

.do(print) 

.keep(eq=2) 

.do(my_function, 'parameter..')) 

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 

[0, 1, 2, 3, 4] 

'My function is called. List: [2] Param1: parameter..' 

>>> [2] 

Empty 

Return True if the list is empty. 

List([]).empty() 

>>> True 

List([None, 0, 0]).empty() 

>>> False 

List([None, 0, 0]).empty(lambda x: not x) 

>>> True 

Extend 

Same as the builtin list.extend but returns the list instead of None. 

List([1,2]).extend([3,4,5]) 

>>> [1, 2, 3, 4, 5] 

Flatten 

Remove multi level of nested lists while preserving elements. 

(List(range(1,8)) 

.map(lambda x: [x,[x,[x]]]) 

.do(print) 

.flatten()) 

[[1, [1, [1]]], [2, [2, [2]]], [3, [3, [3]]], [4, [4, [4]]], [5, [5, [5]]], [6, [6, 

[6]]], [7, [7, [7]]]] 

>>> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7] 

l = [[1,2,3],[4,5,6], [7], [8,9]] 

List(l).flatten() 

>>> [1, 2, 3, 4, 5, 6, 7, 8, 9] 

# Some other ways to flatten lists 

# http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in- 

python 

import itertools 

import operator 

[item for sublist in l for item in sublist] 

sum(l, []) 

merged = list(itertools.chain.from_iterable(list2d)) 

reduce(lambda x,y: x+y,l) 

reduce(operator.add, l) 

Join 

Same as string.join but chainable from List

List(['a', 'b']).join(', ') 

>>> 'a, b' 

List(range(10)).map(str).join('') 

>>> '0123456789' 

Insert 

Same as default list.insert but returns the list instead of None 

List([1,2]).insert(2, 3).insert(0,0) 

>>> [0, 1, 2, 3] 

Invoke 

Shortcut for map(lambda x: x.*name*(args) 

List(['hello','world']).invoke('title').join(' ') 

>>> 'Hello World' 

(List([7]) 

.map(str) 

.invoke('zfill', 3) 

.join('')) 

>>> '007' 

Item 

Shortcut for lambda x: x['name'] 

List([dict(a=1), dict(a=2)]).item('a') 

>>> [1, 2] 

Keep Also knows as filter or select 

Keep *filters* the list based on the given predicate. If the predicate is not callable, the 

identity function is used. 

List(range(1,10)).keep(lambda x: x < 5) 

>>> [1, 2, 3, 4] 

List(range(1,10)).keep(eq=5) 

>>> [5] 

Map 

Map transforms all elements. 

List(range(3)).map(lambda x: x * 2) 

>>> [0, 2, 4] 

(List(range(3)) 

.map(str) 

.join('')) 

>>> '012' 

Rem 

Rem removes the elements satisfying the predicate. If the predicate is not callable, 

the identity function is used. 

List(range(5)).remove(lambda x: x in [1,2])

[0, 3, 4] 

List(range(3)).rem(eq=1) 

>>> [0, 2] 

Reverse 

Reverse the list (Same as reversed but chainable) 

List([1,2,3]).reverse() 

>>> [3,2,1] 

Some alias: has 

*Some* returns True if at least one of the element satisfy the predicate. If the 

predicate is not callable, the identify function is used. 

List(range(5)).some(lambda x: x > 4) 

>>> False 

List(range(5)).some(lambda x: x > 3) 

>>> True 

List(range(5)).some(eq=3) 

>>> True 

# has is an alias for some 

List(['a', 'b', 'c']).has(eq='b') 

>>> True 

Sort 

Same as *sorted* but chainable. 

List([5,3,1]).sort() 

>>> [1, 3, 5] 

List('abcABC').sort().join('') 

>>> 'ABCabc' 

import string 

List('abcABC').sort(key=string.lower).join('') 

>>> 'aAbBcC' 

Uniq 

Remove duplicate. A function may specify what to compare. 

List([1,1,2,3,2,1]).uniq().sort() 

>>> [1, 2, 3] 

List('abcABC').uniq(string.lower).join('') 

>>> 'ACB' # or 'ABC' or 'abc' 

(List('abcABC') 

.uniq(string.lower) 

.sort() 

.map(string.lower) 

.join('')) 

>>> 'abc' 

Dict Wrapper around dict builtin providing chainable/functional interface

Summary 

moka.Dict is a wrapper around the builtin python dict. 

All default methods will work as expected. But the ones returning None will instead 

return 'self'. 

all compact count do empty fromkeys invoke keep map rem some 

All 

Returns True if all elements satisfy a predicate. If the predicate is not callable, the 

identify function is used. 

Dict(a=1, b=1).all(1) 

>>> True 

Dict(a=1, b=2).all(lambda x, y: y < 3) 

>>> True 

Compact 

Remove all empty elements. A predicate can be given to choose what is considered 

as empty. 

Dict(a=1, b=2, c=None, d=[], e={}).compact() 

>>> {'a': 1, 'b': 2} 

# Remove only 'None' values 

Dict(a=1, b=2, c=None, d=[], e={}).compact(lambda *x: x[1] is None) 

>>> {'a': 1, 'b': 2, 'd': [], 'e': {}} 

Count 

Total of elements matching a predicate. If no predicate is given, returns the number 

of elements. 

# How many elements smaller than 5? 

Dict(a=1, b=2, c=3).count(lambda *x: x[1] < 5) 

>>> 3 

# How many values = 3? 

Dict(a=3, b=3, c=3).count(eq=3) 

>>> 3 

# how many keys are in lower case? 

Dict(a=1, B=2, C=3).count(lambda x,_: x.lower() == x) 

>>> 1 

# Without predicate, equivalent to len(dict) 

Dict(a=1, b=2).count() 

>>> 2 

Do 

*do* calls a function passing the whole dict as first parameter. (Remaining args are 

also passed as arguments to the function). 

It can be useful to debug or for operations with side-effects. 

def my_function(my_dict, param_1): 

print('My function is called. Dict: %s Param1: %s' % (my_dict, param_1)) 

(Dict(a=1, b=2, c=3, d=4) 

.keep(lambda x, y: y < 3) 

.do(my_function, 'parameter..') 

.rem(eq=1))

My function is called. Dict: {'a': 1, 'b': 2} Param1: parameter.. 

>>> {'b': 2} 

Empty 

Return True if there is no element. A predicate can be given to choose what is 

considered 'empty'. 

Dict().empty() 

>>> True 

Dict(a=None).empty() 

>>> False 

Dict(a=None).empty(lambda x, y: y is None) 

>>> True 

Fromkeys 

Same as the builtin but returns a new Dict. 

Dict().fromkeys(range(1,5)) 

>>> {1: None, 2: None, 3: None, 4: None} 

Dict().fromkeys(range(1,5), None) 

>>> {1: None, 2: None, 3: None, 4: None} 

Invoke 

Shortcut for map(lambda x, y: y.function(args..) 

Dict(a='hello', b='hi').invoke('upper') 

>>> {'a': 'HELLO', 'b': 'HI'} 

Keep 

Filter elements based on a predicate. If the predicate is not callable, the identity 

function is used. 

Dict(a=1, b=2, c=3).keep(lambda x,y: y < 3) 

>>> {'a': 1, 'b': 2} 

Dict(a=1, b=2, c=3).keep(eq=2) 

>>> {'b': 2} 

Map 

Transforms each elements of the dict. 

Dict(a='hello', b='hello').map(lambda x,y: (x, x.upper())) 

>>> {'a': 'A', 'b': 'B'} 

Rem 

Remove elements based on a predicate. If the predicate is not callable, the identity 

function is used. 

Dict(a=1, b=2, c=3).rem(lambda x, y: y == 2) 

>>> {'a': 1, 'c': 3} 

Some

Return true if one or more elements matches a predicate. If the predicate is not 

callable, the identity function is used. 

Dict(a=1, b=2, c=3).some(lambda x, y: y==2) 

>>> True 

Dict(a=1, b=2, c=3).some(lambda x, y: y==4) 

>>> False 

Dict(a=1, b=2, c=3).some(lambda x, y: y > 2) 

>>> True

2.42 The Transparent Language Popularity Index, see also Sec. 2.17 

First reference occurs in The Transparent Language Popularity Index, see Section 2.17 on page 

108. 

264

Process Modelling TKP4106 

Heinz A. Preisig 


email: preisig@nt.ntnu.no 

phone: +47-7359-??? 

"Bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla 

bla bla bla bla bla bla bla bla bla. " 

Introductory words to TKP4106, Heinz Preisig (2012) 

This page is the index to the modelling sessions of Process Modelling 

TKP4106. For easy off-line browsing you can download the entire 5 MB pdf-file 

here. There is also a FAQ list and a Syllabus available. All subjects are taught 

(chronologically) in a top-down manner. The Goals give an overview of where 

we are heading. 

Goals (ontology): back 

1. 

2. 

3. 

Goals (paradigms): back 

1. 

2. 

3. 

Goals (modelling): back 

HTML paragraph. HTML paragraph. HTML paragraph. HTML paragraph. HTML 

paragraph. HTML paragraph. HTML paragraph. HTML paragraph. HTML 


paragraph. HTML paragraph. HTML paragraph. 

back 

Predefined number 1a. 

Predefined number 1b. 

Predefined number 1c. 

1. 

2. 

3. 


Frequently asked questions (FAQ) 

Nuts and bolts: home THW HAP 

1. How to create a homepage on the stud server: change permissions here. 

2. Publishing .py files: By default the server folk.ntnu.no treats files ending 

in .py as binary files. So, if you click a link to a .py file, the file will be 

downloaded. This is OK if you actually want to edit or play around with the 

file, but not OK if you want to have a quick look and then leave the file. 

However, folk.ntnu.no runs apache web server and it can be configured 

(recursivly on a folder by folder basis) using a special hidden file called 

.htaccess. The trick is to configure the server such that .py files get 

served as text/plain mime type. Because different versions of Windows 

can make it confusing working with hidden files, and because the task is 

very simple to solve in LINUX which is available to all students at 

logon.stud.ntnu.no, open Putty (it is on most NTNU computers, or you 

can install it on your own) and enter logon.stud.ntnu.no. Then logon 

using your normal username and password and copy the following into 

the terminal echo 'AddType text/plain .py' >> ~/public_html/.htaccess. 

3. 

4. 

Python: home THW HAP 

1. Use quit() or ctrl + Z to exit Python in the command window. 

2. Comparison operators in Python are the same as in C/C++ that are ==, 

!=, =. 

3. The indexing of lists, vectors, etc. starts at 0 - not at 1 as in FORTRAN 

and Matlab. 

4. Use colon (:) to terminate if, else, while and for conditionals. 

5. The elif in Python corresponds to else if in C/C++. 

6. To start writing Python you must be familiar with the most basic 

programming concepts: 

recursion 

loops (for, while) 

regular expressions 

functions 

7. You must know how to work with basic objects and containers: 

string 

number 

list 

float 

int

list 

dictionary 

8. Finally, you must know the meaning of a few reserved words: 

for, while 

if, else, elif 

def 

len 

return 

int 

help 

import 

help 

dir 

9. Importing mathematics package: 

import math 

10. Importing regular expression package: 

import re 

re.match('looking for re', 'in string') 

re.group() 

11. Working with dictionaries: 

12. 

13. 

dict.get() 

dict.pop() 

dict.iteritems() 

Unix/Linux/Cygwin: home THW HAP 

1. Find all files of kind TeX or LaTeX in your Document catalogue: find 

~/Documents/ -iname *.tex 

2. Find all occurences of PYTHON, Python, python etc. in those files: grep - 

E -i --color 'python' `find ~/Documents/ -iname *.tex` 

3. Collect all .py files in every sub-directory into a new file called tmp: for file 

in **/*.py; do cat $file; done > tmp 

4. Calculate the cumulative number of words in the entire directory tree: ls - 

R ./**/* | wc -w 

5. Remove comment lines from Python script: grep -Ev '^\s*(#.*)?$' foo.py 

6. 

7. 

Windows: home THW HAP 

1. Use quit() or ctrl + Z to exit Python in the command window. 

2. How to use epydoc in the command window: Open the cmd window.

2. How to use epydoc in the command window: Open the cmd window. 

Change directory (cd) to the folder where epydoc.py was saved 

(C:\Python27\Scripts). Enter the command epydoc.py and then the path to 

the file you want to run epydoc on (e.g. epydoc 

C:\Python27\myfiles\atoms.py). Command line options can be (e.g. -o 

myfiles\html will send output to an html folder in myfiles). 

3. How to set the Python path in windows 7: My computer -> system 

properties -> advanced settings -> environment variables -> scroll down 

to path in the window below -> edit -> add ;C:\Python27 at the end of list. 

Press OK. Now you can open python.exe in the cmd window independent 

of where in the path you are at the present. 

4. How to change text colour in command window: Right click on the 

command line. Choose properties -> colors -> windows text -> choose the 

pale green color. 

5. 

6. 

TextPad: home THW HAP 

1. How to find syntax for regular expressions: help -> help topics -> how to... 

-> find and replace text -> use regular expressions. Will then get a list of 

all legal search expressions. 

2. How to get line numbers: Configure -> preferences -> view -> line 

numbers (tick off). 

3. How to get default file ending of .py: Configure -> preferences -> file -> 

default extension: py. 

4. Downloading of syntax highlighting: Choose the one of python(8) -> 

download to the Samples folder where TextPad has been installed. Go to 

TextPad, close all open documents. Choose configure-> new document 

class -> follow the instructions for installation. Remember to tick off the 

Enable syntax highlighting box. In the drop-down window: Syntax 

definition file -> choose the file you just have downloaded. 

5. How to change background colours: Close all open documents. Configure 

-> preferences -> document classes -> python... -> colors -> choose more 

colors -> choose the yellow color close to the centre of the circle. 

6. Use ctrl + tab to switch between open documents. 

7. 

8. 

Last updated: 28 August 2012. © THW+EHW

Epydoc provides two user interfaces: 

Using Epydoc 

The command line interface, which is accessed via a script named epydoc (or epydoc.py on Windows) 

The graphical interface, which is accessed via a script named epydocgui (or epydoc.pyw on Windows). 

Epydoc can also be accessed programmatically; see epydoc's API documentation for more information. 

The Command Line Interface 

The epydoc script extracts API documentation for a set of Python objects, and writes it using a selected output format. 

Objects can be named using dotted names, module filenames, or package directory names. (On Windows, this script 

is named epydoc.py.) 

Command Line Usage (Abbreviated) 

epydoc [--html|--pdf] [-o DIR] [--parse-only|--introspect-only] [-v|-q] 

[--name NAME] [--url URL] [--docformat NAME] [--graph GRAPHTYPE] 

[--inheritance STYLE] [--config FILE] OBJECTS... 

OBJECTS... 

A list of the Python objects that should be documented. Objects can be specified using dotted names (such as 

os.path), module filenames (such as epydoc/epytext.py), or package directory names (such as epydoc/). 

Packages are expanded to include all sub-modules and sub-packages. 

--html Generate HTML output. (default) 

--pdf Generate Adobe Acrobat (PDF) output, using LaTeX. 

-o DIR, --output DIR, --target DIR 

The output directory. 

--parse-only, --introspect-only 

By default, epydoc will gather information about each Python object using two methods: 

parsing the object's source code; and importing the object and directly introspecting it. 

Epydoc combines the information obtained from these two methods to provide more 

complete and accurate documentation. However, if you wish, you can tell epydoc to use 

only one or the other of these methods. For example, if you are running epydoc on 

untrusted code, you should use the --parse-only option. 

-v, -q Increase (-v) or decrease (-q) the verbosity of the output. These options may be repeated 

to further increase or decrease verbosity. Docstring markup warnings are supressed 

unless -v is used at least once. 

--name NAME The documented project's name. 

--url URL The documented project's URL. 

--docformat NAME 

The markup language that should be used by default to process modules' docstrings. This 

is only used for modules that do not define the special __docformat__ variable; it is 

recommended that you explicitly specify __docformat__ in all your modules. 

--graph GRAPHTYPE 

Include graphs of type GRAPHTYPE in the generated output. Graphs are generated using 

the Graphviz dot executable. If this executable is not on the path, then use --dotpath to 

specify its location. This option may be repeated to include multiple graph types in the 

output. To include all graphs, use --graph all. The available graph types are: 

classtree: displays each class's base classes and subclasses; 

callgraph: displays the callers and callees of each function or method. These 

graphs are based on profiling information, which must be specified using the 

--pstate option. 

umlclass: displays each class's base classes and subclasses, using UML style. 

Methods and attributes are listed in the classes where they are defined. If type 

information is available about attributes (via the @type field), then those types are 

displayed as separate classes, and the attributes are displayed as associations. 

--inheritance STYLE 

The format that should be used to display inherited methods, variables, and properties. 

Currently, three styles are supported. To see an example of each style, click on it:

grouped: Inherited objects are gathered into groups, based on which class they are 

inherited from. 

listed: Inherited objects are listed in a short list at the end of the summary table. 

included: Inherited objects are mixed in with non-inherited objects. 

--config FILE Read the given configuration file, which can contain both options and Python object 

names. This option may be used multiple times, if you wish to use multiple configuration 

files. See Configuration Files for more information. 

The complete list of command line options is available in the Command Line Usage section. 

Examples 

The following command will generate HTML documentation for the sys module, and write it to the directory 

sys_docs: 

[epydoc]$ epydoc --html sys -o sys_docs 

The following commands are used to produce the API documentation for epydoc itself. The first command writes 

html output to the directory html/api, using epydoc as the project name and http://epydoc.sourcforge.net as the 

project URL. The white CSS style is used; inheritance is displayed using the listed style; and all graphs are included 

in the output. The second command writes pdf output to the file api.pdf in the directory latex/api, using Epydoc as 

the project name. 

[epydoc]$ epydoc -v -o html/api --name epydoc --css white \ 

--url http://epydoc.sourceforge.net \ 

--inheritance listed --graph all src/epydoc 

[epydoc]$ epydoc -v -o latex/api --pdf --name "Epydoc" src/epydoc 

Configuration Files 

Configuration files, specified using the --config option, may be used to specify both the list of objects to document, 

and the options that should be used to document them. Configuration files are read using the standard ConfigParser 

module. The following is a simple example of a configuration file. 

[epydoc] # Epydoc section marker (required by ConfigParser) 

# Information about the project. 

name: My Cool Project 

url: http://cool.project/ 

# The list of modules to document. Modules can be named using 

# dotted names, module filenames, or package directory names. 

# This option may be repeated. 

modules: sys, os.path, re 

modules: my/project/driver.py 

# Write html output to the directory "apidocs" 

output: html 

target: apidocs/ 

# Include all automatically generated graphs. These graphs are 

# generated using Graphviz dot. 

graph: all 

dotpath: /usr/local/bin/dot 

A more complete example, including all of the supported options, is also available. 

The Graphical Interface 

Epydoc also includes a graphical interface, for systems where command line interfaces are not convenient (such as 

Windows). The graphical interface can be invoked with the epydocgui command, or with epydoc.pyw in the Scripts 

subdirectory of the Python installation directory under Windows. Currently, the graphical interface can only generate 

HTML output.

Use the Add box to specify what objects you wish to document. Objects can be specified using dotted names (such as 

os.path), module filenames (such as epydoc/epytext.py), or package directory names (such as epydoc/). Packages 

are expanded to include all sub-modules and sub-packages. Once you have added all of the modules that you wish to 

document, press the Start button. Epydoc's progress will be displayed on the progress bar. 

To customize the output, click on the Options arrow at the bottom of the window. This opens the options pane, which 

contains fields corresponding to each command line option. 

The epydoc graphical interface can save and load project files, which record the set of modules and the options that 

you have selected. Select File->Save to save the current modules and options to a project file; and File->Open to 

open a previously saved project file. (These project files do not currently use the same format as the configuration 

files used by the command line interface.) 

For more information, see the epydocgui(1) man page.

Documentation Completeness Checks 

The epydoc script can be used to check the completeness of the reference documentation. In particular, it will check 

that every module, class, method, and function has a description; that every parameter has a description and a type; 

and that every variable has a type. If the -p option is used, then these checks are run on both public and private 

objects; otherwise, the checks are only run on public objects. 

epydoc --check [-p] MODULES... 

MODULES... 

A list of the modules that should be checked. Modules may be specified using either filenames (such as 

epydoc/epytext.py) or module names (such as os.path). The filename for a package is its __init__.py file. 

-p Run documentation completeness checks on private objects. 

For each object that fails a check, epydoc will print a warning. For example, some of the warnings generated when 

checking the completeness of the documentation for epydoc's private objects are: 

epydoc.html.HTML_Doc._dom_link_to_html........No docs 

epydoc.html.HTML_Doc._module..................No type 

epydoc.html.HTML_Doc._link_to_html.link.......No descr 

epydoc.html.HTML_Doc._author.return...........No type 

epydoc.html.HTML_Doc._author.authors..........No descr, No type 

epydoc.html.HTML_Doc._author.container........No descr, No type 

epydoc.html.HTML_Doc._base_tree.uid...........No descr, No type 

epydoc.html.HTML_Doc._base_tree.width.........No descr, No type 

epydoc.html.HTML_Doc._base_tree.postfix.......No descr, No type 

If you'd like more fine-grained control over what gets checked, or you would like to check other fields (such as the 

author or version), then you should use the DocChecker class directly. 

HTML Files 

Every Python module and class is documented in its own file. Index files, tree files, a help file, and a frames-based 

table of contents are also created. The following list describes each of the files generated by epydoc: 

index.html 

The standard entry point for the documentation. Normally, index.html is a copy of the frames file 

(frames.html). But if the --no-frames option is used, then index.html is a copy of the API documentation 

home page, which is normally the documentation page for the top-level package or module (or the trees page if 

there is no top-level package or module). 

module-module.html 

The API documentation for a module. module is the complete dotted name of the module, such as sys or 

epydoc.epytext. 

class-class.html 

The API documentation for a class, exception, or type. class is the complete dotted name of the class, such as 

epydoc.epytext.Token or array.ArrayType. 

module-pysrc.html 

A page with the module colorized source code, with links back to the objects main documentation pages. The 

creation of the colorized source pages can be controlled using the options --show-sourcecode and 

--no-sourcecode. 

module-tree.html 

The documented module hierarchy. 

class-tree.html 

The documented classes hierarchy. 

identifier-index.html 

The index of all the identifiers found in the documented items. 

term-index.html 

The index of all the term definition found in the docstrings. Term definitions are created using the Indexed 

Terms markup. 

bug-index.html 

The index of all the known bug in the documented sources. Bugs are marked using the @bug tag. 

todo-index.html 

The index of all the to-do items in the documented sources. They are marked using the @todo tag.

help.html 

The help page for the project. This page explains how to use and navigate the webpage produced by epydoc. 

epydoc-log.html 

A page with the log of the epydoc execution. It is available clicking on the timestamp below each page, if the 

documentation was created using the --include-log option. The page also contains the list of the options 

enabled when the documentation was created. 

api-objects.txt 

A text file containing each available item and the URL where it is documented. Each item takes a file line and it 

is separated by the URL by a tab charecter. Such file can be used to create external API links. 

redirect.html 

A page containing Javascript code that redirect the browser to the documentation page indicated by the 

accessed fragment. For example opening the page redirect.html#epydoc.apidoc.DottedName the browser 

will be redirected to the page epydoc.apidoc.DottedName-class.html. 

frames.html 

The main frames file. Two frames on the left side of the window contain a table of contents, and the main frame 

on the right side of the window contains API documentation pages. 

toc.html 

The top-level table of contents page. This page is displayed in the upper-left frame of frames.html, and provides 

links to the toc-everything.html and toc-module-module.html pages. 

toc-everything.html 

The table of contents for the entire project. This page is displayed in the lower-left frame of frames.html, and 

provides links to every class, type, exception, function, and variable defined by the project. 

toc-module-module.html 

The table of contents for a module. This page is displayed in the lower-left frame of frames.html, and provides 

links to every class, type, exception, function, and variable defined by the module. module is the complete 

dotted name of the module, such as sys or epydoc.epytext. 

epydoc.css 

The CSS stylesheet used to display all HTML pages. 

CSS Stylesheets 

Epydoc creates a CSS stylesheet (epydoc.css) when it builds the API documentation for a project. You can specify 

which stylesheet should be used using the --css command-line option. If you do not specify a stylesheet, and one is 

already present, epydoc will use that stylesheet; otherwise, it will use the default stylesheet. 


Syllabus 

Week Programming topics: home THW Modelling topics: home HAP 

One 

Two 

Three 

Four 

Five 

Six 

Seven 

Eight 

Nine 

Ten 

Eleven 

Twelve 

Thirteen 

• • • 

Introduction to Python 

Running Python from the command line using text 

files. 

Introduction to modelling 

The concept of mathematic modelling, 

modelling scenarios, and topologies. 

Getting started 

Topology 

Editors and regular expression search-and-replace, 

An algebraic view on modelling. 

and the handling of multiple files. 

Documentation 

Embedded documentation (epytext), and automatic The mass balance principle and chemical 

documentation (epydoc). 

Molecular formula parser 

Backus-Naur formalism, regular expressions, and 

string parsing. 

The atom matrix 

Dictionaries (hash tables) and iterators. 

Independent reactions 

Matrix algebra, null space, and the mass balance 

of chemically reacting systems. 

Root solvers 

Solving non-linear problems in one variable. Safe- 

guarding the iteration. 

A thermodynamic equation solver 

Solving a spefication in H,p,N1,N2,... with 

respect to T,V,N1,N2,... 

The reactor model 

Making a generic simulation model for plug-flow 

reactors. 

Integration 

Solving ODEs using explicit and implicit Euler 

integration. 

Unit testing 

Verification and validation of computer code, and 

exception handling. 

Putting the model to work 

Unit testing the model, and producing plots. 

• • • 

Mass balance 

reactions. 

Energy balance 

The concepts of internal energy, heat and 

work. 

Steady state 

Dynamic states without dynamics. 

Physical events 

Singularities in Nature. Do they exist? 

Matrix theory 

Linear algebra is one way of organizing our 

equations. 

ODE 

Ordinary differential equations. 

PID 

Process control. 

AAA 

Whatever about subject AAA. 

BBB 

Whatever about subject BBB. 

CCC 

Last updated: 28 August 2012. © THW+EHW 

Whatever about subject CCC.

Regular Expression Search-andreplace 





Ken Olsen, founder of DEC (1977) 

Assignments 

Zooball/Dove 

"There is no reason anyone would want a computer in their home." 

1. Read A Smalltalk about Modelling. The paper explains some of the 

reasons why you should learn about computer languages in your natural 

science study. 

2. Install either Vim, Emacs, Smultron or TextPad on your computer. 

Change the color preferences to light grey or pastel background, black 

text and low brightness highlight colors. Never use a gleaming white 

background and bright red, blue, green, etc. colors. The contrast will 

affect your eyes badly. The reason is that you will at times be staring very 

intensively on the screen for a long time to think hard about an algoritm or 

to find a bug. Now, this work mode is very different from what you have 

experienced before using e.g. word processors so you must learn to take 

care of your eyes! 

3. Convert critical_data from XML (eXtensible Markup language) to CSV 

(Comma Separated Variables) format. Often, it is safer to use semicolon 

rather than comma as the field separator, especially if the fields 

themselves contain commas (like many chemical component names do). 

Or, you can enclose the field name in double quotes and still use comma 

as the separator. 

Note: There is a difference in line endings on Windows (carriage return + 

newline), Mac (carriage return) and Unix (newline). In computer jargon 

these characters are given ASCII codes 13 (CR) and 10 (NL) respectively. 

Their regular expression equivalents are \r and \n. Modern editors are 

aware this problem and you can change the newline character(s) to 

whatever you like before saving the file. This will become important when 

you are matching strings that span several lines in the file.

XML belongs to a world of its own, but we do not need to know 

much about the language to solve this task. We only need to 

identify the repetitive pattern that are used to store our 

our data. The characteristic encoding of the XML-file is: 

 

 

 

 

 

 

 

 

 

 

 

 

... 

 

The output shall be on the form: 

Name, Tc, Pc, Vc, Zc 

, K, atm, cc mol^{-1}, 

"ACETIC ANHYDRIDE", 569, 46.2, 290, 0.287 

... 

4. Convert all files in Archive from their non-standard in-house format to 

CSV format. 

In programming, working with multiple source files is more like 

a rule than an exception. For a couple of files I would probably 

edit the changes by hand, but if the files grows in number to 5 

or maybe 10 I would definitly look for a pattern to see if it is 

possible to make simultaneous changes to all the files. The encoding 

of the data files does in this case follow a very simple 

pattern: 

DALEX76B 

Alexandrov, A.A., Khasanahin, T.S., and Larkin, D.K. 

Paper to the Working Group 1 of the IAPS, Kyoto, Japan, (1976). 

T90(K) P(MPa) d(kg/m3) 

96 

423.114 55.568000 945.20639 

423.114 40.152000 938.01591 

... 

The output shall be on the form: 

T90, P, d 

K, MPa, kg/m3 

423.114, 55.568000, 945.20639 

423.114, 40.152000, 938.01591 

... 

5. Make sure the output files can be opened without trouble in Excel or 

OpenOffice. 

Regular expressions belong to the simplest of all languages. An exerpt from

Wikipedia informs us that: "In computing, a regular expression, also referred to 

as regex or regexp, provides a concise and flexible means for matching strings 

of text, such as particular characters, words, or patterns of characters. A regular 

expression is written in a formal language that can be interpreted by a regular 

expression processor." Regular expressions are of widespread use for 

analyzing text, defining programming language syntax and for generic searchand-replace 

in editors. A very short overview of the basic commands is given 

below: 

back 

^ Start of a string 

$ End of a string 

. Any character (except \n) 

* 0 or more of previous expression 

+ 1 or more of previous expression 

? 0 or 1 of previous expression 

\w Matches any word character 

\W Matches any non-word character 

\s Matches any white-space character 

\S Matches any non-white-space character 

\d Matches any decimal digit 

\D Matches any nondigit 

[abc] Matches any single character included in the set 

[âbc] Matches any single character not in the set 

[a-z] Contiguous character ranges 

(a|b) a or b 

ab{2} Matches two b characters 

(expr) Makes a backreference of whatever is matched. 

The backreference is made available as \1 or $1 

in many search-and-replace routines. 

A few examples follow. The text string we want to analyze is: "Hello TKP4106!" 

back 

^.*$ Matches 'Hello TKP4106!' 

^[a-zA-Z0-9 !]*$ Matches 'Hello TKP4106!' 

^.*(o T).*$ Matches 'Hello TKP4106!' (\1=>'o T') 

\w+ Matches 'Hello' 

\s\w+ Matches ' TKP4106' 

\d+ Matches '4106' 

\W Matches ' ' 

\w*(\W+)\w*(\W+) Matches 'Hello TKP4106!' (\1=>' ' and \2=>'!') 

You remember maybe the "burglar's language" from your childhood? It was a 

simple translation of all consonants b, c, d, etc. into bob, coc, dod, etc. So, 

"Python" would become "popytothohonon". This is hard practising of your 

tongue but it is very easy to achieve with regular expressions: 

back

Search for: ([âeiouy\W]) 

Replace by: \1o\1 

There are tons of regex documentation on the Web. This link to Regular 

Expressions seems quite OK. Note, however, that there are many flavors of 

regular expressions and that the syntax can (will) differ when you switch 

between two different editors, operating systems or programming languages. 

back 

Last updated: 16 October 2011. © THW+EHW

Quotes: Prophecy, Prophets 

(something people get tired of hearing someone say, "I told you it would happen.") 

prophecy 

1. A prediction of a future event that is believed to reveal the will of a deity. 

2. A prediction that something will occur in the future. 

prophet 

1. Someone who foretells or predicts what is to come; such as, a weather prophet or prophets of doom. 

2. A spokesperson of some doctrine, cause, or movement. 

Quotations 

Prophecies, or judgments, that have proven to be false: 

1. "Computers, in the future, may weigh more than 1.5 tons." —Popular Mechanics, forecasting the 

relentless march of science, 1949. 

2. "I think there is a world market for, maybe, five computers." —Thomas Watson, chairman of IBM, 

1943. 

3. "I have traveled the length and breadth of this country, and talked with the best people, and I can 

assure you that data processing is a fad that won't last out the year." —The editor in charge of business 

books for Prentice Hall, 1957. 

4. "But what . . . is it good for?" —Engineer at the Advanced Computing Systems Division of IBM, 1968, 

commenting on the microchip. 

5. "There is no reason anyone would want a computer in their home." —Ken Olson, president, chairman 

and founder of Digital Equipment Corp., 1977. 

6. "This 'telephone' has too many shortcomings to be seriously considered as a means of communication. 

The device is, inherently, of no value." —Western Union internal memo, 1876. 

7. "The wireless music box has no imaginable commercial value. Who would pay for a message sent to 

nobody in particular?" —David Sarnoff's associates in response to his urgings for investment in the 

radio in the 1920s. 

8. "The concept is interesting and well-formed. But, in order to earn better than a 'C', the idea must be 

feasible." —A Yale Univ. management professor in response to Fred Smith's paper proposing reliable 

overnight delivery service. (Smith went on to found Federal Express Corp). 

9. "Who wants to hear actors talk?" —H.M. Warner, Warner Brothers, 1927. 

10. "I'm just glad it will be Clark Gable who is falling on his face and not Gary Cooper." — Gary Cooper on 

his decision not to take the leading role in Gone With The Wind. 

11. "A cookie store is a bad idea. Besides, the market research reports say America likes crispy cookies, not 

soft and chewy cookies like you make." —Response to Debbi Fields' idea of starting Mrs. Fields' 

Cookies. 

12. "We don't like their sound and guitar music is on the way out." —Decca Recording Co. rejecting the 

Beatles, 1962. 

13. "Stocks have reached what looks like a permanently high plateau." —Irving Fisher, Professor of 

Economics, Yale University, 1929. 

14. "Airplanes are interesting toys, but of no military value." —Marechal Ferdinand Foch, Professor of 

Strategy, Ecole Superieure de Guerre. 

Other Quotes, Quotation Units. 

Want A Free 2012 Reading? 

Shockingly accurate predictions abt love, health & wealth - try now! 

www.PremiumAstrology.com 

Showing 1 page of 2 main-word entries or main-word-entry groups. 

Home Page Search Box Main Index Table of Contents

The Killers Lanserer nytt album: Battleborn Forhåndsbestill albumet her! itunes.apple.com 

Damer søker Menn Norges nye Datingside. Gratis Medlemskap idag. www.prime-date.no 

The Two Witnesses—Where? No Need to Speculate! You Can Know. Watch this Eye-Opening Video. www.worldtocome.org 

100% Free Psychic Reading A professional clairvoyant offers you a psychic reading sent by email AboutAstro.com 

Web Search Word Info Search 

Search

1 Background 

A Smalltalk † about modelling 




5 June 2009 

The modern era of the human race is deeply rooted in the Enlightenment and the 

contemporary search for a rational description of nature. Man is the only among the 

animals on planet Earth that systematically investigate, interprete, and employ the basic 

laws of nature to its own benefit. It is not too much to state that the understanding of 

the laws of nature has paved our road to technological success, and to the proliferation 

of our own species beyond any control. But, notwithstanding the tremendous success we 

have had on the technological arena there is still room for a more accurate understanding 

of natural phenomena and in particular those of complex nature. 

We tend to think that a complex system must be technically intricate as well. That 

is wrong. For example: Life at the kitchen sink is quite simple (technically), but at the 

same time so complex (mathematically) that it is possible to enjoy a full academic career 

trying to explain all the physical phenomena that are observed: Drop formation, water 

twirls, shock fronts, bubble coalescence, foams, vortices, etc. This daily experience, 

which we rarely appreciate, is quite contrary to the situation in the laboratory. There, 

we try to eliminate all random factors in order to understand one particular phenomenon. 

The outcome of the study can be a measured value of some kind, or the input to a refined 

model of the phenomenom being studied. Actually, the old saying “seeing is believing” 

is for us akin to “observing is explaining”. Every observable physical phenomenon must 

find a rational explanation. There is no easy escape from this dilemma because we believe 

so hard in our present understanding of the physics. But, there are unsurmountable 

problems in explaining all the nitty-gritty details of Nature. We pretend, therefore, that 

our models are too simple still. 

Collecting many small pieces of information make us able to understand and model 

parts of the world around us. At this point the use of computers has strengthen our capabilities 

of formulating and solving complex physico-mathematical models for a diversed 

set of industrial operations like fluid transport, chemical reaction, separation, casting, 

† Smalltalk is a purely object-oriented programming language invented in the 1980s. It has later 

inspired the development of Ruby—a modern scripting language of the same breed as Perl and Python. 

1

electrolysis, extrusion and rolling. The continuum description of a full-sized control volume 

with stress–strain interactions and complicated geometry may now be formulated 

and solved as systems of equations with millions of unknowns. Weather forecasting is 

maybe the ultimate example. 

2 Computer science 

Modelling does also depend on numerical issues like rounding error, computation speed, 

memory capacity and discretization schemes. Focus is thereby lifted from the understanding 

of the laws of nature to the understanding of numerics and computer languages. 

Most important maybe, is the observation that a physical model can be refined 

indefinitly without coming to a full answer of “life, universe and everything”. All models 

have to give in at some point of refinement. This has to do with the granularity of the 

model. The calculation of fluid flow, for instance, does normally ignore the propagation 

of sound waves. So, if sound waves are important, the model will fail. It does not matter 

how many parameters we introduce, or how clever we are tweaking the numbers. It does 

simply fail. We say that the model must be validated against experiments to be trusted. 

Another unfortunate situation occurs when the model gives consistently wrong results. 

Changing the direction of gravity for instance would cause a stone to fall upwards. Apart 

from this flaw all the derived results could be correct. There is no way a computer can 

understand or check this out without human interaction. The programmer must verify 

that the equations are solved correctly. Our first statement about modelling is therefore: 

Validation: The model is made right (experiment decides) 

Verification: The right model is made (programmer decides) 

The secret is to make sure that the model has the right granularity with respect to what 

it is supposed to do, and to choose an implementation that makes the best out of the 

time available and the human resources. The old rule of thumb that one line of code is 

equivalent to one working hour is still valid. For bigger projects devoted to advanced 

modelling this number may easily drop to two lines per day. It is impossible to give a 

totally satisfactory implementation guide to all kinds of physical problems, but it pays to 

keep a close eye at the physics (mostly conservation laws), the solution methods, and the 

program structure. Ideally, a physico-mathematical model consists of four main parts: 

1. A deterministic ∗ function (the model) 

2. Model parameters (perhaps quite many and ill-organized) 

3. A numerical solver (normally linearized) 

4. Calculated results (vector fields or matrices maybe) 

∗ Quantum mechanics makes an interesting case in physical modelling since it is not strictly deter- 

ministic. 

2

Considering these four parts of the model from the very beginning will inevitably limit 

the modelling task to comply with the available human resources. But even the best 

modelling practise gives no clue about how the model is going to be used. Should it be 

a stand-alone tool or made part of a program library? Is it required to make a compiled 

program or will an interpreted script do? In higher education it would be very beneficial 

if the joint modelling efforts from all the math and science classes were put into a small 

toolbox that the students could bring out from university into their future jobs. The 

current situation is nearly the opposite and that is not prosperous for academia. To 

shed some light on this topic I shall like to present a somewhat personal view on the 

links between programming languages, modelling and model uses: 

Languages |= Mathematics |= Physics 

|= Modelling |= Simulation 

|= Animation |= GUI 

The binary operator |= means a dependency—in the sense that Mathematics rely on 

a (formal) Language, Physics rely on Mathematics, Modelling rely on Physics, etc. In 

the late medievial period European universities taught natural languages (Greek and 

Latin mostly), medicine, theology and astronomy. About 300 years ago mathematics 

and physics entered the scene as subjects of their own, while modelling and simulation 

were not commonplace till after WWII. These subjects were quite early moved out of 

the university, however, and safely placed in governmental research institutes, mostly 

connected to defense and aero-space industries. Animation belongs to the computer 

science era, and Graphical User Interfaces (GUI) had basically to await the introduction 

of the Windows 3.1 operating system in the late 1980s. 

3 Natural sciences 

As a consequence of our expanding knowlegde it becomes increasingly harder to give 

priority to one particular subject on the cost of the others. Like Figure 1 says: What 

is the most important subject to teach first? Languages or GUI? Not an easy question 

because mathematics is a language of its own and a textbook is a kind of a graphical 

user interface. Or, perhaps the subjects should be taught in parallell? There are no 

definit answers to these questions, yet we must choose what to teach, when to teach and 

how to teach it. It is interesting to note that our education system which started out 

teaching natural languages several hundreds of years ago has by now ended up as a big 

consumer of formal language procedures and computer programs. 

Classic knowledge has in a way been replaced by synthetic know-how. Just think 

about the use of Internet as a platform for collecting and retrieving information. The 

funny thing is that this change has not been taken into account in the natural science 

curriculums we see today. Retrospectively, the computer was born in a top secret physics 

lab but quickly moved out to become an everyday entertainment machine. It shall be 

our challenge to bring it back into scientific teaching as a mind extender—not a mind 

3

GUI 

Languages 

Languages 

GUI 

GUI 

Languages 

Figure 1: What is the most important subject to teach first? Languages or GUI? Or, 

perhaps the subjects should be taught in parallell? 

boggler. In order to do this we need to understand the buzzwords mentioned above, and 

we need to make a choice about where we should put our efforts. The worst scenario is 

doing a little of everything which easily ends up in nothing. 

Let it be my bold statement that the university must focus on the teaching of formal 

Languages, Mathematics and Physics. This is a very conservative approach, but on top 

of this we should introduce Modelling as a separate issue from day one at the university. 

This does not mean that the students shall run commercial software with advanced 

graphical interfaces. It means, however, that the computer (language) development has 

come to a point where it is possible to solve (non-linear) physical problems at a pace 

that was unimaginable 15 years ago. So, rather than talking about models—not to say 

model simplifications—we can teach the students how to model. Our focus can thereby 

be shifted from mathematical details † to physical insight. 

At the same time it is important to make a sharp distinction between modelling and 

simulation. Modelling is the mathematical description of a physical event into a formal 

language, while simulation is the systematic use of models to study a complete process. 

Simulation is great for validation purposes and for our understanding of complex systems, 

but it should definitly be kept out of the classroom because it does not bring in any new 

understanding of the basics. The control people may disagree with me here, but I am 

talking about basics in the sense of physics—not about systems behaviour. 

The situation is a somewhat different when it comes to Animation and GUI since 

these subjects are touched upon already in the elementary school. Moreover, the World 

Wide Web is a gigantic software enterprice which impossibly can be kept out of the 

classroom. It is also true that the Ministries of Education worldwide think these topics 

are especially important, maybe because “seeing is believing”. I believe these simplistic 

thoughts are harmful, however, because only a small fraction of the resources spent on 

developing computer games, movies, music and entertainment find its way back to where 

it all started; namely increasing the knowlegde of the world around us. E.g. the Avatar 

(2009) movie, which by all means was a trendsetter, is a good example on how reality and 

fiction can be seamlessly merged using a good deal of computing power. But, however 

breathtaking the movie is, it does not increase our understanding of the world around 

† The mathematicians do not need to worry. There is plenty of room for a thorough mathematical 

underpinning in all physical disciplines. 

4

us. 

It is also a common misconception that kids in general get very excited, and want to 

learn science, by simply watching animations and simulations on the computer screen. 

This is simply not true as virtually all students today have watched animated TV programs 

and fabulous action movies since they were 3 years old. The professors are enthusiastic, 

but the students think it is downright boring. However, it is our duty to teach 

the students natural sciences, and even though it is sad to watch how the universities 

in Norway are lacking a good strategy on how to cope with this undertaking—now that 

we definitly have entered the computer age, we must do something. In my opinion this 

something should be a mix of traditional mathematics, physics and chemistry, intersparsed 

with modelling as a tool for learning. The second statement about modelling 

(and computer science in natural science education) is therefore that we should limit our 

focus to: 

Languages |= Mathematics |= Physics |= Modelling 

It is necessary to put some emphasis on the learning of formal languages to understand 

what can be done on a computer, not only how it can be done. The common double– 

clicking–machine is good for everyday surfing on the web and manipulating song lists, 

but it has nothing to do with scientific computing. The situation today is that all 

students are trained in their mother tongue, and in one or two foreign languages. This 

is very good but it is worth a second thought that they are not equally well trained in 

speaking any of the computer languages. Quite interestingly though, since they may 

easily spend 3–8 hours behind the screen every day. Some people would claim that 

there are more than 5000 computer languages today and that the students cannot learn 

everything, but formal languages are quite simplistic and follow the same basic ideas: 

Alphabet, vocabulary, syntax and semantics. The crucial point is that the students must 

learn how to express their thoughts (model = structure + physics + math) in at least 

one such language. To ignore this focus is like traveling to a foreign country without 

knowing the local lingo: You will be nothing but a tourist. In my opinion students of 

natural sciences at NTNU should definitly not be computer tourists. They should know 

how to master their new frontier. 

5

5.1.3 Regular Expressions, see also Sec. 2.11 

First reference occurs in Regex (Stephen Ramsay), see Section 2.11 on page 77. 

287

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 

" Words of the day. Words of the day. Words of the day. Words of the day. Words of the day. Words of the 

day. Words of the day. Words of the day. Words of the day. Words of the day. Words of the day. Words of 

the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 

3. Some pre-formatted text: 

... 

... 

... 

4. Continue. 





back 








back




Last updated: DD Monthname YYYY. © THW+EHW

www.whatip.org 

Home 3io, Inc. HostURL.com UnixSupport PcBookmarks 

Your IP Address Is 84.52.217.188 

Your Browser reports: 

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) 

AppleWebKit/534.52.7 (KHTML, like Gecko) wkpdf/0.5.0 

Provided by www.whatip.org 

Give the above information to whomever asked your to 

visit this website. 

Does this work with any computer? 

Yes, www.whatip.org works with any Macintosh, Windows 

or Linux machine, it even works with mobile phones and 

other embedded browsers. 

What is an IP address? 

IP Addresses are what identify your computer on the 

Internet. Think of it as a phone number, every time your 

computer connects to the Internet it obtains an address 

to be able to make a connection to the websites, email 

and other services you use on the Internet. 

IP Addresses are important to technical support 

representatives, webmasters and other tech people 

because they are generally logged on their services, 

which means if your having a problem, they can then 

search for your IP Address in the logs and find out what is 

going on with their service and correct the problem. 

Why were you sent here? 

Often times it's hard for a support specialist to determine your IP address over the 

phone, depending on how your computer is connected to the internet, your actual PC 

may think it's using a different address then what is actually showing up in their logs. 

By using an outside service like www.whatip.org, the technician can quickly get your 

address without having you go thru complicated steps to determine this on your own. 

Who are you and how do you do this? 

Cisco og 

NetApp training 

Fast Lane - 

raskeste vei til 

kunnskap og 

sertifisering 

www.flane.no 

Who.is Lookup | 

Free Site 

A Global Who Is 

Lookup for 

Domains. Search 

for Domain Owners 

Here! 

www.who.is 

Spread Bet - 

Alpari (UK) 

Spread Betting 

From 10p/point. 

Get A Free Demo 

Account Now. 

www.alpari.co.uk/mt4-… 

Free Cloud PBX 

Free Cloud PBX 

for small and large 

companies. 

www.voiptiger.com

www.whatip.org is run by 3io, Inc. as a public service for the Internet Community. 

3io, Inc. is an internet service company that runs HostItHere.com an Internet 

colocation provider and a number of other free services on the internet. For more 

information feel free to visit our website. 

www.whatip.org uses a simple server call to determine what IP Address you arrived at 

our site from. This may be a proxy (a computer that calls webpages for your network) 

or an address assigned to your cable, DSL, office T1 line or dialup service. 

(c) 2007 3io, Inc. All Rights Reserved. Your IP is 84.52.217.188

Documenting your Code 





The real programmer 

Assignments 

"Real Programmers write programs, not documentation." 

Zooball/Chicken 

1. Instal Python v2.7.x on your computer. You are going to run Python from the 

terminal also called the command window (we don't use IDE's — do we?). I 

suggest you change the color preferences of your terminal to black screen 

and amber or green text. This sounds like an echo from the old days of 

monochrome displays, but it stands the test even today. The terminal is for 

punching in cryptic commands and have maybe thousands of lines of output 

pour over your screen. It is mainly for your information, not for producing 

readable code. A black screen is more relaxing to the eye than a bright 

screen. 

2. Instal epydoc on your computer. 

3. a. Download the Python stub program atoms.py. 

b. Run epydoc on the stub file. Use stylesheet TKP4106.css. The syntax is 

explained further down on this page. 

c. Learn how epydoc uses epytext for rendering its output. 

d. Publish the HTML output from epydoc on your home page. 

4. Download the Python scripts morse.py and antimorse.py for translating back 

and forth between the Latin and Morse alphabets. Learn how you can run 

these scripts in the terminal window. Study Python strings in general and 

method calls like sys.stdin, re.sub and keywords like import, ifelif-else 

and print in particular. 

The source code documentation can be made at two levels. The traditional 

approach is to write lucid comments directly in the code — either above a block of 

code of major significance, say an if-else test or a for loop — or in-line to the 

right of each code statement. The block comment is easier to format and can be 

shaped into a paragraph of its own, while the in-line comment has the nicety that it 

vanishes if the statement should ever be deleted (a comment which is out of sync 

with the source code is incredibly misleading). I tend to use both comment styles in

my programming of small stand-alone scripts like the Matlab script shown below. 

Note that all the comments have flush right margin. This helps the reading a lot. 

Especially if the you have a context sensitive editor which is almost certainly the 

case. 

back 

%Simplex algorithm applied to solve a limited LP-problem. The sy- 

%ntax is [x,b,A,it] = LP(x,b,A,c). The starting point is a mini- 

%mization problem on the form 

% 

% min(c'*y_{k+1}) 

% A*y_{k+1} = A*y_{k} 

% y_{k+1}>= 0 

% where: 

% y(b) = x (basis variables) 

% y(f) = 0 (free variables) 

% 

% x = solution vector (basis variables) [m x 1] 

% b = column indices of basis variables in A [m x 1] 

% A = coefficient matrix where rank(A) = m >1 . [m x n] 

% c = cost vector [n x 1] 

% it = number of iterations spent in this function 

% 

%Copyright Tore Haug-Warberg 2008 (course TKP4175, KP8108, NTNU) 

% 

function [x,b,A,it] = LP(x,b,A,c) 

% 

f = 1:length(c); % temporary list of all variable indices 

f(b) = []; . % remove basis variables =>free variables 

% 

for it=1:prod(size(A)) % restricted no of iterations for simplex 

dldx = c(f)' - c(b)'*A(:,f); % derivatives of d(c'*x)/dx(f) 

if all(dldx>=0) % all derivatives are non-negative 

return % converged, further progress impossible 

else % there is at least one negative derivative 

i = find(dldx

the code itself. This approach is suitable for larger projects but it requires a bit of 

metaprogramming, i.e. there is "coding in the coding". It is important, therefore, that 

the mark-up stays out of the way without cluttering the code. This is the 

documentation form used in many programming languages today and tools like e.g. 

Doxygen makes it possible to churn out PDF and HTML documentation from many 

different sources of code written in C, C++, Fortran, Ruby, Python, etc. A simpler 

tool that goes with Python is epydoc. It builds on epytext, a kind of docstring 

format. An example is shown below. The code is admittedly polluted by artifacts like 

@summary, @author and other so-called metacommands, but the benefit of doing 

this extra formatting more than outweights the drawback. From running the source 

code through epydoc 

$ epydoc -v --css=TKP4106.css --parse-only atoms.py 

an HTML Epydoc output file is generated. Realize how the documentation looks 

quite the same independent of the programmer's personal coding style. 

back 

""" 

@summary: Chemical formula parsing suite. Bla-bla. 

@author: Tore Haug-Warberg 

@organization: Department of Chemical Engineering, NTNU, Norway 

@contact: haugwarb@nt.ntnu.no 

@license: GPLv3 

@requires: Python 2.3.5 or higher 

@since: 2011.06.30 (THW) 

@version: 0.9 

@todo 1.0: Bla-bla. 

@change: started (2011.06.30) 

@change: continued (2011.07.12) 

@note: Bla-bla. 

""" 

import re 

def atoms(formula, debug=False, stack=[{}], \ 

atom=r'([A-Z][a-z]?)(\d+)?', ldel=r'$', rdel=r'$(\d+)?'): 

""" 

The 'atoms' parser takes a chemical formula on standard form - something 

like 'COOH(C(CH3)2)3CH3' - and breaks it into a dictionary of recognized 

atoms and their respective occurences {'C': 11, 'H': 22, 'O': 2}. The 

parsing is performed left-to-right in a recursive manner which means it 

can handle nested parentheses. 

@param formula: a chemical formula 'COOH(C(CH3)2)3CH3' 

@param debug: True or False flag 

@param stack: an initial list of dictionaries 

@param atom: string equivalent of RE matching atom name including an 

optional number 'He', 'N2', 'H3', etc. 

@param ldel: string equivalent of RE matching the left delimiter '(' 

@param rdel: string equivalent of RE matching the right delimiter including 

an optional number ')', ')3', etc. 

@type formula: aString 

@type debug: aBoolean 

@type stack: aList 

@type atom: aRE on raw string format

@type ldel: aRE on raw string format 

@type rdel: aRE on raw string format 

@return: aDictionary e.g. {'C': 11, 'H': 22, 'O': 2} 

""" 

The secret of documentation lies in documenting your code from day one. Always 

make ready for documentation. Never wait. It will be too late before you know. In 

your future job you will be constantly assigned new tasks, which of course are more 

important than the one you are doing at the moment. By adopting a suitable 

documentation style you will always be able to return to your programs after a 

shorter or longer break. Without such a standard you will be lost. As a spin-off you 

can also produce documents that are valuable to your colleagues. It does not 

matter how clever you are in programming if things only work on your 

desktop! 

back 


5.3.1 The real programmer, see also Sec. 2.1 

First reference occurs in Real Programmers use FORTRAN, see Section 2.1 on page 12. 

296

Overview 

Epydoc is a tool for generating 

API documentation for Python 

modules, based on their 

docstrings. For an example of 

epydoc's output, see the API 

documentation for epydoc itself 

(html, pdf). A lightweight markup 

language called epytext can be 

used to format docstrings, and to 

add information about specific 

fields, such as parameters and 

instance variables. Epydoc also 

understands docstrings written in 

reStructuredText, Javadoc, and 

plaintext. For a more extensive 

example of epydoc's output, see 

the API documentation for 

Python 2.5. 

Documentation 

Epydoc manual 

Installing 

Epydoc 

Using Epydoc 

Python 

Docstrings 

The Epytext 

Markup 

Language 

Epydoc Fields 

reStructuredText 

and Javadoc 

Reference 

Documentation 

API 

Documentation 

Feedback 

Report a bug 

Suggest a feature 

Epydoc 

Automatic API Documentation Generation for Python 

Related 

Information 

Open 

Source 

License 

Change 

Log 

History 

Future 

Directions 

Related 

Projects 

Regression 

Tests 

Frequently 

Asked 

Questions 

Latest Release 

The latest stable release is Epydoc 3.0. If you wish to keep up 

on the latest developments, you can also get epydoc from the 

subversion repository. See Installing Epydoc for more 

information. 

Screenshots 

News 

Epydoc 3.0 released [January 2008] 

Epydoc version 3.0 is now available on the SourceForge 

download page. See the What's New page for details. Epydoc is 

under active development; if you wish to keep up on the latest 

developments, you can get epydoc from the subversion 

repository. If you find any bugs, or have suggestions for 

improving it, please report them on sourceforge. 

Presentation at PyCon [March 2004] 

Epydoc was presented at PyCon by Edward Loper. Video and 

audio from the presentation are available for download. 


5.3.3 Verbatim: “atoms.py” 

1 ””” 

2 @summary : Chemical formula p a r s e r . 

3 @author : 

4 @organization : Department o f Chemical Engineering , NTNU, Norway 

5 @contact : 

6 @ l i c e n s e : 

7 @requires : Python or h igher 

8 @since : () 

9 @version : 

10 @todo 1 . 0 : 

11 @change : s t a r t e d () 

12 @change : () 

13 @note : 

14 ””” 

15 

16 def atoms( formula , debug=False , stack = [ ] , delim =0, \ 

17 atom=r ’’ , l d e l=r ’’ , r d e l=r ’’ ) : 

18 ””” 

19 The ’ atoms ’ p a r s e r . 

20 

21 @param formula : a chemical formula ’COOH(C(CH3)2)3CH3 ’ 

22 @param debug : True or False f l a g 

23 @param stack : l i s t o f d i c t i o n a r i e s { ’ atom name ’ : int , . . . } 

24 @param delim : number o f l e f t −d e l i m i t e r s that have been opened and not yet 

25 c l o s e d . 

26 @param atom : s t r i n g e q u i v a l e n t o f RE matching atom name i n c l u d i n g an 

27 o p t i o n a l number ’He ’ , ’N2 ’ , ’H3 ’ , e t c . 

28 @param l d e l : s t r i n g e q u i v a l e n t o f RE matching the l e f t −d e l i m i t e r ’ ( ’ 

29 @param r d e l : s t r i n g e q u i v a l e n t o f RE matching the r i g h t −d e l i m i t e r 

30 i n c l u d i n g an o p t i o n a l number ’ ) ’ , ’ ) 3 ’ , e t c . 

31 

32 @type formula : 

33 @type debug : aBoolean 

34 @type stack : 

35 @type delim : 

36 @type atom : aRE on raw s t r i n g format 

37 @type l d e l : 

38 @type r d e l : 

39 

40 @return : a L i s t [ aDictionary , aDictionary , . . . ] 

41 e . g . [ { ’C ’ : 11 , ’H ’ : 22 , ’O ’ : 2 } ] 

42 ””” 

43 

44 import re 

45 

46 # Empty s t r i n g s do always pose problems . Test e x p l i c i t l y . 

47 pass 

48 

49 # I n i t i a l i z e the d i c t i o n a r y stack . Can ’ t be done in the f u n c t i o n header be− 

50 # cause Python i n i t i a l i z e s only once . Subsequent c a l l s to t h i s f u n c t i o n w i l l 

51 # then increment the same d i c t i o n a r y r a t h e r than making a new one . 

52 stack = stack or [ { } ] 

53 

54 # Python has no switch − case c o n s t r u c t . Match a l l p o s s i b i l i t i e s f i r s t and 

55 # t e s t a f t e r w a r d s : 

56 re atom = pass 

57 r e l d e l = pass 

298

58 r e r d e l = pass 

59 

60 # Atom f o l l o w e d by an o p t i o n a l number ( d e f a u l t i s 1 ) . 

61 i f re atom : 

62 t a i l = formula [ l e n ( re atom . group ( ) ) : ] 

63 head = pass 

64 num = pass 

65 

66 i f stack [ − 1 ] . get ( head , True ) : # verbose t e s t i n g o f Hash key 

67 pass # increment occurence 

68 else : 

69 pass # i n i t i a l i z a t i o n 

70 

71 i f debug : print [ head , num, t a i l ] 

72 

73 # Left−d e l i m i t e r . 

74 e l i f r e l d e l : 

75 t a i l = pass 

76 delim += pass 

77 

78 stack . append ({}) # w i l l be popped from stack by next r i g h t −d e l i m i t e r 

79 

80 i f debug : print [ ’left -delimiter’ , t a i l ] 

81 

82 # Right−d e l i m i t e r f o l l o w e d by an o p t i o n a l number ( d e f a u l t i s 1 ) . 

83 e l i f r e r d e l : 

84 t a i l = pass 

85 num = pass 

86 delim −= pass 

87 

88 i f delim < 0 : 

89 raise SyntaxError ( "un-matched right parenthesis in ’%s’"%(formula , ) ) 

90 

91 for ( k , v ) in stack . pop ( ) . i t e r i t e m s ( ) : 

92 stack [ − 1 ] [ k ] = pass 

93 

94 i f debug : print [ ’right -delimiter’ , num, t a i l ] 

95 

96 # Wrong syntax . 

97 else : 

98 raise SyntaxError ( "’%s’ does not match any regex"%(formula , ) ) 

99 

100 # The formula has not been consumed yet . Continue r e c u r s i v e p a r s i n g . 

101 i f l e n ( t a i l ) > pass 

102 atoms( pass , pass , pass , pass , pass , pass , pass ) 

103 return stack 

104 

105 # Nothing l e f t to parse . Stop r e c u r s i o n . 

106 else : 

107 i f delim > 0 : 

108 raise SyntaxError ( "un-matched left parenthesis in ’%s’"%(formula , ) ) 

109 i f debug : print stack [ −1] 

110 return stack 

299

5.3.4 epytext, see also Sec. 2.31 

First reference occurs in Epytext markup (sourceforge), see Section 2.31 on page 193. 

300

5.3.5 Verbatim: “morse.py” 

1 ””” 

2 @summary : Translate from Latin to Morse using Regular Expressions . 

3 @author : Tore Haug−Warberg 


5 @contact : haugwarb@nt . ntnu . no 

6 @ l i c e n s e : GPLv3 

7 @requires : Python 2 . 3 . 5 or higher 

8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 9 

10 @todo 1 . 0 : 

11 @change : s t a r t e d ( 2 0 1 1 . 0 8 . 3 0 ) 

12 @note : This i s an o v e r l y simple procedure , j u s t f o r fun r e a l l y . 

13 On a Unix t e r m i nal you can use the s c r i p t l i k e t h i s : : 

14 

15 echo ’Oh, h e l l o World everybody ’ | \ 

16 python morse . py | \ 

17 python antimorse . py | \ 

18 python morse . py 

19 

20 e t c . e t c . 

21 ””” 

22 

23 import sys 

24 import re 

25 

26 # Read input from keyboard ( and d e l e t e newline c h a r a c t e r ) . 

27 s t r = re . sub ( r ’\n’ , "" , sys . s t d i n . r e a d l i n e ( ) ) 

28 

29 # Use t h i s example s t r i n g i f input i s empty . 

30 i f not s t r : 

31 s t r = r ’Oh, hello World everybody!’ 

32 

33 # White spaces 

34 s t r = re . sub ( "\s+" , "| " , s t r ) 

35 

36 # S p e c i a l symbols 

37 s t r = re . sub ( "[.:;,?!]" , "|| " , s t r ) 

38 

39 # 2∗∗1 p a t t e r n s 

40 s t r = re . sub ( "e|E" , ". " , s t r ) 

41 s t r = re . sub ( "t|T" , "- " , s t r ) 

42 

43 # 2∗∗2 p a t t e r n s 

44 s t r = re . sub ( "i|I" , ".. " , s t r ) 

45 s t r = re . sub ( "a|A" , ".- " , s t r ) 

46 s t r = re . sub ( "n|N" , "-. " , s t r ) 

47 s t r = re . sub ( "m|M" , "-- " , s t r ) 

48 

49 # 2∗∗3 p a t t e r n s 

50 s t r = re . sub ( "s|S" , "... " , s t r ) 

51 s t r = re . sub ( "u|U" , "..- " , s t r ) 

52 s t r = re . sub ( "r|R" , ".-. " , s t r ) 

53 s t r = re . sub ( "w|W" , ".-- " , s t r ) 

54 s t r = re . sub ( "d|D" , " -.. " , s t r ) 

55 s t r = re . sub ( "k|K" , "-.- " , s t r ) 

56 s t r = re . sub ( "g|G" , "--. " , s t r ) 

57 s t r = re . sub ( "o|O" , "--- " , s t r ) 

301

58 

59 # 2∗∗4 p a t t e r n s 

60 s t r = re . sub ( "h|H" , ".... " , s t r ) 

61 s t r = re . sub ( "v|V" , "...- " , s t r ) 

62 s t r = re . sub ( "f|F" , "..-. " , s t r ) 

63 s t r = re . sub ( "l|L" , ".-.. " , s t r ) 

64 s t r = re . sub ( "p|P" , ".--. " , s t r ) 

65 s t r = re . sub ( "j|J" , ".--- " , s t r ) 

66 s t r = re . sub ( "b|B" , " -... " , s t r ) 

67 s t r = re . sub ( "x|X" , "-..- " , s t r ) 

68 s t r = re . sub ( "c|C" , "-.-. " , s t r ) 

69 s t r = re . sub ( "y|Y" , "-.-- " , s t r ) 

70 s t r = re . sub ( "z|Z" , "--.. " , s t r ) 

71 s t r = re . sub ( "q|Q" , "--.- " , s t r ) 

72 

73 # 2∗∗5 p a t t e r n s 

74 s t r = re . sub ( "5" , "..... " , s t r ) 

75 s t r = re . sub ( "4" , "....- " , s t r ) 

76 s t r = re . sub ( "3" , "...-- " , s t r ) 

77 s t r = re . sub ( "2" , "..--- " , s t r ) 

78 s t r = re . sub ( "1" , ".---- " , s t r ) 

79 s t r = re . sub ( "6" , " -.... " , s t r ) 

80 s t r = re . sub ( "7" , " --... " , s t r ) 

81 s t r = re . sub ( "8" , "---.. " , s t r ) 

82 s t r = re . sub ( "9" , "----. " , s t r ) 

83 s t r = re . sub ( "0" , "----- " , s t r ) 

84 

85 # Do away with the r e s t 

86 s t r = re . sub ( "[^ .|\-]" , "" , s t r ) 

87 

88 print s t r 

302

5.3.6 Verbatim: “antimorse.py” 

1 ””” 

2 @summary : Translate from Morse to Latin using Regular Expressions . 






8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 9 



12 @note : This i s an o v e r l y simple procedure , j u s t f o r fun r e a l l y . 

13 On a Unix t e r m i nal you can use the s c r i p t l i k e t h i s : 

14 

15 echo ’−−− . . . . | | | . . . . . . − . . . − . . −−− | ’\ 

16 ’.−− −−− . −. . − . . −.. | ’\ 

17 ’ . . . . − . . −. −.−− − . . . −−− −.. −.−− ’ | \ 

18 python antimorse . py | \ 

19 python morse . py | \ 

20 python antimorse . py 

21 

22 e t c . e t c . 

23 ””” 

24 

25 import sys 

26 import re 

27 

28 # Read input from keyboard ( and d e l e t e newline c h a r a c t e r ) . 

29 s t r = re . sub ( r ’\n’ , "" , sys . s t d i n . r e a d l i n e ( ) ) 

30 

31 # Use t h i s example s t r i n g i f input i s empty . 

32 i f not s t r : 

33 s t r = ’--- .... || | ’\ 

34 ’.... . .-.. .-.. --- | ’\ 

35 ’.-- --- .-. .-.. -.. | ’\ 

36 ’. ...- . .-. -.-- -... --- -.. -.-- || ’ 

37 

38 # 2∗∗5 p a t t e r n s . 

39 s t r = re . sub ( "\.\.\.\.\. " , "5" , s t r ) 

40 s t r = re . sub ( "\.\.\.\.- " , "4" , s t r ) 

41 s t r = re . sub ( "\.\.\.-- " , "3" , s t r ) 

42 s t r = re . sub ( "\.\.--- " , "2" , s t r ) 

43 s t r = re . sub ( "\.---- " , "1" , s t r ) 

44 s t r = re . sub ( " -\.\.\.\. " , "6" , s t r ) 

45 s t r = re . sub ( " --\.\.\. " , "7" , s t r ) 

46 s t r = re . sub ( " ---\.\. " , "8" , s t r ) 

47 s t r = re . sub ( "----\. " , "9" , s t r ) 

48 s t r = re . sub ( "----- " , "0" , s t r ) 

49 

50 # 2∗∗4 p a t t e r n s . 

51 s t r = re . sub ( "\.\.\.\. " , "h" , s t r ) 

52 s t r = re . sub ( "\.\.\.- " , "v" , s t r ) 

53 s t r = re . sub ( "\.\.-\. " , "f" , s t r ) 

54 s t r = re . sub ( "\.-\.\. " , "l" , s t r ) 

55 s t r = re . sub ( "\.--\. " , "p" , s t r ) 

56 s t r = re . sub ( "\.--- " , "j" , s t r ) 

57 s t r = re . sub ( " -\.\.\. " , "b" , s t r ) 

303

58 s t r = re . sub ( " -\.\.- " , "x" , s t r ) 

59 s t r = re . sub ( " -\.-\. " , "c" , s t r ) 

60 s t r = re . sub ( "-\.-- " , "y" , s t r ) 

61 s t r = re . sub ( " --\.\. " , "z" , s t r ) 

62 s t r = re . sub ( "--\.- " , "q" , s t r ) 

63 

64 # 2∗∗3 p a t t e r n s . 

65 s t r = re . sub ( "\.\.\. " , "s" , s t r ) 

66 s t r = re . sub ( "\.\.- " , "u" , s t r ) 

67 s t r = re . sub ( "\.-\. " , "r" , s t r ) 

68 s t r = re . sub ( "\.-- " , "w" , s t r ) 

69 s t r = re . sub ( " -\.\. " , "d" , s t r ) 

70 s t r = re . sub ( "-\.- " , "k" , s t r ) 

71 s t r = re . sub ( "--\. " , "g" , s t r ) 

72 s t r = re . sub ( "--- " , "o" , s t r ) 

73 

74 # 2∗∗2 p a t t e r n s . 

75 s t r = re . sub ( "\.\. " , "i" , s t r ) 

76 s t r = re . sub ( "\.- " , "a" , s t r ) 

77 s t r = re . sub ( " -\. " , "n" , s t r ) 

78 s t r = re . sub ( "-- " , "m" , s t r ) 

79 

80 # 2∗∗1 p a t t e r n s . 

81 s t r = re . sub ( "\. " , "e" , s t r ) 

82 s t r = re . sub ( "- " , "t" , s t r ) 

83 

84 # Periods ( f o l l o w e d by zero or more white space ) . 

85 s t r = re . sub ( "\|\| (\| )*" , ". " , s t r ) 

86 

87 # Remaining white space . 

88 s t r = re . sub ( "\| " , " " , s t r ) 

89 

90 print s t r 

304


7.1. string — Common 

string operations 

7.1.1. String constants 

7.1.2. String Formatting 

7.1.3. Format String 

Syntax 

7.1.3.1. Format 

Specification Mini- 

Language 

7.1.3.2. Format 

examples 

7.1.4. Template strings 

7.1.5. String functions 

7.1.6. Deprecated string 

functions 


7. String Services 

Next topic 

7.2. re — Regular 

expression operations 

This Page 

Report a Bug 

Show Source 

Quick search 

Go 

Python v2.7.3 documentation » The Python Standard 

Library » 7. String Services » 

Enter search terms or a module, 

class or function name. 

previous | next | modules | index 

7.1. string — Common 

string operations 

Source code: Lib/string.py 

The string module contains a number of 

useful constants and classes, as well as 

some deprecated legacy functions that are 

also available as methods on strings. In 

addition, Pythonʼs built-in string classes 

support the sequence type methods 

described in the Sequence Types — str, 

unicode, list, tuple, bytearray, buffer, xrange 

section, and also the string-specific methods 

described in the String Methods section. To 

output formatted strings use template strings 

or the % operator described in the String 

Formatting Operations section. Also, see the 

re module for string functions based on 

regular expressions. 

7.1.1. String constants 

The constants defined in this module are: 

string.ascii_letters 

The concatenation of the ascii_lowercase 

and ascii_uppercase constants described 

below. This value is not localedependent. 

string.ascii_lowercase 

The lowercase letters 

'abcdefghijklmnopqrstuvwxyz'. This value 

is not locale-dependent and will not 

change. 

string.ascii_uppercase 

The uppercase letters

'ABCDEFGHIJKLMNOPQRSTUVWXYZ'. This value 

is not locale-dependent and will not 

change. 

string.digits 

The string '0123456789'. 

string.hexdigits 

The string '0123456789abcdefABCDEF'. 

string.letters 

The concatenation of the strings 

lowercase and uppercase described 

below. The specific value is localedependent, 

and will be updated when 

locale.setlocale() is called. 

string.lowercase 

A string containing all the characters that 

are considered lowercase letters. On 

most systems this is the string 

'abcdefghijklmnopqrstuvwxyz'. The 

specific value is locale-dependent, and 

will be updated when locale.setlocale() 

is called. 

string.octdigits 

The string '01234567'. 

string.punctuation 

String of ASCII characters which are 

considered punctuation characters in the 

C locale. 

string.printable 

String of characters which are 

considered printable. This is a 

combination of digits, letters, 

punctuation, and whitespace. 

string.uppercase 

A string containing all the characters that 

are considered uppercase letters. On 

most systems this is the string 

'ABCDEFGHIJKLMNOPQRSTUVWXYZ'. The

specific value is locale-dependent, and 

will be updated when locale.setlocale() 

is called. 

string.whitespace 

A string containing all characters that are 

considered whitespace. On most 

systems this includes the characters 

space, tab, linefeed, return, formfeed, 

and vertical tab. 

7.1.2. String Formatting 

New in version 2.6. 

The built-in str and unicode classes provide 

the ability to do complex variable 

substitutions and value formatting via the 

str.format() method described in PEP 3101. 

The Formatter class in the string module 

allows you to create and customize your own 

string formatting behaviors using the same 

implementation as the built-in format() 

method. 

class string.Formatter 

The Formatter class has the following 

public methods: 

format(format_string, *args, 

**kwargs) 

format() is the primary API method. 

It takes a format string and an 

arbitrary set of positional and 

keyword arguments. format() is just 

a wrapper that calls vformat(). 

vformat(format_string, args, kwargs) 

This function does the actual work of 

formatting. It is exposed as a 

separate function for cases where 

you want to pass in a predefined 

dictionary of arguments, rather than 

unpacking and repacking the

dictionary as individual arguments 

using the *args and **kwds syntax. 

vformat() does the work of breaking 

up the format string into character 

data and replacement fields. It calls 

the various methods described 

below. 

In addition, the Formatter defines a 

number of methods that are intended to 

be replaced by subclasses: 

parse(format_string) 

Loop over the format_string and 

return an iterable of tuples 

(literal_text, field_name, 

format_spec, conversion). This is 

used by vformat() to break the string 

into either literal text, or replacement 

fields. 

The values in the tuple conceptually 

represent a span of literal text 

followed by a single replacement 

field. If there is no literal text (which 

can happen if two replacement fields 

occur consecutively), then literal_text 

will be a zero-length string. If there is 

no replacement field, then the values 

of field_name, format_spec and 

conversion will be None. 

get_field(field_name, args, 

kwargs) 

Given field_name as returned by 

parse() (see above), convert it to an 

object to be formatted. Returns a 

tuple (obj, used_key). The default 

version takes strings of the form 

defined in PEP 3101, such as 

“0[name]” or “label.title”. args and 

kwargs are as passed in to 

vformat(). The return value 

used_key has the same meaning as

the key parameter to get_value(). 

get_value(key, args, kwargs) 

Retrieve a given field value. The key 

argument will be either an integer or 

a string. If it is an integer, it 

represents the index of the positional 

argument in args; if it is a string, then 

it represents a named argument in 

kwargs. 

The args parameter is set to the list 

of positional arguments to vformat(), 

and the kwargs parameter is set to 

the dictionary of keyword arguments. 

For compound field names, these 

functions are only called for the first 

component of the field name; 

Subsequent components are 

handled through normal attribute and 

indexing operations. 

So for example, the field expression 

ʻ0.nameʼ would cause get_value() to 

be called with a key argument of 0. 

The name attribute will be looked up 

after get_value() returns by calling 

the built-in getattr() function. 

If the index or keyword refers to an 

item that does not exist, then an 

IndexError or KeyError should be 

raised. 

check_unused_args(used_args, args, 

kwargs) 

Implement checking for unused 

arguments if desired. The arguments 

to this function is the set of all 

argument keys that were actually 

referred to in the format string 

(integers for positional arguments, 

and strings for named arguments), 

and a reference to the args and

kwargs that was passed to vformat. 

The set of unused args can be 

calculated from these parameters. 

check_unused_args() is assumed to 

raise an exception if the check fails. 

format_field(value, format_spec) 

format_field() simply calls the global 

format() built-in. The method is 

provided so that subclasses can 

override it. 

convert_field(value, conversion) 

Converts the value (returned by 

get_field()) given a conversion type 

(as in the tuple returned by the 

parse() method). The default version 

understands ʻsʼ (str), ʻrʼ (repr) and ʻaʼ 

(ascii) conversion types. 

7.1.3. Format String Syntax 

The str.format() method and the Formatter 

class share the same syntax for format 

strings (although in the case of Formatter, 

subclasses can define their own format string 

syntax). 

Format strings contain “replacement fields” 

surrounded by curly braces {}. Anything that 

is not contained in braces is considered literal 

text, which is copied unchanged to the 

output. If you need to include a brace 

character in the literal text, it can be escaped 

by doubling: {{ and }}. 

The grammar for a replacement field is as 

follows:

eplacement_field ::= "{" [field_name] ["!" 

field_name ::= arg_name ("." attribute_ 

arg_name ::= [identifier | integer 

attribute_name ::= identifier 

element_index ::= integer | index_string 

index_string ::=

{}' is equivalent to '{0} {1}'. 

Some simple format string examples: 

"First, thou shalt count to {0}" # References first 

"Bring me a {}" # Implicitly refere 

"From {} to {}" # Same as "From {0} 

"My quest is {name}" # References keywor 

"Weight in tons {0.weight}" # 'weight' attribut 

"Units destroyed: {players[0]}" # First element of 

The conversion field causes a type coercion 

before formatting. Normally, the job of 

formatting a value is done by the 

__format__() method of the value itself. 

However, in some cases it is desirable to 

force a type to be formatted as a string, 

overriding its own definition of formatting. By 

converting the value to a string before calling 

__format__(), the normal formatting logic is 

bypassed. 

Two conversion flags are currently 

supported: '!s' which calls str() on the 

value, and '!r' which calls repr(). 

Some examples: 

"Harold's a clever {0!s}" # Calls str() on th 

"Bring out the holy {name!r}" # Calls repr() on t 

The format_spec field contains a 

specification of how the value should be 

presented, including such details as field 

width, alignment, padding, decimal precision 

and so on. Each value type can define its 

own “formatting mini-language” or 

interpretation of the format_spec. 

Most built-in types support a common 

formatting mini-language, which is described 

in the next section. 

A format_spec field can also include nested 

replacement fields within it. These nested 

replacement fields can contain only a field

eplacement fields can contain only a field 

name; conversion flags and format 

specifications are not allowed. The 

replacement fields within the format_spec are 

substituted before the format_spec string is 

interpreted. This allows the formatting of a 

value to be dynamically specified. 

See the Format examples section for some 

examples. 

7.1.3.1. Format Specification Mini- 

Language 

“Format specifications” are used within 

replacement fields contained within a format 

string to define how individual values are 

presented (see Format String Syntax). They 

can also be passed directly to the built-in 

format() function. Each formattable type may 

define how the format specification is to be 

interpreted. 

Most built-in types implement the following 

options for format specifications, although 

some of the formatting options are only 

supported by the numeric types. 

A general convention is that an empty format 

string ("") produces the same result as if you 

had called str() on the value. A non-empty 

format string typically modifies the result. 

The general form of a standard format 

specifier is: 

format_spec ::= [[fill]align][sign][#][0][width 

fill ::= 

align ::= "" | "=" | "^" 

sign ::= "+" | "-" | " " 

width ::= integer 

precision ::= integer 

type ::= "b" | "c" | "d" | "e" | "E" | "f" | 

The fill character can be any character other 

than ʻ{ʻ or ʻ}ʼ. The presence of a fill character

is signaled by the character following it, 

which must be one of the alignment options. 

If the second character of format_spec is not 

a valid alignment option, then it is assumed 

that both the fill character and the alignment 

option are absent. 

The meaning of the various alignment 

options is as follows: 

Option Meaning 

'' Forces the field to be 

right-aligned within the 

available space (this is 

the default for 

numbers). 

'=' Forces the padding to 

be placed after the sign 

(if any) but before the 

digits. This is used for 

printing fields in the 

form ʻ+000000120ʼ. 

This alignment option is 

only valid for numeric 

types. 

'^' Forces the field to be 

centered within the 

available space. 

Note that unless a minimum field width is 

defined, the field width will always be the 

same size as the data to fill it, so that the 

alignment option has no meaning in this 

case. 

The sign option is only valid for number 

types, and can be one of the following: 

Option Meaning 

'+' indicates that a sign 

should be used for both 

positive as well as

positive as well as 

negative numbers. 

'-' indicates that a sign 

should be used only for 

negative numbers (this 

is the default behavior). 

space indicates that a leading 

space should be used 

on positive numbers, 

and a minus sign on 

negative numbers. 

The '#' option is only valid for integers, and 

only for binary, octal, or hexadecimal output. 

If present, it specifies that the output will be 

prefixed by '0b', '0o', or '0x', respectively. 

The ',' option signals the use of a comma 

for a thousands separator. For a locale aware 

separator, use the 'n' integer presentation 

type instead. 

Changed in version 2.7: Added the ',' option 

(see also PEP 378). 

width is a decimal integer defining the 

minimum field width. If not specified, then the 

field width will be determined by the content. 

Preceding the width field by a zero ('0') 

character enables sign-aware zero-padding 

for numeric types. This is equivalent to a fill 

character of '0' with an alignment type of 

'='. 

The precision is a decimal number indicating 

how many digits should be displayed after 

the decimal point for a floating point value 

formatted with 'f' and 'F', or before and 

after the decimal point for a floating point 

value formatted with 'g' or 'G'. For nonnumber 

types the field indicates the 

maximum field size - in other words, how 

many characters will be used from the field 

content. The precision is not allowed for

integer values. 

Finally, the type determines how the data 

should be presented. 

The available string presentation types are: 

Type Meaning 

's' String format. This is the 

default type for strings 

and may be omitted. 

None The same as 's'. 

The available integer presentation types are: 

Type Meaning 

'b' Binary format. Outputs 

the number in base 2. 

'c' Character. Converts the 

integer to the 

corresponding unicode 

character before printing. 

'd' Decimal Integer. Outputs 

the number in base 10. 

'o' Octal format. Outputs the 

number in base 8. 

'x' Hex format. Outputs the 

number in base 16, using 

lowercase letters for the 

digits above 9. 

'X' Hex format. Outputs the 

number in base 16, using 

uppercase letters for 

the digits above 9. 

'n' Number. This is the 

same as 'd', except that 

it uses the current locale 

setting to insert the 

appropriate number 

separator characters. 

None The same as 'd'. 

In addition to the above presentation types, 

integers can be formatted with the floating 

point presentation types listed below (except 

'n' and None). When doing so, float() is

'n' and None). When doing so, float() is 

used to convert the integer to a floating point 

number before formatting. 

The available presentation types for floating 

point and decimal values are: 

Type Meaning 

'e' Exponent notation. Prints 

the number in scientific 

notation using the letter 

ʻeʼ to indicate the 

exponent. 

'E' Exponent notation. Same 

as 'e' except it uses an 

upper case ʻEʼ as the 

separator character. 

'f' Fixed point. Displays the 

number as a fixed-point 

number. 

'F' Fixed point. Same as 

'f'. 

'g' General format. For a 

given precision p >= 1, 

this rounds the number to 

p significant digits and 

then formats the result in 

either fixed-point format 

or in scientific notation, 

depending on its 

magnitude. 

The precise rules are as 

follows: suppose that the 

result formatted with 

presentation type 'e' and 

precision p-1 would have 

exponent exp. Then if -4 

precision p-1. In both 

cases insignificant trailing 

zeros are removed from 

the significand, and the 

decimal point is also 

removed if there are no 

remaining digits following 

it. 

Positive and negative 

infinity, positive and 

negative zero, and nans, 

are formatted as inf, - 

inf, 0, -0 and nan 

respectively, regardless 

of the precision. 

A precision of 0 is treated 

as equivalent to a 

precision of 1. 

'G' General format. Same as 

'g' except switches to 

'E' if the number gets 

too large. The 

representations of infinity 

and NaN are 

uppercased, too. 

'n' Number. This is the 

same as 'g', except that 

it uses the current locale 

setting to insert the 

appropriate number 

separator characters. 

'%' Percentage. Multiplies 

the number by 100 and 

displays in fixed ('f') 

format, followed by a 

percent sign. 

None The same as 'g'. 

7.1.3.2. Format examples 

This section contains examples of the new 

format syntax and comparison with the old %formatting.

In most of the cases the syntax is similar to 

the old %-formatting, with the addition of the 

{} and with : used instead of %. For example, 

'%03.2f' can be translated to '{:03.2f}'. 

The new format syntax also supports new 

and different options, shown in the follow 

examples. 

Accessing arguments by position: 

>>> '{0}, {1}, {2}'.format('a', 'b', 'c') >>> 

'a, b, c' 

>>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only 

'a, b, c' 

>>> '{2}, {1}, {0}'.format('a', 'b', 'c') 

'c, b, a' 

>>> '{2}, {1}, {0}'.format(*'abc') # unpacking 

'c, b, a' 

>>> '{0}{1}{0}'.format('abra', 'cad') # arguments' 

'abracadabra' 

Accessing arguments by name: 

>>> 'Coordinates: {latitude}, {longitude}'.format 

>>> 

'Coordinates: 37.24N, -115.81W' 

>>> coord = {'latitude': '37.24N', 'longitude' 

>>> 'Coordinates: {latitude}, {longitude}'.format 

'Coordinates: 37.24N, -115.81W' 

Accessing argumentsʼ attributes: 

>>> c = 3-5j 

>>> 

>>> ('The complex number {0} is formed from the real 

... 'and the imaginary part {0.imag}.').format 

'The complex number (3-5j) is formed from the real p 

>>> class Point(object): 

... def __init__(self, x, y): 

... self.x, self.y = x, y 

... def __str__(self): 

... return 'Point({self.x}, {self.y})' 

... 

>>> str(Point(4, 2)) 

'Point(4, 2)' 

Accessing argumentsʼ items:

coord = (3, 5) 

>>> 

>>> 'X: {0[0]}; Y: {0[1]}'.format(coord) 

'X: 3; Y: 5' 

Replacing %s and %r: 

>>> "repr() shows quotes: {!r}; str() >>> doesn't: {!s}" 

"repr() shows quotes: 'test1'; str() doesn't: test2" 

Aligning the text and specifying a width: 

>>> '{:30}'.format('right aligned') 

' right aligned' 

>>> '{:^30}'.format('centered') 

' centered ' 

>>> '{:*^30}'.format('centered') # use '*' as a fil 

'***********centered***********' 

Replacing %+f, %-f, and % f and specifying a 

sign: 

>>> '{:+f}; {:+f}'.format(3.14, -3.14) >>> # show it al 

'+3.140000; -3.140000' 

>>> '{: f}; {: f}'.format(3.14, -3.14) # show a spa 

' 3.140000; -3.140000' 

>>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only 

'3.140000; -3.140000' 

Replacing %x and %o and converting the value 

to different bases: 

>>> # format also supports binary numbers >>> 

>>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0: 

'int: 42; hex: 2a; oct: 52; bin: 101010' 

>>> # with 0x, 0o, or 0b as prefix: 

>>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: { 

'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010' 

Using the comma as a thousands separator: 

>>> '{:,}'.format(1234567890) 

'1,234,567,890' 

Expressing a percentage: 

>>>

Expressing a percentage: 

>>> points = 19.5 

>>> 

>>> total = 22 

>>> 'Correct answers: {:.2%}'.format(points/ 

'Correct answers: 88.64%' 

Using type-specific formatting: 

>>> import datetime 

>>> 

>>> d = datetime.datetime(2010, 7, 4, 12, 15 

>>> '{:%Y-%m-%d %H:%M:%S}'.format(d) 

'2010-07-04 12:15:58' 

Nesting arguments and more complex 

examples: 

>>> for align, text in zip('', ['left', >>> 

'center' 

... '{0:{fill}{align}16}'.format(text, fill 

... 

'left

Templates support $-based substitutions, 

using the following rules: 

$$ is an escape; it is replaced with a 

single $. 

$identifier names a substitution 

placeholder matching a mapping key of 

"identifier". By default, "identifier" 

must spell a Python identifier. The first 

non-identifier character after the $ 

character terminates this placeholder 

specification. 

${identifier} is equivalent to 

$identifier. It is required when valid 

identifier characters follow the 

placeholder but are not part of the 

placeholder, such as 

"${noun}ification". 

Any other appearance of $ in the string will 

result in a ValueError being raised. 

The string module provides a Template class 

that implements these rules. The methods of 

Template are: 

class string.Template(template) 

The constructor takes a single argument 

which is the template string. 

substitute(mapping[, **kws]) 

Performs the template substitution, 

returning a new string. mapping is 

any dictionary-like object with keys 

that match the placeholders in the 

template. Alternatively, you can 

provide keyword arguments, where 

the keywords are the placeholders. 

When both mapping and kws are 

given and there are duplicates, the 

placeholders from kws take 

precedence. 

safe_substitute(mapping[,

**kws]) 

Like substitute(), except that if 

placeholders are missing from 

mapping and kws, instead of raising 

a KeyError exception, the original 

placeholder will appear in the 

resulting string intact. Also, unlike 

with substitute(), any other 

appearances of the $ will simply 

return $ instead of raising 

ValueError. 

While other exceptions may still 

occur, this method is called “safe” 

because substitutions always tries to 

return a usable string instead of 

raising an exception. In another 

sense, safe_substitute() may be 

anything other than safe, since it will 

silently ignore malformed templates 

containing dangling delimiters, 

unmatched braces, or placeholders 

that are not valid Python identifiers. 

Template instances also provide one 

public data attribute: 

template 

This is the object passed to the 

constructorʼs template argument. In 

general, you shouldnʼt change it, but 

read-only access is not enforced. 

Here is an example of how to use a 

Template:

from string import Template 

>>> s = Template('$who likes $what') 

>>> s.substitute(who='tim', what='kung pao') 

'tim likes kung pao' 

>>> d = dict(who='tim') 

>>> Template('Give $who $100').substitute(d) 


[...] 

ValueError: Invalid placeholder in string: line 1, c 

>>> Template('$who likes $what').substitute(d) 


[...] 

KeyError: 'what' 

>>> Template('$who likes $what').safe_substitute(d) 

'tim likes $what' 

Advanced usage: you can derive subclasses 

of Template to customize the placeholder 

syntax, delimiter character, or the entire 

regular expression used to parse template 

strings. To do this, you can override these 

class attributes: 

delimiter – This is the literal string 

describing a placeholder introducing 

delimiter. The default value is $. Note 

that this should not be a regular 

expression, as the implementation will 

call re.escape() on this string as 

needed. 

idpattern – This is the regular 

expression describing the pattern for 

non-braced placeholders (the braces 

will be added automatically as 

appropriate). The default value is the 

regular expression [_a-z][_a-z0-9]*. 

Alternatively, you can provide the entire 

regular expression pattern by overriding the 

class attribute pattern. If you do this, the 

value must be a regular expression object 

with four named capturing groups. The 

capturing groups correspond to the rules 

given above, along with the invalid 

placeholder rule: 

escaped – This group matches the 

escape sequence, e.g. $$, in the default

escape sequence, e.g. $$, in the default 

pattern. 

named – This group matches the 

unbraced placeholder name; it should 

not include the delimiter in capturing 

group. 

braced – This group matches the brace 

enclosed placeholder name; it should 

not include either the delimiter or 

braces in the capturing group. 

invalid – This group matches any other 

delimiter pattern (usually a single 

delimiter), and it should appear last in 

the regular expression. 

7.1.5. String functions 

The following functions are available to 

operate on string and Unicode objects. They 

are not available as string methods. 

string.capwords(s[, sep]) 

Split the argument into words using 

str.split(), capitalize each word using 

str.capitalize(), and join the capitalized 

words using str.join(). If the optional 

second argument sep is absent or None, 

runs of whitespace characters are 

replaced by a single space and leading 

and trailing whitespace are removed, 

otherwise sep is used to split and join the 

words. 

string.maketrans(from, to) 

Return a translation table suitable for 

passing to translate(), that will map 

each character in from into the character 

at the same position in to; from and to 

must have the same length. 

Note: Donʼt use strings derived from 

lowercase and uppercase as arguments; 

in some locales, these donʼt have the 

same length. For case conversions,

same length. For case conversions, 

always use str.lower() and 

str.upper(). 

7.1.6. Deprecated string 

functions 

The following list of functions are also defined 

as methods of string and Unicode objects; 

see section String Methods for more 

information on those. You should consider 

these functions as deprecated, although they 

will not be removed until Python 3. The 

functions defined in this module are: 

string.atof(s) 

Deprecated since version 2.0: Use the 

float() built-in function. 

Convert a string to a floating point 

number. The string must have the 

standard syntax for a floating point literal 

in Python, optionally preceded by a sign 

(+ or -). Note that this behaves identical 

to the built-in function float() when 

passed a string. 

Note: When passing in a string, 

values for NaN and Infinity may be 

returned, depending on the underlying 

C library. The specific set of strings 

accepted which cause these values to 

be returned depends entirely on the C 

library and is known to vary. 

string.atoi(s[, base]) 


int() built-in function. 

Convert string s to an integer in the given 

base. The string must consist of one or 

more digits, optionally preceded by a

more digits, optionally preceded by a 

sign (+ or -). The base defaults to 10. If it 

is 0, a default base is chosen depending 

on the leading characters of the string 

(after stripping the sign): 0x or 0X means 

16, 0 means 8, anything else means 10. 

If base is 16, a leading 0x or 0X is always 

accepted, though not required. This 

behaves identically to the built-in function 

int() when passed a string. (Also note: 

for a more flexible interpretation of 

numeric literals, use the built-in function 

eval().) 

string.atol(s[, base]) 


long() built-in function. 

Convert string s to a long integer in the 

given base. The string must consist of 

one or more digits, optionally preceded 

by a sign (+ or -). The base argument 

has the same meaning as for atoi(). A 

trailing l or L is not allowed, except if the 

base is 0. Note that when invoked 

without base or with base set to 10, this 

behaves identical to the built-in function 

long() when passed a string. 

string.capitalize(word) 

Return a copy of word with only its first 

character capitalized. 

string.expandtabs(s[, tabsize]) 

Expand tabs in a string replacing them 

by one or more spaces, depending on 

the current column and the given tab 

size. The column number is reset to zero 

after each newline occurring in the string. 

This doesnʼt understand other nonprinting 

characters or escape sequences. 

The tab size defaults to 8. 

string.find(s, sub[, start[, end]])

string.find(s, sub[, start[, end]]) 

Return the lowest index in s where the 

substring sub is found such that sub is 

wholly contained in s[start:end]. Return 

-1 on failure. Defaults for start and end 

and interpretation of negative values is 

the same as for slices. 

string.rfind(s, sub[, start[, end]]) 

Like find() but find the highest index. 

string.index(s, sub[, start[, end]]) 

Like find() but raise ValueError when the 

substring is not found. 

string.rindex(s, sub[, start[, end]]) 

Like rfind() but raise ValueError when 

the substring is not found. 

string.count(s, sub[, start[, end]]) 

Return the number of (non-overlapping) 

occurrences of substring sub in string 

s[start:end]. Defaults for start and end 

and interpretation of negative values are 

the same as for slices. 

string.lower(s) 

Return a copy of s, but with upper case 

letters converted to lower case. 

string.split(s[, sep[, maxsplit]]) 

Return a list of the words of the string s. 

If the optional second argument sep is 

absent or None, the words are separated 

by arbitrary strings of whitespace 

characters (space, tab, newline, return, 

formfeed). If the second argument sep is 

present and not None, it specifies a string 

to be used as the word separator. The 

returned list will then have one more item 

than the number of non-overlapping 

occurrences of the separator in the 

string. If maxsplit is given, at most 

maxsplit number of splits occur, and the

maxsplit number of splits occur, and the 

remainder of the string is returned as the 

final element of the list (thus, the list will 

have at most maxsplit+1 elements). If 

maxsplit is not specified or -1, then there 

is no limit on the number of splits (all 

possible splits are made). 

The behavior of split on an empty string 

depends on the value of sep. If sep is not 

specified, or specified as None, the result 

will be an empty list. If sep is specified as 

any string, the result will be a list 

containing one element which is an 

empty string. 

string.rsplit(s[, sep[, maxsplit]]) 

Return a list of the words of the string s, 

scanning s from the end. To all intents 

and purposes, the resulting list of words 

is the same as returned by split(), 

except when the optional third argument 

maxsplit is explicitly specified and 

nonzero. If maxsplit is given, at most 

maxsplit number of splits – the rightmost 

ones – occur, and the remainder of the 

string is returned as the first element of 

the list (thus, the list will have at most 

maxsplit+1 elements). 


string.splitfields(s[, sep[, maxsplit]]) 

This function behaves identically to 

split(). (In the past, split() was only 

used with one argument, while 

splitfields() was only used with two 

arguments.) 

string.join(words[, sep]) 

Concatenate a list or tuple of words with 

intervening occurrences of sep. The 

default value for sep is a single space 

character. It is always true that 

string.join(string.split(s, sep), sep)

string.join(string.split(s, sep), sep) 

equals s. 

string.joinfields(words[, sep]) 

This function behaves identically to 

join(). (In the past, join() was only 

used with one argument, while 

joinfields() was only used with two 

arguments.) Note that there is no 

joinfields() method on string objects; 

use the join() method instead. 

string.lstrip(s[, chars]) 

Return a copy of the string with leading 

characters removed. If chars is omitted 

or None, whitespace characters are 

removed. If given and not None, chars 

must be a string; the characters in the 

string will be stripped from the beginning 

of the string this method is called on. 

Changed in version 2.2.3: The chars 

parameter was added. The chars 

parameter cannot be passed in earlier 

2.2 versions. 

string.rstrip(s[, chars]) 

Return a copy of the string with trailing 

characters removed. If chars is omitted 

or None, whitespace characters are 

removed. If given and not None, chars 


string will be stripped from the end of the 

string this method is called on. 




2.2 versions. 

string.strip(s[, chars]) 

Return a copy of the string with leading 

and trailing characters removed. If chars 

is omitted or None, whitespace characters 

are removed. If given and not None, chars

are removed. If given and not None, chars 


string will be stripped from the both ends 

of the string this method is called on. 




2.2 versions. 

string.swapcase(s) 

Return a copy of s, but with lower case 

letters converted to upper case and vice 

versa. 

string.translate(s, table[, 

deletechars]) 

Delete all characters from s that are in 

deletechars (if present), and then 

translate the characters using table, 

which must be a 256-character string 

giving the translation for each character 

value, indexed by its ordinal. If table is 

None, then only the character deletion 

step is performed. 

string.upper(s) 

Return a copy of s, but with lower case 

letters converted to upper case. 

string.ljust(s, width[, fillchar]) 

string.rjust(s, width[, fillchar]) 

string.center(s, width[, fillchar]) 

These functions respectively left-justify, 

right-justify and center a string in a field 

of given width. They return a string that is 

at least width characters wide, created 

by padding the string s with the character 

fillchar (default is a space) until the given 

width on the right, left or both sides. The 

string is never truncated. 

string.zfill(s, width) 

Pad a numeric string on the left with zero

Python v2.7.3 documentation » The Python Standard 

Library » 7. String Services » 

Pad a numeric string on the left with zero 

digits until the given width is reached. 

Strings starting with a sign are handled 

correctly. 

string.replace(str, old, new[, 

maxreplace]) 

Return a copy of string str with all 

occurrences of substring old replaced by 

new. If the optional argument 

maxreplace is given, the first maxreplace 

occurrences are replaced. 


© Copyright 1990-2012, Python Software Foundation. 

The Python Software Foundation is a non-profit corporation. Please donate. 

Last updated on Sep 06, 2012. Found a bug? 

Created using Sphinx 1.0.7.

5.3.8 docstring, see also Sec. 2.33 

First reference occurs in Python Docstrings (Sourceforge), see Section 2.33 on page 211. 

333

Home Trees Indices Help 

Module atoms_stub 

Module atoms_stub 

Author: 

Organization: Department of Chemical Engineering, NTNU, Norway 

Contact: 

License: 

Requires: Python or higher 

Since: () 

Version: 

To Do (1.0): 

Change Log: 

started () 

() 

Note: 

Functions 

Function Details 

[hide private] 

[frames] | no frames] 

atoms(formula, debug=False, stack=[], delim=0, atom=r'', 

ldel=r'', rdel=r'') 

The 'atoms' parser . 

atoms(formula, debug=False, stack=[], delim=0, atom=r'', 

ldel=r'', rdel=r'') 

The 'atoms' parser . 



Parameters: 

formula () - a chemical formula 'COOH(C(CH3)2)3CH3' 

debug (aBoolean) - True or False flag 

stack () - list of dictionaries { 'atom name': int, ... } 

delim () - number of left-delimiters that have been opened and 

not yet closed. 

atom (aRE on raw string format) - string equivalent of RE matching 

atom name including an optional number 'He', 'N2', 'H3', etc. 

ldel () - string equivalent of RE matching the left-delimiter '(' 

rdel () - string equivalent of RE matching the right-delimiter 

including an optional number ')', ')3', etc. 

Returns: 

aList [ aDictionary, aDictionary, ... ] e.g. [{'C': 11, 'H': 22, 'O': 2}] 

Home Trees Indices Help 

Generated by Epydoc 3.0.1 on Thu Sep 6 23:39:52 2012 

http://epydoc.sourceforge.net

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back





5.4.1 Reference ???, see also Sec. 5.2.1 

First reference occurs in Reference ???, see Section 5.2.1 on page 290. 

337

Parsing a Molecular Formula 





Alan J. Perlis (1982) 

Assignments 

Zooball/Lion 

A language that doesn't affect the way you think about programming, is not worth knowing. 

1. a. Download the stub program atoms.py. Save the file in your local Python 

folder. Keep the file name as indicated. 

b. Learn about Python dictionaries and lists in general and about method 

calls like re.match, len and keywords like def, pass, return in 

particular. We shall also make use of a programming concept called 

"recursiveness". A simple example is the calculation of, say, 5 factorial. 

We can either program it like this: 

def factorial(n=5): 

m = 1 

for i in range(1,n+1): 

m *= i 

return m 

or, using recursive function calls: 

def factorial(n=5): 

if n > 1: 

return n*factorial(n-1) 

else: 

return 1 

Recursiveness gives beatiful albeit hard-to-debug computer code. There 

are special languages devoted entirely to so-called functional 

programming, like e.g. Lisp and Haskell, but Python is also quite wellsuited 

for such tasks. 

2. Write a chemical formula parser called atoms that takes a string input and 

returns a dictionary (hash table) of atom names (keys) and stoichiometric 

numbers (values). Like for instance: 

atoms('COOH(C(CH3)2)3CH3') == [{'Cl':3, 'H': 19, 'C': 11, 'O': 2}] 

Use atoms.py as template. Do not change any of the variable names 

because this makes student's assistance and co-operation much harder!

Chemical formulas are — from a mass balance perspective — simple linear 

algebraic expressions. This sounds a little strange maybe, but the rules of 

summation and multiplication are implicitly understood from the formula. Take e.g. 

water (H2O). The mass of one water molecule is H*2 + O*1 where H and O stand 

for the atomic masses of hydrogen and oxygen. So, when we write H2O we really 

mean H*2 + O*1. The same rule applies to more complicated molecules like for 

instance COOH(C(CH3)2)3CH3. The mass is C*1 + O*1 + O*1 + (C*1 + 

(C*1 + H*3)*2)*3 + C*1 + H*3. We see that the use of parentheses are just 

like in everyday algebra. This means that it is possible to interpret — we shall 

hereafter call it parse — the formula into a list of atoms and a corresponding list of 

stoichiometric numbers. These two list are conveniently held together in what is 

called a dictionary (hash table). In programming lingo we would say: 

'COOH(C(CH3)2)3CH3' -> [{'Cl':3, 'H': 19, 'C': 11, 'O': 2}] 

To make the syntax straight [{}] means a list of length one which contains an 

empty dictionary. Note that for technical reasons the hash table is put inside a list 

(an array). This makes later use of the code easier (the exact reason is not visible 

at the moment). To write a parser we must know a little about Backus-Naur 

Formalism (BNF). The idea is quite simple, but it is hard to explain in words. An 

example serves better. Here is the BNF description of a floating decimal number: 

back 

S := FN | '-' FN 

FN := DL | DL '.' DL 

DL := D | D DL 

D := '0' | '1' | ... | '9' 

Here S stands for sentence, FN for floating number, DL for digit list and D for digit. 

These are called the production rules. They are on the form SYMBOL := SYMBOL 

| TERMINAL. A symbol is something that is defined by := while a terminal is a 

literal string in quotes. We see that our number is composed of the terminals -, ., 

0, 1, ••• 9. OK, fine. Let's see if the BNF can represent a number for us. Starting at 

the top of the production list we continue making arbitrary decisions till there is 

nothing more to decide: 

back 

S

S := '-' ? D + ('.' D +) ? 

D := '0' | '1' | ... | '9' 

This is definitly simpler and it is also quite close to Regular Expressions (RE) 

notation in Python. Actually, there are many dialects of RE but they are all close to 

this form: 

back 

S := (-)?([0-9]+)(\.([0-9]+))? 

or even simpler: 

S := -?\d+(\.\d+)? 

The idea is now to use S inside a program to match all occurences of floating point 

numbers. This is an incredible strong concept as it opens up for the programming of 

programming languages (making parsers and compilers). Now, back to our 

chemical formula we need only three regular expressions: 

1) An atom name (chemical symbol) followed by nothing or an integer. 

2) A left delimiter (left parenthesis). 

3) A right delimiter (right parenthesis) followed by nothing or an integer. 

At the moment these expressions will do all right: 

back 

ATOM := ([A-Z][a-z]?)(\d+)? 

LDEL := $ 

RDEL := $(\d+)? 

I have'nt mentioned it yet, but there are a few reserved characters in RE's. These 

include: ., -, +, (, ), [, ], {, }, ?, |, ^ and $. Any use of these characters as 

terminal strings must be preceeded by \ (a backspace). The technique is called 

"escaping" in the local lingo. 

The trick is now to make use of ATOM, LDEL and RDEL to break the chemical 

formula into bits and pieces using recursive function calls starting at the left end of 

the formula. Exactly how this procedure should be written is made part of your 

assigment (but you have got the license to ask). 

back 


5.5.1 Alan J. Perlis (1982), see also Sec. 2.29 

First reference occurs in 2000 languages, see Section 2.29 on page 165. 

341

5.5.2 atoms.py, see also Sec. 5.3.3 

First reference occurs in atoms.py, see Section 5.3.3 on page 298. 

342

Python v2.7.3 documentation » The Python Tutorial » 


5. Data Structures 

5.1. More on Lists 

5.1.1. Using Lists 

as Stacks 

5.1.2. Using Lists 

as Queues 

5.1.3. Functional 

Programming 

Tools 

5.1.4. List 

Comprehensions 

5.1.4.1. 

Nested List 

Comprehensions 

5.2. The del statement 

5.3. Tuples and 

Sequences 

5.4. Sets 

5.5. Dictionaries 

5.6. Looping 

Techniques 

5.7. More on Conditions 

5.8. Comparing 

Sequences and Other 

Types 


4. More Control Flow Tools 

Next topic 

6. Modules 

This Page 

Report a Bug 

Show Source 

Quick search 

Go 

Enter search terms or a module, 

class or function name. 


5. Data Structures 

This chapter describes some things youʼve 

learned about already in more detail, and 

adds some new things as well. 

5.1. More on Lists 

The list data type has some more methods. 

Here are all of the methods of list objects: 

list.append(x) 

Add an item to the end of the list; 

equivalent to a[len(a):] = [x]. 

list.extend(L) 

Extend the list by appending all the items 

in the given list; equivalent to a[len(a):] 

= L. 

list.insert(i, x) 

Insert an item at a given position. The 

first argument is the index of the element 

before which to insert, so a.insert(0, x) 

inserts at the front of the list, and 

a.insert(len(a), x) is equivalent to 

a.append(x). 

list.remove(x) 

Remove the first item from the list whose 

value is x. It is an error if there is no such 

item. 

list.pop([i]) 

Remove the item at the given position in 

the list, and return it. If no index is 

specified, a.pop() removes and returns 

the last item in the list. (The square 

brackets around the i in the method 

signature denote that the parameter is 

optional, not that you should type square

ackets at that position. You will see this 

notation frequently in the Python Library 

Reference.) 

list.index(x) 

Return the index in the list of the first 

item whose value is x. It is an error if 

there is no such item. 

list.count(x) 

Return the number of times x appears in 

the list. 

list.sort() 

Sort the items of the list, in place. 

list.reverse() 

Reverse the elements of the list, in place. 

An example that uses most of the list 

methods: 

>>> a = [66.25, 333, 333, 1, 1234.5] >>> 

>>> print a.count(333), a.count(66.25), a.count 

2 1 0 

>>> a.insert(2, -1) 

>>> a.append(333) 

>>> a 

[66.25, 333, -1, 333, 1, 1234.5, 333] 

>>> a.index(333) 

1 

>>> a.remove(333) 

>>> a 

[66.25, -1, 333, 1, 1234.5, 333] 

>>> a.reverse() 

>>> a 

[333, 1234.5, 1, 333, -1, 66.25] 

>>> a.sort() 

>>> a 

[-1, 1, 66.25, 333, 333, 1234.5] 

5.1.1. Using Lists as Stacks 

The list methods make it very easy to use a 

list as a stack, where the last element added 

is the first element retrieved (“last-in, firstout”). 

To add an item to the top of the stack, 

use append(). To retrieve an item from the top

of the stack, use pop() without an explicit 

index. For example: 

>>> stack = [3, 4, 5] 

>>> stack.append(6) 

>>> stack.append(7) 

>>> stack 

[3, 4, 5, 6, 7] 

>>> stack.pop() 

7 

>>> stack 

[3, 4, 5, 6] 


6 


5 

>>> stack 

[3, 4] 

5.1.2. Using Lists as Queues 

It is also possible to use a list as a queue, 

where the first element added is the first 

element retrieved (“first-in, first-out”); 

however, lists are not efficient for this 

purpose. While appends and pops from the 

end of list are fast, doing inserts or pops from 

the beginning of a list is slow (because all of 

the other elements have to be shifted by 

one). 

To implement a queue, use collections.deque 

which was designed to have fast appends 

and pops from both ends. For example: 

5.1.3. Functional Programming 

Tools 

>>> 

>>> from collections import deque >>> 

>>> queue = deque(["Eric", "John", "Michael" 

>>> queue.append("Terry") # Terry arrives 

>>> queue.append("Graham") # Graham arrives 

>>> queue.popleft() # The first to a 

'Eric' 

>>> queue.popleft() # The second to 

'John' 

>>> queue # Remaining queu 

deque(['Michael', 'Terry', 'Graham'])

There are three built-in functions that are 

very useful when used with lists: filter(), 

map(), and reduce(). 

filter(function, sequence) returns a 

sequence consisting of those items from the 

sequence for which function(item) is true. If 

sequence is a string or tuple, the result will 

be of the same type; otherwise, it is always a 

list. For example, to compute a sequence of 

numbers not divisible by 2 and 3: 

>>> def f(x): return x % 2 != 0 and x >>> % 3 != 

... 

>>> filter(f, range(2, 25)) 

[5, 7, 11, 13, 17, 19, 23] 

map(function, sequence) calls function(item) 

for each of the sequenceʼs items and returns 

a list of the return values. For example, to 

compute some cubes: 

>>> def cube(x): return x*x*x >>> 

... 

>>> map(cube, range(1, 11)) 

[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000] 

More than one sequence may be passed; the 

function must then have as many arguments 

as there are sequences and is called with the 

corresponding item from each sequence (or 

None if some sequence is shorter than 

another). For example: 

>>> seq = range(8) 

>>> def add(x, y): return x+y 

... 

>>> map(add, seq, seq) 

[0, 2, 4, 6, 8, 10, 12, 14] 

>>> 

reduce(function, sequence) returns a single 

value constructed by calling the binary 

function function on the first two items of the 

sequence, then on the result and the next 

item, and so on. For example, to compute the 

sum of the numbers 1 through 10:

sum of the numbers 1 through 10: 

>>> def add(x,y): return x+y 

... 

>>> reduce(add, range(1, 11)) 

55 

If thereʼs only one item in the sequence, its 

value is returned; if the sequence is empty, 

an exception is raised. 

A third argument can be passed to indicate 

the starting value. In this case the starting 

value is returned for an empty sequence, and 

the function is first applied to the starting 

value and the first sequence item, then to the 

result and the next item, and so on. For 

example, 

>>> def sum(seq): 

... def add(x,y): return x+y 

... return reduce(add, seq, 0) 

... 

>>> sum(range(1, 11)) 

55 

>>> sum([]) 

0 

Donʼt use this exampleʼs definition of sum(): 

since summing numbers is such a common 

need, a built-in function sum(sequence) is 

already provided, and works exactly like this. 


5.1.4. List Comprehensions 

>>> 

>>> 

List comprehensions provide a concise way 

to create lists. Common applications are to 

make new lists where each element is the 

result of some operations applied to each 

member of another sequence or iterable, or 

to create a subsequence of those elements 

that satisfy a certain condition. 

For example, assume we want to create a list 

of squares, like:

squares = [] 

>>> for x in range(10): 

... squares.append(x**2) 

... 

>>> squares 

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81] 

We can obtain the same result with: 

squares = [x**2 for x in range(10)] 

This is also equivalent to squares = 

map(lambda x: x**2, range(10)), but itʼs more 

concise and readable. 

A list comprehension consists of brackets 

containing an expression followed by a for 

clause, then zero or more for or if clauses. 

The result will be a new list resulting from 

evaluating the expression in the context of 

the for and if clauses which follow it. For 

example, this listcomp combines the 

elements of two lists if they are not equal: 

and itʼs equivalent to: 

>>> 

>>> [(x, y) for x in [1,2,3] for y in >>> [3,1,4 

[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 

>>> combs = [] 

>>> 

>>> for x in [1,2,3]: 

... for y in [3,1,4]: 

... if x != y: 

... combs.append((x, y)) 

... 

>>> combs 

[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 

Note how the order of the for and if 

statements is the same in both these 

snippets. 

If the expression is a tuple (e.g. the (x, y) in 

the previous example), it must be 

parenthesized.

vec = [-4, -2, 0, 2, 4] 

>>> 

>>> # create a new list with the values doubled 

>>> [x*2 for x in vec] 

[-8, -4, 0, 4, 8] 

>>> # filter the list to exclude negative numbers 

>>> [x for x in vec if x >= 0] 

[0, 2, 4] 

>>> # apply a function to all the elements 

>>> [abs(x) for x in vec] 

[4, 2, 0, 2, 4] 

>>> # call a method on each element 

>>> freshfruit = [' banana', ' loganberry ' 

>>> [weapon.strip() for weapon in freshfruit 

['banana', 'loganberry', 'passion fruit'] 

>>> # create a list of 2-tuples like (number, square 

>>> [(x, x**2) for x in range(6)] 

[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25)] 

>>> # the tuple must be parenthesized, otherwise an 

>>> [x, x**2 for x in range(6)] 

File "", line 1 

[x, x**2 for x in range(6)] 

^ 

SyntaxError: invalid syntax 

>>> # flatten a list using a listcomp with two 'for' 

>>> vec = [[1,2,3], [4,5,6], [7,8,9]] 

>>> [num for elem in vec for num in elem] 

[1, 2, 3, 4, 5, 6, 7, 8, 9] 

List comprehensions can contain complex 

expressions and nested functions: 

>>> from math import pi 

>>> 

>>> [str(round(pi, i)) for i in range(1, 6)] 

['3.1', '3.14', '3.142', '3.1416', '3.14159'] 

5.1.4.1. Nested List Comprehensions 

The initial expression in a list comprehension 

can be any arbitrary expression, including 

another list comprehension. 

Consider the following example of a 3x4 

matrix implemented as a list of 3 lists of 

length 4: 

>>> matrix = [ 

... [1, 2, 3, 4], 

... [5, 6, 7, 8], 

... [9, 10, 11, 12], 

... ] 

>>>

The following list comprehension will 

transpose rows and columns: 

>>> [[row[i] for row in matrix] for i >>> in range 

[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]] 

As we saw in the previous section, the 

nested listcomp is evaluated in the context of 

the for that follows it, so this example is 

equivalent to: 

>>> transposed = [] 

>>> 

>>> for i in range(4): 

... transposed.append([row[i] for row in 

... 

>>> transposed 

[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]] 

which, in turn, is the same as: 

>>> transposed = [] 

>>> 

>>> for i in range(4): 

... # the following 3 lines implement the nested 

... transposed_row = [] 

... for row in matrix: 

... transposed_row.append(row[i]) 

... transposed.append(transposed_row) 

... 

>>> transposed 

[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]] 

In the real world, you should prefer built-in 

functions to complex flow statements. The 

zip() function would do a great job for this 

use case: 

>>> zip(*matrix) 

>>> 

[(1, 5, 9), (2, 6, 10), (3, 7, 11), (4, 8, 12)] 

See Unpacking Argument Lists for details on 

the asterisk in this line. 

5.2. The del statement 

There is a way to remove an item from a list

given its index instead of its value: the del 

statement. This differs from the pop() method 

which returns a value. The del statement can 

also be used to remove slices from a list or 

clear the entire list (which we did earlier by 

assignment of an empty list to the slice). For 

example: 

>>> a = [-1, 1, 66.25, 333, 333, 1234.5] >>> 

>>> del a[0] 

>>> a 

[1, 66.25, 333, 333, 1234.5] 

>>> del a[2:4] 

>>> a 

[1, 66.25, 1234.5] 

>>> del a[:] 

>>> a 

[] 

del can also be used to delete entire 

variables: 

>>> del a 

Referencing the name a hereafter is an error 

(at least until another value is assigned to it). 

Weʼll find other uses for del later. 

5.3. Tuples and Sequences 

>>> 

We saw that lists and strings have many 

common properties, such as indexing and 

slicing operations. They are two examples of 

sequence data types (see Sequence Types 

— str, unicode, list, tuple, bytearray, buffer, 

xrange). Since Python is an evolving 

language, other sequence data types may be 

added. There is also another standard 

sequence data type: the tuple. 

A tuple consists of a number of values 

separated by commas, for instance:

t = 12345, 54321, 'hello!' >>> 

>>> t[0] 

12345 

>>> t 

(12345, 54321, 'hello!') 

>>> # Tuples may be nested: 

... u = t, (1, 2, 3, 4, 5) 

>>> u 

((12345, 54321, 'hello!'), (1, 2, 3, 4, 5)) 

>>> # Tuples are immutable: 

... t[0] = 88888 


File "", line 1, in 

TypeError: 'tuple' object does not support item assi 

>>> # but they can contain mutable objects: 

... v = ([1, 2, 3], [3, 2, 1]) 

>>> v 

([1, 2, 3], [3, 2, 1]) 

As you see, on output tuples are always 

enclosed in parentheses, so that nested 

tuples are interpreted correctly; they may be 

input with or without surrounding 

parentheses, although often parentheses are 

necessary anyway (if the tuple is part of a 

larger expression). It is not possible to assign 

to the individual items of a tuple, however it is 

possible to create tuples which contain 

mutable objects, such as lists. 

Though tuples may seem similar to lists, they 

are often used in different situations and for 

different purposes. Tuples are immutable, 

and usually contain an heterogeneous 

sequence of elements that are accessed via 

unpacking (see later in this section) or 

indexing (or even by attribute in the case of 

namedtuples). Lists are mutable, and their 

elements are usually homogeneous and are 

accessed by iterating over the list. 

A special problem is the construction of 

tuples containing 0 or 1 items: the syntax has 

some extra quirks to accommodate these. 

Empty tuples are constructed by an empty 

pair of parentheses; a tuple with one item is 

constructed by following a value with a 

comma (it is not sufficient to enclose a single

value in parentheses). Ugly, but effective. For 

example: 

>>> empty = () 

>>> 

>>> singleton = 'hello', # >> len(empty) 

0 

>>> len(singleton) 

1 

>>> singleton 

('hello',) 

The statement t = 12345, 54321, 'hello!' is 

an example of tuple packing: the values 

12345, 54321 and 'hello!' are packed 

together in a tuple. The reverse operation is 

also possible: 

>>> x, y, z = t 

This is called, appropriately enough, 

sequence unpacking and works for any 

sequence on the right-hand side. Sequence 

unpacking requires the list of variables on the 

left to have the same number of elements as 

the length of the sequence. Note that multiple 

assignment is really just a combination of 

tuple packing and sequence unpacking. 

5.4. Sets 

Python also includes a data type for sets. A 

set is an unordered collection with no 

duplicate elements. Basic uses include 

membership testing and eliminating duplicate 

entries. Set objects also support 

mathematical operations like union, 

intersection, difference, and symmetric 

difference. 

Here is a brief demonstration: 

>>>

asket = ['apple', 'orange', 'apple', >>> 

'pear' 

>>> fruit = set(basket) # create a set 

>>> fruit 

set(['orange', 'pear', 'apple', 'banana']) 

>>> 'orange' in fruit # fast members 

True 

>>> 'crabgrass' in fruit 

False 

>>> # Demonstrate set operations on unique letters f 

... 

>>> a = set('abracadabra') 

>>> b = set('alacazam') 

>>> a # unique lett 

set(['a', 'r', 'b', 'c', 'd']) 

>>> a - b # letters in 

set(['r', 'd', 'b']) 

>>> a | b # letters in 

set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) 

>>> a & b # letters in 

set(['a', 'c']) 

>>> a ^ b # letters in 

set(['r', 'd', 'b', 'm', 'z', 'l']) 

5.5. Dictionaries 

Another useful data type built into Python is 

the dictionary (see Mapping Types — dict). 

Dictionaries are sometimes found in other 

languages as “associative memories” or 

“associative arrays”. Unlike sequences, 

which are indexed by a range of numbers, 

dictionaries are indexed by keys, which can 

be any immutable type; strings and numbers 

can always be keys. Tuples can be used as 

keys if they contain only strings, numbers, or 

tuples; if a tuple contains any mutable object 

either directly or indirectly, it cannot be used 

as a key. You canʼt use lists as keys, since 

lists can be modified in place using index 

assignments, slice assignments, or methods 

like append() and extend(). 

It is best to think of a dictionary as an 

unordered set of key: value pairs, with the 

requirement that the keys are unique (within 

one dictionary). A pair of braces creates an 

empty dictionary: {}. Placing a commaseparated 

list of key:value pairs within the

separated list of key:value pairs within the 

braces adds initial key:value pairs to the 

dictionary; this is also the way dictionaries 

are written on output. 

The main operations on a dictionary are 

storing a value with some key and extracting 

the value given the key. It is also possible to 

delete a key:value pair with del. If you store 

using a key that is already in use, the old 

value associated with that key is forgotten. It 

is an error to extract a value using a nonexistent 

key. 

The keys() method of a dictionary object 

returns a list of all the keys used in the 

dictionary, in arbitrary order (if you want it 

sorted, just apply the sorted() function to it). 

To check whether a single key is in the 

dictionary, use the in keyword. 

Here is a small example using a dictionary: 

>>> tel = {'jack': 4098, 'sape': 4139} >>> 

>>> tel['guido'] = 4127 

>>> tel 

{'sape': 4139, 'guido': 4127, 'jack': 4098} 

>>> tel['jack'] 

4098 

>>> del tel['sape'] 

>>> tel['irv'] = 4127 

>>> tel 

{'guido': 4127, 'irv': 4127, 'jack': 4098} 

>>> tel.keys() 

['guido', 'irv', 'jack'] 

>>> 'guido' in tel 

True 

The dict() constructor builds dictionaries 

directly from lists of key-value pairs stored as 

tuples. When the pairs form a pattern, list 

comprehensions can compactly specify the 

key-value list. 

>>> dict([('sape', 4139), ('guido', 4127), >>> 

( 

{'sape': 4139, 'jack': 4098, 'guido': 4127} 

>>> dict([(x, x**2) for x in (2, 4, 6)]) 

{2: 4, 4: 16, 6: 36}

Later in the tutorial, we will learn about 

Generator Expressions which are even better 

suited for the task of supplying key-values 

pairs to the dict() constructor. 

When the keys are simple strings, it is 

sometimes easier to specify pairs using 

keyword arguments: 

>>> dict(sape=4139, guido=4127, jack=4098) >>> 

{'sape': 4139, 'jack': 4098, 'guido': 4127} 

5.6. Looping Techniques 

When looping through a sequence, the 

position index and corresponding value can 

be retrieved at the same time using the 

enumerate() function. 

>>> for i, v in enumerate(['tic', 'tac', >>> 'toe' 

... print i, v 

... 

0 tic 

1 tac 

2 toe 

To loop over two or more sequences at the 

same time, the entries can be paired with the 

zip() function. 

>>> questions = ['name', 'quest', 'favorite >>> 

color' 

>>> answers = ['lancelot', 'the holy grail', 

>>> for q, a in zip(questions, answers): 

... print 'What is your {0}? It is {1}.' 

... 

What is your name? It is lancelot. 

What is your quest? It is the holy grail. 

What is your favorite color? It is blue. 

To loop over a sequence in reverse, first 

specify the sequence in a forward direction 

and then call the reversed() function.

for i in reversed(xrange(1,10,2)): >>> 

... print i 

... 

9 

7 

5 

3 

1 

To loop over a sequence in sorted order, use 

the sorted() function which returns a new 

sorted list while leaving the source unaltered. 

>>> basket = ['apple', 'orange', 'apple', >>> 'pear' 

>>> for f in sorted(set(basket)): 

... print f 

... 

apple 

banana 

orange 

pear 

When looping through dictionaries, the key 

and corresponding value can be retrieved at 

the same time using the iteritems() method. 

>>> knights = {'gallahad': 'the pure', >>> 

'robin' 

>>> for k, v in knights.iteritems(): 

... print k, v 

... 

gallahad the pure 

robin the brave 

5.7. More on Conditions 

The conditions used in while and if 

statements can contain any operators, not 

just comparisons. 

The comparison operators in and not in 

check whether a value occurs (does not 

occur) in a sequence. The operators is and 

is not compare whether two objects are 

really the same object; this only matters for 

mutable objects like lists. All comparison 

operators have the same priority, which is 

lower than that of all numerical operators.

lower than that of all numerical operators. 

Comparisons can be chained. For example, a 



moreover b equals c. 

Comparisons may be combined using the 

Boolean operators and and or, and the 

outcome of a comparison (or of any other 

Boolean expression) may be negated with 

not. These have lower priorities than 

comparison operators; between them, not 

has the highest priority and or the lowest, so 

that A and not B or C is equivalent to (A and 

(not B)) or C. As always, parentheses can 

be used to express the desired composition. 

The Boolean operators and and or are socalled 

short-circuit operators: their arguments 

are evaluated from left to right, and 

evaluation stops as soon as the outcome is 

determined. For example, if A and C are true 

but B is false, A and B and C does not 

evaluate the expression C. When used as a 

general value and not as a Boolean, the 

return value of a short-circuit operator is the 

last evaluated argument. 

It is possible to assign the result of a 

comparison or other Boolean expression to a 

variable. For example, 

>>> string1, string2, string3 = '', 'Trondheim' 

>>> 

>>> non_null = string1 or string2 or string3 

>>> non_null 

'Trondheim' 

Note that in Python, unlike C, assignment 

cannot occur inside expressions. C 

programmers may grumble about this, but it 

avoids a common class of problems 

encountered in C programs: typing = in an 

expression when == was intended. 

5.8. Comparing Sequences

and Other Types 

Sequence objects may be compared to other 

objects with the same sequence type. The 

comparison uses lexicographical ordering: 

first the first two items are compared, and if 

they differ this determines the outcome of the 

comparison; if they are equal, the next two 

items are compared, and so on, until either 

sequence is exhausted. If two items to be 

compared are themselves sequences of the 

same type, the lexicographical comparison is 

carried out recursively. If all items of two 

sequences compare equal, the sequences 

are considered equal. If one sequence is an 

initial sub-sequence of the other, the shorter 

sequence is the smaller (lesser) one. 

Lexicographical ordering for strings uses the 

ASCII ordering for individual characters. 

Some examples of comparisons between 

sequences of the same type: 

(1, 2, 3) < (1, 2, 4) 

[1, 2, 3] < [1, 2, 4] 

'ABC' < 'C' < 'Pascal' < 'Python' 

(1, 2, 3, 4) < (1, 2, 4) 

(1, 2) < (1, 2, -1) 

(1, 2, 3) == (1.0, 2.0, 3.0) 

(1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 

Note that comparing objects of different types 

is legal. The outcome is deterministic but 

arbitrary: the types are ordered by their 

name. Thus, a list is always smaller than a 

string, a string is always smaller than a tuple, 

etc. [1] Mixed numeric types are compared 

according to their numeric value, so 0 equals 

0.0, etc. 

Footnotes 

[1] The rules for comparing objects of 

different types should not be relied 

upon; they may change in a future 

version of the language.

Python v2.7.3 documentation » The Python Tutorial » 

version of the language. 


© Copyright 1990-2012, Python Software Foundation. 

The Python Software Foundation is a non-profit corporation. Please donate. 

Last updated on Sep 06, 2012. Found a bug? 

Created using Sphinx 1.0.7.

5.5.4 Backus-Naur Formalism, see also Sec. 2.13 

First reference occurs in BNF and EBNF (L. M. Garshol), see Section 2.13 on page 85. 

361

Copyright © tutorialspoint.com 

Python - Regular Expressions 

Advertisements 

A regular expression is a special sequence of characters that helps you match or find 

other strings or sets of strings, using a specialized syntax held in a pattern. Regular 

expressions are widely used in UNIX world. 

The module re provides full support for Perl-like regular expressions in Python. The 

re module raises the exception re.error if an error occurs while compiling or using a 

regular expression. 

We would cover two important functions which would be used to handle regular 

expressions. But a small thing first: There are various characters which would have 

special meaning when they are used in regular expression. To avoid any confusion 

while dealing with regular expressions we would use Raw Strings as r'expression'. 

The match Function 

This function attempts to match RE pattern to string with optional flags. 

Here is the syntax for this function: 

re.match(pattern, string, flags=0) 

Here is the description of the parameters: 

Parameter Description 

pattern This is the regular expression to be matched. 

string 

flags 

This is the string which would be searched to match the 

pattern 

You can specifiy different flags using bitwise OR (|). 

These are modifiers which are listed in the table below. 

The re.match function returns a match object on success, None on failure. We 

would use group(num) or groups() function of match object to get matched 

expression. 

Match Object 

Methods 

group(num=0) 

groups() 

Description 

This methods returns entire match (or specific subgroup 

num) 

This method return all matching subgroups in a tuple 

(empty if there weren't any)

Example: 

#!/usr/bin/python 

import re 

line = "Cats are smarter than dogs"; 

matchObj = re.match( r'(.*) are(\.*)', line, re.M|re.I) 

if matchObj: 

print "matchObj.group() : ", matchObj.group() 

print "matchObj.group(1) : ", matchObj.group(1) 


else: 

print "No match!!" 

This will produce following result: 

matchObj.group(): Cats are 

matchObj.group(1) : Cats 

matchObj.group(2) : 

The search Function 

This function search for first occurrence of RE pattern within string with optional 

flags. 

Here is the syntax for this function: 

re.search(pattern, string, flags=0) 

Here is the description of the parameters: 

Parameter Description 

pattern This is the regular expression to be matched. 

string 

flags 

This is the string which would be searched to match the 

pattern 

You can specifiy different flags using bitwise OR (|). 

These are modifiers which are listed in the table below. 

The re.search function returns a match object on success, None on failure. We 

would use group(num) or groups() function of match object to get matched 

expression. 

Match Object 

Methods 

group(num=0) 

Description 

This methods returns entire match (or specific subgroup 

num) 

This method return all matching subgroups in a tuple

groups() 

Example: 


import re 


This method return all matching subgroups in a tuple 

(empty if there weren't any) 

matchObj = re.search( r'(.*) are(\.*)', line, re.M|re.I) 

if matchObj: 

print "matchObj.group() : ", matchObj.group() 



else: 



matchObj.group(): Cats are 

matchObj.group(1) : Cats 

matchObj.group(2) : 

Matching vs Searching: 

Python offers two different primitive operations based on regular expressions: match 

checks for a match only at the beginning of the string, while search checks for a 

match anywhere in the string (this is what Perl does by default). 

Example: 


import re 


matchObj = re.match( r'dogs', line, re.M|re.I) 

if matchObj: 

print "match --> matchObj.group() : ", matchObj.group() 

else: 


matchObj = re.search( r'dogs', line, re.M|re.I) 

if matchObj: 

print "search --> matchObj.group() : ", matchObj.group() 

else: 


This will produce following result:

No match!! 

search --> matchObj.group() : dogs 

Search and Replace: 

Some of the most important re methods that use regular expressions is sub. 

Syntax: 

re.sub(pattern, repl, string, max=0) 

This method replace all occurrences of the RE pattern in string with repl, substituting 

all occurrences unless max provided. This method would return modified string. 

Example: 

Following is the example: 


phone = "2004-959-559 #This is Phone Number" 

# Delete Python-style comments 

num = re.sub(r'#.*$', "", phone) 

print "Phone Num : ", num 

# Remove anything other than digits 

num = re.sub(r'\D', "", phone) 

print "Phone Num : ", num 


Phone Num : 2004-959-559 

Phone Num : 2004959559 

Regular-expression Modifiers - Option Flags 

Regular expression literals may include an optional modifier to control various 

aspects of matching. The modifier are specified as an optional flag. You can provide 

multiple modified using exclusive OR (|), as shown previously and may be 

represented by one of these: 

Modifier Description 

re.I Performs case-insensitive matching. 

re.L 

re.M 

Interprets words according to the current locale.This 

interpretation affects the alphabetic group (\w and \W), as 

well as word boundary behavior (\b and \B). 

Makes $ match the end of a line (not just the end of the 

string) and makes ^ match the start of any line (not just the 

start of the string).

e.S 

re.U 

re.X 

Makes a period (dot) match any character, including a 

newline. 

Interprets letters according to the Unicode character set. This 

flag affects the behavior of \w, \W, \b, \B. 

Permits "cuter" regular expression syntax. It ignores 

whitespace (except inside a set [] or when escaped by a 

backslash), and treats unescaped # as a comment marker. 

Regular-expression patterns: 

Except for control characters, (+ ? . * ^ $ ( ) [ ] { } | \), all characters match 

themselves. You can escape a control character by preceding it with a backslash. 

Following table lists the regular expression syntax that is available in Python. 

Pattern Description 

^ Matches beginning of line. 

$ Matches end of line. 

. 

Matches any single character except newline. Using m option 

allows it to match newline as well. 

[...] Matches any single character in brackets. 

[^...] Matches any single character not in brackets 

re* Matches 0 or more occurrences of preceding expression. 

re+ Matches 1 or more occurrence of preceding expression. 

re? Matches 0 or 1 occurrence of preceding expression. 

re{ n} 

Matches exactly n number of occurrences of preceding 

expression. 

re{ n,} Matches n or more occurrences of preceding expression. 

re{ n, m} 

Matches at least n and at most m occurrences of preceding 

expression. 

a| b Matches either a or b. 

(re) Groups regular expressions and remembers matched text. 

(?imx) 

(?-imx) 

(?: re) 

Temporarily toggles on i, m, or x options within a regular 

expression. If in parentheses, only that area is affected. 

Temporarily toggles off i, m, or x options within a regular 

expression. If in parentheses, only that area is affected. 

Groups regular expressions without remembering matched 

text. 

(?imx: re) Temporarily toggles on i, m, or x options within parentheses. 

(?-imx: re) Temporarily toggles off i, m, or x options within parentheses. 

(?#...) Comment.

(?#...) Comment. 

(?= re) Specifies position using a pattern. Doesn't have a range. 

(?! re) 

Specifies position using pattern negation. Doesn't have a 

range. 

(?> re) Matches independent pattern without backtracking. 

\w Matches word characters. 

\W Matches nonword characters. 

\s Matches whitespace. Equivalent to [\t\n\r\f]. 

\S Matches nonwhitespace. 

\d Matches digits. Equivalent to [0-9]. 

\D Matches nondigits. 

\A Matches beginning of string. 

\Z 

Matches end of string. If a newline exists, it matches just 

before newline. 

\z Matches end of string. 

\G Matches point where last match finished. 

\b 

Matches word boundaries when outside brackets. Matches 

backspace (0x08) when inside brackets. 

\B Matches nonword boundaries. 

\n, \t, etc. Matches newlines, carriage returns, tabs, etc. 

\1...\9 Matches nth grouped subexpression. 

\10 

Matches nth grouped subexpression if it matched already. 

Otherwise refers to the octal representation of a character 

code. 

Regular-expression Examples: 

Literal characters: 

Example Description 

python Match "python". 

Character classes: 


[Pp]ython Match "Python" or "python" 

rub[ye] Match "ruby" or "rube" 

[aeiou] Match any one lowercase vowel 

[0-9] Match any digit; same as [0123456789] 

[a-z] Match any lowercase ASCII letter

[a-z] Match any lowercase ASCII letter 

[A-Z] Match any uppercase ASCII letter 

[a-zA-Z0-9] Match any of the above 

[âeiou] Match anything other than a lowercase vowel 

[^0-9] Match anything other than a digit 

Special Character Classes: 


. Match any character except newline 

\d Match a digit: [0-9] 

\D Match a nondigit: [^0-9] 

\s Match a whitespace character: [ \t\r\n\f] 

\S Match nonwhitespace: [^ \t\r\n\f] 

\w Match a single word character: [A-Za-z0-9_] 

\W Match a nonword character: [Â-Za-z0-9_] 

Repetition Cases: 


ruby? Match "rub" or "ruby": the y is optional 

ruby* Match "rub" plus 0 or more ys 

ruby+ Match "rub" plus 1 or more ys 

\d{3} Match exactly 3 digits 

\d{3,} Match 3 or more digits 

\d{3,5} Match 3, 4, or 5 digits 

Nongreedy repetition: 

This matches the smallest number of repetitions: 


Greedy repetition: matches "perl>" 

Nongreedy: matches "" in "perl>" 

Grouping with parentheses: 


\D\d+ No group: + repeats \d 

(\D\d)+ Grouped: + repeats \D\d pair 

([Pp]ython(, )?)+ Match "Python", "Python, python, python", etc.

Backreferences: 

This matches a previously matched group again: 


([Pp])ython&\1ails Match python&rails or Python&Rails 

(['"])[^\1]*\1 

Alternatives: 

Single or double-quoted string. \1 matches whatever the 1st 

group matched . \2 matches whatever the 2nd group 

matched, etc. 


python|perl Match "python" or "perl" 

rub(y|le)) Match "ruby" or "ruble" 

Python(!+|\?) "Python" followed by one or more ! or one ? 

Anchors: 

This need to specify match position 


^Python Match "Python" at the start of a string or internal line 

Python$ Match "Python" at the end of a string or line 

\APython Match "Python" at the start of a string 

Python\Z Match "Python" at the end of a string 

\bPython\b Match "Python" at a word boundary 

\brub\B 

\B is nonword boundary: match "rub" in "rube" and "ruby" but 

not alone 

Python(?=!) Match "Python", if followed by an exclamation point 

Python(?!!) Match "Python", if not followed by an exclamation point 

Special syntax with parentheses: 


R(?#comment) Matches "R". All the rest is a comment 

R(?i)uby Case-insensitive while matching "uby" 

R(?i:uby) Same as above 

rub(?:y|le)) Group only without creating \1 backreference 

Copyright © tutorialspoint.com

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







373

The Atom Matrix 





Spell Check Song 

Assignments 

"Spell Check Song" 

I have a spelling checker. 

It came with my PC. 

It plane lee marks four my revue 

Miss steaks aye can knot see. 

Eye ran this poem threw it. 

Your sure real glad two no. 

Its very polished in its weigh, 

• • • 

Zooball/Penguin 

1. Write a procedure atom_matrix for calculating the formula matrix of an 

ordered set a substances from their chemical formulas (given as a list of 

strings). Make the output a list of lists of integers [[int11, int12, 

...], [in21, int22, ...], ...]. Use the stub program 

atom_matrix.py as template. 

2. Spin-off (not compulsory): Write a procedure molecular_weight for 

calculating the molecular weight of a substance given its chemical formula 

(string). Make the output a list of two integers [int1,int2] where Mw = 

int1/int2 and all the digits of int1 are significant. Use the stub 

program molecular_weight.py as template. 

3. Learn about Python sets (as in "set" theory) and about method calls like 

str.sort and keywords like list in particular. We shall also start 

talking about the list iterator for x in xlist and the List 

comprehension [a+b for (a, b) in zip(alist, blist)]. 

Python is a programming language which to a large extent is built on the 

concept of lists and list comprehensions. Mix it with recursive function calls and 

you have a powerful programming environment! About the difference between 

for-loops, list comprehension and recursive function calls I shall say this much:

1. For-loops are for casual problems without any particular data structure. 

2. List comprehension is a Good Thing if you are dealing entirely with lists. 

3. Recursive programming is The Way of making lists of arbitrary length 

when termination (convergence) can be guaranteed. 

Three stylistic examples follow. Let args be a list, or any other data structure 

with an iterator implemented, that is a method which visits the members of the 

list once - and exactly once. objects arg of unknown types. fun is a function 

that takes one arg and do something about it, and err is a second function 

that evaluates the convergence criterion for the sequence: 

back 

# Imperative for-loop: 

for arg in args: 

fun(arg) 

pass 

# List comprehension: 

[fun(arg) for arg in args] 

# Recursive function call: 

def rc(arg, fun, err, seq=[]): 

if err(arg, fun): rc(fun(arg), fun, err, seq) 

seq.insert(0, arg) 

return seq 

Note that in the two first cases fun appears as a function in the mathematical 

sense. In the last case, however, fun (and err) appear as function objects 

given to cr. They are sometimes called functors to remind you of functionals 

in mathematics. Think about integrals. This is a mathematical operation 

awaiting your function of interest in order to produce a number. rc is doing the 

same. It awaits a starting point arg and two functors fun and crit in order to 

produce the convergence sequence seq. If you are new to Python this sounds 

Greek maybe, but give it a chance! Invent a few problems and increase your 

knowledge••• A minimal example is the convergence of x_n+1 = x_n*x_n 

=> 0 for x_0 < 1 and n => infinity. A possible implementation is: 

back 

# Perfectly general Fixed Point Iteration. 

def rc(arg, fun, err, seq=[]): 

if err(arg, fun): rc(fun(arg), fun, err, seq) 

seq.insert(0, arg) 

return seq 

# Your function implementation. 

def myfun(arg): 

return arg**2 

# Your termination criteria. 

def myerr(arg, fun): 

if abs(arg-fun(arg)) > 0: return True

eturn False 

args = rc(0.999, myfun, myerr) 

print args 

The sequence converges beatifully to zero (make sure to run the program 

yourself in order to achieve a better understanding of the matter): 

back 

[ 0.999, 

0.99800100000000003, 

0.99600599600100004, 

etc. 

3.3406915454655646e-29, 

1.1160220001945103e-57, 

1.2455051049181556e-114, 

1.5512829663771860e-228, 

0.0 ] 

Back to business••• The formula (atom) matrix of a mixture — an ordered set of 

substances called a component list — is defined as a stoichiometry matrix 

where each of the columns is assigned to a substance and each of the rows is 

assigned to a chemical element (atom). The column sequence must correspond 

to the given component list, while the rows may come in any order. One simple 

example illustrates the concept: 

back 

[ 

[2, 4, 0, 2], # H 

[1, 0, 2, 0], # O 

[0, 1, 1, 0] # C 

] 

This is the formula (atom) matrix corresponding to the component list: H2O, 

CH4, CO2 and H2. The generalization into more complex mixtures is 

straightforward. We shall, however, calculate the matrix by first parsing each 

formula into a dictionary telling how many atoms there are of each kind and 

then transcribe the dictionaries into a list of lists of stoichiometric numbers. For 

the simple example given above the programmatic actions look like: 

back 

['H2O', 'CH4', 'CO2', 'H2'] 

=> 

[

{'H':2, 'O':1}, 

{'C':1, 'H':4}, 

{'C':1, 'O':2}, 

{'H':2} 

] 

=> 

[ 

[2, 4, 0, 2], # H 

[1, 0, 2, 0], # O 

[0, 1, 1, 0] # C 

] 

In order to do so we need to learn about lists and dictionaries, and about 

iterators and list comprehensions in Python. Recursive functions are also into 

this picture since our formula parser is built on that principle. 

back 


5.7.1 Spell Check Song, see also Sec. 2.27 

First reference occurs in About spell checkers (WWW), see Section 2.27 on page 161. 

378

5.7.2 Verbatim: “atom matrix.py” 

1 ””” 

2 @summary : Return the ( atoms x s p e c i e s ) formula matrix f o r a given l i s t o f 

3 chemical formulas . 






9 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

10 @version : 0 . 9 

11 @todo 1 . 0 : 


13 ””” 

14 

15 def atom matrix ( formulas , debug=False ) : 

16 ””” 

17 C a l c u l a t e an atom s t o i c h i o m e t r y matrix which i s conformal to the chemical 

18 formulas given in l i s t ’ formulas ’ . 

19 

20 @param formulas : l i s t o f chemical formulas e . g . [ ’H2O ’ , ’CO2 ’ , . . . ] 


22 

23 @type formulas : 


25 

26 @return : a L i s t [ a L i s t [ aNumber , aNumber , . . . ] ] 

27 e . g . [ [ 2 , 0 , . . . ] , [ 1 , 2 , . . . ] , [ 0 , 1 , . . . ] , . . . ] 

28 ””” 

29 

30 from atoms import atoms 

31 

32 import sys 

33 

34 i f sys . v e r s i o n i n f o < ( 2 , 4 ) : 

35 from s e t s import Set # deprecated s i n c e v e r s i o n 2 . 4 

36 stack = [ ] # l i s t o f parsed formulas ( d i c t i o n a r i e s ) e . g . { ’H ’ : 2 , ’O: 1 ’ } 

37 syms = Set ( ) # s e t o f unique atom names ( chemical symbols ) 

38 else : 

39 stack = [ ] # l i s t o f parsed formulas ( d i c t i o n a r i e s ) e . g . { ’H ’ : 2 , ’O: 1 ’ } 

40 syms = s e t ( ) # s e t o f unique atom names ( chemical symbols ) 

41 

42 # Build ’ stack ’ and ’ syms ’ . 

43 for formula in formulas : 

44 stack . append ({}) 

45 pass # update chemical symbols Set 

46 

47 syms = l i s t ( syms ) # transform s e t i n t o l i s t b e f o r e s o r t i n g ! 

48 syms . s o r t ( ) # s o r t atom names l e x i c o g r a p h i c a l l y ( in−p l a c e s o r t i n g ) 

49 

50 a r r = [ ] # the atom s t o i c h i o m e t r y ’ matrix ’ 

51 

52 # Build ’ a r r ’ . 

53 for sym in syms : # f o r a l l atoms 

54 a r r . append ( [ ] ) # make a new row o f s t o i c h i o m e t r i c c o e f f i c i e n t s 

55 for hsh in stack : # f o r a l l formulas 

56 pass # f i l l in with v a l u e s in the l a s t row 

57 

379

58 return a r r # s i z e i s (m x n ) where n = l e n ( formulas ) and m = l e n ( syms ) 

380

5.7.3 Verbatim: “molecular weight.py” 

1 ””” 

2 @summary : Return molecular weight t u p l e ( val , e r r ) f o r a given formula . 






8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 9 



12 ””” 

13 

14 def molecular w e i g h t ( formula , debug=False , mw= [ ] ) : 

15 ””” 

16 C a l c u l a t e molecular weight ( mass per mole ) o f a substance with chemical 

17 composition equal to ’ formula ’ . The atomic masses o f the elements are ( by 

18 d e f a u l t ) taken from : M. E. Wieser , Atomic Weights o f the Elements 2005 , Pure 

19 Appl . Chem . , Vol . 78 , No . 11 , pp . 2051 −2066 , 2006 ( s e e code ) , u n l e s s e x p l i c − 

20 i t l y provided by the user ( in l i s t ’mw ’ ) . The c a l c u l a t e d molecular weight 

21 i s returned as a s c a l e d i n t e g e r , i . e . val [ 0 ] , where a l l the d i g i t s are sign − 

22 i f i c a n t . The order o f magnitude o f the s c a l i n g i s returned as a second value 

23 val [ 1 ] such that the a c t u a l Mw = val [ 0 ] / val [ 1 ] . 

24 

25 @param formula : a chemical formula ’COOH(C(CH3)2)3CH3 ’ 


27 @param mw: l i s t o f t u p l e ( ’ name ’ , ’ symbol ’ , number , mass , u n c e r t a i n t y ) 

28 

29 @type formula : 


31 @type mw: 

32 

33 @return : t h e L i s t [ anInt , anInt ] 

34 ””” 

35 

36 # Chemical formula p a r s e r and t r a n s c e n d e n t a l math . 


38 import math 

39 

40 stack = pass # parse formula i n t o [ { ’ Symbol ’ : int , ’ Symbol ’ : int , . . . } ] 

41 

42 i f not stack : return [ 0 , 1 ] # no atom s t o i c h i o m e t r y i s a v a i l a b l e 

43 

44 hsh = pass # continue with { ’ Symbol ’ : int , ’ Symbol ’ : int , . . . } 

45 

46 # Enter p e r i o d i c t a b l e i n f o r m a t i o n : The ’mw’ l i s t i s e i t h e r given as input 

47 # to the f u n c t i o n ’ m o l e c u lar weight ’ or i t i s an empty l i s t in which case i t 

48 # must be p r o p e r l y d e f i n e d here . 

49 mw = mw or \ 

50 [ 

51 ( ’carbon’ , ’C’ , 6 , 12.0107 , 8E−5) , 

52 ( ’hydrogen’ , ’H’ , 1 , 1.00794 , 7E−6) 

53 ] 

54 

55 val = 0 . 0 # molecular weight [ amu ] 

56 e r r = 0 . 0 # t r u n c a t i o n e r r o r ( approx . u n c e r t a i n t y ) 

57 m = 0 # number o f elements r e c o g n i z e d in the formula 

381

58 

59 # C a l c u l a t e ’ val ’ , ’ e r r ’ og ’m ’ . 

60 for tup in mw: 

61 i f hsh . has key ( tup [ 1 ] ) : 

62 pass # increment molecular weight 

63 pass # increment e r r o r ( u n c r t a i n t y ) 

64 pass # increment the number o f elements in the formula 

65 else : 

66 pass 

67 

68 i f m != l e n ( hsh ) : raise SyntaxError ( "weird atom in ’%s’"%(formula , ) ) 

# 

69 

70 n = abs ( i n t ( math . log10 ( e r r ) ) ) # c a l c u l a t e order o f magnitude ( abs value ) 

71 

72 i f debug : print [ val , err , n ] 

73 

74 return [ i n t ( round ( val ∗10∗∗n ) ) , 10∗∗n ] # make sure l a s t d i g i t i s s i g n i f i c a n t 

382

5.7.4 Python sets, see also Sec. 5.5.3 

First reference occurs in Python dictionaries, see Section 5.5.3 on page 343. 

383

5.7.5 List comprehension, see also Sec. 5.5.3 

First reference occurs in Python dictionaries, see Section 5.5.3 on page 343. 

384

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







387

Independent Reactions 





Reasons computers are male: 

In order to get their attention, you have to turn them on. 

They have a lot of data but are still clueless. 

Zooball/Fish 

They are supposed to help you solve your problems, but half the time they cause the problem. 

As soon as you commit to one, you realize that if you had waited a little longer, you could have had 

a better model. 

Computers are male 

Assignments 

1. Write a procedure rref for calculating the row-reduced-echelon 

rref(A) = inv(G)*A of a given matrix A. Matrix G is formally required, 

but it will never show up in the code. The return values shall be matrix B 

= [B1^T, 0]^T where B is of the same shape as A, rank(A) = 

rank(B1), and pivots(B) = [None|anInt, ...] identifying the 

pivot elements used in the elimination process (row pivoting only). That is 

rref(A) => B, rank(B1), pivots(B). Use the stub program 

rref.py as template. 

2. Based on the output of rref write another procedure for calculating the 

nullspace N of A such that [A^T, N] makes an invertible basis for the 

vector space. That is null(A) => N, rank(N) where 

rank(A)+rank(N)=rowdim(A). Use the stub program null.py as 

template. 

Read this whitepaper about The mass balance if you need a more thorough 

explanation of the nullspace theory than you will find on the current page. 

From formula matrix A we can calculate a row-reduced-echelon form B1 by 

doing Gauss elimination on the rows of A. This process will require row 

permuations if one of the pivot elements becomes zero, but it does without any 

column permutations. Let inv(G) be a matrix that is doing the steps needed. 

Then, by definition rref(A) = inv(G)A. The shape of rref(A) is the same

as A but it may have one or more rows being fully zero (filling out the lower part 

of the matrix) even when A is dense. Hence, rref(A) = [B1^T, 0]^T 

where the 0 matrix may or may not exist. 

The next operation is to make an elementary matrix E1 by putting all non-pivot 

columns in B1 to zero. These are the columns that have not been fully rowreduced 

(an invertible matrix has, by the way, no such columns). This process 

is hard to explain in words, but the examples below are quite illuminating. 

From B1 and E1 we can easily calculate E1*B1-I. This matrix has the property 

that B1(E1*B1-I) = 0. Prove it! Furthermore, we can show (after a second 

or maybe third thought) that A(E1*B1-I) = 0. This means the non-zero 

columns of E1*B1-I define the null space of A. Hence N = (E1*B1-I). 

Our first example is a one-component mixture of water. Water (H2O) has 2 

hydrogen atoms and 1 oxygen atom. The formula matrix and the corresponding 

Gauss elimination is shown below: 

back 

A = [ [ 2 ] 'H' 

[ 1 ] ] 'O' 

Step #1: 0.5*R1 

Step #2: R2 - R1 

rref(A) = [ [ 1 ] 

[ 0 ] ] 

inv(G) = [ [ 0.5 0 ] 

[ -0.5 1 ] ] 

B1 = [ [ 1 ] ] 

rank = 1 

pivots = [ 0 ] 

E1 = [ [ 1 ] ] 

E1*B1-I = [ [ 0 ] ] 

N = [ [ ] ] 

The second example is a binary mixture of water monomer and water dimer 

(H2O, (H2O)2). Note that A is a square matrix, albeit with two linearly 

dependent rows. rref(A) has therefore a zero row at the end which means B1 

has only 1 row while A got 2. We say that A is rank deficient, which means there 

is the possibility of a chemical reaction in the mixture. From the stoichiometry of 

N we can deduce 2*H2O - 1*(H2O)2 = 0 or 2*H2O = (H2O)2. The two 

forms are equivalent. 

back

A = [ [ 2 4 ] 'H' 

[ 1 2 ] ] 'O' 

Elimination step 1: 0.5*R1 

Elimination step 2: R2 - R1 

rref(A) = [ [ 1 2 ] 

[ 0 0 ] ] 

inv(G) = [ [ 0.5 0 ] 

[ -0.5 1 ] ] 

B1 = [ [ 1 2 ] ] 

rank = 1 

pivots = [ 0 None ] 

E1 = [ [ 1 ] ] 

[ 0 ] ] 

E1*B1-I = [ [ 0 2 ] 

[ 0 -1 ] 

N = [ [ 2 ] 

[ -1 ] ] 

The third example is a binary mixture of hydrogen and oxygen (H2, O2). Again, 

A is a square matrix but this time it is non-singular. This means there are no 

chemical reaction possible. 

back 

A = [ [ 2 0 ] 'H' 

[ 0 2 ] ] 'O' 



rref(A) = [ [ 1 0 ] 

[ 0 1 ] ] 

inv(G) = [ [ 0.5 0 ] 

[ 0 0.5 ] ] 

B1 = [ [ 1 0 ] 

[ 0 1 ] ] 

rank = 2 

pivots = [ 0 1 ] 

E1 = [ [ 1 0 ] ] 

[ 0 1 ] ] 

E1*B1-I = [ [ 0 0 ] 

[ 0 0 ]

N = [ [ ] 

[ ] ] 

The fourth example is a quinary mixture of formaldehyde, carbon monoxide, 

hydrogen, water and oxygen (CHOH, CO, H2, H2O, O2). This is an almost 

fullblown example (it does not require row permutations though) because the 

elimination process leaves a non-pivot column in the middle of A. The rank of A 

is 3 (all rows are independent) and the row-size is 5. That means there are 2 

degrees of freedom which manifest themselves as chemical reactions. From 

the stoichiometry matrix N we get: 1*CHOH - 1*CO - 1*H2 = 0 and - 

2*CHOH + 2*CO + 2*H2O - 1*O2 = 0, or, alternatively, CHOH = CO + 

H2 and 2*CO + 2*H2O = 2*CHOH + O2. 

back 

A = [ [ 1 1 0 0 0 ] 'C' 

[ 2 0 2 2 0 ] 'H' 

[ 1 1 0 1 2 ] ] 'O' 

Elimination step 1: R2 - 2*R1 


Elimination step 3: -0.5*R2 



Elimination step 6: R2 + 1*R3 

rref(A) = [ [ 1 0 1 0 -2 ] 

[ 0 1 -1 0 2 ] 

[ 0 0 0 1 2 ] ] 

inv(G) = [ [ 1 0.5 -1 ] 

[ 0 -.5 1 ] 

[ -1 0 1 ] ] 

B1 = [ [ 1 0 1 0 -2 ] 

[ 0 1 -1 0 2 ] 

[ 0 0 0 1 2 ] ] 

rank = 3 

pivots = [ 0 1 None 3 None ] 

E1 = [ [ 1 0 0 ] 

[ 0 1 0 ] 

[ 0 0 0 ] 

[ 0 0 1 ] 

[ 0 0 0 ] ] 

E1*B1 - I = [ [ 0 0 1 0 -2 ] 

[ 0 0 -1 0 2 ] 

[ 0 0 -1 0 0 ] 

[ 0 0 0 0 2 ] 

[ 0 0 0 0 -1 ] ] 

N = [ [ 1 -2 ]

[ -1 2 ] 

[ -1 0 ] 

[ 0 2 ] 

[ 0 -1 ] ] 


Top 10 reasons computers are male 

10. They have a lot of data but are still clueless. 

9. A better model is always just around the corner. 

8. They look nice and shiny until you bring them home. 

7. It is always necessary to have a backup. 

6. They'll do whatever you say if you push the right buttons. 

5. The best part of having either one is the games you can play. 

4. In order to get their attention, you have to turn them on. 

3. The lights are on but nobody's home. 

2. Big power surges knock them out for the night. 

1. Size does matter. 

Return 

Washington Apple Pi IFAQ 

lic Wednesday, November 5, 1997

5.9.2 Verbatim: “rref.py” 

1 ””” 

2 @summary : C a l c u l a t e the row−reduced echelon form o f a given matrix . 






8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 8 



12 ””” 

13 

14 def r r e f ( amat , debug=False ) : 

15 ””” 

16 C a l c u l a t e the row−reduced−echelon o f ’ amat ’ using Gauss e l i m i n a t i o n o f the 

17 rows . There i s p a r t i a l p i v o t i n g only − i e no column permutations . The output 

18 i s a matrix o f the same shape as ’ amat ’ : : 

19 

20 | 0 . . . 0 1 ∗ . . . 0 ∗ . . . 0 ∗ . . . ∗ | 

21 | 0 . . . 0 0 0 . . . 1 ∗ . . . 0 ∗ . . . ∗ | 

22 | 0 . . . 0 0 0 . . . 0 0 . . . 0 ∗ . . . ∗ | 

23 r r e f ( amat ) = | : : : : : : : : : | 

24 | 0 . . . 0 0 0 . . . 0 0 . . . 1 ∗ . . . ∗ | 

25 | 0 . . . 0 0 0 . . . 0 0 . . . 0 0 . . . 0 | 

26 | : : : : : : : : : | 

27 | 0 . . . 0 0 0 . . . 0 0 . . . 0 0 . . . 0 | 

28 

29 Notice the zero b l o c k s at the l e f t and bottom o f ’ r r e f ( amat ) ’ . For chemical 

30 formula m a t r i c e s the l e f t block i s always missing while the bottom block i s 

31 p r e s e n t in the case ’ amat ’ i s rank d e f i c i e n t ( more atoms than components f o r 

32 example ) . The ’ rank ’ o f ’ r r e f ( amat ) ’ i s equal to the number o f non−zero 

33 rows . The ’ p i v o t s ’ l i s t holds the p o s i t i o n o f a l l the pivot elements used in 

34 the e l i m i n a t i o n , i . e . [ None , . . . , None , i , None , . . . , j , None , . . . , k , None , 

35 . . . , None ] in the example above . Note : The output matrix elements are con− 

36 verted to Float i r r e s p e c t i v e o f what comes in ( Int or Float ) . 

37 

38 @param amat : Input matrix given as a l i s t o f l i s t s o f numbers 


40 

41 @type amat : a L i s t [ a L i s t [ aNumber , aNumber , . . . ] , . . . ] 


43 

44 @return : a L i s t [ r r e f ( amat ) , anInt , a L i s t [ None | anInt , . . . ] ] 

45 ””” 

46 

47 i f not ( amat ) or not ( amat [ 0 ] ) : 

48 raise ArithmeticError ( "zero rows in amat ’%s’"%(amat , ) ) 

49 

50 amat = pass # make work copy and convert to f l o a t 

51 p i v o t s = range ( 0 , l e n ( amat [ 0 ] ) ) # assume l e n ( amat [ 0 ] = l e n ( amat [ 1 ] ) = . . . 

52 rank = 0 # i n i t i a l i z e number o f non−zero rows in amat 

53 

54 i f debug : print ’\nrref() :\n’ + \ 

55 ’\ninput amat = ’ + s t r ( amat ) 

56 

57 for c in p i v o t s : # c o n s i d e r a l l columns o f amat 

394

58 piv , val = 0 , 0 . 0 # s t a r t i n g pivot row , pivot value 

59 for r in range ( pass , pass ) # p a r t i a l p i v o t i n g o f remaining rows 

60 arc = pass # c u r r e n t amat [ row , column ] element 

61 i f abs ( arc ) > abs ( val ) : # new pivot candidate found 

62 pass # change pivot row , pivot value 

63 

64 i f debug : 

65 print ’\namat : ’ + s t r ( amat ) + \ 

66 ’\ncolumn : ’ + s t r ( c ) + \ 

67 ’\npivot element: ’ + s t r ( piv ) + \ 

68 ’\npivot value : ’ + s t r ( val ) 

69 

70 i f val != 0 . 0 : # a non−zero pivot value was found 

71 pass # swap rows 

72 

73 for j in range ( pass , pass ) # s t a r t pivot row s c a l i n g 

74 pass # make amat [ rank ] [ c ] = 1 

75 

76 # Note r e v e r s e d order in row e l i m i n a t i o n . You e i t h e r has to do t h i s , 

77 # or use a temporary v a r i a b l e . I f you use j in range ( c , l e n ( p i v o t s ) ) 

78 # then amat [ i ] [ c ] i s changed at the very beginning o f the loop which 

79 # screws up the algorithm . 

80 for i in range ( pass , pass ) # s t a r t row e l i m i n a t i o n 

81 i f i == rank : continue # i g n o r e pivot row 

82 for j in range ( pass , pass ) # r e v e r s e d row e l i m i n a t i o n 

83 pass # make amat [ i ] [ c ] = 0 

84 

85 rank += 1 # i n c r e a s e the rank 

86 else : # zero pivot value 

87 p i v o t s [ c ] = None # c u r r e n t column i s not a f r e e v a r i a b l e 

88 


90 print ’\noutput amat : ’ + s t r ( amat ) + \ 

91 ’\nrank : ’ + s t r ( rank ) + \ 

92 ’\npivots : ’ + s t r ( p i v o t s ) 

93 

94 return [ amat , rank , p i v o t s ] 

395

5.9.3 Verbatim: “null.py” 

1 ””” 

2 @summary : C a l c u l a t e the n u l l s p a c e o f a given matrix . 






8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 9 



12 ””” 

13 

14 def n u l l ( amat , debug=False ) : 

15 ””” 

16 C a l c u l a t e the n u l l s p a c e o f ’ amat ’ from r r e f ( amat ) and f i d d l i n g around with 

17 the Gauss e l i m i n a t i o n s t r u c t u r e . The r e s u l t i s that amat∗ n u l l ( amat ) = zero . 

18 That ’ s a l l . No fancy mathematics l i k e e . g . o r t h o n o r m a l i z a t i o n o f the null − 

19 space . 

20 



23 



26 

27 @return : a L i s t [ a L i s t [ aFloat , aFloat , . . . ] ] 

28 e . g . [ [ 1 . 0 , 0 . 0 ] , [ 0 . 0 , 1 . 0 ] , [ −1.0 , 0 . 0 ] , [ 0 . 0 , −1.0] , . . . ] 

29 ””” 

30 

31 # Row−reduced−echelon −form . 

32 from r r e f import r r e f 

33 

34 bmat , rank , p i v o t s = r r e f ( amat , debug ) 

35 


37 print ’\nnull() :\n’ + \ 

38 ’\ninput bmat = ’ + s t r ( bmat ) + \ 

39 ’\ninput rank = ’ + s t r ( rank ) + \ 

40 ’\ninput pivots = ’ + s t r ( p i v o t s ) 

41 

42 # I n s e r t −1 along the main d iagonal f o r each o f the dependent v a r i a b l e s . 

43 for r in [ i for i in range ( 0 , l e n ( p i v o t s ) ) i f p i v o t s [ i ] == None ] : 

44 pass 

45 pass 

46 

47 # S t r i p o f f rows that have been pushed o u t s i d e the matrix boundary ( they are 

48 # anyway f u l l y zero ) . 

49 pass 

50 

51 # Remove the columns corresponding to independent v a r i a b l e s in the n u l l s p a c e 

52 # s o l u t i o n . 

53 for r in range ( 0 , l e n ( p i v o t s ) ) : 


55 print ’\nbmat : ’ + s t r ( bmat ) + \ 

56 ’\nrow : ’ + s t r ( r ) 

57 

396

58 # Remove independent v a r i a b l e s by popping from r i g h t to l e f t . 

59 for c in [ pass ] : 

60 pass 

61 

62 i f debug : print ’\noutput bmat : ’ + s t r ( bmat ) 

63 

64 return bmat 

397

Plug Flow Reactor. Part I 




23 August 2011 

(completed after 120 hours of writing, programming and testing) 

1 The mass balance 

˙b ın 

A 

˙ξ 

b(t, z, ∆z) 

z z + ∆z 

˙b out 

From Einstein’s mass–energy 

equivalence E = mc 2 we know that 

energy and mass are in principle 

convertible state properties. At 

least so for relativistic processes 

and nuclear reactions. In everyday 

physics and chemistry the mass 

changes are so small, however, that 

we are not able to measure them correctly, and for all practical purposes we may therefore 

assume that mass and energy are independent properties. The mass balance of an 

open system can then be written 

M(t, z, ∆z) = 

�t 

0 

� 

˙Mın dτ − 

0 

t 

˙Mout dτ . 

In this equation M is used (rather than m) for the total mass to conform with thermodynamic 

practise where extensive quantities are designated by capital letters. The balance 

of total mass is an absolute must for all non-nuclear systems, but for multicomponent 

mixtures of chemical origin we can go a bit further.The balance principle does not only 

apply to the total mass, but to the mass of each individual atom in the mixture. Or, 

we may consider the mole number Bi of each atom since the atomic masses are constant 

properties of the atoms. This means that the mass Mi = Bi ∗Mw,i of atom i is conserved 

if Bi is conserved. Let b ˆ= [B1, B2, · · · ] be a vector of mole numbers for all the atoms 

in the mixture. The mass balance of an open chemical system is then 

b(t, z, ∆z) = 

�t 

0 

� 

˙bın dτ − 

1 

0 

t 

˙bout dτ

To proceed we need to embroider the concepts of chemical formulas and chemical reactions. 

Quite interestingly, we can in the present context look upon chemical formulas 

as algebraic expressions written on a very condensed form. Take for instance iron(II)acetate: 

Fe(CH3COO)2 · 4H2O. Using standard rules of operation (from IUPAC) the 

formula expands to: 

Fe(CH3COO)2 · 4H2O = Fe + 2 · (2C + 3H + 2O) + 4 · (2H + O) 

= Fe + 4C + 14H + 8O 

Convince yourself that this expression evaluates to the molecular weight of iron(II)acetate 

provided the symbols Fe, C, H and O are assigned to the atomic masses of the 

chemical elements in question. You can also verify that the summation of pair products 

(a number times a symbol) are the only operations needed in the calculation. This makes 

matrix algebra a useful tool since the inner product of matrix algebra is just that—a 

summation of pair products. By considering a mixture of known chemical substances it 

is possible to make a corresponding list of all atoms encountered in the mixture. The 

link between these two lists is the so-called formula matrix. Let again b ˆ= [B1, B2, · · · ] 

and this time also n ˆ= [N1, N2, · · · ] where Ni is the mole number of compound i often 

referred to as substance i. Using matrix algebra we can now write: 

b = An 

The stoichiometric coefficients of each substance, of which iron(II)-acetate is one example, 

are collected into the corresponding columns of A. Albeit quite trivial, the principle 

is best served by a concret example. Take e.g. the combustion of methane (CH4) in air 

(0.78 N2, 0.21 O2 and 0.01 Ar) to the reaction products CO, CO2, H2O, H2, OH, H and 

NO. Altogether there are 11 substances and 5 atoms in the mixture: 

A = 

CH4 N2 O2 Ar CO CO2 H2O H2 OH H NO 

⎛ 

⎞ 

1 0 0 0 1 1 0 0 0 0 0 

⎜ 4 

⎜ 0 

⎝ 0 

0 

2 

0 

0 

0 

2 

0 

0 

0 

0 

0 

1 

0 

0 

2 

2 

0 

1 

2 

0 

0 

1 

0 

1 

1 

0 

0 

0 ⎟ 

1 ⎟ 

1 ⎠ 

0 0 0 1 0 0 0 0 0 0 0 

and, to make what we are talking about absolutely clear: 

C 

H 

N 

O 

Ar 

n = � NCH4 NN2 NO2 NAr NCO NCO2 NH2O NH2 NOH 

�T NH NNO , 

The mass balance is now written 

b = � �T BC BH BN BO BAr . 

An(t, z, ∆z) = 

�t 

0 

� 

A ˙nın dτ − 

2 

0 

t 

, (1) 

A ˙nout dτ , (2)

ut A is usually a singular matrix (except for mixtures of pure elements) which prohibits 

a simple solution to these equations. The physical reasoning is that there can 

occur chemical transpositions in the system taking one set of substances (reactants) into 

another set of substances (products). This transposition is called chemical reaction. It is 

known by experiment that chemical reactions can change the composition of the system 

without altering the mole numbers of the atoms. The mathematical explanation of the 

phenomena lies in the nullspace of A. It is defined as a matrix N such that AN = 0 

and where � A T N � constitutes an invertible matrix of full rank. From the definition 

of the nullspace it is clear that whatever happens in the column space of N it will not 

affect the atoms vector b. To make this situation very clear we shall consider a closed 

system that is changed from one compositional state 1 to another state 2. The equations 

describing the changes are listed below: 

b2 = b1 

An2 = An1 

A(n2 − n1) = 0 

A∆n = 0 

If we now calculate ∆n as a linear combination of the columns of N we have a full-blown 

solution to the mass balance problem of the closed system: 

∆n = Nξ ⇒ A∆n = ANξ = 0 

The elements ξi of the solution vector ξ are the extents of reaction for each independent 

reaction in the system. With this understanding in mind we can recast the mass balance 

into 

n(t, z, ∆z) = 

�t 

0 

� 

˙nın dτ − 

0 

t 

� 

˙nout dτ + 

0 

t 

� 

z+∆z 

z 

AN ˙ ξ dζ dτ , (3) 

where A stands for the cross-sectional area of the reactor (perpendicular to the flow) 

and ˙ ξ is the vector of independent reaction rates (moles per unit time and volume). It 

is easy to verify that Eq. 3 is a solution of Eq. 2. Multiplying by A on both sides of 

the equation makes the chemical reaction integral drop out because AN = 0. Eq. 2 is 

thereby reduced to Eq. 3. 

To calculate actual numbers for ξi we need to model either the reaction kinetics or 

the thermodynamic equilibria (or both) in the mixture, and to do this we must couple 

the mass balance equations with the energy and impulse balances of the system. This 

is our ultimate goal explained in the Part III of this paper entitled Modelling Issues. 

We must first concentrate on the nullspace calculation, however, and find a clear-cut 

and solid way to do the matrix operations that are needed. There are several nullspace 

algorithms on the market but we shall define our own. The reasons are twofold: Firstly, 

the problems we are dealing with are on a tiny scale (5–20 variables) and there is no 

need for a very fast and numerically secure algorithm. Secondly, bringing in an advanced 

nullspace algorithm has the disadvantage that we do not learn much about simpler things 

3

like Gauss-elimination, row dependencies and matrix ranks. Calculating the row reduced 

echelon (starcaise) form B = rref(A) = G-1A is one way to define the nullspace. Let G 

be an invertible matrix doing a sequence of zero or more steps of Gauss-elimination to 

reach the following result: 

⎛ 

0 · · · 0 1 ∗ · · · 0 ∗ · · · 0 ∗ 

⎞ 

· · · 

⎜ 0 · · · 0 0 0 · · · 1 ∗ · · · 0 ∗ · · · ⎟ 

B ˆ= 

� B1 

0 

� 

= G -1 ⎜ 

. 

A = ⎜ 0 

⎜ 0 

⎜ 

⎝ . 

. .. 

· · · 

· · · 

. .. 

. 

0 

0 

. 

. 

0 

0 

. 

. 

0 

0 

. 

. .. 

· · · 

· · · 

. .. 

. 

0 

0 

. 

. 

0 

0 

. 

. .. 

· · · 

· · · 

. .. 

. 

1 

0 

. 

. 

∗ 

0 

. 

⎟ 

. .. 

⎟ 

· · · ⎟ 

· · · ⎟ 

. .. 

⎟ 

⎠ 

0 · · · 0 0 0 · · · 0 0 · · · 0 0 · · · 

The matrix element ∗ can be any real number (i.e. not necessarily 0 or 1) or a missing 

element (in which case the whole column is missing). 

The elimination process is properly defined for all matrices regardless their shape 

and content, but columns that are fully zero have no meaning in thermodynamics (they 

correspond to chemical formulas without any atoms). Rows that are fully zero are on 

the other hand physically acceptable, and is in fact quite inevitable for single component 

systems with two or more atoms. Note also that there are two special cases of B: If A 

is invertibel then B1 = I and G = A-1 . If A = 0 then B1 is empty and G = I. From 

B1 we can define the elementary matrix 

E T 1 = 

⎛ 

⎞ 

0 · · · 0 1 0 · · · 0 0 · · · 0 0 · · · 

⎜ 0 · · · 0 0 0 · · · 1 0 · · · 0 0 · · · ⎟ 

⎜ 

⎝ 

. 

. .. 

. 

. . . .. . 

. . .. . 

. . .. 

⎟ 

⎠ 

0 · · · 0 0 0 · · · 0 0 · · · 1 0 · · · 

by putting all ∗ to zero. Thus dim(ET 1 ) = dim(B1). The product of E1 and B1 is thereby 

a square matrix with either 0 or 1 along the diagonal. Hence E1B1 − I is a similarly 

shaped matrix with either −1 or 0 along the diagonal. In order to see this clearly we 

remove for a moment all ellipsises · · · , . and . . . from the matrix expression: 

⎛ 

⎜ 

E1B1 − I = ⎜ 

⎝ 

−1 0 0 0 0 0 0 0 

0 −1 0 0 0 0 0 0 

0 0 0 ∗ 0 ∗ 0 ∗ 

0 0 0 −1 0 0 0 0 

0 0 0 0 0 ∗ 0 ∗ 

0 0 0 0 0 −1 0 0 

0 0 0 0 0 0 0 ∗ 

0 0 0 0 0 0 0 −1 

The outcome of the manipulation is that B(E1B1 − I) = 0. This property follows from 

the definition of E1 which implies B1E1 ˆ= Irank(A)×rank(A). Furthermore: 

� � 

� � � � 

B1 

I 

B1 

B(E1B1 − I) ˆ= (E1B1 − I) = B1 − = 0 

0 

0 

0 

4 

⎞ 

⎟ 

⎠

It also means we have captured the nullspace of A since A = GB. If B(E1B1 − I) is 

zero then A(E1B1 − I) is zero because G is an invertible (non-singular) matrix. What 

remains now is to extract N by selecting the non-zero columns of E1B1 − I. Let E2 be 

an elementary selection matrix doing these operations. Then: 

N ˆ= (E1B1 − I)E2 

Each column of N corresponds to a chemical reaction with coefficients taken from the 

elements of that column. From its physical interpretation N is also called the reaction 

stoichiometry matrix of the system. 

Let A = ( 1 2 ) be the atom matrix of a chemical system comprised of component 

A and its dimer A2. We shall find the reaction stoichiometry of this system using the 

matrix formulations above. The result is 

A = � 1 2 � 

B1 = B = � 1 2 � 

E T 1 = � 1 0 � 

� � 

1 2 

E1B1 = 

0 0 

� � 

0 2 

E1B1 − I = 

0 −1 

� � 

2 

N ˆ= (E1B1 − I)E2 = 

−1 

where B1E1 = ( 1 2 )( 1 0 ) T = ( 1 ) ≡ I rank(A)×rank(A). Note: The stoichiometry 

matrix N is in chemical lingo written 2A ⇔ A2. Left as an exercise for the reader is 

finding all six(!) reactions in the methane – air system mentioned in Eq. 1. 

After this lenghty digression of nullspaces and chemical reactions we shall finally 

continue with the mass balance in Eq. 3. The forthcoming discussion has much in 

common with the energy balance in Part II of this paper, but the mass balance is 

inherently simpler then seen from a modelling point of view. To continue we shall first 

require the partial derivative of n at a fixed spatial position z with respect to time is: 

� � 

∂n 

= ˙nz − ˙nz+∆z + 

∂t z 

� 

z+∆z 

z 

AN ˙ ξ dζ 

As is also explained in the second paper this equation has a very special meaning whenever 

the physical situation is such that it allows the left hand side to be put to zero. 

This is the celebrated steady state which reduces the differential equation to an algebraic 

equation on the form: 

z+∆z � 

˙nz+∆z − ˙nz = AN ˙ ξ dζ 

5 

z

The mole flows can be factored into the flow of total mass and a composition term: 

The mass balance is then reduced to: 

˙n = ˙ Mc 

( ˙ Mc)z+∆z − ( ˙ 

Mc)z = 

� 

z+∆z 

z 

AN ˙ ξ dζ 

From the mass conservation principle we know that (for steady-state flow): 

˙Mz+∆z − ˙ Mz = 0 

Division by ˙ Mz+∆z = ˙ Mz ˆ= ˙ M on both sides of the equation yields: 

In the limit of ∆z → 0 we get: 

or rearranged: 

cz+∆z − cz = 

� 

z+∆z 

z 

AN ˙ M -1 ˙ ξ dζ 

lim 

∆z→0 (cz+∆z − cz) = AN ˙ M -1 ξ∆z ˙ 

cz+∆z − cz 

lim 

∆z→0 ∆z 

= AN ˙ M -1 ˙ ξ 

We immediately recognize the left hand side as the partial derivative of c with respect to 

z. On the right hand side we can make the definition r ˆ= ˙ M -1ξ˙ standing for the specific 

reaction yield (moles per unit mass and volume). The mass balance for a steady state 

reactor is finally written: � �s-s ∂c 

= ANr 

∂z 

To solve this equation we need N and a kinetic model for r(z, c). An algorithm for 

calculating N is discussed in this paper, but the calculation of r has to await a more 

thorough discussion of thermodynamic state variables in Parts II and III of this paper. 

The reason is that r is a strong function of thermodynamic variables like temperature 

and pressure in addition to the composition variable c. 

There is another formal issue here which must not be forgotten: The mass balance 

is written as a partial derivative with respect to the spatial co-ordinate. This is odd 

since c is by no means a function of z. It only depends on z through the solution of 

the differential equation. The thread to this discussion will be picked up in conjunction 

with the energy balance in Part II. 

6

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back





5.10.1 Reference ???, see also Sec. 5.2.1 


406

Root solvers 





Reasons computers are female 

No one but their creator understands their internal logic. 

Zooball/Elephant 

The native language they use to communicate with other computers is incomprehensible to 

everyone else. 

Even your smallest mistakes are stored in long-term memory for later retrieval. 

As soon as you make a commitment to one, you find yourself spending half your paycheck on 

accessories for it. 

Computers are female 

Assignments 

1. Write a procedure sqrt for solving x=sqrt(y) using Newton-Raphson 

iteration. Variable y is supposed to be a known number taken in from the 

command line and you are asked to find x. Note: You cannot iterate on 

x-sqrt(y)=0 directly because this problem already requires sqrt() 

which is an unknown function (without importing the math module in 

Python). Rather, you should consider iterating on x**2-y=0. Use the 

stub program sqrt.py as template. 

2. Play around with sqrt() and see if you can trick it somehow. Make it 

diverge in other words. 

3. Write a procedure pv for solving pv(p,t,v,ntot)=0 using Newton- 

Raphson iteration. Variables p, t, v and ntot are supposed to be known 

numbers taken in from the command line. However, v is a starting value 

only and will change during the iteration. Note: You must avoid unphysical 

solutions. That is to say negative volumes. Use the stub program pv.py as 

template. 

4. Play around with pv() and learn more about Newton-Raphson iteration 

sequences. Run it a couple of thousand times at different starting values 

to see how stable it is. Observe that the iteration method is of 2nd order. 

I.e. that it doubles the significant digits in every iteration (at some point in 

the iteration history).

Start reading about The energy balance to get into the thinking of physical 

problem formulations, equations of state and numerical solvers. 

Most of the time we will be using Newton-Raphson iteration in this course for 

solving non-linear equations, but there is something called recursive iteration 

(using the Banach fix-point theorem) which can be very efficient. Perhaps you 

know this type of iteration as 'direct substitution'. It is worth while looking at - 

now that we know a little Python. 

back 

1. Write a recursive procedure for iterating x_k+1 = x_k**2 starting at 

x_0 < 1. 

Have a look at for_lc_rc.py for some compelling thoughts on how this iteration 

can be achieved. 

back 

%Predefined number 2. 

HTML text number 3. 

back 


Top 10 reasons compilers are female: 

10. Picky, picky, picky. 

9. They hear what you say, but not what you mean. 

8. Beauty is only shell deep. 

7. When you ask what's wrong, they say "nothing". 

6. Can produce incorrect results with alarming speed. 

5. Always turning simple statements into big productions. 

4. Smalltalk is important. 

3. You do the same thing for years, and suddenly it's wrong. 

2. They make you take the garbage out. 

1. Miss a period and they go wild. 

Return 

Washington Apple Pi IFAQ 

lic Wednesday, November 5, 1997

5.11.2 Verbatim: “sqrt.py” 

1 ””” 

2 @summary : C a l c u l a t e the square root o f any s e t o f p o s i t i v e numbers using 

3 Newton−Raphson i t e r a t i o n on : : 

4 

5 x∗x − y = 0 

6 

7 where y i s the given number . In the implementation below y and x 

8 are not p l a i n numbers but l i s t s o f numbers . 


10 @organization : Department o f Chemical Engineering , NTNU, Norway 




14 @since : 2 0 1 1 . 1 0 . 1 3 (THW) 

15 @version : 1 . 0 

16 @todo 2 . 0 : nothing 


18 @note : On a Unix t e r m inal you can use the s c r i p t l i k e t h i s : 

19 

20 >>> python s q r t . py 

21 >>> python s q r t . py 

22 

23 y1 = aNumber 

24 y2 = aNumber 

25 . . . = aNumber 

26 

27 ””” 

28 

29 def s q r t ( y , x , debug=False , norm=1e999 ) : 

30 


32 print x 

33 

34 dy = pass # c a l c max( abs ( r e s i d u a l ) ) 

35 

36 i f dy < 1 . 0 e−8 and dy >= norm : # i t e r a t e t i l l the b i t t e r end 

37 return x 

38 

39 else : 

40 return pass # s q r t ( y , x k +1, debug , dy ) 

41 

42 # Test the code . Feed i t p r e t t y bad s t a r t i n g v a l u e s . . . 

43 # 

44 i f name == ’__main__’ : 

45 

46 import s q r t 

47 import sys 

48 

49 # User problem . 

50 i f l e n ( sys . argv ) > 1 : 

51 y1 = [ f l o a t ( y i ) for y i in sys . argv [ 1 : ] ] 

52 x0 = y1 

53 debug = False 

54 

55 # Default problem . 

56 else : 

57 y1 = [ 2 , 3 , 4 ] 

410

58 x0 = [ 1 . 0 e −10, 1 , 1 . 0 e10 ] 

59 debug = True 

60 

61 print s q r t . s q r t ( y1 , x0 , debug ) 

411

5.11.3 Verbatim: “pv.py” 

1 ””” 

2 @summary : Solve pˆ{ i g }( v ) = p1 using Newton−Raphson i t e r a t i o n . 

3 Step s i z e i s c o n t r o l l e d in order to avoid v < 0 . 






9 @since : 2 0 1 1 . 1 0 . 1 3 (THW) 

10 @version : 0 . 9 

11 @todo 1 . 0 : 


13 @note : On a Unix t e r m inal you can use the s c r i p t l i k e t h i s : 

14 

15 >>> python pv . py 

16 >>> python pv . py 

17 

18 p1 = p r e s s u r e [ kbar ] 

19 t = temperature [ kK ] 

20 v0 = i n i t i a l volume [ dm3 ] 

21 ntot = t o t a l number o f moles [ mol ] 

22 

23 ””” 

24 

25 def pv ( p1 , t =0.29815 , v0 =1.0 , ntot =1.0 , debug=False ) : 

26 

27 converged = False # convergence f l a g 

28 norm = 1 . 0 # convergence c o n t r o l v a r i a b l e 

29 eps = 1 . 0 e−8 # convergence t o l e r a n c e 

30 v = v0 # s t a r t volume 

31 r = 0.083145119843087 # gas constant [10ˆ5 J molˆ{−1} kKˆ{ −1}] 

32 

33 # Solve p ( v ) = p1 using Newton ’ s method . 

34 while not converged : 

35 dpdv = pass # Jacobian 

36 dp = pass # p r e s s u r e r e s i d u a l 

37 dv = pass # volume change 

38 converged = abs ( dv ) < eps and abs ( dv ) >= norm # d e c r e a s i n g norm? 

39 norm = abs ( dv ) # new norm 

40 

41 # The model f a i l s i f ’ v ’ becomes n e g a t i v e volume . Shorten the i t e r a t i o n 

42 # step t i l l the updated volume i s p o s i t i v e . Raise an e x c e p t i o n i f the 

43 # step becomes too small . 

44 while v+dv < 0 . 0 : 

45 i f abs ( dv ) < eps : 

46 raise SyntaxError ( "cannot converge p(v) = p1 relation" ) 

47 pass # reduce the step length ( h e u r i s t i c r u l e ) 

48 pass # update volume 


50 print "norm=%8.3g; v=%16.15g;" % (norm , v ) 

51 

52 return v 

53 

54 # Test the code . 

55 # 

56 i f name == ’__main__’ : 

57 

412

58 import pv 

59 import sys 

60 

61 # User problem . 

62 i f l e n ( sys . argv ) == 5 : 

63 p1 = f l o a t ( sys . argv [ 1 ] ) 

64 t = f l o a t ( sys . argv [ 2 ] ) 

65 v0 = f l o a t ( sys . argv [ 3 ] ) 

66 ntot = f l o a t ( sys . argv [ 4 ] ) 

67 debug = False 

68 

69 # Default problem . 

70 else : 

71 p1 = 0 . 2 # given p r e s s u r e [ kbar ] 

72 t = 0 . 8 # temperature [ kK ] 

73 v0 = 1 . 0 # i n i t i a l volume [ dm3 ] 

74 ntot = 13.0 # t o t a l mole number [ mol ] 

75 debug = True 

76 

77 print ’\nInput:’ 

78 print ’p1=%8.6f; T=%8.6f; V0=%8.6f; Ntot=%8.6f\n’ % ( p1 , t , v0 , ntot ) 

79 print ’\nOutput:\nV1=%8.6f\n’ % ( pv . pv ( p1 , t , v0 , ntot , debug ) , ) 

413

Plug Flow Reactor. Part II 




16 October 2011 


1 The energy balance 

( ˙ U + p ˙ V )ın 

( ˙ U + p ˙ ˙Q 

The derivation of a rigorous 

energy balance for any real-life system, 

of which the idealized Plug 

U(t, z, ∆z) 

V ) out 

Flow Reactor (PFR) is one simple 

example, demands a tour de 

continuum mécanique which defi- 

z z + ∆z 

nitely is beyond the scope of this 

little text. But, we cannot ignore 

the energy balance alltogether so we must somehow pick up a model description that is 

mathematically succinct and at the same time physically correct. The following derivation 

is a humble attempt to reach a reasonably clear disposition of the subject. 

Let U(t, z, ∆z) be the internal energy of a control volume with one inlet and one 

outlet. The material flow into the control volume, and out from it, is assumed to 

be perpendicular to the control surfaces which are situated at z and z + ∆z. This 

simplification reduces the traditional inner product of the surface normal (vector) and 

the (vectorial) flows of heat, displacement work, and energy, into their scalar counterparts 

called ˙ Q, p ˙ V and ˙ U. Note that we shall only consider the flow of internal energy ˙ U while 

in the general case we might need to include terms for potential energy, kinetic energy, 

surface energy, electromagnetic energy and so forth. But, because the picture becomes 

immensely complicated when every possible term is included, it is important to simplify 

the model as much as possible without loosing the grip of reality. According to the 

aforementioned simplifications and the principle of energy conservation we shall write 

C 

� 

U(t, z, ∆z) = U◦ + 

0 

t 

� � 

˙U + pV˙ z 

� 

dτ − 

0 

t 

� � �t 

˙U + pV˙ dτ + ( 

z+∆z ˙ Q − ˙ Ws) dτ 

where ˙ Ws is the mechanical “shaft” work applied to the reactor. Normally it is close to 

zero. Subscripts z and z+∆z are used to denote physical properties that are calculated 

1 

0

at these two spatial positions. This is not to say that ˙ U and p ˙ V are functions of z per 

se. They have co-ordinates of their own which in a way are defined at every point in 

space and time. This subtlety is discussed further down the text. 

In the current context we may put the integration constant U◦ to zero. It implies 

that a material system with zero mass has zero energy. This is an important thermodynamic 

consideration which is true for all chemical systems in the absence of strong 

electromagnetic radiation. 

The symbols ˙ Q, V˙ and U˙ stand for the transported heat, volume and energy (per 

unit time) and has nothing to do with the derivative of a mathematical function, say F , 

which is defined like: 

�∂F � � � 

F (t + ∆t) − F (t) 

ˆ= lim 

∂t ∆t→0 ∆t 

x1,x2,··· 

This means we need to distinguish clearly between the transportation ˙ 

F and the time 

derivative (∂F/∂t). The scientific units are the same but their interpretations are entirely 

different 1 . In other papers you may find ˆ F being used rather than the dotted form favored 

here. The meaning is the same though. 

To continue, U and U from which ˙ U is derived look quite similar, but they do actually 

measure two different aspects of internal energy. U is a mathematical construction 

(we may call it a functional) which has no simple physical description, while U is a 

thermodynamic state function U(S, V, N1, N2, · · · ) which by definition is independent of 

time. That is to say U(x, t1, z1) = U(x, t2, z2) = . . . for fixed values of entropy, volume 

and mole numbers (collected into one vector x). To be a state function U must represent 

the energy of an isotropic system in equilibrium with respect to certain restricted changes 

in the state variables S, V , N1, N2, etc. (the definition of state variables is made broader 

later in this text). Hence, it is generally true that (∂U/∂t) = 0 while (∂U/∂t) �= 0. To 

proceed, we introduce from thermodynamic theory that H ˆ= U + pV . This definition 

also works for the transported enthalpy: 

˙H ˆ= ˙ U + p ˙ V (1) 

1 Formal arguments can be raised against this conjecture. Consider a functional F that describes the 

amount of energy, mass or any other extensive property that has passed the control surface at z over 

the time period [0, t]. Then 

F(t, z) = 

�t 

0 

A ˙ 

f dτ 

where ˙ f is the flux (amount per unit area and time) of F , and A is the cross-sectional area of the 

transport. The time derivative of F is 

� 

∂F 

∂t 

� 

z 

= A ˙ 

f ˆ= ˙ 

F 

So, in a sense ˙ F is really a partial derivative, but it must be understood that F has no explicit (and time 

independent) function expression like e.g. the thermodynamic and kinetic models we are using. Most 

students have problems in understanding the fundamental difference between dF /dt and (∂F/∂t) and I 

therefore hesitate in calling ˙ F a derivative because it will bring even more confusion into the subject. 

2

It works because p (the pressure) is an intensive state variable which is independent of 

the magnitude of the volume flow. At the same time we want to integrate the total heat 

flux over the external surface of the reactor section 

˙Q = 

� 

z+∆z 

z 

C ˙q dζ , (2) 

where C is the circumference of the reactor and ˙q is the heat flux (per unit time and 

surface area). Note that dζ rather than dz is acting as an integrator for ˙q. We use 

this convention (Greek integrator—Latin variable) to make sure we do not mix up the 

integrator symbol with the symbol of either the upper or the lower limit of the integral 2 . 

This makes the integral a function of z while ζ is consumed during the integration. 

It is customary to neglect the heat flow in the axial direction which is why the 

integral is carried out over the outer surface only. However, strictly speaking there is 

an order-of-magnitude analysis missing here but this is left as an exercise for the reader. 

The internal energy of the control volume is then: 

U(t, z, ∆z) = 

�t 

0 

� 

˙Hz − ˙ � � 

Hz+∆z dτ + 

0 

t 

� 

z+∆z 

z 

C ˙q dζ dτ 

This states the energy balance of a simple plug flow reactor. On the form given it is 

particularly useful for testing and verifying the accuracy of numerical integrators used 

in dynamic simulation studies, but this is not our goal. We shall proceed instead by 

calculating the partial derivative U at a fixed spatial position z with respect to time: 

� � 

∂U 

= 

∂t z,∆z 

˙ Hz − ˙ Hz+∆z + 

� 

z+∆z 

z 

C ˙q dζ (3) 

On the current form Eq. 3 leads to a partial differential equation (PDE) in time and space 

which is considered to be a hard numerical task. But, there are relevant simplifications. 

In particular we shall study the behaviour of closed systems without throughput of mass 

and steady state (time independent) systems. 

1.1 First law of thermodynamics 

A special form of the energy balance applies to closed systems. Here, closed means 

˙Hz = ˙ Hz+∆z = 0. This appears to be outside the scope of our PFR model but it is 

still in reach of the thermodynamic formalism. In a system of this kind energy changes 

2 Dealing mostly with closed and definite integrals we may not even realise the problem, but as we move 

on to indefinite integrals (antiderivatives) the symbol clash becomes very noticeable. In thermodynamics 

we define for example the residual function G r,p (p) ˆ= � p 

0 (V (π) − V ıg (π)) dπ where π is an integrator (over 

pressure) and p is the system pressure. The convolution integral F (t) = � t 

ϕ(τ)ψ(t − τ) dτ used in signal 

0 

theory is another example. The mutual roles of τ and t must here be sorted out beforehand. 

3

solely because heat is expelled to, or brought in from, the environment. For the change 

of U we can then write: 

(dU) c-s z+∆z � 

= C ˙q dζ dt 

z 

Backsubstitution of ˙ Q from Eq. 2 yields the simpler form: dU = ˙ Q dt. A similar argument 

holds also for any kind of external work even though it by coincidence has been excluded 

in Eq. 3. The reason is that the PFR model is not subject to any volume change nor is it 

equipped with a mechanical stirrer. If we had decided to include external work (positive 

when work is delivered by the system) the energy equation would have been extended 

to dU = ˙ Q dt − ˙ W dt. 

Taken a bit further it customary to say that ˙ Q dt = δQ and ˙ W dt = δW where δQ 

and δW stand for the non-exact differentials of Q and W . Non-exact means that U does 

not depend on Q and W in a definite way. I.e. there exists no function U(Q, W ) such 

that when Q and W are given then U is also given. This should be quite intuitive all the 

time U is the energy of a material system where the masses of the chemical constituents 

must also play a role. 

In fact, Q and W are path dependent functions of the thermodynamic state, and also 

of the spatial co-ordinates and of time. They are not state functions in any way and they 

do not constitute a part of the system. Rather, they express the transportation of energy 

across the system border. Inside the system, however, heat and work can only be stored 

as internal energy. There are in other words no “heat content” or “energy content” 

of the system, only the ability to exchange heat and work with the environment. We 

therefore talk about “heat potential” and “work potential” to stress the fact that energy 

(the thermodynamic potential) has to be converted back and forth between heat and 

work all the time. 

Finally, before we leave the discussion of the closed system we shall make a precise 

interpretation of U and U. It has already been stated that U is a constructed energy 

function—a functional—that serves the need of an accumulation term in the energy 

equation. From the discussion given above it is clear that U does not change in a closed 

system unless there is heat or work exchange with the environment. If there are no 

interactions of any kind, then all experiments made over the past 200 years indicate that 

U gradually becomes undistinguishable from U. That is: 

Ueq ˆ= lim 

t→∞ U → U 

The two functions U and U are identical whenever their function values are the same 

over the entire definition domain 3 . In this case U is constant throughout the experiment 

so how can it then become gradually undistinguishable from U? The experiment tells 

us that U does not change in a closed system over time. Our postulate says that U 

is identical to U when all internal agitation and transients have died out. Before that 

the measurements of any intensive variables like temperature, pressure and chemical 

3 E.g. the two functions f(x) = cos 2 (x) + sin 2 (x) and g(x) = 1 are mathematically identical for x ∈ R. 

4

potentials give unreliable readings even though the function values are the same at 

any time. It is only then all the readings are stable we can say that U ≡ U in the 

mathematical understanding of the statement. We call this the equilibrium state of the 

system. It has an incredible simple representation in the sense that only n+2 macroscopic 

variables are needed in order to establish the value of U(S, V, N1, N2, 

� 

· · · , Nn). From a 

microscopic point of view this is really incredible because there are 6NA i Ni mechanical 

degrees of freedom when all the particles in the system are considered as a Newtonian 

universe. Thermodynamic systems are much simpler, however, because experimentally 

only the statistically most relevant state is being observed, and since thermodynamics is 

a phenomenological science the observations and theory go hand in hand. This means 

we can write the energy balance of a closed system as 

(dU) c-s = δQ − δW 

which is precisely the first law of thermodynamics. The energy balance in Eq. 3 fulfills 

in other words the requirements of the first law of thermodynamics albeit in disguise. 

It must be understood, however, that the usability of U = Ueq hinges on the fact that 

the relaxation time of the equilibrium process must be smaller than the time scale of 

the simulation. This may, or may not, be the case, but for the present purpose we shall 

assume that U has the meaning of U; at least locally for each point in space—if not for 

the entire system. 

1.2 Steady state solution 

Eq. 3 has another special meaning whenever the physical situation is such that it allows 

the left hand side to be put to zero. It is the celebrated steady state which reduces the 

differential equation to a time-independent algebraic equation on the form: 

( ˙ Hz+∆z − ˙ Hz) s-s = 

� 

z+∆z 

z 

C ˙q dζ 

Despite its simple form the last equation has a wide range of applicability. It is valid for 

any type of fluid flow, inviscid or not, gas or liquid, one-phase or multi-phase, and with 

or without chemical reactions. 

Just like the displacement work in Eq. 1 was factored into p ˙ V , the transported 

enthalpy can be factored into the the transported mass and a term called the specific 

enthalpy h: 

˙H = h ˙ 

M 

The inherent scaling properties, namely that ˙ W = p ˙ V and ˙ H = h ˙ 

M, are deeply rooted in 

thermodynamic theory and are examples of the so-called Euler homogeneous functions. 

The energy balance is then reduced to: 

(h ˙ M)z+∆z − (h ˙ 

M)z = 

5 

� 

z+∆z 

z 

C ˙q dζ

From the mass conservation principle we know that (for steady-state flow): 

˙Mz+∆z − ˙ Mz = 0 

Division by ˙ Mz+∆z = ˙ Mz ˆ= ˙ M on both sides of the equation yields: 

In the limit of ∆z → 0 we get: 

or rearranged: 

hz+∆z − hz = 

� 

z+∆z 

z 

C ˙q 

˙M dζ 

lim 

∆z→0 (hz+∆z − hz) = C ˙q 

˙M ∆z 

hz+∆z − hz 

lim 

∆z→0 ∆z 

= C ˙q 

˙M 

We immediately recognize the left hand side as the partial derivative of h with respect to 

z. On the right hand side we can make the definition q ˆ= ˙q/ ˙ M standing for the specific 

heat load (energy per unit mass and area). The energy balance for a steady state reactor 

with only internal energy flow is then: 

� �s-s ∂h 

= Cq 

∂z 

The anti-derivative of the energy balance defines the so-called enthalpy equation (please 

note the integral on the right side is zero for an adiabatic reactor without external heat 

load): 

�z 

h(z) = h(0) + C(ζ)q(ζ) dζ 

At this point we need to worry about the mathematical notation we are using. The 

operations are formally correct up to the point where ∆z → 0, but here it stops. At 

some finite value of ∆z it becomes smaller than the resolution of the measurement. Or, it 

may in fact become smaller than the effective size of the molecules comprising the system 

and on this tiny scale h looses its meaning since it requires a big number of colliding 

molecules to establish a thermodynamic state variable. Hence, the derivative (∂h/∂z) 

does not exist in proper. It is only the finite difference hz+∆z − hz that is physically 

measureable, and then only if ∆z is sufficiently large. This is not a practical problem 

in most cases, but for e.g. high-vacuum systems we must take precautions because the 

distance covered between two successive collisions of the molecules can be of the order 

millimeters or even centimeters. 

Our second worry is that h is not a function of the spatial co-ordinate z. It is in fact 

a function of the state variables T , v ˆ= V/M, c1 ˆ= N1/M, c2 ˆ= N2/M, etc. when any 

of the modern pressure explicit equations of state are being used in the modelling (most 

0 

6

of them are descendants of the Van der Waals equation of state from 1873). Hence, 

(∂h/∂z) does not exist other than as a formal expression, but from differential calculus 

we know that dh/dz takes the same numerical value as (∂h/∂z) when all the degrees of 

freedom except one (i.e. z) are locked. However, the total differential of h is 

� � 

� � 

∂h 

∂h 

dh = 

dT + 

dv 

∂T v,c1,c2,··· ∂v T,c1,c2,··· 

� � 

� � 

∂h 

∂h 

+ 

dc1 + 

dc2 + · · · 

∂c1 T,v,c2,c3,··· ∂c2 T,v,c1,c3,··· 

or given a more compact form: 

dh = ∂T h · dT + ∂vh · dv + ∂c1 h · dc1 + ∂c2 h · dc2 + · · · 

Inventing a new notation “over the night” is not something I usually recommend, but 

we will run out of paper pretty soon unless we do something about the partial derivatives 

flourishing all over the place. Dividing by dz (which is an algebraic quantity 

remember—and by the way quite different from ∂z which is an operator) gives the dif- 

ferential quotient: 

� � � � 

dh ∂h 

= 

dz ∂T 

v,c1,c2,··· 

or, using our shorter notation: 

� � � � 

dT ∂h 

+ 

dz ∂v 

� � 

∂h 

+ 

∂c1 

T,c1,c2,··· 

T,v,c2,c3,··· 

� � 

dv 

dz 

� � � � 

dc1 ∂h 

+ 

dz ∂c2 

T,v,c1,c3,··· 

∇h = ∂T h · ∇T + ∂vh · ∇v + ∂c1 h · ∇c1 + ∂c2 h · ∇c2 + · · · 

� � 

dc2 

+ · · · 

dz 

This is precisely the expression we are looking for. The crux of the matter is that ∇h 

takes the same numerical value as (∂h/∂z), but to carry on we need to first solve an 

equation system that settles the values of ∇T , ∇v, ∇c1, ∇c2, etc. This is done by 

simultaneously solving the energy, momentum and mass balances at the inlet of the 

reactor and integrating the solution variables along the spatial co-ordinate z. The how’s 

and why’s are fully explained in Part III of this paper entitled Modelling Issues. The 

implicitness of the conservation statement is so fundamental to the thermodynamisist, 

however, that it is really deserves an introductory example. The internal workings of 

the so-called Jacobian transformation is explained below. 

1.3 Calculation example 

Doing matrix algebra by hand is hard work but there is no other way we can get an 

understanding of how the linearization really works. So, to gain the insight we shall 

practise on a minimalistic 2 × 2 example. Assume a problem on the form: 

H ıg (T, V ) ˆ= C ıg 

P T = H◦ 

p ıg (T, V ) ˆ= NRT 

V 

7 

= p◦

where N is constant, and H◦ and p◦ are conserved quantities. Let x ˆ= ( T V ) and 

y ˆ= ( H p ). To solve y(x) = y◦ we first linearize y(x) and then attempt to solve the 

equations iteratively using the Newton–Raphson method: 

� 

∂y 

yk + 

∂xT � 

(xk+1 − xk) = y◦ 

Rearrangment gives: 

where 

so that: 

J -1 

k = 

� C ıg 

P 

NR 

V 

Jk ˆ= 

0 

NRT − V 2 

k 

xk+1 = xk − J -1 

k (yk − y◦) 

� 

∂y 

∂xT ⎛ � � 

∂H 

� 

⎜ 

= ⎜ �∂T � 

⎝ ∂p 

k 

∂T 

�-1 

k 

= −1 

C ıg 

P 

NRT 

V 2 

The remaining algebra is straightforward: 

� 

T 

V 

� � 

T 

= 

V 

� ⎛ 

− ⎝ 

1 

C ıg 

P 

k+1 

k 

V 

C ıg 

P T 

V 

V 

� − NRT 

V 2 

0 

−V 2 

NRT 

− NR 

V 

⎞ 

⎠ 

k 

� � 

∂H 

�∂V � 

∂p 

∂V 

0 

C ıg 

P 

� 

T 

T 

k 

⎡� 

ıg 

C 

⎣ 

Iteration example: H◦ = 10 4 J, p◦ = 10 6 Pa, N = 1 mol, C ıg 

P 

P T 

NRT 

V 

⎞ 

⎟ 

⎠ 

k 

⎛ 

= ⎝ 

k T [K] V [m 3 ] 

0 298.15 0.001 

1 481.087257201275 0.00221018092537634 

2 481.087257201275 0.00319913692002833 

3 481.087257201275 0.00383965458178457 

4 481.087257201275 0.00399357233671433 

6 481.087257201275 0.00399998967128617 

7 481.087257201275 0.00399999999997333 

8 481.087257201275 0.004 

� 

k 

1 

C ıg 

P 

V 

C ıg 

P T 

− 

� H 

p 

0 

−V 2 

NRT 

� 

◦ 

⎤ 

⎦ 

⎞ 

⎠ 

= 5 

2 R, R = 8.3145 J mol-1 K -1 : 

The Newton–Raphson iteration is a so-called second order method. One characteristic 

feature is that the number of significant digits will double in each iteration sufficiently 

close to the solution (iteration 3 onward). Verify this behaviour. From the table it is 

also clear that T converges in one step whilst V requires 8 iterations. Give a reason for 

this observation 4 . Finally, it should be mentioned that the Newton–Raphson method is 

sensitive to the starting values. E.g. try to start the iteration at V = 0.01 rather than 

V = 0.001. Suggest a possible fix to the algorithm in this case 5 . 

5necessary. 

. 4V 

and 

T 

restriction is 

linear in both 

length 

strictly 

Step 

is 

update. 

H(T, V ) 

volume 

8 

Unphysical 

k

1.4 Epilogue 

I have in this little text sought to establish a fairly rigorous derivation of the energy 

balance for an idealized plug flow reactor. It is neither highly sophisticated nor does it 

require advanced mathematics. Still, it is not of a kind that is eagerly agreed upon by 

the chemical engineering community—be it professors, students or working professionals. 

Many people find the painstaking calculations of differentials and partial derivatives 

confusing and of little practical interest, but the latter is definitly wrong. The very fact 

that ∇T , ∇v and ∇ci are solution variables of a set of model equation whereas ∂T h, ∂vh 

and ∂cih are explicit (or sometimes implicit) state functions establishing the coefficient 

matrix of the model equations is so important that it can hardly be overemphasized. 

The culprit in this controversy might be the teaching of dy/dx = y ′ in highschool 

mathematics. By doing so the students learn that dy/dx is synonymous with y ′ ˆ= 

(∂y/∂x) and that the rest of the story is just syntactic sugar. For one-variable systems 

I can agree that the difference is subtle, but for many-variable systems it is not. The 

discussion has much in common with the use of substantial derivatives in fluid mechanics 

which says: dy/dt = (∂y/∂t) + (∂y/∂x1) dx1/dt + (∂y/∂x2) dx2/dt + · · · . In this case I 

think it can hardly be misunderstood that dy/dt and (∂y/∂t) are different mathematical 

objects—and very different ones as well. 

9

5.11.5 Verbatim: “for lc rc.py” 

1 ””” 

2 @summary : Demonstrate Banach ’ s f i x point theorem on r e c u r s i v e f u n c t i o n i t e r a t i o n 

3 using the r e c u r r e n c e formula : : 

4 

5 x k+1 = x k ∗∗2 

6 

7 on one s i n g l e s t a r t i n g value and on a l i s t o f many s t a r t i n g v a l u e s . 


9 @since : September 2011 (THW) 

10 ””” 

11 

12 # Converging a s i n g l e s t a r t i n g value . 

13 # 

14 def rc ( arg , fun , err , seq = [ ] ) : 

15 i f e r r ( arg , fun ) : rc ( fun ( arg ) , fun , err , seq ) 

16 seq . i n s e r t ( 0 , arg ) 

17 return seq 

18 

19 def myfun ( arg ) : 

20 return arg ∗∗2# r e c u r r e n c e formula 

21 

22 def myerr ( arg , fun ) : 

23 i f abs ( arg−fun ( arg ) ) > 0 : return True# convergence c r i t e r i o n 

24 return False 

25 

26 args = rc ( 0 . 9 9 9 , myfun , myerr )# using f u n c t i o n s as f i r s t c l a s s v a r i a b l e s 

27 

28 print "\nRecursion of x_k+1 = x_k**2 starting at " + s t r ( 0 . 9 9 9 ) + ":\n" 

29 print args 

30 

31 # Converging a l i s t o f s t a r t i n g v a l u e s . 

32 # Note that f u n c t i o n ’ rc ’ s t a y s the same as in the s i n g l e v a r i a b l e case ! 

33 # 

34 def rc ( arg , fun , err , seq = [ ] ) : 

35 i f e r r ( arg , fun ) : rc ( fun ( arg ) , fun , err , seq ) 

36 seq . i n s e r t ( 0 , arg ) 

37 return seq 

38 

39 def myfun ( arg ) : 

40 return [ x∗∗2 for x in arg ]# v e c t o r i z e d r e c u r r e n c e formula 

41 

42 def myerr ( arg , fun ) : 

43 i f max ( [ abs ( x−fx ) for ( x , fx ) in z i p ( arg , fun ( arg ) ) ] ) > 0 : return True# d i t t o 

44 return False 

45 

46 args = rc ( [ 0 . 9 , 0 . 9 9 , 0 . 9 9 9 ] , myfun , myerr )# v e c t o r i z e d input 

47 

48 print "\n\nRecursion of x_k+1 = x_k**2 starting at " + \ 

49 s t r ( [ 0 . 9 , 0 . 9 9 , 0 . 9 9 9 ] ) + \ 

50 ":\n" 

51 

52 for x in args : 

53 print x 

423

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







426

Solving a Set of Non-Linear 

Equations 





Zooball/Beaver 

"••• one of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to 

indicate successful termination of their C programs." 

Robert Firth 

Assignments 

1. Write a procedure solve for solving a set of linear equations using the 

Row-Reduced-Echelon form of matrix A. Hint: For the linear equation 

system A X = B we get rref([A | B]) = [I | X] according to the 

definition of rref. Object B is a "matrix" in this case. If it so happens that 

B has a single column b we end up with the special case A x = b, but 

there is not much to save, neither in time nor in programming lines, from 

disregarding the general solution. So, go for it! Use the stub program 

solve.py as template. 

2. Linearize the energy balance and the pressure specification of the Plug 

Flow Reactor. Combine it with the mass balance into one simultaneous 

set of linear(ized) equations. Write a solver that iterates on T, v, c_1, 

c_2, ... to find a thermodynamic state which is constrained by h, p, 

c_1, c_2, .... Use the stub program hpn.py as template. 

3. It can also be worth while programming the matrix (inner) product for later 

use. Use the stub program mprod.py as template. 

Continue reading about The energy balance if you need further guidance to the 

understanding of energy, enthalpy, thermodynamics and the mapping between 

different co-ordinate systems. 

back 

%Predefined number 1.


back 



back 


5.13.1 Robert Firth, see also Sec. 2.29 

First reference occurs in 2000 languages, see Section 2.29 on page 165. 

429

5.13.2 Verbatim: “solve.py” 

1 ””” 

2 @summary : C a l c u l a t e xmat from amat ∗ xmat = bmat . 






8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 8 



12 ””” 

13 

14 def s o l v e ( amat , bmat , debug=False ) : 

15 ””” 

16 Solve the l i n e a r equation system amat ∗ xmat = bmat using r r e f (augm) where 

17 augm = [ amat | bmat ] i s the row augmented matrix [ amat [ 0 ] + bmat [ 0 ] , . . . ] . 

18 


20 @param bmat : Right hand s p e c i f i c a t i o n given as a l i s t o f l i s t s o f numbers 


22 


24 @type bmat : 

25 @type debug : 

26 

27 @return : a L i s t [ a L i s t [ aFloat , aFloat , aFloat ] ] 

28 e . g . [ [ 1 . 0 , 2 . 0 , . . . ] , [ 3 . 0 , 4 . 0 , . . . ] , [ 5 . 0 , 6 . 0 , . . . ] , . . . ] 

29 ””” 

30 

31 # Row−reduced−echelon −form . 


33 


35 pass # r a i s e e x c e p t i o n 

36 

37 i f l e n ( amat ) != l e n ( amat [ 0 ] ) : 


39 

40 i f not ( bmat ) or not ( bmat [ 0 ] ) : 


42 

43 i f l e n ( bmat ) != l e n ( amat [ 0 ] ) : 


45 

46 augm = pass # augmented matrix [ amat | bmat ] 

47 

48 augm , rank , p i v o t s = r r e f (augm , debug ) 

49 

50 i f rank != l e n ( amat ) : 


52 

53 return pass # return s o l u t i o n 

430

5.13.3 Verbatim: “hpn.py” 

1 ””” 

2 @summary : Solve (H, p , N1 , N2 , . . . , N5) v e r s u s (T, V, N1 , N2 , . . . , N5) f o r the 

3 i d e a l gas equation o f s t a t e . The p e r t i n e n t e quations are : : 

4 

5 H = sum i ( h (T) i ∗ N i ) 

6 h (T) i = h 0 i + i n t {T0}ˆ{T} ( cp (T) i ∗ dT ) 

7 Cp = sum i ( cp (T) i ∗ N i ) 

8 cp (T) i = c 1 i + c 2 i ∗ t + c 3 i ∗ t ∗∗2 + c 4 i ∗ t ∗∗3 

9 p = ntot ∗R ∗ T / V 

10 ntot = sum i ( N i ) 

11 

12 The s t r a t e g y i s to implement a standard Newton−Raphson i t e r a t o r and 

13 s o l v e : : 

14 

15 ( tvn )ˆ{ k+1} = ( tvn )ˆ{ k} + d ( tvn ) 

16 d ( tvn ) = inv ( j a c ) ∗ ( y1 − hpn ) 

17 

18 r e p e a t e d l y u n t i l the norm o f d ( tvn ) i s not d e c r e a s i n g anymore . On 

19 the r i g h t s i d e ’ y1 ’ i s a given c o n s t r a i n t ” matrix ” : : 

20 

21 [ [ H1 ] , 

22 [ p1 ] , 

23 y1 = [ N1 1 ] , 

24 [ N1 2 ] , 

25 . . . 

26 ] 

27 

28 and ’ hpn ’ i s a s i m i l a r l y shaped ” matrix ” o f i d e a l gas p r o p e r t i e s 

29 c a l c u l a t e d as f u n c t i o n s o f T, V, and N 1 , . . . N 5 : : 

30 

31 [ [ H ] , 

32 [ p ] , 

33 hpn = [ N1 ] , 

34 [ N2 ] , 

35 . . . 

36 ] 

37 

38 The Jacobian o f H, p , N1 , N2 , . . . with r e s p e c t to T, V, N1 , N2 , . . . 

39 i s on the form : : 

40 

41 [ [ dH/dT, dH/dV, dH/dN1 , dH/dN2 , . . . ] , 

42 [ dp/dT, dp/dV, dp/dN1 , dp/dN2 , . . . ] , 

43 j a c = [ dN1/dT, dN1/dV, dN1/dN1 , dN1/dN2 , . . . ] , 

44 [ dN2/dT, dN2/dV, dN2/dN1 , dN2/dN2 , . . . ] , 

45 . . . 

46 ] 

47 






53 @since : 2 0 1 1 . 1 0 . 1 3 (THW) 

54 @version : 0 . 0 . 1 

55 @todo 1 . 0 : 


57 @note : 

431

58 

59 Test the program e n t e r i n g one o f the f o l l o w i n g l i n e s from the command l i n e : : 

60 

61 >>> python hpn . py 

62 >>> python hpn . py . . . 

63 >>> python hpn . py . . . . . . 

64 

65 H1 = f i n a l enthalpy [10ˆ5 J ] 

66 p1 = f i n a l p r e s s u r e [ kbar ] 

67 N1 1 = f i n a l mole number o f component 1 [ mol ] 

68 . . . 

69 N1 5 = f i n a l mole number o f component 5 [ mol ] 

70 T0 = i n i t i a l temperature [ kK ] 

71 V0 = i n i t i a l volume [ dm3 ] 

72 N0 1 = i n i t i a l mole number o f component 1 [ mol ] 

73 . . . 

74 N0 5 = i n i t i a l mole number o f component 5 [ mol ] 

75 

76 ””” 

77 

78 import tkp4106 

79 

80 def h p n v s t v n s o l v e r ( y1 , x0 , eps =1.0e −8, maxiter =50): 

81 

82 f i x r g a s = 0.083145119843087 # gas constant 

83 v a r t = x0 [ 0 ] [ − 1 ] # temperature [ kK ] 

84 var v = x0 [ 1 ] [ − 1 ] # volume [ dm3 ] 

85 var n = [ ni [ −1] for ni in x0 [ 2 : ] ] # mole numbers [ mol ] 

86 par h0 = [ −.45898 , 0.00000 , 0.00000 , 0.00000 , −.74520] # h0 [10ˆ5 J/mol ] 

87 p a r c 1 c p = [ 0 . 2 7 3 1 0 , 0 .31150 , 0.27140 , 0.20786 , 0 . 0 1 9 2 5 ] # Cp c o e f f i c i e n t 

88 p a r c 2 c p = [ 0 . 2 3 8 3 0 , −.13570 , 0.09274 , 0.00000 , 0 . 5 2 1 3 0 ] # Cp c o e f f i c i e n t 

89 p a r c 3 c p = [ 0 . 1 7 0 7 0 , 0 .26800 , −.13810 , 0.00000 , 0 . 1 1 9 7 0 ] # Cp c o e f f i c i e n t 

90 p a r c 4 c p = [ −.11850 , −.11680 , 0.07645 , 0.00000 , −.11320] # Cp c o e f f i c i e n t 

91 



94 ni = 0 # number o f i t e r a t i o n s 

95 nc = l e n ( var n ) # number o f components in mixture 

96 


98 ni += 1 

99 

100 t = v a r t 

101 v = var v 

102 n = var n 

103 r = f i x r g a s 


105 ntot = sum( n ) 


107 # I n i t i a l i z a t i o n o f enthalpy and i t s d e r i v a t i v e s . 

108 s t a t e h = 0 . 0 

109 s t a t e h t = 0 . 0 

110 s t a t e h v = 0 . 0 

111 s t a t e h n = [ 0 . 0 ] ∗ nc 

112 

113 s t a t e p = pass # p (T,V, n ) 

114 s t a t e p t = pass # ( dp/dT) {V, n} 

115 s t a t e p v = pass # ( dp/dV) {T, n} 

116 s t a t e p n = pass # ( dp/dn [ i ] ) {T,V, n [ j ] } 

432

117 

118 s t a t e n = n 

119 s t a t e n t = pass # ( dn/dT) {V, n} 

120 s t a t e n v = pass # ( dn/dV) {T, n} 

121 s t a t e n n = [ i n t ( i==j ) for i in xrange ( 0 , nc ) for j in xrange ( 0 , nc ) ] 

122 

123 t0 = 0.29815 # standard s t a t e temperature 

124 

125 for i in xrange ( 0 , nc ) : 

126 h t i = par h0 [ i ] + \ 

127 pass + \ 

128 pass + \ 

129 pass + \ 

130 pass # i n t { t0 }ˆ{T} cp [ i ] ( t ) dt 

131 c p i = p a r c 1 c p [ i ] + \ 

132 p a r c 2 c p [ i ] ∗ t + \ 

133 p a r c 3 c p [ i ] ∗ t ∗∗2 + \ 

134 p a r c 4 c p [ i ] ∗ t ∗∗3 # cp [ i ] (T) 

135 s t a t e h += pass # H(T,V, n ) 

136 s t a t e h t += pass # (dH/dT) {V, n} 

137 s t a t e h v += pass # (dH/dV) {T, n} 

138 s t a t e h n [ i ] = pass # (dH/dn [ i ] ) {T,V, n [ j ] } 

139 

140 hpn = [ [ s t a t e h ] ] + [ [ s t a t e p ] ] + [ [ ni ] for ni in s t a t e n ] 

141 

142 dh = [ s t a t e h t ] + [ s t a t e h v ] + s t a t e h n # dH/d (T,V, n ) 

143 dp = pass # dp/d (T,V, n ) 

144 dn = [ \ 

145 [ s t a t e n t [ i ] ] + 

146 [ s t a t e n v [ i ] ] + \ 

147 s t a t e n n [ i ∗nc : ( i +1)∗nc ] for i in xrange ( 0 , nc )\ 

148 ] # dn/d (T,V, n ) 

149 

150 j a c = pass # d (H, p , n )/d(T,V, n ) 

151 

152 dy = pass # y1 − (H, p , n ) 

153 dx = tkp4106 . s o l v e ( jac , dy ) 

154 tmp = max ( [ abs ( dxi [ −1]) for dxi in dx ] ) 

155 converged = abs (tmp) < eps and abs (tmp) >= norm 

156 norm = abs (tmp) 

157 print "norm=%8.3g;" % (norm , ) 

158 i f not converged and ni >= abs ( maxiter ) : 

159 raise SyntaxError ( "max iterations (%s) exceeded" % ( ni , ) ) 

160 v a r t += pass # update temperature 

161 var v += pass # update volume 

162 var n = pass # update mole numbers 

163 

164 tvn = [ [ v a r t ] ] + [ [ var v ] ] + [ [ ni ] for ni in var n ] 

165 hpn = [ [ s t a t e h ] ] + [ [ s t a t e p ] ] + [ [ ni ] for ni in s t a t e n ] 

166 

167 return [ tvn , hpn ] 

168 

169 # Test the code . 

170 # 

171 i f name == ’__main__’ : 

172 

173 import hpn 

174 import sys 

175 

433

176 # Read in H1 , p1 and n1 , plus T0 , V0 and n0 from the command l i n e . 

177 i f l e n ( sys . argv ) == 7+7+1: 

178 x0 = [ [ f l o a t ( x0i ) ] for x0i in sys . argv [ 8 : ] ] # T, V, n 

179 y1 = [ [ f l o a t ( y1i ) ] for y1i in sys . argv [ 1 : 8 ] ] # H, p , n 

180 

181 # Read in H1 , p1 and n1 from the command l i n e . Use d e f a u l t T0 , V0 and n0 . 

182 e l i f l e n ( sys . argv ) == 7+1: 

183 x0 = [ [ 0 . 2 9 8 1 5 ] , [ 0 . 0 0 1 ] , [ 2 . 0 ] , [ 1 . 5 ] , [ 0 . 5 ] , [ 3 . 0 ] , [ 1 . 0 ] ] # T, V, n 

184 y1 = [ [ f l o a t ( y1i ) ] for y1i in sys . argv [ 1 : ] ] # H, p , n 

185 

186 # Use d e f a u l t H1 , p1 and n1 , plus d e f a u l t T0 , V0 and n0 . 

187 else : 

188 x0 = [ [ 0 . 2 9 8 1 5 ] , [ 0 . 0 0 1 ] , [ 2 . 0 ] , [ 1 . 5 ] , [ 0 . 5 ] , [ 3 . 0 ] , [ 1 . 0 ] ] # T, V, n 

189 y1 = [ [ 0 ] , [ 0 . 1 ] , [ 1 . 0 ] , [ 2 . 5 ] , [ 1 . 5 ] , [ 2 . 0 ] , [ 3 . 0 ] ] # H, p , n 

190 

191 tvn , hpn = hpn . h p n v s t v n s o l v e r ( y1 , x0 ) 

192 

193 print ’\nInput:’ 

194 print "T0=%12.6g; V0=%12.6g; n0=%s;" % ( x0 [ 0 ] [ − 1 ] , x0 [ 1 ] [ − 1 ] , x0 [ 2 : ] ) 

195 print "H1=%12.6g; p1=%12.6g; n1=%s;" % ( y1 [ 0 ] [ − 1 ] , y1 [ 1 ] [ − 1 ] , y1 [ 2 : ] ) 

196 

197 print ’\nOutput:’ 

198 print "T =%12.6g; V =%12.6g; n =%s;" % ( tvn [ 0 ] [ − 1 ] , tvn [ 1 ] [ − 1 ] , tvn [ 2 : ] ) 

199 print "H =%12.6g; p =%12.6g; n =%s;" % ( hpn [ 0 ] [ − 1 ] , hpn [ 1 ] [ − 1 ] , hpn [ 2 : ] ) 

434

5.13.4 Verbatim: “mprod.py” 

1 ””” 

2 @summary : C a l c u l a t e the f u l l matrix product cmat = amat ∗ bmat . 






8 @since : 2 0 1 1 . 0 8 . 3 0 (THW) 

9 @version : 0 . 9 



12 ””” 

13 

14 def mprod ( amat , bmat , debug=False ) : 

15 ””” 

16 Matrix m u l t i p l i c a t i o n o f amat ∗ bmat = cmat . 

17 

18 @param amat : 

19 @param bmat : 

20 @param debug : 

21 


23 @type bmat : 

24 @type debug : 

25 

26 @return : a L i s t [ a L i s t [ aFloat , aFloat , aFloat ] ] 

27 e . g . [ [ 1 . 0 , 2 . 0 , . . . ] , [ 3 . 0 , 4 . 0 , . . . ] , [ 5 . 0 , 6 . 0 , . . . ] , . . . ] 

28 ””” 

29 



32 

33 i f not ( bmat ) or not ( bmat [ 0 ] ) : 


35 

36 i f l e n ( bmat ) != l e n ( amat [ 0 ] ) : 


38 

39 # Output matrix has dimension : rows ( amat ) x columns ( bmat ) . 

40 cmat = [ [ 0 for b in bmat [ 0 ] ] for a in amat ] 

41 

42 for i pass # rows in amat = rows in cmat 

43 for j pass # columns in bmat = columns in cmat 

44 for k pass # columns in amat = rows in bmat 

45 pass # c a l c u l a t e cmat [ i ] [ j ] 

46 

47 return cmat 

435

5.13.5 The energy balance, see also Sec. 5.11.4 

First reference occurs in The energy balance, see Section 5.11.4 on page 414. 

436

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







439

The Plug Flow Reactor 





Zooball/Kangaroo 

At a computer expo (COMDEX), Bill Gates reportedly compared the computer industry with the auto 

industry and stated: "If GM had kept up with the technology like the computer industry has, we would all 

be driving $25.00 cars that got 1,000 miles to the gallon." In response to Bill's comments, General Motors 

issued a press release (by Mr. Welch himself) stating: If GM had developed technology like Microsoft, we 

would all be driving cars with the following characteristics: 

For no reason at all, your car would crash twice a day. 

Every time they repainted the lines on the road, you would have to buy a new car. 

Only one person at a time could use the car, unless you bought Car95 or CarNT, and then added 

more seats. 

Apple would make a car powered by the sun, reliable, five times as fast, and twice as easy to drive, 

but would run on only five per cent of the roads. 

The airbag would say 'Are you sure?' before going off. 

Occasionally, for no reason, your car would lock you out and refuse to let you in until you 

simultaneously lifted the door handle, turned the key, and grabbed the radio antenna. 

You would press the start button to shut off the engine. 

• • • 

General Motors vs. Bill Gates 

Assignments 

1. Download the thermodynamics module srk_ammonia.py. 

2. Download the flowsheet module flowsheet.py. 

3. a. Download the ammonia reactor module ammonia_reactor.py. 

b. Beware the integrating namespace tkp4106.py. 

c. Finish the initialization of p(V) = p0 in Section 2. 

d. Make sure the equation is solved correctly. 

4. Run ammonia_reactor.py from the command line: 

python ammonia_reactor.py rk2 explicit 12 1 

until you hit an error in the integration method 

hpn_vs_tvn_integrator(), confer Section 6 in that file. You may 

have a problem getting past feed.get_cfw() in Section 3. That is

probably because you have not implemented 

tkp4106.molecular_weight() which is referenced in 

srk_ammonia.py. Fix this problem by hard-coding the missing values inplace. 

Start reading about Modelling issues to understand the three physical principles 

(energy, momentum, mass) that lie behind the Plug-Flow-Reactor model, and 

also the meaning of linearization. 

back 



back 



back 


Bill Gates and General Motors 

At a recent computer expo (COMDEX), Bill Gates reportedly compared the computer 

industry with the auto industry and stated: "If GM had kept up with technology like the 

computer industry has, we would all be driving $25 cars that get 1,000 to the gallon." 

In response to Bill's comments, General Motors issued a press release (From Mr. Welch 

himself): "If GM had developed technology like Microsoft, we would all be driving cars 

with the following characteristics:" 

1. For no reason whatsoever, your car would crash twice a day. 

2. Every time they repainted the lines on the road, you would have to buy a new car. 

3. Occasionally your car would die on the freeway for no reason, and you would just 

accept this, restart and drive on. 

4. Occasionally, executing a maneuver such as a left turn, would cause your car to 

shut down and refuse to restart, in which case you would have to reinstall the 

engine. 

5. Only one person at a time could use the car, unless you bought "Car95" or 

"CarNT." But then you would have to buy more seats. 

6. Macintosh would make a car that was powered by the sun, reliable, five times as 

fast, and twice as easy to drive, but would only run on five percent of the roads. 

7. The oil, water temperature and alternator warning lights would be replaced by a 

single "general car fault" warning light. 

8. New seats would force everyone to have the same size butt. 

9. The airbag system would say "Are you sure?" before going off. 

10. Occasionally for no reason whatsoever, your car would lock you out and refuse to 

let you in until you simultaneously lifted the door handle, turned the key and grab 

hold of the radio antenna. 

11. GM would require all car buyers to also purchase a deluxe set of Ran McNally 

Road maps (now a GM subsidiary), even though they would immediately cause the 

car's performance to diminish by 50% or more. Moreover, GM would become a 

target for investigation by the Justice Department. 

12. Every time GM introduced a new model, car buyers would have to learn how to 

drive all over again because none of the controls would operate in the same manner 

as the old car. 

13. You'd press the "start" button to shut off the engine.

Last 

article 

General 

menu 

Main 

index 

Top of 

article 

Local 

menu 

Next 

article

5.15.2 Verbatim: “srk ammonia.py” 

1 ””” 

2 @summary : Mock−up thermodynamic c l a s s f o r ammonia r e a c t o r c a l c u l a t i o n s . 

3 Based on i d e a l gas as a f u n c t i o n o f T, V, n i . e . ∗ not ∗ T, p , n . 

4 Pressure has t h e r e f o r e to be i t e r a t e d i f n e c e s s a r y . This i s part 

5 o f the t r a i n i n g o f our students though . . . 





10 @requires : Python 2 . 3 . 5 or higher 

11 @since : 2 0 1 1 . 1 0 . 0 4 (THW) 

12 @version : 0 . 6 

13 @todo 1 . 0 : 


15 @note : Bla−bla . 

16 ””” 

17 

18 class Model : 

19 ’ ’ ’ I d e a l gas implemented on the form o f Helmholtz energy A(T, V, nvec ) . ’ ’ ’ 

20 def i n i t ( s e l f , args ) : 

21 

22 # Turn component names i n t o lower case b e f o r e any comparisons are made . 

23 args = [ arg . lower ( ) for arg in args ] 

24 

25 # from s t r i n g import lower # a l t e r n a t i v e c o n v e r s i o n 

26 # args = map( lower , args ) # a l t e r n a t i v e c o n v e r s i o n 

27 

28 # The model component l i s t i s hard−coded . This may change in the future , 

29 # but so f a r we must l i v e with the hack . 

30 import tkp4016 

31 

32 # Molecular weight [10ˆ10 g/mol ] . 

33 mw = lambda s t r : \ 

34 [ 1 . 0 e −10∗n/m for (n , m) in [ tkp4106 . molecular weight ( s t r ) ] ] . pop ( ) 

35 

36 cfw = [ ( ’ammonia’ , ’NH3’ , mw( ’NH3’ ) ) , 

37 ( ’nitrogen’ , ’N2’ , mw( ’N2’ ) ) , 

38 ( ’hydrogen’ , ’H2’ , mw( ’H2’ ) ) , 

39 ( ’argon’ , ’Ar’ , mw( ’Ar’ ) ) , 

40 ( ’methane’ , ’CH4’ , mw( ’CH4’ ) ) ] 

41 

42 tmp = [ c for ( c , f , w) in cfw ] 

43 

44 # Check that given components are in range o f model . 

45 for arg in args : 

46 i f not arg in tmp : 

47 raise SyntaxError ( "unknown component ’%s’" % ( arg , ) ) 

48 

49 tmp = [ c for ( c , f , w) in cfw i f c in args ] 

50 

51 # Check that given components are in c o r r e c t order . 

52 i f not tmp == args : 

53 print ’Warning: component list: %s\n’ \ 

54 ’ reordered to: %s’ % ( args , tmp) 

55 

56 # S e l e c t v a l u e s from l i s t ’ v ’ being True in l i s t ’ b ’ . 

57 compact = lambda b , v : [ v i for ( bi , v i ) in z i p (b , v ) i f bi ] 

444

58 

59 # Make Boolean f l a g s ( True | False ) f o r e x t r a c t i o n o f data . 

60 f l a g s = [ c in args for ( c , f , w) in cfw ] 

61 s e l f . cfw = compact ( f l a g s , cfw ) # e x t r a c t ( component name , formula , mw) s 

62 nc = l e n ( s e l f . cfw ) # number o f chemical components in mixture 

63 

64 # Enthalpies o f formation and standard e n t r o p i e s from DIPPR (1996) data− 

65 # base . Heat c a p a c i t y parameters from Reid , Poling and Prausnitz (1987) 

66 # book . These are the data needed f o r c a l c u l a t i n g other s t a t e v a r i a b l e s . 

67 # The u n i t s are : 

68 # temperature [ kK ] 

69 # p r e s s u r e [ kbar ] 

70 # volume [ dm3 ] 

71 # mole number [ mol ] 

72 # energy [10ˆ5 J ] 

73 # mass [10ˆ10 g ] 

74 # time [ s ] 

75 # The reason f o r t h e s e odd c h o i c e s i s numerical s t a b i l i t y . 

76 # 

77 s e l f . d i c t = {\ 

78 ’fix_rgas’ :0.083145119843087 , 

79 ’var_t’ : 0 . 2 9 8 1 5 , 

80 ’var_v’ : 0 . 0 0 1 , 

81 ’var_n’ : [ 1 . 0 ] ∗ nc , 

82 ’par_h0’ : compact ( f l a g s , [ −.45898 , 0.00000 , 0.00000 , 0.00000 , −.74520]) , 

83 ’par_s0’ : compact ( f l a g s , [ 1 . 9 2 6 6 0 , 1.91500 , 1.30571 , 1.54737 , 1 . 8 6 2 7 0 ] ) , 

84 ’par_c1_cp’ : compact ( f l a g s , [ 0 . 2 7 3 1 , 0 . 3 1 1 5 , 0.27140 , 0.20786 , 0 . 0 1 9 2 5 ] ) , 

85 ’par_c2_cp’ : compact ( f l a g s , [ 0 . 2 3 8 3 , −.1357 , 0.09274 , 0.00000 , 0 . 5 2 1 3 0 ] ) , 

86 ’par_c3_cp’ : compact ( f l a g s , [ 0 . 1 7 0 7 , 0 . 2 6 8 0 , −.13810 , 0.00000 , 0 . 1 1 9 7 0 ] ) , 

87 ’par_c4_cp’ : compact ( f l a g s , [ −.1185 , −.1168 , 0.07645 , 0.00000 , −.11320]) 

88 } 

89 

90 # Run s e l f . c a l l ( ) to c a l c u l a t e derived ’ s t a t e ’ p r o p e r t i e s . 

91 s e l f ( ) 

92 

93 def c a l l ( s e l f , ∗∗ args ) : 

94 for ( k , v ) in args . i t e r i t e m s ( ) : 

95 s e l f . d i c t [ k ] = v # s t o r e input arguments ( i f any ) 

96 

97 t = s e l f . d i c t [ ’var_t’ ] 

98 v = s e l f . d i c t [ ’var_v’ ] 

99 n = s e l f . d i c t [ ’var_n’ ] 

100 r = s e l f . d i c t [ ’fix_rgas’ ] 


102 i f t

117 eye = [ i n t ( i==j ) for i in xrange ( 0 , nc ) for j in xrange ( 0 , nc ) ] 

118 

119 s e l f . d i c t [ ’state_t_t’ ] = 1 . 0 

120 s e l f . d i c t [ ’state_t_v’ ] = 0 . 0 

121 s e l f . d i c t [ ’state_t_n’ ] = [ 0 . 0 ] ∗ nc 

122 s e l f . d i c t [ ’state_v_t’ ] = 1 . 0 

123 s e l f . d i c t [ ’state_v_v’ ] = 0 . 0 

124 s e l f . d i c t [ ’state_v_n’ ] = [ 0 . 0 ] ∗ nc 

125 s e l f . d i c t [ ’state_n_t’ ] = [ 0 . 0 ] ∗ nc 

126 s e l f . d i c t [ ’state_n_v’ ] = [ 0 . 0 ] ∗ nc 

127 s e l f . d i c t [ ’state_n_n’ ] = eye 

128 

129 t0 = 0.29815 # r e f e r e n c e temperature [ kK ] 

130 p0 = 0.00101325 # standard s t a t e p r e s s u r e [ kbar ] 

131 

132 s e l f . d i c t [ ’state_p’ ] = ntot ∗ r ∗ t /v 

133 s e l f . d i c t [ ’state_p_t’ ] = ntot ∗ r /v 

134 s e l f . d i c t [ ’state_p_v’ ] =−ntot ∗ r ∗ t /v∗∗2 

135 s e l f . d i c t [ ’state_p_n’ ] = [ r ∗ t /v ] ∗ nc 

136 s e l f . d i c t [ ’state_h’ ] = 0 . 0 

137 s e l f . d i c t [ ’state_h_t’ ] = 0 . 0 

138 s e l f . d i c t [ ’state_h_v’ ] = 0 . 0 

139 s e l f . d i c t [ ’state_h_n’ ] = [ 0 . 0 ] ∗ nc 

140 s e l f . d i c t [ ’state_mu’ ] = [ 0 . 0 ] ∗ nc 

141 s e l f . d i c t [ ’state_mu0’ ] = [ 0 . 0 ] ∗ nc 

142 


144 

145 for i in xrange ( 0 , nc ) : 

146 h t i = s e l f . d i c t [ ’par_h0’ ] [ i ] + \ 

147 s e l f . d i c t [ ’par_c1_cp’ ] [ i ] ∗ ( t−t0 ) + \ 

148 s e l f . d i c t [ ’par_c2_cp’ ] [ i ] ∗ ( t ∗∗2−t0 ∗ ∗ 2 ) / 2 . 0 + \ 


150 s e l f . d i c t [ ’par_c4_cp’ ] [ i ] ∗ ( t ∗∗4−t0 ∗ ∗ 4 ) / 4 . 0 

151 c p i = s e l f . d i c t [ ’par_c1_cp’ ] [ i ] + \ 

152 s e l f . d i c t [ ’par_c2_cp’ ] [ i ] ∗ t + \ 

153 s e l f . d i c t [ ’par_c3_cp’ ] [ i ] ∗ t ∗∗2 + \ 

154 s e l f . d i c t [ ’par_c4_cp’ ] [ i ] ∗ t ∗∗3 

155 s t i = s e l f . d i c t [ ’par_s0’ ] [ i ] + \ 

156 s e l f . d i c t [ ’par_c1_cp’ ] [ i ] ∗ math . l o g ( t / t0 ) + \ 

157 s e l f . d i c t [ ’par_c2_cp’ ] [ i ] ∗ ( t−t0 ) + \ 


159 s e l f . d i c t [ ’par_c4_cp’ ] [ i ] ∗ ( t ∗∗3−t0 ∗ ∗ 3 ) / 3 . 0 

160 s e l f . d i c t [ ’state_h’ ] += h t i ∗n [ i ] 

161 s e l f . d i c t [ ’state_h_t’ ] += c p i ∗n [ i ] 

162 s e l f . d i c t [ ’state_h_n’ ] [ i ] = h t i 

163 s e l f . d i c t [ ’state_mu’ ] [ i ] = h t i − t ∗ s t i + r ∗ t ∗math . l o g ( n [ i ] ∗ r ∗ t /v/p0 ) 

164 s e l f . d i c t [ ’state_mu0’ ] [ i ] = h t i − t ∗ s t i 

165 

166 return True 

167 

168 def g e t i t e m ( s e l f , key ) : 

169 return s e l f . d i c t [ key ] 

170 

171 def s e t i t e m ( s e l f , key , val ) : 

172 s e l f . d i c t [ key ] = val 

173 return None 

174 

175 def s t r ( s e l f ) : 

446

176 return ’T=%8.6f; p=%8.6f; H=%8.5f; V=%8.6f’ % \ 

177 ( s e l f . d i c t [ ’state_t’ ] , 

178 s e l f . d i c t [ ’state_p’ ] , 

179 s e l f . d i c t [ ’state_h’ ] , 

180 s e l f . d i c t [ ’state_v’ ] ) 

181 

182 def g e t c f w ( s e l f ) : 

183 return s e l f . cfw 

447

5.15.3 Verbatim: “flowsheet.py” 

1 ””” 

2 @summary : Flowsheet module . UnitParentClass i s an ’ a b s t r a c t ’ c l a s s used f o r 

3 implementing f e a t u r e s that are common to a l l u nit o p e r a t i o n s ( so f a r 

4 Stream and Reactor ) . Common f e a t u r e s are ( in r e g u l a r Python syntax ) : : 

5 

6 obj [ ’ variable name ’ ] # g e t i t e m ( ’ variable name ’ ) 

7 obj [ ’ variable name ’ ] = value # s e t i t e m ( ’ variable name ’ , value ) 

8 obj ( ) # c a l l ( ) 

9 p r i n t obj # s t r ( ) 

10 obj . c o m p o n e n t l i s t ( ) # [ ( name , formula ) , . . . ] 

11 obj . connect ( a n o t h e r o b j ) # obj [ v a r t ] = a n o t h e r o b j [ v a r t ] , . . . 

12 obj . f u n c t o r (name , fun , args ) # obj . name(∗ args ) => fun ( z , ∗ args ) 

13 

14 The module a l s o c o n t a i n s a c o l l e c t i o n o f f u n c t i o n s f o r c a l c u l a t i n g the 

15 p r e s s u r e drop , heat exchange , k i n e t i c s , Jacobian matrix , e t c . o f a 

16 unit o p e r a t i o n o b j e c t . 

17 






23 @since : 2 0 1 1 . 1 0 . 0 4 (THW) 

24 @version : 0 . 5 

25 @todo 1 . 0 : F i n i s h methods a r r h e n i u s ( ) , t u b e a n d s h e l l ( ) 


27 @note : 

28 ””” 

29 

30 import srk ammonia 


32 

33 # Unit o p e r a t i o n parent c l a s s . I t should have been an a b s t r a c t c l a s s ( that i s a 

34 # c l a s s without a c o n s t r u c t o r ) , but t h i s i s not s t r a i g h t f o r w a r d in Python . Note 

35 # that ’ UnitParentClass ’ r e p r e s e n t s a thermodynamic s t a t e object , i t i s ∗NOT∗ a 

36 # flow o b j e c t s i n c e t h e r e i s no concept o f time here . 

37 class UnitParentClass : 

38 ’ ’ ’ Base c l a s s f o r unit o p e r a t i o n o b j e c t s . ’ ’ ’ 

39 def i n i t ( s e l f , tag , module , c o m p o n e n t l i s t ) : 

40 s e l f . model = module . Model( c o m p o n e n t l i s t ) 

41 s e l f . tag = tag 

42 s e l f . module = module 

43 s e l f . f u n c t o r s = {} 

44 

45 def g e t i t e m ( s e l f , key ) : 

46 return s e l f . model [ key ] 

47 

48 def s e t i t e m ( s e l f , key , val ) : 

49 s e l f . model [ key ] = val 

50 return None 

51 

52 def c a l l ( s e l f , ∗∗ args ) : 

53 return s e l f . model (∗∗ args ) 

54 

55 def s t r ( s e l f ) : 

56 return "’" + s e l f . tag + "’; " + s t r ( s e l f . model ) 

57 

448

58 def g e t c f w ( s e l f ) : 

59 return s e l f . model . g e t c f w ( ) 

60 

61 def get module ( s e l f ) : 

62 return s e l f . module 

63 

64 def connect ( s e l f , arg ) : 

65 s e l f . model [ ’var_t’ ] = arg . model [ ’var_t’ ] 

66 s e l f . model [ ’var_v’ ] = arg . model [ ’var_v’ ] 

67 s e l f . model [ ’var_n’ ] = arg . model [ ’var_n’ ] 

68 s e l f . model ( ) 

69 for (name , fun ) in arg . f u n c t o r s . i t e r i t e m s ( ) : 

70 s e t a t t r ( s e l f . c l a s s , name , fun ) 

71 

72 def f u n c t o r ( s e l f , ∗ args ) : 

73 fun = lambda s e l f , x=None : args [ 1 ] ( s e l f , x , ∗ args [ 2 ] ) 

74 s e t a t t r ( s e l f . c l a s s , args [ 0 ] , fun ) 

75 s e l f . f u n c t o r s [ args [ 0 ] ] = fun 

76 return s e l f 

77 

78 def d u p l i c a t e ( s e l f , tag , arg ={}): 

79 c o m p o n e n t l i s t = [ name for (name , formula , mw) in s e l f . g e t c f w ( ) ] 

80 module = s e l f . get module ( ) 

81 obj = s e l f . c l a s s ( tag , module , c o m p o n e n t l i s t ) 

82 obj . connect ( s e l f ) 

83 return obj 

84 

85 

86 # Derived p r o c e s s Stream c l a s s . 

87 class Stream( UnitParentClass ) : 

88 ’ ’ ’ S y n t a c t i c sugar . ’ ’ ’ 

89 pass 

90 

91 # Derived chemical Reactor c l a s s . 

92 class Reactor ( UnitParentClass ) : 

93 ’ ’ ’ S y n t a c t i c sugar . ’ ’ ’ 

94 pass 

95 

96 # Global f u n c t i o n s used in r e a c t o r s i m u l a t i o n . Connect to UnitParentClass o b j e c t 

97 # using so−c a l l e d ’ lambda’− f u n c t i o n s , s e e method ’ f u n c t o r ( ) ’ in t h i s f i l e . 

98 def constantpdrop ( obj , z , dp ) : 

99 ””” 

100 Constant p r e s s u r e drop ( dp/dz = constant ) along the unit . 

101 @param obj : u nit o p e r a t i o n o b j e c t 

102 @param z : a x i a l p o s i t i o n 

103 @param dp : p r e s s u r e drop [ kbar ] per r e a c t o r length 

104 @type obj : aUnitParentClass 

105 @type z : aFloat 

106 @type dp : aFloat 

107 @return : aFloat 

108 ””” 

109 return dp 

110 

111 def c o n s t a n t c o o l i n g ( obj , z , duty ) : 

112 ””” 

113 Constant heat t r a n s f e r (dQ/dz = constant ) along the unit . 

114 @param obj : u nit o p e r a t i o n o b j e c t 

115 @param z : a x i a l p o s i t i o n 

116 @param duty : heat t r a n s f e r [ 1 . 0 e5 J ] per r e a c t o r length 

449

117 @type obj : aUnitParentClass 

118 @type z : aFloat 

119 @type duty : aFloat 

120 @return : aFloat 

121 ””” 

122 return duty 

123 

124 def t u b e a n d s h e l l ( obj , z , ua , t0 ) : 

125 ””” 

126 Heat t r a n s f e r c a l c u l a t i o n f o r a ’ tube−and−s h e l l ’ heat exchanger . 





131 @return : aFloat 

132 ””” 

133 return ua ∗( t0 − obj [ ’state_t’ ] ) 

134 

135 def c o n s t a n t r a t e ( obj , z , nmat , k ) : 

136 ””” 

137 Constant r e a c t i o n r a t e ( r = constant ) along the unit . 



140 @param nmat : r e a c t i o n s t o i c h i o m e t r y matrix 

141 @param k : extent o f r e a c t i o n s ( one f o r each column o f nmat ) 



144 @type nmat : a L i s t [ a L i s t [ aFloat , aFloat , . . . ] ] 

145 @type k : a L i s t [ aFloat , aFloat , . . . ] 

146 @return : a L i s t [ aFloat , aFloat , . . . ] 

147 ””” 

148 return [ sum ( [ nui ∗ k i for ( nui , k i ) in z i p ( nu , k ) ] ) for nu in nmat ] 

149 

150 def f i r s t o r d e r ( obj , z , nmat , keyc , k ) : 

151 ””” 

152 F i r s t order k i n e t i c s with r e s p e c t to given ’ key ’ components . 



155 @param nmat : r e a c t i o n s t o i c h i o m e t r y matrix 

156 @param keyc : key components ( one f o r each column o f nmat ) 

157 @param k : r a t e c o n s t a n t s ( one f o r each column o f nmat ) 



160 @type nmat : a L i s t [ a L i s t [ aFloat , aFloat , . . . ] ] 

161 @type keyc : a L i s t [ anInt , anInt , . . . ] 

162 @type k : a L i s t [ aFloat , aFloat , . . . ] 


164 ””” 

165 return [ \ 

166 sum ( [ nui ∗ obj [ ’state_n’ ] [ c i ] ∗ k i for ( nui , ci , k i ) in z i p ( nu , keyc , k ) ] ) \ 

167 for nu in nmat\ 

168 ] 

169 

170 def a r r h e n i u s ( obj , z , nmat , keyc , k , a , t0 ) : 

171 ””” 

172 Arrhenius chemical r e a c t i o n k i n e t i c s . 




450



178 ””” 

179 return [ \ 

180 sum ( [ nui ∗( math . exp(−a/ obj [ ’state_t’ ] / obj [ ’fix_rgas’ ] ) / math . exp(−a/ t0 / obj [ ’fix_rgas’ 

181 for nu in nmat\ 

182 ] 

183 

184 # Matrix−l i k e thermodynamic s t a t e f u n c t i o n s . E x p l i c i t in temperature , volume and 

185 # mole numbers . 

186 def h p n v s t v n j a c o b i a n ( obj , n u l l=None ) : 

187 ””” 

188 Thermodynamic Jacobian o f d (H, p , N1 , N2 , . . . ) / d (T,V, N1 , N2 , . . . ) c a l c u l a t e d as 

189 [ [ dH/dT, dH/dV, dH/dN1 , . . . ] , [ dp/dT, . . . ] , . . . ] . 


191 @param n u l l : not used 


193 @type n u l l : anObject 

194 @return : a L i s t [ a L i s t [ aFloat , aFloat , . . . ] ] 

195 ””” 

196 nc = l e n ( obj [ ’state_n’ ] ) 

197 dh = [ obj [ ’state_h_t’ ] ] + [ obj [ ’state_h_v’ ] ] + obj [ ’state_h_n’ ] 

198 dp = [ obj [ ’state_p_t’ ] ] + [ obj [ ’state_p_v’ ] ] + obj [ ’state_p_n’ ] 

199 dn = [ \ 

200 [ obj [ ’state_n_t’ ] [ i ] ] + 

201 [ obj [ ’state_n_v’ ] [ i ] ] + 

202 obj [ ’state_n_n’ ] [ i ∗nc : ( i +1)∗nc ] for i in xrange ( 0 , nc )\ 

203 ] 

204 return [ dh ] + [ dp ] + dn 

205 

206 def hpn ( obj , n u l l=None ) : 

207 ””” 

208 Thermodynamic c o n s t r a i n t f u n c t i o n [ [H] , [ p ] , [ N1 ] , [ N2 ] , . . . ] . 


210 @param n u l l : not used 


212 @type n u l l : anObject 

213 @return : a L i s t [ [ aFloat ] , [ aFloat ] , . . . ] 

214 ””” 

215 return [ [ obj [ ’state_h’ ] ] ] + \ 

216 [ [ obj [ ’state_p’ ] ] ] + [ [ ni ] for ni in obj [ ’state_n’ ] ] 

217 

218 # Enthalpy , pressure , composition s o l v e r . No f a l l −back s o l u t i o n f o r erroneous 

219 # thermodynamic c a l c u l a t i o n s ( c r o s s your f i n g e r s ) . This i s q u i t e easy to program 

220 # but i t causes a mild code b l o a t and i s l e f t as an e x e r c i s e f o r the i n t e r e s t e d 

221 # reader . 


223 

224 def h p n v s t v n s o l v e r ( obj , y1 , eps , maxiter =50): 

225 ””” 

226 Thermodynamic equation s o l v e r . I t e r a t e s on ’ tvn ’ = (T,V, N1 , N2 , . . . ) to meet a 

227 given s p e c i f i c a t i o n ’ y1 ’ = (H, p , N1 , N2 , . . . ) . 


229 @param y1 : [ [ H] , [ p ] , [ N1 ] , [ N2 ] , . . . ] s p e c i f i c a t i o n 

230 @param eps : convergence c r i t e r i o n ( upper bound ) 

231 @param maxiter : maximum number o f i t e r a t i o n s ( n e g a t i v e value i m p l i e s a f i x e d 

232 number o f i t e r a t i o n s ) . 


234 @type y1 : a L i s t [ a L i s t [ aFloat , aFloat , . . . ] ] 

451

235 @type eps : aFloat 

236 @type maxiter : anInt 

237 @return : aUnitParentClass 

238 ””” 



241 nc = l e n ( obj [ ’state_n’ ] ) # number o f chemical components in mixture 

242 ni = 0 # number o f i t e r a t i o n s 


244 ni += 1 

245 dy = pass # y1 − (h , p , n ) 

246 dx = tkp4106 . s o l v e ( obj . j a c ( ) , dy ) 

247 tmp = max ( [ abs ( dxi [ −1]) for dxi in dx ] ) 

248 converged = tmp < eps and tmp >= norm or ( ni+maxiter ) == 0 

249 norm = tmp 

250 i f maxiter > 0 : 

251 print "norm=%8.3g; %s;" % (norm , obj ) 

252 i f not converged and ni >= abs ( maxiter ) : 

253 raise ArithmeticError ( "max iterations (%s) exceeded" % ( ni , ) ) 

254 obj [ ’var_t’]+= pass # dt 

255 obj [ ’var_v’]+= pass # dv 

256 obj [ ’var_n’ ] = pass # d n i 

257 obj ( ) 

258 

259 return obj 

260 

261 # Numerical i n t e g r a t i o n o f enthalpy , p r e s s u r e and composition problems . With or 

262 # without chemical r e a c t i o n s . 

263 def h p n v s t v n i n t e g r a t o r ( method , obj , z0 , z1 , nz ) : 

264 ””” 

265 Thermodynamic i n t e g r a t o r using Euler , RK2 or RK4 methods . Both e x p l i c i t and 

266 i m p l i c i t update schemes are p o s s i b l e . The lambda f u n c t i o n ’ obj . update ( ) ’ i s 

267 supposed to e x i s t and i s used to i t e r a t e on ’ tvn ’ = (T,V, N1 , N2 , . . . ) to meet 

268 a given s p e c i f i c a t i o n ’ y1 ’ = (H, p , N1 , N2 , . . . ) in one or more i t e r a t i o n s . One 

269 i t e r a t i o n means an e x p l i c i t update . I t e r a t i o n t i l l f u l l convergence i s a l s o 

270 p o s s i b l e . This i s the i m p l i c i t update . In c a l c u l a t i n g the r i g h t s i d e o f the 

271 d i f f e r e n t i a l equation t h r e e other lambda f u n c t i o n s must e x i s t : These are 

272 ’ obj . heatexchange ( ) ’ , ’ obj . p r e s s u r e p r o f i l e ( ) ’ and ’ obj . k i n e t i c s ( ) ’ . 

273 @author : Stud . Techn . Stig −Erik Nogva 


275 @param method : ’ e u l e r ’ , ’ rk2 ’ or ’ rk4 ’ 


277 @param z0 : s t a r t o f i n t e g r a t i o n 

278 @param z1 : end o f i n t e g r a t i o n 

279 @param nz : number o f i n t e g r a t i o n s t e p s 

280 @type method : a S t r i n g 


282 @type z0 : aNumber 

283 @type z1 : aNumber 

284 @type nz : aNumber 

285 @return : theUnitParentClass 

286 ””” 

287 o b j s = [ ] # u t i l i t y l i s t ( Runge−Kutta needs i n t e r m e d i a t e s t a t e s ) 

288 dz = f l o a t ( z1−z0 )/ nz # i n t e g r a t o r step s i z e 

289 

290 for z in [ z0+k∗dz for k in xrange ( 0 , nz ) ] : 

291 

292 # C a l c u l a t e r i g h t s i d e o f ODE on the dot ( y ) = y ( z ) form . 

293 yz = [ obj . heatexchange ( z ) ] + \ 

452

294 [ obj . p r e s s u r e p r o f i l e ( z ) ] + obj . k i n e t i c s ( z ) 

295 

296 i f method == ’euler’ : 

297 y1 = pass # (h , p , n ) + yz∗dz 

298 

299 e l i f method == ’rk2’ : 

300 while l e n ( o b j s ) < 2 : 

301 tmp = obj . d u p l i c a t e ( ’RK2_’+s t r ( l e n ( o b j s ) ) ) # 1 i n t e r m e d i a t e obj 

302 o b j s . append (tmp) 

303 

304 for i in range ( 0 , l e n ( o b j s ) ) : 

305 o b j s [ i ] . connect ( obj ) # connect to master o b j e c t in every step 

306 

307 # Obtain 1 a u x i l i a r y quantity 

308 k1 = [ y z i ∗dz for y z i in yz ] 

309 yk2 = [ [ y i [−1]+ k1i ] for ( yi , k1i ) in z i p ( o b j s [ 0 ] . hpn ( ) , k1 ) ] 

310 o b j s [ 0 ] . update ( yk2 ) # i t e r a t e on the i n t e r m e d i a t e s t a t e 

311 

312 yz2 = [ o b j s [ 0 ] . heatexchange ( z +1.0∗ dz ) ] + \ 

313 [ o b j s [ 0 ] . p r e s s u r e p r o f i l e ( z +1.0∗ dz ) ] + \ 

314 o b j s [ 0 ] . k i n e t i c s ( z +1.0∗ dz ) 

315 k2 = [ y z i ∗dz for y z i in yz2 ] 

316 k = [ k1i+k2i for ( k1i , k2i ) in z i p ( k1 , k2 ) ] 

317 

318 y1 = [ [ y i [ −1]+(1/ f l o a t ( 2 ) ) ∗ k i ] for ( yi , k i ) in z i p ( obj . hpn ( ) , k ) ] 

319 

320 e l i f method == ’rk4’ : 

321 while l e n ( o b j s ) < 4 : 

322 tmp = obj . d u p l i c a t e ( ’RK4_’+s t r ( l e n ( o b j s ) ) ) # 3 i n t e r m e d i a t e o b j s 

323 o b j s . append (tmp) 

324 

325 for i in range ( 0 , l e n ( o b j s ) ) : 

326 o b j s [ i ] . connect ( obj ) # connect to master o b j e c t in every step 

327 

328 # Obtain the 4 a u x i l i a r y q u a n t i t i e s 

329 k1 = [ y z i ∗dz for y z i in yz ] 

330 yk2 = [ [ y i [ −1]+0.5∗ k1i ] for ( yi , k1i ) in z i p ( o b j s [ 0 ] . hpn ( ) , k1 ) ] 

331 o b j s [ 0 ] . update ( yk2 ) # i t e r a t e on i n t e r m e d i a t e s t a t e 1 

332 





337 yk3 = [ [ y i [ −1]+0.5∗ k2i ] for ( yi , k2i ) in z i p ( o b j s [ 1 ] . hpn ( ) , k2 ) ] 


339 





344 yk4 = [ [ y i [−1]+ k3i ] for ( yi , k3i ) in z i p ( o b j s [ 2 ] . hpn ( ) , k3 ) ] 


346 

347 yz4 = [ o b j s [ 2 ] . heatexchange ( z ) ] + \ 

348 [ o b j s [ 2 ] . p r e s s u r e p r o f i l e ( z ) ] + o b j s [ 2 ] . k i n e t i c s ( z ) 


350 k = [ k1i+2∗ k2i+2∗ k3i+k4i for ( k1i , k2i , k3i , k4i ) \ 

351 in z i p ( k1 , k2 , k3 , k4 ) ] 

352 

453

353 y1 = [ [ y i [ −1]+(1/ f l o a t ( 6 ) ) ∗ k i ] for ( yi , k i ) in z i p ( obj . hpn ( ) , k ) ] 

354 

355 else : 

356 raise NameError ( ’Method "’ + method + ’"’ + ’ not implemented yet’ ) 

357 

358 # Note : ’ y1 ’ i s the f i n a l [ [H] , [ p ] , [ N1 ] , . . . ] a f t e r the step ’ dz ’ i s 

359 # taken . Lambda f u n c t i o n ’ obj . update ( ) ’ i s r e s p o n s i b l e f o r updating the 

360 # thermodynamic s t a t e a c c o r d i n g l y . 

361 obj . update ( y1 ) 

362 

363 print "z=%5.3f; %s;" % ( z+dz , obj ) 

364 

365 return obj 

454

5.15.4 Verbatim: “ammonia reactor.py” 

1 ””” 

2 @summary : A simple ammonia r e a c t o r c a l c u l a t i o n i l l u s t r a t i n g some p r i n c i p l e s 

3 o f OOP ( Object Oriented Programming ) in chemical e n g i n e e r i n g : : 

4 

5 ’ f e e d ’ −−−−−−−−−−−−−−−− ’ o u t l e t ’ 

6 ) −−−−−−−−−−−−> | . . . ’ rx ’ . . . | −−−−−−−−−−−−−−> ( 

7 −−−−−−−−−−−−−−−− 

8 

9 The outcome o f the study i s a converged f e e d stream and an 

10 i n t e g r a t e d o u t l e t from the r e a c t o r . 






16 @since : 2 0 1 1 . 1 0 . 0 4 (THW) 

17 @version : 0 . 6 

18 @todo 1 . 0 : 


20 @note : This module d e f i n e s the r e a c t i o n chemistry ( k i n e t i c s ) and heat 

21 t r a n s p o r t f o r a minimal setup o f an ammonia r e a c t o r . Nothing very 

22 fancy , but t h e r e are 7 t h i n g s to l e a r n ( s e e item numbering in 

23 source code ) . From the command l i n e run t h i s s c r i p t as : : 

24 

25 >>> python ammonia reactor . py ’ e u l e r | rk2 | rk4 ’ \ 

26 ’ i m p l i c i t | e x p l i c i t ’ \ 

27 

28 

29 nz = number o f i n t e g r a t i o n s t e p s . 

30 maxiter = maximum number o f i t e r a t i o n s spent on the thermodynamic 

31 s t a t e c a l c u l a t i o n s . I f maxiter < 0 then e x a c t l y abs ( maxiter ) 

32 i t e r a t i o n s w i l l be used independent o f the r e s i d u a l s norm . 

33 ””” 

34 

35 import srk ammonia 

36 import flowsheet 


38 

39 # 1) There are 3 thermodynamic o b j e c t s in a c t i o n : ’ f e e d ’ , ’ rx ’ and ’ o u t l e t ’ . 

40 # Each o b j e c t r e p r e s e n t s one − and only one − thermodynamic s t a t e . This means 

41 # that ’ rx ’ , d e s c r i b i n g a s t a t e that v a r i e s in space , has to be i n t e g r a t e d over 

42 # the length over the r e a c t o r . The r e a c t o r p r o f i l e s o f temperature , pressure , 

43 # e t c . are l o s t in the p r o c e s s o f i n t e g r a t i o n , however , because ’ rx ’ can keep 

44 # only one ( 1 ) s t a t e at a time . I t i s o f course p o s s i b l e to keep the p r o f i l e s 

45 # in memory as i n t e r m e d i a t e thermodynamic s t a t e o b j e c t s , but t h i s could e a s i l y 

46 # be an o v e r k i l l because e x p l i c i t Euler i n t e g r a t i o n r e q u i r e s somewhere in the 

47 # range o f 10 ,000 − 100 ,000 s t e p s in order to reach 6 d i g i t s p r e c i s i o n − which 

48 # would e v e n t u a l l y bind a s u b s t a n t i t a l block o f memory . 

49 syngas = [ ’ammonia’ , ’nitrogen’ , ’hydrogen’ ] 

50 

51 f e e d = flowsheet . Stream( ’Feed’ , srk ammonia , syngas ) 

52 o u t l e t = flowsheet . Stream( ’Outlet’ , srk ammonia , syngas ) 

53 rx = flowsheet . Reactor ( ’Rx’ , srk ammonia , syngas ) 

54 

55 # I n i t i a l i z e f e e d stream . 

56 f e e d [ ’var_t’ ] = 0 . 7 # temperature [ kK ] 

57 f e e d [ ’var_v’ ] = 1 . 0 # volume [ dm3 ] 

455

58 f e e d [ ’var_n’ ] = [ 0 . 0 4 , 0 . 2 4 , 0 . 7 2 ] # mole f r a c t i o n s 

59 f e e d ( ) # run thermodynamics code 

60 f e e d [ ’var_n’ ] = [ ni / f e e d [ ’state_mtot’ ] / 1 e7 for ni in f e e d [ ’state_n’ ] ] # [ mol/kg ] 

61 

62 # Re−i n i t i a l i z e ( change T and V to show extra f l e x i b i l i t y ) . 

63 f e e d ( v a r t =0.8 , var v=f e e d [ ’var_v’ ] / f e e d [ ’state_mtot’ ] / 1 e7 ) 

64 

65 print "Initial %s" % ( feed , ) 

66 

67 # 2) The f e e d stream has a s p e c i f i e d p r e s s u r e p0 whereas most thermodynamic equ− 

68 # a t i o n s o f s t a t e are e x p l i c i t in volume ( and temperature and composition ) . The 

69 # r e l a t i o n p (V) = p0 must t h e r e f o r e be s o l v e d i t e r a t i v e l y ( using Newton ’ s 

70 # method in t h i s case ) . 

71 eps = 1 . 0 e−8 # convergence c r i t e r i o n 

72 p0 = 0.25 # s y n t h e s i s p r e s s u r e [ kbar ] 

73 

74 print "\nNewton -Raphson solution of p(v) = p0:" 

75 



78 

79 # Solve p ( v ) = p0 using Newton ’ s method . The thermodynamics model respond to the 

80 # f r e e v a r i a b l e ’ var v ’ and c a l c u l a t e s p r e s s u r e ’ s t a t e p ’ and p r e s s u r e 

81 # d e r i v a t i v e ’ s t a t e p n ’ . 


83 dpdv = pass # Jacobian (1 x 1) 

84 dp = pass # p r e s s u r e r e s i d u a l (1 x 1) 

85 dv = tkp4106 . s o l v e ( dpdv , dp ) [ 0 ] [ − 1 ] # volume change ( s c a l a r ) 

86 f e e d [ ’var_v’ ] += pass # update the model 

87 converged = abs ( dv ) < eps and abs ( dv ) >= norm # continue t i l l norm i s steady 

88 norm = abs ( dv ) # new norm 

89 

90 # The model f a i l s i f ’ var v ’ becomes unphysical ( n e g a t i v e volume t y p i c a l l y ) . 

91 # I f t h i s happens we must shorten the i t e r a t i o n step u n t i l the model says i t 

92 # i s OK. An e x c e p t i o n i s r a i s e d i f the step becomes too small . 

93 while not f e e d ( ) : 

94 i f abs ( dv ) < eps : 

95 raise ArithmeticError ( "cannot converge p(v) = p0 relation" ) 

96 pass # step back to l a s t s u c c e s s f u l s t a t e 

97 pass # reduce the step length 

98 pass # try once more 

99 print "norm=%8.3g; %s;" % (norm , f e e d ) 


101 print "\nConverged %s" % ( feed , ) 


103 # 3) C a l c u l a t e the ( atoms x component ) matrix and the ( components x r e a c t i o n s ) 

104 # s t o i c h i o m e t r y from molecular formulas o f the components in the mixture . 

105 tmp = [ formula for (name , formula , mw) in f e e d . g e t c f w ( ) ] 

106 amat = tkp4106 . atom matrix (tmp) 

107 nmat = tkp4106 . n u l l ( amat ) 


109 # 4) There i s the use o f f u n c t o r s in the s i m u l a t i o n code . Their meaning i s a b i t 

110 # magic to newbies , but to old−timers they o f f e r a g r e a t way o f code s e p a r a t i o n 

111 # The key i s s u e i s that we can s t a r t w r i t i n g a lgorithms ( an Euler i n t e g r a t o r in 

112 # t h i s case ) r e q u i r i n g a c e r t a i n f u n c t i o n a l i t y ( p r e s s u r e drop , heat exchange 

113 # and r e a c t i o n k i n e t i c s ) , without knowing the exact nature o f the underlying 

114 # f u n c t i o n s . The p r o p e r t i e s are i n s t e a d r e g i s t e r e d in the ’ rx ’ o b j e c t using so− 

115 # c a l l e d lambda e x p r e s s i o n s c a l l i n g the c o r r e c t f u n c t i o n run−time by d e r e f e r e n c − 

116 # ing the f u n c t i o n p o i n t e r . In e f f e c t , the heat exchange , p r e s s u r e drop and 

456

117 # r e a c t i o n k i n e t i c s can be changed in one p l a c e o f the code without a f f e c t i n g 

118 # the s o l u t i o n algorithm . I t y i e l d s , in f a c t , a way o f d e f i n i n g the t r a n s p o r t 

119 # p r o p e r t i e s e x t e r n a l l y without changing n e i t h e r the unit o p e r a t i o n c l a s s nor 

120 # the i n t e g r a t i o n method . The same idiom i s a l s o used f o r d e f i n i n g thermodynamic 

121 # s t a t e d e r i v a t i v e s ( the Jacobian ) . In t h i s case we want to c o n t r o l the exact 

122 # meaning o f ’ y1 ’ , ’ y2 ’ , ’ x1 ’ , ’ x2 ’ , e t c . in d ( y1 , y2 , . . . ) / d ( x1 , x2 , . . . ) . 

123 rx . connect ( f e e d ) 

124 

125 # S e l e c t a ’ key ’ component f o r the r e a c t i o n k i n e t i c s . Normalize the correspond− 

126 # ing s t o i c h i o m e t r i c c o e f f i c i e n t to −1. Make a shallow copy o f matrix row b e f o r e 

127 # doing o p e r a t i o n s on ’ nmat ’ . The algorithm works f o r s i n g l e r e a c t i o n s only . 

128 keyc = [ name for (name , formula , mw) in rx . g e t c f w ( ) ] . index ( ’nitrogen’ ) 

129 piv = l i s t ( nmat [ keyc ] ) 

130 for i in xrange ( 0 , l e n ( nmat ) ) : 

131 for j in xrange ( 0 , l e n ( nmat [ i ] ) ) : 

132 nmat [ i ] [ j ] /= −piv [ j ] 

133 

134 # Declare t r a n s p o r t p r o p e r t i e s and k i n e t i c s f o r the r e a c t o r . Non−l i n e a r example . 

135 # rx . f u n c t o r ( ’ p r e s s u r e p r o f i l e ’ , f l o w s h e e t . constantpdrop , [ − . 0 0 5 ] ) # dp/dz 

136 # rx . f u n c t o r ( ’ heatexchange ’ , f l o w s h e e t . tubeandshell , [ 3 0 . 0 , 0 . 2 8 ] ) # ua ∗( t−t0 ) 

137 # rx . f u n c t o r ( ’ k i n e t i c s ’ , f l o w s h e e t . arrhenius , [ nmat , [ keyc ] , [ 4 / 3 . 0 ] , 0 . 1 , 0 . 8 ] ) 

138 

139 # Declare t r a n s p o r t p r o p e r t i e s and k i n e t i c s f o r the r e a c t o r . Linear example . 

140 rx . f u n c t o r ( ’pressureprofile’ , flowsheet . constantpdrop , [ 0 . 0 ] ) # dp/dz 

141 rx . f u n c t o r ( ’heatexchange’ , flowsheet . c o n s t a n t c o o l i n g , [ − 2 0 . 0 ] ) # heat [ 1 . 0 e5 J ] 

142 rx . f u n c t o r ( ’kinetics’ , flowsheet . f i r s t o r d e r , [ nmat , [ keyc ] , [ 4 / 3 . 0 ] ] ) # rx r a t e s 

143 

144 # 5) I n t e r a c t with the command l i n e reader to get hold o f the i n t e g r a t o r scheme 

145 # and the number o f s t e p s r e q u i r e d f o r the i n t e g r a t i o n . 

146 import sys 

147 

148 method , i t e r a t o r , nz , maxiter = sys . argv [ 1 : ] 

149 nz , maxiter = i n t ( nz ) , i n t ( maxiter ) 

150 

151 # Declare a thermodynamic i t e r a t o r ( f o r use i n s i d e the i n t e g r a t o r ) . 

152 i f i t e r a t o r == ’implicit’ : 

153 maxiter = abs ( maxiter ) 

154 

155 i f i t e r a t o r == ’explicit’ : 

156 maxiter =−abs ( maxiter ) 

157 

158 # Declare a thermodynamic f u n c t i o n s o l v e r and s t a t e d e r i v a t i v e s f o r the r e a c t o r . 

159 rx . f u n c t o r ( ’update’ , flowsheet . h p n v s t v n s o l v e r , [ eps , maxiter ] ) # s t a t e update 

160 rx . f u n c t o r ( ’jac’ , flowsheet . h pn vs tvn jacobian , [ ] ) # Jacobian matrix 

161 rx . f u n c t o r ( ’hpn’ , flowsheet . hpn , [ ] ) # c o n s t r a i n t v a r i a b l e s 

162 

163 # 6) I n t e g r a t e over the r e a c t o r using the given i n t e g r a t i o n ’ method ’ and the 

164 # given ’ i t e r a t o r ’ mechanism . 

165 print "\n%s %s integration using %s steps:" % \ 

166 ( i t e r a t o r . c a p i t a l i z e ( ) , method . c a p i t a l i z e ( ) , nz ) 

167 

168 flowsheet . h p n v s t v n i n t e g r a t o r ( method , rx , 0 , 1 , nz )# i n t e g r a t e from z=0 to z=1 

169 

170 print "\nIntegrated %s" % ( rx , ) 

171 

172 # 7) C a l c u l a t e the r e a c t o r o u t l e t using an a n a l y t i c s o l u t i o n based on the matrix 

173 # e x p o n e n t i a l o f the ( constant ) ODE c o e f f i c i e n t . Let y = (h , p , c ) and dot ( y)=C∗y 

174 # Then y ( z=1) = expm(C)∗ y ( z=0) where ’expm ’ i s the matrix e x p o n e n t i a l o f C: 

175 # 

457

176 # | 1 Q/p 0 0 0 | 

177 # | 0 1 0 0 0 | 

178 # expm = | 0 0 1 nu 0 / nu 1 ( f a c − 1) 0 | 

179 # | 0 0 0 f a c 0 | 

180 # | 0 0 0 nu 2 / nu 1 ( f a c − 1) 1 | 

181 # 

182 # Here , ’Q ’ i s the heat load , ’ p ’ i s the ( constant ) r e a c t o r pressure , ’ n u i ’ are 

183 # s t o i c h i o m e t r i c c o e f f i c i e n t s and ’ f a c ’ i s the r e s i l i e n c e f a c t o r o f the ’ key ’ 

184 # component . 


186 

187 o u t l e t . connect ( rx ) # i n h e r i t lambda f u n c t i o n s from ’ rx ’ 

188 o u t l e t ( v a r t=f e e d [ ’var_t’ ] , var v=f e e d [ ’var_v’ ] , var n=f e e d [ ’var_n’ ] ) # re−i n i t 

189 

190 # C a l c u l a t e the r e s i l i e n c e f a c t o r o f the ’ key ’ component . 

191 f a c = math . exp ( o u t l e t . k i n e t i c s ( 0 ) [ keyc ] / o u t l e t [ ’state_n’ ] [ keyc ] ) 

192 

193 # C a l c u l a t e the matrix e x p o n e n t i a l . 

194 nc = l e n ( o u t l e t [ ’state_n’ ] ) 

195 expm = [ [ f l o a t ( i==j ) for i in xrange ( 0 , nc +2)] for j in xrange ( 0 , nc +2)]# i d e n t i t y 

196 expm [ 0 ] [ 1 ] = o u t l e t . heatexchange (0)/ o u t l e t [ ’state_p’ ] # heat t r a n s f e r 

197 expm[2+ keyc ][2+ keyc ] = f a c # ’ key ’ component r e s i l i e n c e 

198 for i in [ j for j in xrange ( 0 , nc ) i f j != keyc ] : 

199 expm[2+ i ][2+ keyc ] = nmat [ i ][ −1]/ nmat [ keyc ] [ − 1 ] ∗ ( fac −1.0) # other r e a c t i o n s 

200 

201 # C a l c u l a t e the o u t l e t s t a t e from y ( z=1) = expm(C)∗ y ( z =0). 

202 y1 = tkp4106 . mprod (expm , o u t l e t . hpn ( ) ) 

203 

204 print "\nNewton -Raphson solution of f(h,p,c) = 0:" 

205 

206 flowsheet . h p n v s t v n s o l v e r ( o u t l e t , y1 , eps , 20) 

207 

208 print "\nConverged %s" % ( o u t l e t , ) 

458

5.15.5 Verbatim: “tkp4106.py” 

1 ””” 

2 @summary : I n c r e a s e l o c a l namespace with TKP4106 f u n c t i o n a l i t y . 






8 @since : 2 0 1 2 . 0 9 . 0 5 (THW) 

9 @version : 0 . 9 



12 ””” 

13 

14 from molecular w e i g h t import molecular weight 

15 from tridiagmprod import tridiagmprod 

16 from atom matrix import atom matrix 


18 from s o l v e import s o l v e 

19 from mprod import mprod 


21 from n u l l import n u l l 

459

5.15.6 ammonia reactor.py, see also Sec. 5.15.4 

First reference occurs in ammonia reactor.py, see Section 5.15.4 on page 455. 

460

5.15.7 srk ammonia.py, see also Sec. 5.15.2 

First reference occurs in srk ammonia.py, see Section 5.15.2 on page 444. 

461

Plug Flow Reactor. Part III 




13 November 2011 


1 Modelling issues 

˙bın , p ın 

( ˙ U + p ˙ V )ın 

C 

A 

˙Q 

b(t, z, ∆z) , ˙ ξ 

U(t, z, ∆z) 

z z + ∆z 

˙bout , p out 

( ˙ U + p ˙ V ) out 

From an academic perspective 

the title of this text is a little pretentious. 

It says “Modelling Issues” 

which means quite a lot to 

people devoting their professional 

lives to the several aspects of chemical 

reactor calculations, while it 

means next to nothing for a novice 

in the field. Let our perspective be something in between—that of an expert novice 

maybe. On our behalf then, the idealized plug flow reactor is like the one depicted in 

the figure. The mass and energy balances for steady state ( s-s ) operation of the reactor 

were devloped in Parts I and II of this paper. In short we found that: 

� � 

∂h [energy mass-1 s-s 

] 

and 

∂z [length] 

� ∂c [mole mass -1 ] 

∂z [length] 

= C [length] q [heat mass -1 area -1 ] 

�s-s = A [area] N r [mole mass -1 volume -1 ] 

What is missing here is a momentum balance of the reactor. It is needed to resolve the 

pressure distribution inside the reactor, which of course is of great interest for reactor 

design and operation, but at the same time it is pulling our wagon too far. The calculations 

are so involved and require so much input about reactor geometry, transport 

properties and kinetics that we must do without. Our replacement of the momentum 

balance is simply: � �s-s ∂p [pressure] 

= ∇p [pressure] 

∂z [length] 

1

That is to say we rely on an explicit pressure profile p(z) given at the outset of the 

simulation (we shall most of the time use ∇p = 0). 

Counting the number of equations there is 1 energy balance, 1 pressure profile and C 

mass balances. That makes C + 2 equations which are going to be solved simultanously 

in C +2 variables. The big question is: What variables? In practise we cannot choose the 

solution variables freely but must tackle whatever needs our models impose on us—i.e. 

the models we use to evaluate h, q and r—and there is much fuzz about which variables 

are the most versatile. 

Chemical engineers traditionally use T , p, x1, x2, · · · that is temperature, pressure 

and mole fractions. There is no theoretical reason for this choice except that these 

variables are always reported in process flow diagrams. They are also quite natural in 

the sense that they play a part of our sensation of the physical world. 

Thermodynamicists think differently and usually prefer T , v, c1, c2, · · · that is 

temperature, specific volume and specific concentrations. This choice is natural from a 

theoretical point of view because most equations of state are given as p(T, v, c) models. 

By iterating directly on the variables as they appear in the equation of state we can 

formulate very consise and elegant solvers. 

Being trained thermodynamicists and having a keen eye on aesthetics we shall stick 

to the last alternative even though we then have to solve for pressure as a function 

of volume rather than just specifying it. The equations we need to be solve can be 

condensed into (see Parts I and II for an explanation of the syntax): 

Energy: ∂T h · ∇T + ∂vh · ∇v + ∂c1 h·∇c1+∂c2 h·∇c2+ · · · = Cq 

Momentum: ∂T p · ∇T + ∂vp · ∇v + ∂c1p·∇c1+∂c2 p·∇c2+ · · · = ∇p 

Mass (1): ∇c1 = A � 

Mass (2): ∇c2 = A � 

. 

. 

i 

i 

N1,iri 

N2,iri 

This set of equations is more easily handled using matrix algebra. To minimize the use 

of extra symbols ∂ch and ∂cp are taken to be row vectors while r is (still) a column 

vector: ⎛ 

⎝ 

∂T h ∂vh ∂ch 

∂T p ∂vp ∂cp 

0 0 I 

⎞ ⎛ 

⎠ ⎝ 

∇T 

∇v 

∇c 

⎞ 

⎛ 

⎠ = ⎝ 

Cq 

∇p 

ANr 

The equations above illustrate the ambivalence we are facing with regard to p or v being 

our primary iteration variable. In this case we shall iterate on v to satisfy ∇p given as 

the gradient of a predefined function p(z). But, since pressure is a non-linear function 

of v it implies that ∇p shows up on the right side while ∇T , ∇v and ∇c appear as 

solution variables on the left side. If p had been a primary iteration variable we could 

have dropped the second row in the equation set, but at the same time we had to handle 

the p(v) inversion inside the equation of state. This is a questionable approach because 

2 

⎞ 

⎠

it involves a nested hierarchy of solvers which can cause all kinds of numerical problems. 

Usually, it is safer to handle all the equations in one solver, at least so when the equations 

are few in number like in this case. On a very condensed form we can write 

J(x)∇x = f(z, x) (1) 

which is the equation system we have to integrate in order to calculate the temperature 

and concentration profiles of the reactor. Note carefully that J(x) is a purely thermodynamic 

state function while f(z, x) is a function of both the thermodynamic state 

variables and the space co-ordinate. The mathematical definitions of J and f are not 

known to us at this point—they are what we might call anonymous lambda-functions 

in functional programming—but their semantic meaning is all clear. E.g. their scientific 

units most conform1 . 

The separation of the problem into J and f tells us that the transport and kinetic 

properties q and r, used in defining f on the right side, may require thermodynamic 

information, while the Jacobian J is independent of the spatial co-ordinate and of the 

transport properties. Anyhow, the anti-derivative of the reactor model is 

�z 

x(z) = x◦ + J(x) -1 f(ζ, x) dζ , 

0 

and the next question is how we can make an integrator for this problem. Basically, 

there are three options: Analytic, explicit and implicit solutions. We shall have a look 

at all three cases. Briefly stated there are few analytical solutions of practical interest, 

but the few that exist are important for: i) our theoretical insight, and ii) serving as 

test cases for numerical calculations. For the numerical solutions we must be aware that 

words like “explicit” and “implicit” have two different meanings. The terms do either 

refer to how the ODE is formulated, or they refer to how the integration is performed. 

The distinction is quite subtle and the implementation details are bewildering—these 

are the combinations we shall look at: 

• Explicit ODE with explicit Euler integration (forward Euler). 

• Implicit ODE with (semi)implicit Euler integration (backward Euler). 

• Explicit ODE with explicit Runge–Kutta integration. 

• Implicit ODE with explicit Runge–Kutta integration. 

From a practical point of view it is easier to implement the explicit solvers compared to 

the implicit ones, but at the same time they are numerically unstable. This is a classic 

result from numerical mathematics which we should know about, but which is not so 

important for the PFR we are studying. What we shall see is that the explicit model 

formulation fails to conserve (even explicit) constraints in energy and pressure, while the 

implicit formulation does this to our satisfaction. 

1 It also means that f(z, x) and f(z, x(y)), and f(z, y), shall refer to the same kind of function in 

this document. The free variables change, and the function definitions need not be the same, but the 

function values are always interpereted as the gradient in specific enthalpy, pressure and composition. 

3

1.1 Analytic solutions 

Equation 1 is written with the variables x ˆ= [T, v, c] in mind but it applies equally 

well to any other set of thermodynamic state variables yielding an invertible Jacobian 

J. In particular we could try to replace x by y ˆ= [h, p, c] which yields a much simpler 

formulation. Note carefully that Jacobian reduces to J(y) ≡ I: 

∇y = f(z, y) (2) 

Now, if f(z, y) is written as a linear function in y we have the classic problem of an ordinary 

differential equation (ODE) with constant coefficients. The standard formulation 

of the problem is shown below (matrix C has nothing to do with the circumference C 

used in the energy balance): 

∇y = Cy 

For PFRs that experience a constant circumference C, constant cross-sectional area 

A, constant pressure drop ∇p, constant heat flux q, and constant reaction rates r or 

first order kinetics ri ∝ c j(i), we can spell out four different cases of linear differential 

equations with constant coefficients. To keep the algebra as simple as possible—but 

not simpler—we shall assume one chemical reaction (i.e. dim N = dim c × 1) and a 

dimensionless reactor length in the range z ∈ [0, 1]: 

⎧ 

⎨ 

1) 

⎩ 

⎧ 

⎨ 

3) 

⎩ 

∇h = 0 

∇p = ∇p 

∇c = ξN 

∇h = q 

∇p = 0 

∇c = ξN 

⎧ 

⎨ 

2) 

⎩ 

⎧ 

⎨ 

4) 

⎩ 

∇h = 0 

∇p = ∇p 

∇c = kc1N 

∇h = q 

∇p = 0 

∇c = kc1N 

Here, ξ means the overall extent of reaction, q means the overall heat transfer and kc1 

denotes the first order reaction with respect to component 1 (an arbitrary choice from 

our side). A textual interpretation of the four cases follows: 

Case Description 

1 Adiabatic, fixed pressure drop, fixed extent of reaction 

2 Adiabatic, fixed pressure drop, first order reaction 

3 Fixed heat load, isobaric, fixed extent of reaction 

4 Fixed heat load, isobaric, first order reaction 

Behind the terminology of constant coefficients there is an implication that the equations 

can be recast into matrix expressions. This is advantageous from a theoretical perspective 

because it renders a generic solution of the problem ∇y = Cy where C takes one 

4

of the four shapes shown below: 

⎛ 

⎜ 

C1 = ⎜ 

⎝ 

0 0 0 0 

∇p 

h 0 0 0 

ξν1 

h 0 0 0 

ξν2 

h 0 0 0 

⎞ 

⎟ 

⎠ 

⎛ 

0 

⎜ 

C3 = ⎜ 

0 

⎜ 

⎝ 0 

0 

q 

p 

0 

ξν1 

p 

0 

0 

0 

⎞ 

0 

⎟ 

0 ⎟ 

0 

⎟ 

⎠ 

ξν2 

p 0 0 

⎛ 

⎜ 

C2 = ⎜ 

⎝ 

0 0 0 0 

∇p 

h 0 0 0 

0 0 kν1 0 

0 0 kν2 0 

⎞ 

⎟ 

⎠ 

⎛ 

0 

⎜ 

C4 = ⎜ 

0 

⎜ 

⎝ 0 

q 

p 

0 

0 

0 

0 

kν1 

⎞ 

0 

⎟ 

0 ⎟ 

0 

⎟ 

⎠ 

0 0 kν2 0 

Here, we have been assuming a two-component mixture with chemical reaction ν1A = 

ν2B. More components can easily be added without violating the structure of the matrices. 

The solution(s) can be written 

y(z) = e zC y(0) 

where ezC means the matrix exponential of zC. Covering the matrix theory in detail 

would take us astray from the PFR subject, but it is important to know that what is 

said next can be formalized—if not always as closed analytical formulas—at least in 

the form of numerical calculations. But, for the C-matrices mentioned above we can 

follow the simple approach and find the matrix exponentials by inspection because the 

matrices have such simple structures. Writing out solutions of mathematical problems 

without any further details is somewhat arrogant but I think that in this case it implies 

less confusion—not more confusion—to do it quick and simple. You should verify the 

results by backsubstituting into the matrix formula using y(0) = [h, p, c]z=0 though: 

⎛ 

e zC1 

⎜ 

= ⎜ 

⎝ 

1 0 0 0 

z∇p 

h 1 0 0 

zξν1 

h 0 1 0 

zξν2 

h 0 0 1 

⎞ 

⎟ 

⎠ 

e zC3 

⎛ 

1 

⎜ 

= ⎜ 

0 

⎜ 

⎝ 0 

0 

zq 

p 

1 

zξν1 

p 

0 

0 

1 

⎞ 

0 

⎟ 

0 ⎟ 

0 

⎟ 

⎠ 

zξν2 

p 0 1 

⎛ 

e zC2 

⎜ 

= ⎜ 

⎝ 

1 0 0 0 

z∇p 

h 1 0 0 

ν1 

⎞ 

0 0 ezkν1 ⎟ 

0 

⎟ 

� � 

⎠ 

ν2 zkν1 0 0 e − 1 1 

e zC4 

⎛ 

zq 

1 p 0 0 

⎜ 

= ⎜ 

0 1 0 0 

⎜ 

⎝ 0 0 ezkν1 ⎞ 

⎟ 

0 

⎟ 

⎠ 

� � ν2 zkν1 0 0 e − 1 1 

Case 4 is maybe the most interesting for the chemical engineering student since it gives 

the opportunity to study PFRs with a maximum in temperature along the reactor. The 

argument is simple: Consider an exothermic first order reaction with constant cooling. 

A first order reaction means that the reaction rate will decrease monotonically along the 

reactor. Then, by balancing the heat production in the middle the reactor with the heat 

5 

ν1

taken away at the same spot it should be clear that excess heat is produced at the inlet 

and excess cooling is applied at the outlet. The result is a curved temperature profile 

which of course looks more interesting than a flat one. 

1.2 Explicit Euler-integration 

Talking about numerical integration the word explicit means the differential equations 

are stated without iterative calculations. So, how can that be arranged for a non-linear 

problem? The short answer is it cannot, the long answer is we can make piecewise linear 

approximations to the functions we want to integrate and solve each little sub-problem 

explicitly. The outcome will not be the answer, but merely a numerical approximation 

to it. There are many things to worry about in such calculations. Numerical accuracy 

and stability are maybe the most important issues. 

We shall not look very deep into the matter but try to understand what happens in a 

numerical integrator and see how we can formulate the equations in a piecewise manner. 

Our starting point is Eq. 1: 

J(x)∇x = f(z, x) 

Inverting J (yes, we must assume that the Jacobian is invertible—else the problem is 

thermodynamically inconsistent) yields the explicit formula 

∇x = J(x) -1 f(z, x) 

Then comes the piecewise approximation ∇x ≈ (∆z) -1 ∆x which is assumed to be valid 

on the range [z, z + ∆z]: 

∆x = J(xz) -1 f(z, xz)∆z + O(∆z) 2 

The truncation error is of second order, that is O(∆z) 2 , but the integrated answer will 

not be that accurate because the number of steps taken in the interval is proportional 

to (∆z) -1 which means the integration error will be O(∆z) 2 (∆z) -1 = O(∆z) 1 , that is 

of first order only. We shall later learn how to implement schemes of higher order, 

namely the Runge–Kutta integration methods of 2nd and 4th order. From the definition 

∆x ˆ= xz+∆z − xz we can write the final update formula as: 

x e-e 

z+∆z ˆ= xz + J(xz) -1 f(z, xz)∆z (3) 

By applying this formula successively on the integration domain z ∈ [0, 1] we can calculate 

the sequence x0, x∆z, x2∆z, · · · very easily. Furthermore, it is (almost) evident 

that xNz will converge to the true solution x(Nz) when ∆z → 0 and N → ∞. But, 

this requires an infinite number of steps which eventually would take infinite time on a 

computer. Another problem of the numerical solution is that computers have fixed word 

lengths. Irrational numbers are approximated inside the computer as decimal numbers 

represented by 16, 32, 64, or 128 bits length. This gives a round-off error in (nearly) 

every multiplication or division that is carried out. There is therefore a trade-off between 

a smaller ∆z to achieve higher accuracy in the updating formula, and a not-so-small ∆z 

to avoid excessive round-off errors (and to reduce the computation time). 

6

1.3 Implicit Euler-integration 

Physical theories build on a limited number of conservation laws. For example mass and 

energy conservation is essentially what lies behind our PFR model. This is the strong 

point of physics. The weaker part of the theory arises from the lack of appropriate 

models expressed directly in the conserved properties. This branch of physics belongs 

to thermodynamics. In our case the conservation laws are made linear in the thermodynamic 

variables h, p, c, while in most cases the equation of state serving the calculation 

of p (and h) is on the form p(T, v, c). Hence, to update the equation of state we need to 

solve the relationships between T, v and h, p iteratively (the problem is strongly coupled 

and non-linear). If these relationships are solved at each step taken from z to z + ∆z 

the method is said to be implicit. Recall that for the explicit method in Eq. 3 there is 

no need for an iterative solution because matrix inversion in itself is an explicit method. 

Why should we worry about implicit integration then? It sounds complicated and if 

explicit integration works why bother? The answer is simple, definite and instructive: 

Explicit integration violates the conservation principle(s) because of the linearization 

term that is behind Eq. 3. If this feature is considered to be unfortunate we should consider 

implicit integration. This is because it solves the conservation equations accurately 

at each step of the integration. It is not to say that the integration is more accurate, it 

is only consistent. Consistently wrong you might say, but it is not inconsistent. 

To write an implicit integrator we need to understand that the conservation laws put 

constraints on y ˆ= [h, p, c] while the thermodynamics, heat exchange, pressure drop and 

kinetics models rely on x ˆ= [T, v, c]. We must therefore be able to solve the relationship 

x(y) by e.g. Newton–Raphson iteration (to obtain second order convergence) in parallell 

with the integration task. This topic is also known as: Integration on manifolds, geometric 

integration, or Differential–Algebraic–Equations (DAEs) solving. The starting 

point is the same as in Eq. 2 except for the implicit relation x(y) that sits on the right 

side: 

∇y = f(z, x(y)) 

Linearization (this time in y) yields: 

∆y = f(z, x(yz+∆z))∆z + O(∆z) 2 

This is the fully implicit formulation of the problem, where “fully” indicates that the 

right side is evaluated at the next location z + ∆z, i.e. not the current z. Solving this 

problem with Newton–Raphson iteration is not so easy because it requires derivative 

information about f(z, x(y)). We know very little about the structure of this function 

and can hardly make anything ready on general terms, but for the relation x(y) we 

know a lot. It is a thermodynamic mapping with a fixed structure awaiting only a 

thermodynamic model to calculate the numbers run-time. We shall therefore restrict 

ourselves to the following semi-implicit formulation of the problem 

∆y = f(z, x(yz))∆z + O(∆z) 2 

where the right side is assumed constant at each position z. This yields the simpler 

update formula: 

7

yz+∆z ˆ= yz + f(z, x(yz))∆z 

Even though the formulation above is semi-implicit it is consistent with any conservation 

principle that yields a constant contribution on the right side (linear profile). The 

method is therefore referred to as just “implicit” when there is no danger of misunderstanding. 

Later on we shall see in practise how the method works for a problem with 

linear enthalpy and pressure profiles. Notwithstanding these merits the semi-implicit 

method is just an approximation with respect to changes that are not subject to conservation. 

Temperature is one example. So, even when the energy is conserved the 

temperature profile is not necessarily correct. Incorrect is not the same as inconsistent 

though. 

To solve for yz+∆z we shall alter the values of x. We must then make some additional 

calculations denoted as iterations 0 , 1 , · · · , k , k+1 . Because the problem formulation is 

semi-implicit we need derivatives for y versus x but not for f(z, x(y)). Linearization of 

yz+∆z on the left side yields the following approximation: 

y k z + J(x k z)∆x k ≈ y 0 z + f(z, x(y 0 z))∆z 

By definition y 0 z ≡ yz and we sincerely hope that y ∞ z → yz+∆z. We cannot prove the 

last property, but if it is correct the iteration process is said to converge locally. The 

Newton-Raphson procedure may converge or it may diverge. Impossible to say in fact 

without problem specific information. If it does converge, however, it shows second order 

convergence. In practise this means that the number of significant digits will double in 

each iteration when k is sufficiently large. What sufficiently large means is also hard to 

say, but in normal cases it is typically in the range kcrit ∈ [3, 5]. Solving for ∆x k we get: 

∆x k ≈ J(x k z) -1 [y 0 z + f(z, x(y 0 � �� z))∆z − y 

� 

yz+∆z 

k z] (4) 

Note the underbrace above: yz+∆z comes in as a constant estimate on the right side such 

that if (i.e. hopefully then) the iteration converges we get y k z → y∞ z → yz+∆z which 

makes ∆x k → 0. Finally, when the update norm satisfies � � |∆x k � � | ≤ ǫ the iteration is 

stopped. A suitable stop criterion must be set by us—or in practise the programmer. 

The definition of ∆x k ˆ= x k+1 

z 

− x k z leads to 

x ı-e,k+1 

z ˆ= x k z + J(xkz )-1 [yz+∆z − y k z ] (5) 

which is the final update formula for the implicit Euler integration method. But, for 

the special case k = 0 we can identify yz+∆z − y k z on the right side being equal to 

yz+∆z − y 0 z = f(z, x(y 0 z))∆z, see Eq. 4. This leaves the much simpler formula: 

x ı-e,1 

z 

= x0 z + J(x0 z )-1 f(z, x(y 0 z ))∆z 

Comparing the right side of this formula with the explicit Euler formula in Eq. 3 reveals 

the following relationship (after noticing that x0 z ≡ xz and y0 z ≡ yz): 

x ı-e,1 

z 

≡ x e-e 

z+∆z 

8

The conclusion is that the first iteration of the implicit Euler scheme is identical to the 

explicit Euler update (if, and only if, the update is calculated using Newton–Raphson 

iteration). We can therefore say that the two integration methods are examples of 

N’th level explicit Euler schemes. For N = 1 we retain the classic Euler integration 

and for N → ∞ we get implicit Euler integration, but in many cases it is enough to 

make only 2 or 3 Newton–Raphson updates in order to reach a sufficiently converged 

x-state. Thus, it makes sense to integrate several times trying out 1st, 2nd and 3rd level 

updates to verify that the solution converges smoothly to a value that is independent 

of the linearization. What cannot be controlled in this manner is the accuracy of the 

integration. Usually, higher accuracy means higher order approximation methods like 

for instance the Runge–Kutta familiy of non-stiff integrators. To control stiffness as well 

(that is integrating ODEs showing a wide spread in the eigenvalues) we have to deal 

with an entirely different approach using variable step length and precondition of the 

equations. This is way outside the current scope. 

1.4 Runge–Kutta integration 

The Runge–Kutta methods belong to a family of explicit integrators often considered to 

be the work horses of numerical integration. The members of this family are characterized 

by an order parameter n saying that the global integration error is proportional to 

(∆z) n , where n is typically 2, 3, 4 and 5. A Runge–Kutta method of order 1 will then be 

equivalent to explicit (forward) Euler integration. It can be argued that schemes of even 

order are better “balanced” than schemes of odd order. The odd-ordered schemes are 

therefore used for trunction error control, mostly, while the integration itself is carried 

out with one of the even-ordered schemes. 

We shall have a further look at second and fourth order schemes called RK2 and 

RK4 throughout this text. These are explicit integration schemes, but the methods will 

be defined such that we can choose to stay on the h, p, c manifold if we wish. It is 

then important to know what “on” means. Just like for the explicit and (semi)implicit 

Euler methods this question does not need be answered once and for all, but can await 

us specifying (later) the number of iterations we would like to spend on the update of 

T, v, c at each step of the integration. 

1.5 Calculation example 

A good calculation example must serve many needs. Firstly, it should be verifiable. 

Only this way is it possible to prove (or disprove) that the equations are solved correctly. 

Secondly, it should be familiar to the reader. An example that comes as a total surprise 

can hardly serve as an example because the perspective is missing. Thirdly, it should 

be realistic. An unrealistic example can perhaps be more intriguing but it adds nothing 

to our physical experience. Forthly, it should contribute new insight. However, to come 

up with an example that is both verifiable, familiar, realistic and new is not so easy. 

The production of ammonia from nitrogen and hydrogen is a classical textbook example. 

It is the most important of all the industrial reactions and without it we would 

9

have been in the 19th century still. But, it has a very complicated reactor design and we 

shall not try too hard to be realistic. Uniform cooling, zero pressure drop and first order 

reaction is the best we can do if we also want to verify the calculation by comparing it 

with an analytical solution, see also Section 1.1. 

The ammonia reaction is exothermic and shows a substantial temperature increase 

under normal operation. So, by matching the cooling duty with the reaction rate it is 

possible to obtain a curved temperature profile along the reactor axis. The chemical 

compositions vary exponentially along the same axis and for the reactor as a whole we 

can expect a pronounced non-linear behaviour. This puts our solution method on trial. 

We shall therefore investigate several integration schemes: explicit and implicit Euler, 

and explicit RK2 and RK4 (Runge–Kutta 2nd and 4th order) with both explicit and 

implicit function updates. 

For the reactor calculation we need of course a set of differential equations, but we 

also need to fill in with thermodynamic state information. Ideal gas is the simplest 

non-trivial concept we can use in this case. The gas mixture of ammonia, nitrogen and 

hydrogen is non-ideal at synthesis conditions, but the physical insight of the problem is 

not changed very much by this fact. The only artifact we should know about is that 

the ideal gas enthalpy is independent of pressure whereas the real enthalpy is not (this 

feature can betray us badly at adiabatic conditions). The thermodynamic relations we 

are using are listed below: 

p ıg = 

h ıg = � 

� 

i ciRT 

v 

i 

� 

ci ∆fh ◦ i + 

�T 

0.29815 

where R ˆ= 0.083145 . . . 10 5 J mol -1 kK -1 , and where 

and finally: 

c ◦ p,i (τ) dτ� 

∆fh◦ NH3 

[105J mol-1 ] = −0.45898; ∆fh ◦ N2 = ∆fh ◦ H2 = 0 

c◦ p,NH3 (τ) 

[105J mol-1 kK-1 ] = 0.27310 + 0.23830τ + 0.17070τ 2 − 0.11850τ 3 

c◦ p,N2 (τ) 

[105J mol-1 kK-1 ] = 0.31150 − 0.13570τ + 0.26800τ 2 − 0.11680τ 3 

c◦ p,H2 (τ) 

[105J mol-1 kK-1 ] = 0.27140 + 0.09274τ − 0.13810τ 2 + 0.07645τ 3 

As explained at the beginning of this chapter the mixture is normalized to one kilogram 

of material which implies that all enthalpies, volumes and mole numbers are reported as 

specific quantities in the upcoming tables. This fixes the size of the problem. Everything 

10

is on mass basis. The last statement can be a little bewildering because the reaction 

stoichiometry is 

N2 +3 H2 = 2 NH3 

which is independent of the system size. This equation reflects only the chemical stoichiometry, 

however, and not the total conversion in the system. It is the kinetics model 

that scales the chemical reaction equation to the size of the system. Now, to integrate 

through the reactor we need to know the complete intensive state of the gas mixture at 

the inlet. The initial temperature, pressure and composition (mole fractions) chosen in 

this case are: 

T◦ = 0.800 [kK] 

p◦ = 0.250 [kbar] 

z◦ = [ 0.04, 0.24, 0.72 ] [-] 

The units of thermodynamics (kK, kbar, 10 5 J, dm 3 and mol) are maybe curious but they 

are in fact judiciously selected to increase the numerical stability of the solvers. This issue 

is hard to explain without the prior knowledge of numerical mathematics and fixed wordlength 

computers and we shall leave it open for the interested reader. Note also that the 

initial pressure is a dependent variable in this case and that it must be iterated on since 

the thermodynamic model is explicit in volume—not in pressure. Carrying on we shall 

assume a uniform cooling profile along the reactor equal to ∇h = −20 105J, zero pressure 

drop ∇p = 0 kbar, and first order reaction of nitrogen equal to ∇cN2 = −(4/3)cN2 mol. 

All gradients are defined per kilogram of material and per reactor length. The outcome 

is a set of differential equations equivalent to Case 4 in Section 1.1: 

⎛ 

h 

⎞ 

⎜ p 

∇y ˆ= ∇ ⎜ cNH3 ⎜ 

⎝ cN2 

⎟ 

⎠ → 

⎛ 

−20 

⎞ 

⎜ 0 

⎜ (8/3)cN2 

⎜ 

⎝ −(4/3)cN2 

⎟ 

⎠ 

The analytical solution is 

⎛ 

⎜ 

y(z) ˆ= ⎜ 

⎝ 

h 

p 

cNH3 

cN2 

cH2 

cH2 

⎞ 

⎛ 

⎟ 

⎠ → 

⎜ 

⎝ 

−(4/1)cN2 

h◦ − 20z 

p◦ 

c ◦ NH3 − 2(α − 1)c◦ N2 

αc ◦ N2 

c ◦ H2 + 3(α − 1)c◦ N2 

where α ˆ= e −(4/3)z . The enthalpy, pressure and composition profiles are easily calculated 

from the last formula and by iterating on temperature and volume at each step along 

the reactor axis (we need in fact only one step to integrate the entire reactor) we can 

calculate the profiles to our discretion. E.g. dividing the reactor into 5 segments yields 

the following exact answer to our differential equation problem (reported in more familiar 

units for the ease of reading): 

11 

⎞ 

⎟ 

⎠

z 

T 

[K] 

V 

[dm 3 ] 

h 

[MJ] 

p 

[bar] 

cNH 3 

[mol] 

cN 2 

[mol] 

cH 2 

[mol] 

0 800.000 30.0438 1.495255 250.000 4.5168 27.1006 81.3019 

0.2 882.267 29.4106 1.095255 250.000 17.2037 20.7571 62.2714 

0.4 919.963 27.6941 0.695255 250.000 26.9211 15.8985 47.6954 

0.6 921.796 25.4676 0.295255 250.000 34.3638 12.1771 36.5313 

0.8 894.927 23.0285 −.104745 250.000 40.0645 9.3268 27.9804 

1 844.596 20.5069 −.504745 250.000 44.4307 7.1436 21.4309 

The numbers printed in blue ink are the variables we want to investigate further using 

a small assortment of homemade integrators. So, integrating from z = 0 to z = 1 in 3 

steps (numbers being exact to 6 digits are printed in blue) yields: 

Method N T 

[K] 

V 

[dm 3 ] 

h 

[MJ] 

p 

[bar] 

Euler 1 923.156 21.7968 −0.522353 239.498 

Euler 3 928.546 21.0031 −0.504745 250.001 

RK2 1 828.557 20.4743 −0.507512 248.660 

RK2 3 829.427 20.3859 −0.504745 250.000 

RK4 1 844.365 20.5106 −0.504997 249.918 

RK4 3 844.444 20.5057 −0.504745 250.000 

Exact - 844.596 20.5069 −0.504745 250.000 

We see that all the explicit methods fail: Euler-1 fails badly, RK2-1 fails less, while RK4- 

1 is pretty close—but they all fail. The implicit methods behave differently. Except for 

Euler-3 they are all correct in their predictions of enthalpy and pressure. This means 

the energy and momentum balances are consistent with the underlying conservation 

principles. The temperature and the volume are still off which means the calculations 

are not correct—only consistent. 

By increasing the number of integration steps we may hope to rectify the situation 

and get truely correct answers. In fact, by integrating from z = 0 to z = 1 in 12 steps 

(numbers being exact to 6 digits are still printed in blue) we get: 

Method N T 

[K] 

V 

[dm 3 ] 

h 

[MJ] 

p 

[bar] 

Euler 1 862.454 20.7456 −0.507013 248.550 

Euler 3 863.160 20.6421 −0.504745 250.000 

RK2 1 843.829 20.5017 −0.504892 249.982 

RK2 3 843.875 20.5014 −0.504745 250.000 

RK4 1 844.595 20.5069 −0.504746 250.000 

RK4 3 844.596 20.5069 −0.504745 250.000 

Exact - 844.596 20.5069 −0.504745 250.000 

This time RK4-3 yields correct answers all over the line. The same resolution with 

RK2-3 and Euler-3 would require 380 and 500,000 steps respectively. Note: The total 

calculation effort is bigger because one step of RK4-3 requires 4 intermediate steps each 

12

using 3 iterations in Eq. 5. The total number of steps is then 12*4*3 = 144. For RK2- 

3 the total number of steps is 360*2*3 = 2160, and for Euler-3 it is 500,000*1*3 = 

1,500,000. Notwithstanding the extra calculations required to fulfill the RK4 and RK2 

steps, the conclusion is that higher order schemes are superior to lower order schemes 

(of course I should say). 

In interesting spin-off from this disussion is that there is no difference between implicit 

and explicit problem formulations when we talk about numerical accuracy. I.e. explicit 

Euler and implicit Euler yield the same accuracy as do RK2 with explicit and implicit 

model formulations and the same for RK4. Buth then it comes to conservation laws we 

see the difference. The implicit model formulation always yield correct enthalpies and 

pressures whereas the explicit formulations do not. For RK4 the difference is in the last 

digit only, but it is nevertheless present and it is visible. 

13

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







477

Numerical Integration 





We don't need no... 

Assignments 

"Another Glitch in the Call" 

We don't need no indirection 

We don't need no flow control 

No data typing or declarations 

Hey! You! Leave those lists alone! 

Chorus: 

All in all, it's just a pure-LISP function call. 

All in all, it's just a pure-LISP function call. 

• • • 

Zooball/Giraffe 

1. Finish the equation solver hpn_vs_tvn_solver() in flowsheet.py. 




python ammonia_reactor.py rk2 implicit 12 30 



python ammonia_reactor.py rk4 implicit 12 30 

3. Finish the Euler integration option in method 

hpn_vs_tvn_integrator() in flowsheet.py. 


python ammonia_reactor.py euler explicit 12 1 

python ammonia_reactor.py euler explicit 12 3 

python ammonia_reactor.py euler implicit 12 30 

5. Compare the results you've got. 

Continue reading about Modelling issues with focus on Euler and Runge-Kutta 

integration. 

back



back 



back 


5.17.1 Verbatim: “We don’t need no...” 

1 We don ’ t need no i n d i r e c t i o n 

2 We don ’ t need no flow c o n t r o l 

3 No data typing or d e c l a r a t i o n s 

4 Hey ! did you l e a v e those l i s t s alone ? 

5 Hey hacker ! Leave those l i s t s alone ! 

6 

7 Oh no ! i t ’ s j u s t a pure LISP f u n c t i o n c a l l 


9 

10 We don ’ t need no compilation 

11 We don ’ t need no load c o n t r o l 

12 No l i n k e d i t f o r e x t e r n a l b i n d ings 

13 Hey ! did you l e a v e that source alone ? 

14 Hey hacker ! Leave that source alone ! 

15 



18 

19 We don ’ t need no s i d e e f f e c t i n g 

20 We don ’ t need no flow c o n t r o l 

21 No g l o b a l v a r i a b l e s f o r e x e c u t i o n 

22 Hey ! did you l e a v e the args alone ? 

23 Hey hacker ! Leave the args alone ! 

24 



27 

28 We don ’ t need no a l l o c a t i o n 

29 We don ’ t need no s p e c i a l nodes 

30 No dark b i t f l i p p i n g f o r debugging 

31 Hey ! did you l e a v e those b i t s alone ? 

32 Hey hacker ! Leave those b i t s alone ! 

33 



480

5.17.2 flowsheet.py, see also Sec. 5.15.3 

First reference occurs in flowsheet.py, see Section 5.15.3 on page 448. 

481



482

5.17.4 flowsheet.py, see also Sec. 5.15.3 

First reference occurs in flowsheet.py, see Section 5.15.3 on page 448. 

483



484

5.17.6 Modelling issues, see also Sec. 5.15.8 

First reference occurs in Modelling issues, see Section 5.15.8 on page 462. 

485

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







488

Unit Testing 





Taoism: Shit happens. 

Comparative Religion 

Confucianism: Confucius say, "Shit happens." 

Hinduism: This shit has happened before. 

Protestantism: Let shit happen to someone else. 

Seventh Day Adventism: No shit shall happen on Saturdays. 

Zooball/Cow 

Jehovah's Witnesses: May we have a moment to show you some of our shit? 

Creationism: God made all shit. 

Hare Krishna: Shit happens, rama rama. 

Rastafarianism: Let's smoke this shit! 

Satanism: SNEPPAH TIHS. 

Stoicism: This shit is good for me. 

Nihilism: No shit. 

••• 

The Origin of Faeces 

Assignments 

1. Blabla 


back 



back 

%Predefined number 2.


back 


Two Classics 

(both have appeared in many places, in many versions) 

The Origin of Faeces 

1. In the beginning was the Plan. 

2. And then came the Assumptions. 

3. And the Assumptions were without form. 

4. And the Plan was without Substance. 

5. And darkness was upon the face of the Workers. 

6. And they spoke among themselves saying, "It is a crock of shit 

and it stinks." 

7. And the Workers went unto their Supervisors and said, "It is a 

pail of dung and we cannot live with the smell." 

8. And the Supervisors went unto their Managers saying, "It is a 

container of organic waste, and it is very strong, such that none 

may abide by it." 

9. And the Managers went unto their Directors, saying, "It is a 

vessel of fertilizer, and none may abide its strength." 

10. And the Directors spoke among themselves, saying to one 

another, "It contains that which aids plant growth, and it is 

very strong." 

11. And the Directors went to the Vice Presidents, saying unto them, 

"It promotes growth, and it is very powerful." 

12. And the Vice Presidents went to the President, saying unto him, 

"This new plan will actively promote the growth and vigor of the 

company with very powerful effects." 

13. And the President looked upon the Plan and saw that it was good. 

14. And the Plan became Policy. 

15. And this is how shit happens. 

Comparative Religion 

Taoism: Shit happens. 

Confucianism: Confucius say, "Shit happens." 

Buddhism: If shit happens, it isn't really shit. 

Zen Buddhism: What is the sound of shit happening? 

Hinduism: This shit has happened before. 

Mormonism: This shit is going to happen again. 

Islam: If shit happens, it is the will of Allah. 

Catholicism: If shit happens, you deserve it. 

Calvinism: Shit happens because you don't work hard enough. 

Protestantism: Let shit happen to someone else.* 

Judaism: Why does this shit always happen to us? 

Seventh Day Adventism: No shit shall happen on Saturdays. 

Christian Science: Shit is in your mind. 

Jehovah's Witnesses: May we have a moment to show you some of our shit? 

Creationism: God made all shit.

Creationism: God made all shit. 

Secular Humanism: Shit evolves. 

Oshoism: If shit happens, celebrate it. 

Scientology: If shit happens, see "Dianetics", p.157. 

Hare Krishna: Shit happens, rama rama. 

Rastafarianism: Let's smoke this shit! 

Agnostic: Shit might have happened; then again, maybe not. 

Satanism: SNEPPAH TIHS. 

Stoicism: This shit is good for me. 

Atheism: I can't believe this shit! 

Advaitism: Inquire into who it is that gives a shit 

Nihilism: No shit. 

* = you got a better one for this? 

Navigation: Site Map Home

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







495

The Final Touch 





There is always a second bug. 

From a Real Programmer's diary: 

If it's possible to make a mistake, you'll make it. 

If it's possible to forget something, you'll soon forget it. 

If it's possible to postpone a task, you'll postpone it. 

If you find a simple solution to a problem it's most likely wrong. 

Anything that walks and quacks like a duck is probably something else. 

Make a clever design and you'll end up shooting yourself in the foot. 

Never trust someone else's code and especially not your own. 

Things take time — about three times more than you expect. 

Every rule is a rule, but no rule is absolute. 

Bjørn Tore Løvfall and Tore Haug-Warberg (2004 - 2008) 

Assignments 

1. Install GNUplot and GhostScript on your computer. 

2. Download the plot files graph.plt and graph.dat. 

Zooball/Monkey 

3. Plot the file content(s) from the command line. In a UNIX-style 

environment the commands are: 

gnuplot graph.plt 

ps2pdf graph.ps 

open graph.pdf 

The output shall be like this: graph.pdf 

4. Modify your version of ammonia_reactor.py to make it produce some 

decent GNUplot output. Make a template similar to graph.plt for plotting 

the calculated results. 

5. Have Great Fun with the tools you've got! 


paragraph. HTML paragraph. HTML paragraph. HTML paragraph. HTML



back 





back 



back 


5.21.1 Verbatim: “graph.plt” 

1 #! / sw/ bin / gnuplot −p e r s i s t 

2 # 

3 # Test s c r i p t p l o t t i n g a y ( t ) graph with e r r o r bars and 

4 # s e p a r a t e boxes showing the e r r o r l e v e l . Data are loaded 

5 # loaded from f i l e ” graph . dat ” and dumped to ” graph . ps ” . 

6 # 

7 set terminal p o s t s c r i p t \ 

8 landscape noenhanced monochrome \ 

9 dashed d e f a u l t p l e x "Helvetica" 18 

10 

11 set output ’graph.ps’ 

12 

13 set t i t l e ’Testing out GNUplot’ 

14 set xlabel ’Time [s]’ 

15 set ylabel ’Measurement’ 

16 

17 set xrange [ 0 : 9 ] 

18 set yrange [ 0 : 3 ] 

19 set mxtics 2 

20 set mytics 2 

21 

22 set style l i n e 1 \ 

23 l i n e t y p e 2 l i n e w i d t h 4 pointsize 2 pointtype 6 

24 set style l i n e 2 \ 

25 l i n e t y p e 1 l i n e w i d t h 1 pointsize 0 

26 

27 set m u l t i p l o t 

28 set style data boxes 

29 set key l e f t 

30 

31 plot "graph.dat" using 1 : 3 \ 

32 t i t l e "error" l i n e s t y l e 2 

33 

34 set style data l i n e s 

35 set key r i g h t 

36 

37 plot "graph.dat" using 1 : 2 \ 

38 t i t l e "y(t)" with l i n e s p o i n t s l i n e s t y l e 1 

39 

40 plot "graph.dat" using 1 : 2 : 3 \ 

41 n o t i t l e with y e r r o r b a r s l i n e s t y l e 2 

498

5.21.2 Verbatim: “graph.dat” 

1 # graph . dat 

2 # 

3 # gnuplot i g n o r e s l i n e s that s t a r t with # 

4 # 

5 # t y e r r o r −in−y 

6 # 

7 0 0 0.01 

8 1 0.25 0 . 1 

9 2 0 . 5 0.05 

10 3 0.75 0 . 4 

11 4 1.25 0 . 2 

12 5 1.30 0 . 3 

13 6 1.55 0.33 

14 7 1.80 0 . 1 

15 8 2.05 0 . 5 

16 9 2 . 0 0 . 2 

499

Measurement 

3 

2.5 

2 

1.5 

1 

0.5 

error 

Testing out GNUplot 

0 

0 1 2 3 4 5 6 7 8 9 

Time [s] 

y(t)



501

5.21.5 graph.plt, see also Sec. 5.21.1 

First reference occurs in graph.plt, see Section 5.21.1 on page 498. 

502

Title ??? 




phone: +47-7359-??? 

Zooball/Dove 



the day. " 

Reference ??? 

Table ??? 

1. Hello, 

2. World. 


... 

... 

... 

4. Continue. 





back 








back







505

10 MB pdf-file here - NTNU

Create successful ePaper yourself

Delete template?

Save as template?