It is very dangerous to live in a society where a few people have highlevel thinking skills, and the rest don’t. Democracy does not work well in such a society.
Also: People who have highlevel thinking skills are generally more productive than people who don’t. As a consequence, jobs that require highlevel thinking generally pay better than jobs that don’t.
Almost everybody knows how to run, after a fashion. However, if you sign up for the track team, or the soccer team, or anything like that, the coach will train you to run better, possibly a lot better.  Everybody knows how to think. It would be incorrect and insulting to tell someone they don’t know how to think. However, the fact remains that a good science class will train you to think better, possibly a lot better. 
Sometimes good thinking and good learning habits are taught implicitly, by osmosis ... but sometimes they are discussed explicitly, as a topic unto themselves. This topic is called metacognition, which means “thinking about thinking”.
That is, thinking skills are byandlarge separate from domain knowledge. To solve realworld problems in a particular domain, you need knowledge about the domain plus general thinking skills.
If you have highlevel thinking skills, you can become proficient in a new domain just by learning the new domainspecific knowledge; you don’t need to learn the thinking skills all over again. For more about problemsolving in general, see reference 1.
Einstein said “An education is what remains after you have forgotten everything you learned in school.” I’m pretty sure his point was that thinking skills remain, even after all the narrow domainspecific factoids have been forgotten (or have become irrelevant).
Anecdote: Once upon a time, a friend and I were conducting sea trials in a large, brandnew sailboat. The two of us had worked together before, debugging large computer programs. As you can imagine, debugging a computer program requires a detailed understanding of the computer language ... whereas debugging a boat requires considerable knowledge about how boats work, which is quite a different body of knowledge. However, both of us were struck by the fact that we used essentially the same process in both cases. We checked the typical case, then we checked the edges of the envelope, then we checked the corners of the envelope. When we observed small anomalies, we made a note of them, and then did whatever was necessary to make them reproducible. And so forth. We both knew what had to be done, and we each knew what the other guy was thinking, which helped us to work quickly and efficiently.
I call such things “gameshow tests”. They cause some serious problems, as we now discuss. (Additional discussion can be found in reference 2.)
I used to say that such tests don’t predict anything at all, but if things keep going the way they are, such tests will begin to predict success in school ... for the simple reason that success in school is being measured, more and more, by such tests. This is circular in a truly ghastly way. It encourages rote learning and discourages thinking.
In particular, we need tests that measure thinking skills.
If you do it right, kids will increase their thinking skills and enjoy it.
After years of a steady diet of such problems, students will be alarmed and recalcitrant if you suddenly assign them homework that requires nontrivial thinking. You will have to explain that your course is different from other courses, past and present. Then you will have to patiently teach them the required thinking skills. Then you can assign problems that require thinking, with gradually increasing complexity.
Some parts of criticalthinking task are hard, but some of them are easy. Indeed you should have learned some of them in third grade, such as item 3, item 4, and item 10 in the following list.
Here are a few ideas that contribute to critical thinking, and can be applied to almost any field:
This is hugely important. A memory is not useful, and hardly even counts as a memory, if you cannot recall it when needed. Thinking about the connections increases the usefulness of each memory, by increasing the number of ways in which it can be recalled. See section 10.1 and reference 4.
For example:
More generally:
With the math, you can learn a relatively small number of powerful, versatile principles. You can understand where the rules come from and how they fit together.  Without the math, you are reduced to learning a huge number of unprincipled inflexible rules that apply in special situations. These rules have to be learned by rote, because there is no way to see where they come from or how they fit together. 
The principles are easy to remember, because they get used again and again. Every time a memory is used, it gets stronger.  Special situations arise so rarely that by the time the situation arises, you are likely to have forgotten the specialpurpose rule. 
Sometimes the indirect approach is more convenient than the direct approach. Also, as discussed in section 5.2, a devious solution can serve as a valuable crosscheck on the straightforward solution.
Here are some examples, in order of increasing complexity:
a) Add 198 plus 215. 
You can easily do this in your head if you rearrange it as (215 + (200 − 2)) which is 415 − 2 which is 413.  The small point here is that by rearranging things, a lot of carrying can be avoided. 
b) Decide which is larger: 12345/12347 or 23457/23459. 
You can easily do this in your head if you rearrange it as (1  2/12347) versus (1  2/23459). Both are less than 1. The latter is clearly closer to 1.  The small point here is that by rearranging things, and by using quanitative facts about arithmetic, such as the idea that inverse is a monotonic function, long division can be avoided. 
Note: You could perhaps consider a shorter problem, perhaps 5/7 versus 7/9.  On the other hand, the longer version makes it clear that this is meant to be a logic puzzle, not a simple division problem. It increases the value of the clever approach over the bruteforce approach. The indirect, understandingbased approach is equally easy for both versions. So it’s not clear that the shorter version is doing anybody a favor. 
c) In your head, without a calculator or even pencil and paper, decide which is larger: 25/28 or 15/17. 
We can write the question mathematically as

This looks simpler than the previous question, because the numbers are smaller, but it is more challenging, because the two sides are more nearly equal. 
We can rewrite this as

We know the value of a and b, so we don’t need to solve for them, but for now let’s leave them as symbolic rather than numeric. 
Multiply top and bottom on the LHS by 2, and
multiply top and bottom on the RHS by 3. That leaves us with

Observe that this now has the same structure as the previous example (item b), and the previous reasoning can be reused.  I can do all of this in my head, in less time than it takes to find a pencil and paper. 
More importantly, these examples illustrate several larger points:
There is artistry involved in finding a “good” generalization. Equation 2 is not the only possible generalization of equation 1.
The artistry is not however a shot in the dark. Experience suggests patterns that are worth looking at. In this case there is an analogy to differentialmode signaling. On the LHS of equation 2, (a) is the commonmode signal, common to both numerator and denominator, while (3) is the differentialmode signal. Rewriting it so as to focus attention on what’s common and what’s different is a technique that you can use in lots of situations.
I tell students: You’re not Superman, and I’m not Superman either. I’m not going to teach you how to leap tall buildings in a single bound. Instead I will show you where the stairwell is. We can get to the top one step at a time. Very often when the task seems impossibly hard, you can break it down into a number of easy steps.
As I see it, question 4 is plain old arithmetic, not algebra ... whereas question 5 is algebra. It’s just barely algebra, but it’s definitely on the algebra side of the border. The point is that equation 5 can be solved using subtraction, but it doesn’t explicitly tell you that; no minus signs appear in the equation. You have to apply a little bit of reasoning before it becomes a routine subtraction problem.
It’s important to make the work checkable. Some day you will be doing something important, where a mistake could cost millions of dollars and/or put people’s lives at risk. Important work obviously needs to be checked. Therefore you should get in the habit of making your work easy to check. Even if you are doing exercises that aren’t really important, pretend they are important, and check the work. To put it bluntly, if you cannot be trusted to check your work and to make it easily checkable by others, you won’t get hired to do anything important.
The first step toward making the work checkable is to show your work. Also, to the extent that you have checked your own work, show the checks.
The same principle applies to software: Document your code. For more on this, see reference 12.
Here’s a simple example of what it means to show your work in a useful way: When doing a calculation, whenever you write down a number, write an equation or a sentence that explains what the number means. That is to say, do not write down a bunch of unlabeled numbers:
 ☠ (6) 
but instead write equations:
 (7) 
This applies not only to calculations on paper, but also to computer programs including spreadsheets. If one cell in the spreadsheet contains a number, use the neighboring cells to document what the number means.
There is a double rationale for labeling the numbers: It is partly so that you can go back and check your work ... and partly so that other folks can check your work.
This is one of the reasons why standardized multipleguess tests are the enemy of critical thinking: Only the final answer is scored, so there is no incentive for making the intermediate steps checkable. (Arguably there is an indirect incentive for doublechecking the result, because it improves the odds of getting the right answer ... but one major reason for grades in the first place is that students at the introductory level do not respond well to indirect incentives. They need more direct, immediate feedback.)
Also note that the architecture of a multipleguess test rewards taking a slapdash approach to every problem. IMHO we urgently need better tests, and a better educational system in general, something with better rewards for meticulous work. That includes choosing a modest number of important problems, rather than a huge number of trivial problems.
Suggestion to teachers: One small step in the right direction is to arrange the scoring scheme on every quiz so that getting the numericallycorrect answer is nowhere near sufficient for getting full credit. Another part of the score is for showing the work leading up to the answer, and yet another part is for doublechecking the answer and showing that work as well.
Also: Sometimes it helps to assign the same question twice, perhaps as two items on the same assignment, or perhaps on two separate assignments a week apart. The point is to find a second, independent method of solution. Anybody who used method A the first time should use method B or C the second time. Anybody who used method B the first time should use method A or C the second time. And so forth. It must be emphasized that we are not redoing the problem because the first solution was wrong. The point is that there are multiple perfectly correct solutions, and we want to find them.
There are many, many ways that a solution can be verified.
There are almost always numerous constraints that must be satisfied at each step along the way. The details will vary from problem to problem, so it is your job to figure out what constraints apply to the problem and hand, and then verify that each step satisfies the constraints. If there are multiple elements in the solutionset, be sure to check each of them.
The same idea applies if you want to find the force experienced by the rider in a centrifuge, or practically any other situation involving rotational motion. Analyze it once in the lab frame and then analyze it again in the rotating frame.
Verifying the solution is valuable, but beware that it does not protect you from all possible errors. For example, if you have overlooked some elements of the solutionset, verifying whatever solutions you have found won’t alert you to the existence of other solutions. As another example, in the trigonometric example given above, if you inadvertently reverse the roles of a and b, you will get the wrong answer, but it will still satisfy the four trigonometric constraints itemized above.
Loosely speaking, any problem that requires thinking is called a puzzle or (equivalently) a riddle. Also, most puzzles have the further property that it is much harder to find a solution than it is to verify and understand a solution once it has been found. For example, consider the “eleven words in one” puzzle (reference 13). A given solution can be verified directly ... but a direct attack to find the solution would be thousands of times harder, since it would require searching through all the sixletter words in the English language.
Note: Easy verification is related to what computer scientists call the NP property. (If you don’t know what this means, don’t worry about it.) This is also related to what some puzzle aficionados call this the “Aha!” property, especially if the puzzle hinges on a single point that is obvious in retrospect.
Puzzles can be classified along various axes, as we now discuss.
One axis indicates how much domain knowledge the puzzle requires. Let’s call this the K axis. There are thousands of available puzzles that are near K=0. They are completely selfcontained, i.e. the statement of the problem contains all the information necessary to solve it. Good starting places include the “20 questions” game (reference 14) and the “twelve coins” puzzle (reference 15). A collection of classic puzzles by Henry Dudeney can be found in reference 16. Some of them are word puzzles, while others involve (in subtle ways) a fair bit of mathematical sophistication. From the same era and in a similar style we have the notorious Sam Loyd (reference 17). More recently there are whole series of books by the likes of Raymond Smullyan and Martin Gardner. Selfcontained puzzles are useful as a starting point, so that students can get accustomed to thinking even before they have much domain knowledge. As it says in reference 18, “Children lack knowledge and experience, but not reasoning ability.”
Moving along the K axis we come to problems that are “almost” selfcontained, in the sense that they depend on facts that are unstated but wellknown and easy to bring to mind. Farther along this axis are problems that require some amount of domainspecific knowledge. Reference 19 is a wellknown source of openended questions and puzzles that involve modest amounts of physics knowledge.
At the far end of the K axis we find problems that require deep and broad knowledge of the world. To illustrate the range of the K axis, consider the following contrast:
The “Who Owns the Fish” problem (reference 20) is intricate enough to scare away most people, but it is near K=0, because it is completely selfcontained and wellposed. The statement of the problem contains just the information required to solve the problem ... no less, and essentially no more.  The “Mississippi Flow” problem (section 14.4) problem is very far from being selfcontained. It requires you to rack your brain searching for information that might help solve the problem. A wide search is necessary, because seemingly very disparate tidbits of information turn out to be helpful. This is characteristic of a wide range of realworld problems. 
We can also define a B axis, which indicates to what extent a direct approach suffices, or not. The ninedots puzzle (section 14.2) is the quintessential example and the source of the expression “outsidethebox thinking”. Other venerable examples where the direct approach fails include the dogduckgrain problem and the orchard with 10 trees in five straight rows of four trees each.
If an indirect approach is needed, you need to use your imagination to find it, as discussed in section 4.
We can also define a H axis, which indicates how large is the space of hypotheses that must be considered. The number of hypotheses might be small, large, or infinite. Examples include
As large as those numbers are, they are still finite, so in principle one could enumerate all the possibilities by brute force. On the other hand, if you want the time you spend on it to be less than the age of the universe, you need to find an approach that is more efficient than brute force.
Similarly, the Mississippi flow problem (section 14.4) is in some sense infinite or nearly so ... not so much because there are infinitely many answers, but because there are nearly infinitely many dusty corners of your mind where you must look for potentiallyuseful information.
It is also worth noting that some puzzles (and many realworld problems) have multiple solutions; that is, there are multiple members of the solution set. As an elementary example, suppose the desired answer is x and we know that x^{2} = 81. If you find an xvalue that solves the equation, you may or may not have found the desired answer.
A much more challenging example is to find the complete solutionset to the “south/east/north triangle problem” (section 14.6). Many people find one solution and express absolute certainty that it is the only solution. It’s not.
For some reason that I don’t fully understand, finding one solution creates a tremendous psychological barrier to finding another solution. Perhaps this is just a result of poor training: the students have been trained to expect that every homework problem will have only one solution.
We now turn to a topic that is somewhat related but somewhat different, namely methods of solution. (This topic was introduced in section 4.) For example, there are two completely independent ways of finding how much water flows in the Mississippi. That means we can ask questions at two different metaphysical levels:
Question (a) has essentially only one answer, but question (b) has a solutionset with at least two members.
Again it seems that finding one answer to question (b) creates a tremendous psychological barrier to finding another answer.
It must be emphasized that being able to solve question (a) in two different ways is a tremendously valuable skill, because it vastly decreases the chance of making an undetected error.
We now consider problems that are underspecified, overspecified, or otherwise illposed. The most troublesome kind of illposed problems involve inconsistencies. That is, sometimes the “facts” you’re working with are not entirely true.
To deal with such problems, you need to move beyond blackandwhite notions of trueandfalse; instead you need to weigh the probabilities. Similarly, you are no longer dealing with facts; instead you are weighing the evidence.
Some of the inconsistencies are exogenous, i.e. they come from what other people have told you. Other inconsistencies are endogenous, i.e. they come from assumptions that you have made on your own.
Some “recreational” puzzles, especially those that involve outsidethebox thinking, are useful for developing a subset of critical thinking skills, because they tempt you to make false assumptions, and force you to question your assumptions.
On the other hand, the overwhelming majority of “recreational” puzzles are wellposed, which means they don’t really exercise the full range of critical thinking skills.
For more discussion of illposed problems, see reference 22.
It is a classic studentpilot mistake to flip down the handle that controls the landing gear, and then announce that the landing gear is down. We agree that the handle is down, but there are many ways in which the landing gear itself might fail to come down. (This kind of failure is rare, but it is the pilot’s job to defend against rarebutimportant failures.)
We agree that the handle in question is meant to symbolize the landing gear. It is even shaped like a miniature landing gear, as shown in figure 1. However, the handle is just a symbol; the real landing gear is something dramatically different. Wanting the landing gear to be down is not sufficient. Putting the handle down is not sufficient. It is the pilot’s job to verify that the actual landing gear is down and locked.
People are normally good at symbolism. It is one of the things that makes us human. In figure 2, the dolls’ eyes are made of buttons, but we are not supposed to think of them as buttons. We are expected to look past the symbol and think in terms of the thing symbolized, namely eyes.
In the same way, numerals are different from numbers. The numeral is a symbol that merely symbolizes the actual number. Any given number can be symbolized in many different ways, including seventeen, 17, xvii, 0x11, 17.0000, et cetera.
The idea of symbolism has been pretty well understood for at least 2000 years; see reference 23 and reference 24.
Proper symbolism should never be taken for granted. By way of counterexample, a pocket calculator does arithmetic without understanding what the numbers mean. It does not really deal with numbers at all; instead it unthinkingly manipulates the numerals according to cutanddried rules.
Sometimes you can get away with manipulating empty symbols, but sometimes you can’t. Anything resembling critical reasoning requires looking past the symbol to understand the thing symbolized. In the context of actual thinking, the rule is:
Here’s another example: There is an abstract, ideal thing called a derivative. Like prisoners in Plato’s cave, we never get to see the real thing, but we do get to see various manifestations of it. For starters, we get to see symbols such as dy/dx.
It is distressingly common for students to recognize dy/dx as a derivative, but not recognize dE/dS as a derivative. This suggests that they have not been watching the shadows closely enough, because on Tuesdays the derivative projects out as dy/dx, and on Wednesdays it projects out as dE/dS, et cetera. As always, the goal is to look beyond the shadows to understand the big picture. You need to wrap your head around the abstract, ideal, Platonic derivative. When you understand the derivative, you know that it projects out as all sorts of things of the form d(⋯)/d(⋯), acting on some variables. The idea of derivative transcends the choice of variables.
The nice thing about math and physics is that we are much better off than those old Greek guys. We are better off because our cave has multiple walls, allowing us to look at multiple projections of the same ideal thing. For example: On one wall the derivative projects out as symbols of the form d(⋯)/d(⋯). Meanwhile, on another wall, it projects out as a diagram with a tangent vector that indicates the instantaneous slope. On yet another wall, dE projects out as the gradient vector (aka the exterior derivative), which we can diagram as a set of contours of constant E. Then dE = T dS can be considered a statement about proportionality between two vectors in some highly abstract space. In any case, your job is to look past all these representations to see the referent, to see the thing being represented.
A closelyrelated issue is discussed in section 8.2.
On Madison Avenue, there is a saying: “The sizzle sells the steak”.
The problem is that Madison Avenue is infamous for applying sizzle to products that do not deserve it. However, that is a dangerous oversimplification, because there are in fact four possibilities:
wholesome  unwholesome  
steak  steak  
sizzle:  A  B  
no sizzle:  C  D 
This is related to what we discussed in section 8.1, insofar as the sizzle is only a symbol that is meant to call attention to the steak.
Logical thinking requires considering all four possibilities. In contrast, it is a common mistake to think that there are only two possibilities, namely B and C. Sometimes people get so disgusted by bad products with deceptive advertising that they make a kneejerk transition to the diagonallyopposite policy, namely a good product with little or no advertising. This is a bonehead move. The correct move is diagonal, but rather horizontal, from B to A. You can perfectly well apply sizzle to a nice wholesome steak.
More generally, the point is: Do not assume two variables are linked. Some variables are linked, but some are not. If they are linked, there may be tradeoffs; you may have to optimize one variable at the expense of the other. However, if they are unlinked, you may be able to optimize both of them simultaneously.
Here is another example, applied to persuasion in general. Category B might be called empty rhetoric ... but keep in mind that not all rhetoric is empty.
true and  bogus and  
sincere  insincere  
persuasive:  A  B  
unpersuasive:  C  D 
Here is yet another example, applied to politics. Category B might be called dirty politics ... but keep in mind that not all politics is dirty. You want to find a politician who has good policies and enough charisma and political strength to get people to go along with those policies.
good policy  bad policy  
political strength:  A  B  
political weakness:  C  D 
Very commonly, there are two or more concepts that are not clearly distinguished. This is a major obstacle to correct thinking.
Often the ambiguity is reflected in the names we give things. That is to say, there are multiple different things represented by the same symbol. For example, in physics there are two different things that “gravity” might mean, and at least four different things that “heat” might mean. In math, there are at least three different things that “closed” might mean, and two different things that “primitive” might mean. And so on. Sometimes it is possible to figure out from context which meaning should apply, but this is laborious and errorprone. As a specific example, a wellknown cryptography book quoted one definition of “primitive” in a context where the other definition was required. Additional examples of ambiguous and/or misleading terminology can be found in reference 25.
Sometimes you can fix this problem by using different words, for example replacing the word “heat” with TdS, if that’s the intended meaning.
Sometimes you can fix the problem by adding adjectives to the old words, for example talking about “massogenic gravity” versus “framative gravity”.
By way of example, suppose you were given the measured points shown in figure 3, and asked to fit a sine wave to them.
The obvious solution to this problem is shown in figure 4. That looks like a good fit. The amplitude, frequency, and phase of the fitted function are determined to high precision, according to the ordinary widelyused curvefitting formulas.
Even so, some crucial questions remain: How sure are you that this is the right answer? How well does this fitted function predict the position of the next measured point? These are tricky questions, because an unrestricted search for the sine wave that best fits the points is almost certainly not the best way to predict the next point. Figure 5 is the key to understanding why this is so.
It turns out that for almost any set of points, you can always find some sine wave that goes through the points, as closely as you please, if you make the frequency high enough. However, this ordinarily results in extreme overfitting. You’re fitting to the noise, rather than averaging out the noise. The resulting overfitted sine wave will be useless for predicting the next point. Another term that gets applied to this concept is biasvariance tradeoff. These concepts can be quantified and formalized using the VapnikChernovenkis (VC) dimensionality and related ideas. A sine wave has an infinite VC dimensionality.
The sine wave stands in contrast to a polynomial with N adjustable coefficients, for which the VC dimensionality is at most N. That means if you fit the polynomial to a large number of points, large compared to N, the coefficients will be well determined and the polynomial will be a good predictor.
The distinction is that a sine wave can encode an infinite amount of information in the trailing digits of the fitting parameters, whereas a polynomial cannot. The trailing digits of the coefficients of the polynomial are truly insignificant. Your intuition about limiting the number of parameters (Ockham’s razor) fails miserably for sine waves.
There are some deep ideas here, ideas of proof, disproof, predictive power, et cetera. For more on this, see the machinelearning literature, especially PAC learning. Reference 26 is a good place to start.
This sinewave example calls attention to the fact that the family of fitting functions we are using (sine waves with adjustable amplitude, frequency, and phase) has an infinite VC dimensionality, even though there are only three adjustable parameters. We see that three data points – or even a couple dozen data points – are nowhere near sufficient to pin down these three parameters. This tells us that VC dimensionality is the important concept, and “number of parameters” is only an approximate concept, sometimes valid but definitely not always.
For a polynomial, the VC dimensionality is just the number of parameters, which makes sense. So you may be wondering how sin(ωt+φ), with only two parameters (ω and φ) can have an infinte VC dimensionality. Here’s the trick: The sine takes its argument modulo 2π. When the frequency is high, this can be used to pick out the thousandth or the millionth insignificant digit of t. You can use the frequency (if it’s high enough) almost like a computer program, and get it to do anything you want.
Another example of what can go wrong is shown in figure 6. The black curve represents the raw data. We have lots and lots of data points, with very high precision. We know a priori that the area under the black curve is the sum of two rectangles – a red rectangle and a blue rectangle. All we need to do is a simple fit, to determine the height, width, and center of the two rectangles. As you can see from the figure, there are two equally good solutions. There are two equally perfect fits. Alas, this leaves us with very considerable uncertainty about the area, width, and center of the blue rectangle.
Some problems in this category can be solved by introducing some sort of regularizer, as discussed in reference 22.
Illposed questions, especially when they are not superobviously illposed, create lots of opportunities for people to fool themselves into “knowing” that they have “the” answer (when in fact they have not considered all the possibilities). Some examples can be found in reference 27.
For more about problemsolving in general, see reference 1.
In the words of the famous philosopher Don Schlitz:

The school experience – especially the standardized “gameshow” testing discussed in section 2 – gives many people the destructive idea that if it takes more then 45 seconds to solve a problem, they should give up. In the real world, you don’t get 40 questions in 30 minutes. That’s off by multiple orders of magnitude. More commonly you get 4 questions in 300 minutes, or something even beyond that. Therefore you must learn not to give up too soon.  At some point you should give up. You don’t want to spend the rest of your life stuck on some problem that you can’t solve. If you don’t want to give up entirely, you can set the problem aside temporarily, and return to it later, after you have acquired more knowledge and skill. 
If you give up on the main goal you are admitting defeat. Many people are too quick to give up on the main goal.  Many problems require exploring the possibilities. That involves choosing tentative, hypothetical subgoals. If such a hypothesis doesn’t work out satisfactorily, you need to backtrack and redo the analysis, choosing the next item from the list of hypotheses. Many people are too slow to give up on an untenable hypothesis (and therefore too slow to begin consideration of alternative hypotheses). 
The process of exploring the hypotheses can often be formalized as searching a tree. More formally, this is called combinatorial search. Solving a Sudoku puzzle is a classic example of combinatorial search. Also, many chess problems involve searching a tree. Another example is exploring a maze. Giving up on deadend subgoals is absolutely necessary for making progress toward the main goal.
You will get into all sorts of trouble if you pay attention only to evidence that supports your favorite hypothesis. (This is called “selecting the data” and it is a Bad Thing.) For each hypothesis, you should check equally diligently for supporting evidence and conflicting evidence.
Indeed, the most dangerous models are the ones that get the right answer 99% of the time, in “normal” situations – and then betray you in some rare but critical situation.
If you leave nonexperts to figure stuff out on their own, they run the risk of latching on to some oversimplified theory that explains some undersized set of data. You can’t fault them for this. At this stage they have no way of knowing that their simple theory will not survive when more data becomes available.
Indeed, sometimes solving a small instance of the problem puts you in a position to solve all larger instances by induction.
We want knowledge to be useful, which means you must be able to apply the knowledge when needed. We want memory to be useful, which means you must be able to recall the memory when needed.
The key to having a good memory is to put stuff in to your memory in such a way as to facilitate getting it out of your memory when needed.
In the 1890s William James (reference 4) described thought memory in terms of the associations between ideas:
Each of the associates is a hook to which [the memory] hangs, a means to fish it up when sunk below the surface. Together they form a network of attachments by which it is woven into the entire tissue of our thought. The ’secret of a good memory’ is thus the secret of forming diverse and multiple associations with every fact we care to retain. But this forming of associations with a fact, – what is it but thinking about the fact as much as possible? Briefly, then, of two men with the same outward experiences, the one who thinks over his experiences most, and weaves them into the most systematic relations with each other, will be the one with the best memory.
[Italics in the original.] This (along with lots of other evidence) tells us that thought and memory are inextricably intertwined. Analytical thought requires facts; otherwise there is nothing to analyze. Conversely, the process of memorizing and recalling facts is itself a thought process ... and if done right, it is a rather deep though process.
The cycle time for neurons in the human brain is on the order of 20 milliseconds. That’s 50 cycles per second. In contrast, there are many many thousands of facts that you can recall in less than a second, incluing names, phone numbers, vocabulary words, et cetera. That should make it obvious that recall is not a serial process. You do not run down a sequential list of facts, looking for the one you want. Instead, the recall process is massively parallel. Roughly speaking, it’s like posing a question to a very large audience: If any one person knows the answer, they raise their hand, and you receive the answer very quickly.
In item 10 in section 9, the advice was to “figure it out”. Sometimes that’s exactly the right advice, especially if you are not accustomed to figuring things out, or if you didn’t realize that a given problem was figureoutable.  Once you have begun the process of figuring things out, having somebody tell you “Figure it out” is no longer of much help. 
Learning to be smart is a longterm process. It is a lifestyle, in much the same way that the habit of healthy eating and regular exercise is a lifestyle.  By the time you are facing a hard problem with a short deadline, it is too late. You cannot make yourself significantly smarter in the time available. 
If you want to improve your ability to recall things, you need to plan ahead. You need to make an extra effort at the time you encounter each new idea.  If you wait until you need to recall the idea, making an extra effort at that time won’t help much. It’s too late. 
The idea is to live your life in such a way that you continually get smarter. One of the key ingredients is this: Every time you hear a new idea you should turn it over in your mind, looking for ways in which it is connected to other ideas. Make a note of which old ideas are consistent with – or inconsistent with – the new idea. The process of mulling ideas and looking for connections takes time and effort. Over a period of weeks and months and years, these connections will make you smarter. They will make you more prepared to handle hard problems.
Let’s be clear that we are talking about something very odd: Forming a useful memory requires making a conscious effort to teach your subconscious what to do. The recall process is mostly subconscious ... but by consciously mulling over an idea you can form the connections that will allow the subconscious processes to work properly, later, when they are needed.
Now that we have discussed useful learning, let’s say a few words about useless learning. As a first example, relying on cramming is a bad idea. Anything you learn in a couple of days you will forget a couple of days later. Cramming might improve your grade this semester, but it guarantees you will crash and burn next semester. If a “cram” situation ever arises, it tells you something very important: It means you need to change your lifestyle and change your learning habits so you never get into that situation again.
As another example: Rote learning allows you to recall an idea in one way. As such, it counts as a memory, but it’s not a very useful memory. In contrast, mulling over an idea and establishing connections to other ideas allows you to recall the idea in 100 different ways, which is 100 times more useful.
Reference 7 explains how a scaling argument based on figure 7 can be used to figure out the formula for the area of an ellipse.
This leaves us with multiple ways of figuring out the area of an ellipse: You could just plain remember the formula from highschool geometry, and/or you could look it up, and/or you could easily reconstruct it whenever it is needed.
I know some people who have quite bad memory who are successful physicists. They carefully remember a few fundamental facts, and rederive everything else on an asneeded basis. For example, with a little practice, you can rederive the formula for the area of an ellipse faster than most people can recall it from memory (and with less probability of error).
It may be that some people develop extrasharp thinking skills as a way of compensating for bad memory ... in analogy to the way that blind persons often develop extrasharp hearing skills. However, I am not going to recommend bad memory any more than I would recommend blindness. Memory is a valuable skill. Obviously it is best to have a good memory and good thinking skills.
Feynman said that knowledge is like a grand tapestry. A forgotten fact is like a hole in the tapestry. You should be able to repair the hole in several different ways, by reweaving down from the top, or up from the bottom, or in from the sides. Any important fact can be rederived in numerous ways, because the things we know are interconnected in innumerable ways.
Therefore: You should practice rederiving things. Even if it is something that you remember, rederive it anyway. This provides multiple advantages: First, it serves as a crosscheck on your memory. Secondly, it builds up your thinking skills. Thirdly, it improves your understanding and recall of facts related to the one you are looking for, by exercising the allimportant connections between facts.
Remember that any important formula should be derivable in multiple different ways, so if you derived it one way last time, try to derive it another way next time.
Some things can’t be derived, so you just have to remember them.
Conversely, some things can’t be remembered, so you just have to figure them out. In particular, if/when you visit unexplored territory, it is nice to be able to derive new formulas on the spot. It is a really good feeling to know that even though you are in unexplored territory, you are not lost. Based on your good thinking skills, you can move around more freely than most people do in familiar territory.
In contrast, the guy who tries to get by on memory alone, to the neglect of good thinking skills, will get seriously stuck as soon as he sets foot in unexplored territory, because the facts he needs are nowhere in his memory.
Last but not least: There is no clearcut distinction between remembering something and figuring it out. If there is any distinction at all, it is of zero importance. Memory is itself a thought process. Sometimes it is a subconscious process, and sometimes it is a recognizably conscious process, but there is no important distinction. As an example, if I need to know the square root of 40, I can never remember the numerical value, but I know at least two ways of figuring it out in my head. I can figure it out to 1% accuracy in less time than it takes to talk about it, and figure it out to 0.1% accuracy almost as quickly. There’s thinking involved, but not much in the way of creative thinking, because I know exactly what procedures to use. You could ask whether this counts as memory, or as thinking, or both ... but the answer doesn’t matter.
Once upon a time, there was a sophomore who heard that fruits and vegetables are good for you. So he ate nothing but apples and celery for three months. Then he died.
Some members of the community reacted by saying “Apples are corruption! Celery is emblematic of everything that is wrong with society today! We must destroy all fruits and vegetables immediately!”
I beg to differ. I still think fruits and vegetables are good for you. The problem was not what the guy ate ... the problem was what they guy didn’t eat. Do not confuse the presence of apples and celery with the absence of a balanced diet.
Let’s turn our attention now to understanding, and now it relates to algorithms, mnemonics, formalism, et cetera.
I’ve heard math teachers tell me, in all seriousness, that long division is evil, because it is an algorithm, and all algorithms are mindless. (If you want to know how I do long division, see reference 8.)  I get really tired of hearing that. 
I’ve heard chemistry teachers tell me that students should not be allowed to use Gaussian elimination to balance chemical reaction equations, even when the number of variables is huge, because that would be an algorithm, and all algorithms are rote, and rote is evil.  I get really tired of hearing that, too. 
I’ve heard physics teachers, in all seriousness, use the word “algorithms” as an antonym for understanding.  I get really tired of hearing that, too. 
My point is that properlychosen algorithms / mnemonics / equations / procedures / formalisms / methods are good for you. Really they are ... just like fruits and vegetables are good for you, as part of a balanced diet. If a student has some formal tools but lacks a gut feeling for how things work, the problem is not what the student has ... the problem is what the student doesn’t have.
Everyone needs a balanced diet. That is, everyone needs gut feelings and formalism.
Real understanding is represented by point B, in the upperright corner, where there is a high level of feeling for the subject backed up by a high level of rigor.
As indicated by the red and blue arrows, you don’t get to the goal in one step. You start out with a little bit of feeling and a little bit of formalism. They reinforce each other and provide a foundation for the next step. The red leverages the blue and the blue leverages the red. And so you itsybitsyspider your way up and over toward point B.
Let’s be clear:
The problem is not what the students have; the problem is what they don’t have. They don’t have a feeling for the subject.
This situation is represented by point D in figure 8. It sometimes goes by the name “rigor mortis”, which is a pretty good name for rigor without feeling.
This manifests itself in many ways. As an example, sometimes people sling buzzwords around without any real understanding. If they had checked their feelings against the theory, they would have known their feelings were nonsense.
Many additional examples are classified under the educationalese term “negative transference”. That means your gut feeling based on experience in one domain might give you the wrong answer when applied in another domain.
I’m not saying that gut feelings are bad. I’m saying that gut feelings have to be checked against the facts.
Red Queen: “Why, sometimes I’ve believed as many as six impossible things before breakfast.”
— Lewis Carroll
Also, I’m saying that sometimes having some sophistication gives you useful information about the limits of validity of your gut feelings.
Lady Thiang: “This is a man who thinks with his heart, His heart is not always wise.”
— Oscar Hammerstein
This sheds some light on the socalled “new math” and its relationship to “old math”, which has remained an unsettled issue since the 1960s. (If you’re interested in the history of this, reference 29 is a reasonably informative, nonhysterical, nonpolemical news article.) This issue is commonly referred to as the “Math Wars” but I don’t like to use that term. The warlike aspects are a discredit to everyone involved. The sensible approach is to use smart, efficient algorithms^{2} and to understand the principles involved.
Some people object to algorithms because they can be memorized. Of course algorithms can be memorized, but that’s usually irrelevant, sometimes an advantage, and never a disadvantage. As mentioned in section 10.2, I don’t recommend doing away with memory, for the same reason I don’t recommend blindness. Memory is not the opposite of thought, nor the enemy of thought. Using an algorithm is not necessarily the nonthoughtful approach; usually it is the most thoughtful approach. Algorithms are like tools. When I tighten a bolt, I use a wrench. That does not make me any less skillful than the guy who tries to tighten the bolt with his bare hands. I’m allowed to use the wrench, even though I didn’t invent it or even manufacture it.
Continuing that thought: There have many occasions where I did invent and construct a specialized wrench or other tool to solve a specialized problem. Building custom tools and jigs requires an investment, but often this approach pays off handsomely, leading to overall faster and better results, compared to the bruteforce headon approach.
It is always possible to learn an algorithm in a mindless way, and to apply the algorithm by rote. That’s unsurprising, because any tool can be abused. Similarly equations can be abused by students who plug and chug, without any thought as to what the symbols mean. However:
You should never use “equation” as a synonym for plugandchug. You should never use “algorithm” as a synonym for mindless. You should never use “systematic” as a synonym for rote.  If you mean rote, say “rote”. If you mean mindless, say “mindless”. If you mean plugandchug, say “plugandchug”. 
Having a tool does not oblige you to abuse the tool.  You must not blame the presence of one tool for the absence of another. 
A tool that is wellsuited for “Task A” might be laughably illsuited for “Task B” – and vice versa. It’s your job to figure out which tool to use for the task at hand. This requires judgement.
This is relevant to any discussion of why it pays to be smart and well educated. Any dummy can do a crude numerical integration, but once you start doing it there is a tremendous incentive to use a smarter algorithm. If you don’t know some physics, you have no idea what phase space is or why a symplectic integrator is advantageous. If you don’t know some algebra, you can’t even be part of the conversation about secondorder versus firstorder.
An algorithm is like a tool, much like a piano is a tool for making music. Having a powerful tool doesn’t automagically make me a good musician, but it doesn’t make me worse. Having the tool does not preclude me from having a feel for the music, based on artistry as well as understanding.
So it is with physics formalism and mathematical algorithms: Using the tool does not automagically bestow understanding, but neither does it detract. By using the tool wisely, you can get a lot more done.
When I teach a kid to ride a bike I do not first require him to build the bike from scratch. When I teach a kid to play the piano I do not first require him to build the piano from scratch. By the same token, when teaching kids to do physics, I do not require them to reinvent all the formalism from scratch.
I often cook without a recipe, improvising based on what’s available and on the whim of the moment. On the other hand reading a cookbook won’t make me a worse cook, and quite likely will make me safer, make me more efficient, and produce a better endproduct.
Algorithms do not prevent understanding, but neither do the bestow it ... or even require it.
Physics formalism is very powerful. Everybody, present company not excepted, uses the formalism without fully understanding it. Feynman said he didn’t really understand quantum mechanics (even though, relatively speaking, he understood it better than anyone else alive). Newton did not pretend to attribute any deep meaning to the law of universal gravitation (“hypotheses non fingo”). I daresay a lot of other folks are in the same boat or worse, yet we are proficient at using the formalism to get good results.
In any case, as the proverb says, it is a poor workman who blames his tools (or his algorithms).
We should also say a few words about crutches:
Sometimes there is a legitimate need for a crutch. That can happen if somebody has a broken leg .... after you have taken direct action to treat the underlying malady and provided you have briefed the user on the correct usage and limitations of the crutch.  On the other hand, crutches can actually cause secondary injuries, especially if overused or abused. For a person with normal abilities, a crutch is worse than useless. It gets in the way, and hinders development of normal performance. 
So ... there are upsides and downsides to crutches. We should not overreact to the upsides or the downsides. I’ve seen some algorithms – such as the infamous “density triangle” – that should be categorized as crutches. They may be useful in some rare, temporary situations, but otherwise are worse than useless.
If you see somebody using a crutch that is not really needed, it is a good idea to wean them off the crutch, sooner rather than later.
Last but not least: The right answer depends on the background and developmental level of the student. If a five year old kid asks “how does this flashlight work”, he does not want a lecture on the chemistry of batteries or the physics of LEDs. A more appropriate answer would be something purely operational, such as “you need to twist it, like so.”
If the student actually wants a more detailed answer, he can always ask a more detailed question.
In section 10.3 we argued that memory is part of thought (not the opposite of thought or the enemy of thought). Similarly we argued that algorithms and methods are part of thought (not the opposite of thought or the enemy of thought). These parts reinforce each other in a lattice, as shown in figure 8.
The same applies to creativity. Not all thinking counts as creative thinking, but if you are going to do any creative thinking, it will necessarily be based on a foundation of memories and methods, of gut feelings and algorithms.
Most inventions can be described as pushing forward the frontier of knowledge. In order to do this, you need to know where the frontier is! In almost all cases, usefully original thinking is not wildly original. For example, Beethoven is famous for breaking the rules of classical music theory ... but he did not break all the rules at once. He broke a rule here and a rule there, in crafty and purposeful ways.
Any discussion of critical thinking must necessarily cover much of the same ground as a discussion of scientific methods. See reference 10.
Consider the following scenario: I pose the “Mississippi Flow” problem to two different people who have nominally similar educational backgrounds and experience.
The usual case is that I work with the person for 45 minutes, telling them “don’t give up” and “if you need to know that, figure it out” ... and giving a series of hints. At the end of this time, they have a solution. They realize in retrospect that in principle they could have solved the problem, in the sense that they knew everything necessary to permit a solution. At the same time, they realize that in practice they could never have found the solution on their own, because they would not have been able to organize their thinking in such a way as to call attention to the relevant facts.  In a notverysmall minority of the cases, the person can solve the puzzle very very quickly. They outline the method of solution in about four seconds, and then take another few seconds to carry out the required multiplications. 
The fact that proficiency with this sort of problemsolving is so unevenly distributed makes this sort of problem difficult to discuss in a classroom situation. The class as a whole, working as a team, can solve the problem relatively quickly, but that defeats one of the major purposes, namely giving each person experience racking their brain to find and organize the required bits of information. I don’t really know how to solve this problem. It would be ideal to spend 45 minutes with each student oneonone, going over this puzzle, but that would be prohibitively expensive in a typical school setting.
Similar considerations apply to homework. If the purpose of the exercise is to get experience racking one’s brain, the purpose is defeated if students google the solution, or get the solution from a classmate. This problem cannot be prevented, but it can be fairly well controlled, as follows: You can separate the sheep from the goats by assigning a modified version of the puzzle on a closedbook inclass quiz. Someone who understands the method of solution will be able to solve the modified version instantly, whereas someone who merely copied the solution will not. (I don’t know of any suitable modifications of the “Mississippi Flow” problem, but others such as the “Who Owns the Fish” problem are readily modifiable.)
Let us return to the question of what is a puzzle. Consider the contrast:
Many puzzles have the unfortunate property that even if you solve the puzzle, it’s still just a puzzle. The reward for solving it is trivial, artificial, or very indirect. Most homework problems are in this category; that is, the teacher already knows the right answer, and is not going to make any lifeordeath decisions based on the student’s answer.  In many realworld situations, there is a lot riding on the question. It may truly be a lifeanddeath decision. 
As my friend Larry says: If it’s not worth doing, it’s not worth doing right.  If it’s really worth doing, it’s worth doublechecking to make sure you did it right. 
Consider someone who is learning to ride a bike. Why are they doing it? They typically are not doing it for the challenge; they are not doing it because the learning process is difficult. They are doing it because being able to ride a bike will empower them to go places and do things they could not do otherwise.
Consider the following four scenarios:
Problem A is hard, and the solution is worth $10.00.  Problem B is hard, and the solution is worth $100.00. 
Problem C is easy, and the solution is worth $10.00.  Problem D is easy, and the solution is worth $100.00. 
Given the choice, I would prefer problem B over problem A every time. That is, we should not value puzzles because they are hard; instead we should value puzzles if and when the answer is important. Homework problems have indirect value if (and only if) they teach skills that will have direct value later.
It is also true that given the choice, I would prefer problem C over problem A. Easy problems are preferable to hard problems, other things being equal.
Of course problem D is the most preferable of all.
More generally, I need to do a cost/benefit analysis. Given the choice between an easy, lowvalue problem and a hard, highvalue problem, a tradeoff must be made. Making wise tradeoffs requires analysis and judgement.
In any case, we need to maintain a clear understanding of what is primary versus what is secondary, what is directly valuable versus what is only indirectly valuable, and what is real versus what is artificial.
Therefore do not get carried away with doing puzzles for the sake of doing puzzles. Choose puzzles that cultivate some useful general skill. Explicitly discuss what skills are being taught, and why. (See section 9 for some basic thoughts about this.)
The idea is neither to work harder, nor to work less hard. The idea is to get more done, by being clever. Things that formerly seemed difficult become easy once you know how. Above all, you should learn to solve important problems.
For more on this, see reference 31.
Some of these are interesting because they have more than one answer, i.e. the solutionset is not a singleton. Others are interesting because even though there is only one final answer, there are multiple methods of solution. Others are interesting because they require outsidethebox thinking.
Note that some of these are more challenging than they appear. I’ve seen some famously smart people get them wrong.
I have a quantity x such that x^{2}=81. Please tell me the value of x. How do you know? How sure are you?
It’s amazing how many people get this wrong, and express complete confidence in the wrong answer.
Arrange nine dots in a regular array, three rows of three:
•  •  •  
•  •  •  
•  •  • 
The task is to draw a single path^{3} consisting of four straight line segments, such that the path goes through all of the dots. This requires outsidethebox thinking, literally as well as figuratively.
Given a barometer, how many different ways can you think of for measuring the height of a building?
This is a classic, although the original story (reference 32) set up the question differently, not quite so directly.
Please give me an estimate of how much Mississippi River water flows past New Orleans in a year. This is a closedbook question; don’t look anything up; figure it out.
Once upon a time a farmer went to market and purchased a wolf, a goat, and a cabbage. On his way home, he came to a river crossing where a small boat was available. The boat was so small that the farmer could carry only himself and a single one of his purchases : the wolf, the goat, or the cabbage.
If given a chance, not in the farmer’s presence, the goat would eat the cabbage, and the wolf would eat the goat.
The goal is to keep all of the purchases intact and get them to the far side of the river. How would you do it?
The “cannibals and missionaries” problem is similar. It permits fewer outsidethebox solutions, which is usually a bad thing, but might sometimes help somebody get out of a rut.
There are a number of other “river crossing puzzles”, and more generally a vast number of “transport puzzles”.
You start out at point A. You travel strictly south for one mile. You then make a rightangle turn and travel strictly east for one mile. You then make another rightangle turn and travel strictly north for one mile. It turns out that you are now back at point A. So, please tell me, where is point A? How do you know? How sure are you?
Note: For present purposes, we approximate the earth as being perfectly spherical. Point A is on the surface, and all travel takes place along the surface.
Pencil and paper only. No calculators or computers of any kind allowed.
Bongard problems (reference 33) have the advantage that they don’t require much if any domain specific knowledge, so you can use them on Day One of the course. Also, there are many dozens of them (unlike the Mississippi Flow problem, which is a oneoff). The disadvantage is that Bongard problems are artificial puzzles. The downside to puzzles and games is that even if you win the game, it’s still just a game. The idea here is to use Bongard problems as scaffolding, to allow students to feel what it’s like to deal with openended problems. In the long run they won’t need the scaffolding, because there are plenty of realworld problems begging to be solved.
Students generally enjoy Bongard problems. They teach some useful thinking skills, including the necessity of looking at a problem from more than one viewpoint.
Here’s a challenge of a different kind: Can you draw a continuous path that crosses each of the line segments in figure 10, crossing each one once and only once?
Crossing at the corners is not allowed.
The path shown in red in figure 11 is almost – but not quite – a successful solution; it fails to cross the middle vertical line segment. Can you do better than this?
Truly a classic. Make sure the assumed constraints really are constraints. For details, see reference 34.
Let’s talk about creativity, originality, artistry, and imagination. These topics are closely related to the idea of openended questions and to the larger idea of critical thinking, in ways that will be discussed shortly.
Whenever (in section 15.2 or elsewhere) I talk about “the building block approach”, please imagine a child building something out of Legos ... something complicated and imaginative. The question of what to build and how to build it is openended. There is an astronomical number of right answers.  In contrast, please do not imagine a menial laborer building a brick wall. In this situation, if you ask where the next brick comes from, there is usually only one right answer. Similarly, if you ask where the next brick needs to go, there is usually only one or two right answers. 
With the Legos, reducing it to a stepbystep, multiplechoice, checklistoriented activity would spectacularly bad. It would remove the spontaneity, creativity, and openendedness.  With menial bricklaying, multiplechoice questions might be perfectly appropriate. 
Here’s another analogy: At many universities, the music department offers a “music appreciation for dummies” course. The students listen to music and talk about it. There is little if any openendedness. This stands in contrast to the course in “composition and orchestration” that is taken by music majors, by real musicians, where originality and artistry are required. There is a great deal of structure, but also a great deal of openendedness.
I mention this because all too often, the introductory physics course degenerates into “physics appreciation for dummies”. The students are on the outside looking in. The students look at physics and talk about physics, but they don’t actually do any physics. In particular,
I insist that it doesn’t have to be this way. Physics, even at the introductory level, does not have to be a mindless, joyless, multiple guess activity.
Life is not a multipleguess test. Teaching is certainly not a multiple guess job. I mean, seriously, when was the last time a student came up to a teacher and said “I’m confused, and here are the four possible ways in which I could be confused. Pick one.”
Anybody who has more than a day of experience in the teaching profession (or any other profession) knows what it’s like to deal with openended questions. Everybody on this list knows in their bones how to do it.
There may be some questions about how best to teach this idea, but that is to be expected.
On the other hand, there are some triedandtrue ideas that we can keep in our bag of tricks, and there are some known pitfalls to be avoided. The following sections contain suggestions from the keengraspoftheobvious department.
When faced with a complex task, consider the classic “building block” approach. It consists of two phases:
The first phase of this approach (learning the individual buildingblocks) is not sufficient. You also need the second phase, namely putting the blocks together to complete the overall edifice.
As an aid to the second phase, consider the “Music Minus One” approach as described in reference 35.
That is, rather than asking kids to solve a complex realworld problem ab initio, you hand them an almostcomplete solution and let them provide the missing piece(s). I do this routinely with flying students, when they are learning to land the airplane:
The buildingblock approach stands in contrast to the “allornothing” fallacy. Never allow yourself to be put in a position where the only options are:
There are lots of more nuanced approaches, whereby they gradually learn how to swim.
The buildingblock metaphor is continued in section 15.3.
Here are two more fallacies, which are in some sense mirror images of each other. As usual, both extremes are wrong.
At one extreme is the repertoryonly approach, aka learningbydoing, aka the problemsolvingonly approach. Imagine trying to learn a new subject, reading nothing but the Schaum’s outline. This is nuts. You need more explanation than that. You need more theory than that.  At the opposite extreme is the principlesandconceptsonly approach. This corresponds to giving the kid an enormous set of Legos, enough in principle to build almost anything ... but not allowing him to ever actually build anything. This is also nuts. If the principles and concepts are not used for interesting and important applications, there will be zero motivation and zero retention. 
As a more positive way of saying the same thing, you need teach both principles and applications, so that applications illustrate and motivate the theory, and the theory enables and explains the applications. At this point I do the itsybitsyspider thing with my hands.
Physics principles are like legos, or like fullscale masonry building blocks. Individually they aren’t very interesting. Even a large pile of them is not very interesting. The interesting part comes when you combine them in artistic ways to build something.
Analogies aside: The thing that makes physics interesting, the thing that makes physics physics, is the fact that you can learn a surprisingly small amount of basic principles and then combine them in complicated multistep chains of reasoning to figure out interesting stuff.
If the student is given nothing but legoshaped holes into which they must plug one individual lego, then the whole program is worthless, and we might as well get rid of the legos. However, that seems like curing a headache by amputation.
If the student can answer the exercises by equationhunting, there’s nothing wrong with the student and there’s nothing wrong with the equations. There’s probably something wrong with the question, insofar as it isn’t making the point you wanted to make, but that’s your problem, not the students’ problem.
You shouldn’t encourage equationhunting, because in the real world it is phenomenally inefficient. On the other hand, you shouldn’t make a rule against it, partly because the rule is unenforceable, and also because there are a few realworld examples of successful equationhunting. Indeed, some of the most famous equations of all time were hunted, and had to be hunted, because there was no preexisting basis from which to derive them. Galileo. George Green. Max Planck.
By far the best way to discourage equationhunting is to keep the domain of discourse sufficiently large that there are too many available equations. To say the same thing the other way: Sometimes people behave as if there were an unwritten rule that says “Solve using the methods of this chapter” — but that is a very unwise rule, because (a) it encourages mindless equationhunting, and (b) it penalizes creativity and insight. Therefore: (a) Make a point of mixing in some exercises that use the methods of previous chapters. This serves as a useful review of the material, and penalizes equationhunting in a natural, authentic way. Also: (b) Accept unexpected methods of solution, so long as they lead to the right answer.
A course that focuses on concepts only, in the absence of reasoning, is like a disorganized pile of legos. Students will find it boring and useless, and rightly so. They will remember none of it. One has to wonder, what’s the point? Why bother?
Students will put a lot more effort into the course if the see that it’s actually good for something.
Suppose you don’t know anything about how to play the piano. Now imagine plopping yourself in front of a grand piano and trying to play the Hammerklavier sonata (reference 36). If you believe in the direct approach, you start by playing just the first note. That’s doable, right? Then you play just the first two notes. That’s doable, right? Then imagine working your way through the whole piece that way. If at first you don’t succeed, try harder. TRY HARDER!
Alas, that is not going to work. Instead you need to back up many many steps and approach the problem indirectly, using ultrasimple level1 pieces and lots of scales and études. This is indirect because later, after you have mastered the masterpieces, it is OK to forget the level1 pieces. They are are not wrong and do not need to be unlearned; they just get left behind or integrated and absorbed ... just as a chick leaves behind its shell and absorbs its yolksac.
Another term for this is scaffolding: To assemble a huge statue, you need scaffolding ... but afterwards you remove the scaffolding. It’s not needed anymore, and it just gets in the way. You can call this unlearning if you want, but it’s the ultrasimple nonproblematic kind of unlearning, because it’s obvious which part is statue and which part is scaffolding. It’s obvious what you need to remove. There is nothing confusing or deceptive.
Remark: Only on the rarest of occasions does it help to tell a student to “TRY HARDER”. Usually the problem is that the kid doesn’t have the requisite foundation. You have to go back and build the foundation.
To say the same thing another way: Putting more effort into an approach that isn’t working is worse than useless. See section 9 for more on this.
Remark: The scalesandétudes approach is indirect in another way: There is some sort of Cartesian direct product. Imagine a matrix with N rows and M columns. Studying N different scales and études prepares you to play M different repertory pieces. There is not a onetoone relationship between a given étude and a given sonata.
That leads us back to the topic of openendedness. There are plenty of things in the world that are not openended. If you are the home plate umpire, for every pitch you need to call “ball” or “strike”. It’s multiple choice. You can’t split the difference.
On the other hand, there are also lots of openended things in the world. Most of real life consists of dealing with openended questions. Some examples that can be used in the classroom include:
We need to emphasize openended problems in order to restore balance, because in recent years closedended checklistoriented multipleguess problems have been grotesquely overemphasized.
Suggestion: Whenever selecting or creating quiz questions, include a goodly proportion of openended questions. Whenever I see a multipleguess question, I wince, and then ask myself whether it could be converted into an openended question.
If you say multiplechoice questions save a lot of time, because they are easier to grade, then I respond as follows: If we are going to talk about savings, we need to talk about the costs, too. The cost of overemphasizing shortanswer multipleguess tests is that they defeat the purpose of the entire educational system. The cost exceeds your entire salary plus overhead and more than that besides. Ask the kids whether they would rather learn a small number of truly useful things, or be “exposed to” a huge number of useless things that they won’t remember anyway.
Besides, consider the options for a 50minute test: You could assign 65 multipleguess questions (45 seconds apiece) or you could assign 4 openended questions (12 minutes apiece). You may find that grading the 4question version does not take much (if any) longer than grading the 65question version.
As mentioned in section 3, the critical thinking must be baked into the thinking process, like the oatmeal in oatmeal cookies, not sprinkled on as an afterthought. So it is with the teaching of critical thinking: It has to be baked into the curriculum. It is not something you can advocate for 15 minutes on Tuesday morning and then ignore – or penalize – the rest of the time.
Combining several of the items mentioned in section 15 and elsewhere, we come back to the idea that 99.999% of reasoning is massively parallel and subconscious. Consider the Mississippi Flow problem. Most people have a very hard time with this problem. Solving it involves racking the brain and sifting the memory, searching for information that is somehow related to the problem. If on Day One of the course you ask students to do this, they can’t do it, and pestering them to “try harder” won’t help ... and the direct approach won’t work either. This problem doesn’t rise to the Hammerklavier level of complexity, but the same idea applies: Rather than attacking it directly, you need to go back and spend a long time working on scales and études so as to build up the skills necessary to attack the problem.
More specifically: You can’t fetch stuff out of your memory in a good way if you didn’t put it into your memory in a good way. This helps explain why the current overemphasis on fullyscripted problemsolving and multipleguess quizzes is so poisonous. If there is a script for solving every problem that is going to be on the state test, then students naturally get the idea that rote learning is sufficient. The hallmark of rote learning is that each idea can be recalled in exactly one way. Technically that counts as a memory, but it is not a very useful memory. As discussed in section 10.1, the smart approach is to mull over each new idea, checking it against previouslyknown ideas, looking for connections ... and, conversely, checking for inconsistencies. If you do this, each idea can be recalled in 100 different ways, which makes it 100 times more useful than a rote memory.

The point is, you need to make it a habit to give every new idea this treatment. Don’t wait until you have cavities to start brushing your teeth. Don’t wait until you are facing an openended question to suddenly wish you had a more agile, effective mind. Wishing won’t help. This is one of the many things I like about the Harry Potter stories: the kids didn’t get to be powerful wizards overnight. They didn’t do it by praying or wishing. They worked hard for years to develop their skills, including reasoning and teamwork as well as the more domainspecific skills.
There is a bit of a chickenandegg problem here, because until the students learn how to build up richlyconnected memories, they won’t be very good at solving openended problems ... and conversely, until they have some success at recalling offthewall and outsidethebox ideas, they won’t appreciate the value of richlyconnected memories, and won’t be motivated to do the work – the years of work – necessary to build such memories.
Here is a bit of an exercise: The association game. Call on students, in order, so that everybody has to participate. The assignment is to come up with some word or idea that is associated with the Mississippi.
Kids who are called on later have the advantage of more time to think, but the disadvantage that the lowhanging fruit has already been picked.
Then we can go back and take those items two at a time, looking for other connections. In particular, what do riverboats and Huckleberry Finn have in common? What does that mean? Could that possibly help solve the Mississippi Flow problem?
Don’t tell me students can’t play this game. They play six degrees of Kevin Bacon for fun. The downside to 6^{∘}KB is that even if you win, it’s still just a trivia game. What we’re doing here is just as much fun, but it’s better because it’s not just a game. We are building up the skills needed to solve important realworld problems.
More generally, you can play six degrees of physics. The law of universal gravitation is related to Coulomb’s law which is related to conservation of flux lines which is related to conservation of other things (such as the butter gun discussed in Feynman) which is related to continuity of world lines in spacetime which is ............
Students (and parents etc.) get irate if they think you are cheating, if they think you are not playing by the rules.
Therefore it is superimportant to make the point to all concerned that you’re not cheating and you’re not merely changing the rules ... you’re changing the game. Tell them:
You don’t show up to play football wearing your baseball uniform and carrying your bat and mitt. It’s a different game, with different rules, different skills, and different equipment.
So it is with this class. You’ve spent 12 years learning how to play trivia games, and there’s nothing wrong with that, but in this class we are playing a whole different game. It has different rules, and requires different equipment and different skills. For starters, rather than flirting with a large number of trivial problems, we are going to solve a small number of important problems. There will be lots of openended questions, and relatively few multiple guess questions. There will be few if any questions that can be answered in 45 seconds. Creativity and originality will be encouraged.
We do not choose problems because they are hard, or because they are easy. We choose problems that are important.
If you can find an easy solution to an important problem, that’s the biggest win of all. That’s what we call lowhanging fruit. I love lowhanging fruit. I go for the watermelons, because they are huge and delicious and very, very lowhanging.
In this class, we do not do hard problems. We will sometimes do problems that would have been hard if you didn’t know the tricks. So let’s get started. Let me show you a few good tricks.
You’ll have to give that speech multiple times before anybody believes you ... and then you have to deliver. They’ve heard (most of) that speech before, from people who didn’t mean it and/or didn’t even understand what they were saying.
Also at some point it is necessary to explain what I mean by “good tricks”.
Good tricks help solve lots of important problems. If the trick can do that, it’s a good trick, and I don’t care whether you call it physics or nonphysics.
Mathematicians think that all of physics is just a bunch of tricks for guessing the right answer without doing a mathematicallyrigorous proof. I take it as a compliment, even if it was not intended as such.
In contrast, a worthless trick solves only one or two problems, none of which are very important. Learning such a trick is not worth the trouble. The poster child for this involves simplifying the fraction
16 64 (8)
by “canceling” the sixes. Even when it works it’s a ridiculous method.
Worthless tricks come in various flavors and colors:
 Rote memory counts as a memory, and sometimes it is exactly the right approach, but it comes at a terrible cost. It’s not very scalable. You could make a list of all 500,000 questions that could be on the final, and if you want to memorize and retain all of the questions and answers, Rain Man style, that’s fine with me. However, if your name is not Rain Man, you will find it easier – a whole lot easier – to learn a handful of deep principles and learn how to apply them.
 According to legend, when they (separately) faced the Kobayashi Maru scenario, Ensign James T. Kirk and Ensign Montgomery Scott each found ways of «solving» the problem that depended in the fact that it was a simulation, not a real situation. In other words, they converted an important problem into an unimportant problem and quote «solved» it. That is the opposite of good behavior. That is not a survival skill in real life.
Our goal is to convert hard important problems into easy important problems. It has to remain important or it’s not worth the trouble.
And by the way, at any decent school, if something goes wrong with the simulation, you do not get credit for a win. We come up with a better scenario and you get to start over. Life is not some made forTV courtroom drama where the objective is to win on a technicality.
Suggestion: On occasion, assign the same exercise more than once, with instructions to find the answer by a different method. This is not easy, but it can undoubtedly be done. For example, more than 250 different proofs of the Pythagorean theorem are known.
Discuss the various solutions. Start by deciding which are correct ... but don’t stop there. It is also appropriate to evaluate the degree of originality ... but don’t stop there, either. As shown in figure 12, we don’t want to cultivate originality just for the sake of originality; we want to cultivate correctness, style, and elegance.
Originality without correctness is the domain of kooks and crackpots. Originality without good style is sometimes perverse and sometimes just weird. Good style is subjective, but it is nevertheless real and important. For more about the role of style, elegance, and artistry in science, see reference 37.
Einstein complained that his teachers required Kadavergehorsamkeit – the obedience of a corpse.
Once upon a time, I was helping a kid translate a sentence for his Spanish homework. He thought of two possible translations: the literal translation, and the fluent translation. He immediately wrote down the literal translation. I thought the fluent translation would be better, because in real life, nobody with any sense would use the literal translation. The teacher speaks Spanish; she will know this is better.
However, the kid was adamant:
Everybody else is going to do use the translation in the back of the book. If I do that, it will be marked correct. If I do anything else, it might or might not be marked correct. I could argue the point, and I might even win, but then I get the reputation for being argumentative, which is not good. By the time she gets around to grading the papers, she will have forgotten what the question was, so it doesn’t matter whether she speaks Spanish or not. The most important thing is to get the same answer as everybody else. Just because you think the fluent translation is better doesn’t mean the teacher thinks it is better. Maybe in your world it is better, but in my world there is a downside and no possible upside to the fluent translation.
I told the kid he was right, and apologized for giving bad advice. We briefly discussed the difference between what works in school and what is the right thing in real life.
We discussed the possibility of providing the second translation off to the side, or as an appendix, or asking about it in class. The kid stuck to his position: That school’s motto is “the nail that sticks up gets hammered down” and nothing was going to tempt the kid to stick his head up.
The lesson here is simple: If you want kids to learn to think, you have to reward the fluent translation. Also explicitly mention that it is possible to have more than one correct solution to homework questions. Ask in class if anybody has any questions. Ask if anybody can come up with alternative approaches.
It is not sufficient to merely say you want such things; you have to practice what you preach. Merely tolerating it would be an improvement over what we have now, but actually rewarding it would be even better. If you want kids to not learn by rote, you must not grade by rote.
I am quite aware that this requires the teacher to do more work. It is, however, necessary work.
Here’s a constructive suggestion that is simple yet superimportant: Start by removing the thousands of little things that reward conformity and rote regurgitation while penalizing creativity and critical thinking. You don’t have to remove all of them at once; just remove them a few at a time, as you come to them, all day every day.
For example, all too often on page 101 of the textbook there will be a definition, and then on page 105 there will be an “exercise” that calls for regurgitation of the definition, word for word. Suggestion: don’t assign that exercise. Come up with some exercise that requires applying the idea rather than memorizing empty words. If you can’t come up with such an exercise, then the idea must not be very important, and you need not assign any exercise whatsoever on that topic.
Here’s an even more obvious and more important suggestion: Don’t require students to learn things that can’t possibly be true. For example:
Another suggestion: Set up a program that rewards students for finding errors in the textbook. The reward should depend on the importance of the error: One point for simple spelling errors, and many points for fundamental misconceptions.
Similarly, assign students to find examples of nonsense in real life. There are plenty of blatant examples, including advertisements for “low calorie energy bars” and so forth. The aisles of a typical drug store contain homeopathic “drugs”, magnetictherapy bracelets, and many other products that could not possibly work as advertised. Many news articles cannot keep straight the distinction between a millisievert and a millisievert per hour. Politicians promise to reduce taxes and balance the budget without cutting government programs. More subtle forms of nonsense are even more abundant.
Last but not least, get rid of the highstakes gameshow tests, as discussed in section 2.
One thing that poisons the system is the idea that “practice makes perfect”. That is, all too often, students and teachers alike think that the best preparation for the test is to do things similar to the test over and over again. Alas, this is a terrible approach. First of all, there is heavy time pressure on the SAT and similar tests, so in that context the students are well advised to not take the time to ponder and/or play. Secondly, during the test there is no reward for showing the work, and not enough time to do it. This teaches exactly the wrong lesson about the importance of showing the work.
One thing that helps a little bit is to explain the distinction between knowledgein and knowledgeout ... that is, between learning and recall. The test measures your ability to pour knowledge out of your head, as quickly as possible. The skills necessary for this task are wildly different from the skills necessary for putting knowledge into your head.
The two ideas are linked, but only very indirectly. They are linked because you can’t get knowledge out efficiently unless you put it in properly. Still, the point remains, the procedures for knowledgein are very different from the procedures for knowledgeout. The former absolutely requires mulling each new idea, to see how it connects – and how it conflicts – with other ideas. See section 10.1. This takes time and effort. It requires making lots of connections, because you rarely know in advance which connections will pay off.
Another thing that helps a little bit is to explain the difference between testprep and lifeprep ... and to explain about burning the furniture.
Cramming for a test might make sense in an emergency, but it is like burning the furniture. I’m not even sure it’s good testprep, but I guarantee that it’s not good lifeprep. Cramming is a bad practice, and it should be avoided like the plague, except maybe in emergencies. Anything that is quickly learned will be quickly forgotten. Unless it is an absolute emergency, your life will be better if you spend the time properly learning a few things, rather than pseudolearning a whole bunch of things.
As a teacher, there are lots of little things you can do to gradually convince students you’re serious about mulling, pondering, and playing:
These questions do not even remotely resemble SAT questions or FCI questions, but thinking about them puts you in a much better position to answer such questions, and a lot of other questions besides.
The most important lifeprep lesson is this: As a student, the key to doing better on the test does not consist in working harder during the test. Instead, what really matters is how you lead your life in the months and years leading up to the test. If you ponder stuff and play with stuff, that makes you smarter, bit by gradual bit. Then when the test comes around – and when realworld challenges come around – you will have a much easier time.
One last thing: The movie “Stand and Deliver” revolves around an SAT test ... but the movie is far more insightful than you normally expect a movie to be. My alltime favorite line is this. The day before the big test, a student quietly speaks to the teacher:
Claudia: You’re worried that we’ll screw up royally tomorrow, aren’t you?Jaime Escalante: Tomorrow’s another day. I’m worried you’re gonna screw up the rest of your lives.
Thinking is hard. Teaching and learning are hard. Teaching and learning critical thinking skills is particularly hard.
We need to do these things, even though they are hard. Some specific suggestions and hints on how to do this were presented above. See the table of contents.
Let’s be clear:
Those provisos are important! The current reliance on highstakes multipleguess tests is not the right approach. Not even close.
Recommendation: Step #1: We simply must stop overemphasizing highstakes “standardized” multipleguess tests. I’m not sure we need to do away with all such tests immediately, but the overreliance needs to end, immediately and entirely. It is simply outrageous that we would judge students, teachers, and/or schools based on such horrible tests. For more on this, see reference 2.
Recommendation Step #2: We need to establish proper systems for assessment, accountability, and standardization.
We need to take step #1 immediately. Step #2 will take some time and effort. We can begin step #2 in parallel with step #1, but we need to take step #1 immediately, without waiting for completion of step #2. The assessment, accountability, and standardization we have now is worse than nothing, so there is no downside to step #1.
Beware that in some cases (e.g. problem 5) the answer in the back of the book is not correct.