Soft versus Hard Evidence; Appeal to Authority etc.

Soft versus Hard Evidence; Appeal to Authority etc.
John Denker

* Contents

1 Introduction
2 How Research Is Done
3 The Value and Limitations of Soft Evidence
4 The Gray Area
5 Definitions, Terminology, Notation, and Other Conventions
6 The Problems with Appeal to Authority
7 More Examples of Scientific Successes and Failures
8 How To Be a Heretic
9 References

1 Introduction

Argument-by-authority is unscientific. Authority, credentials, and reputation are sometimes useful, as will be discussed in section 3, but at best they provide only soft evidence, always inferior to hard scientific evidence. (For more about scientific methods, see reference 1.)

Let me explain why, starting with a few examples. (Numerous additional examples can be found in reference 2.)

Many of the most-important scientific discoveries involved overthrowing established ideas. For example,

Michelson and Morley overthrew the ether theory.
Rutherford overthrew the plum-pudding model of atomic structure.
Rumford overthrew the idea that heat was conserved separately from other forms of energy.

The task of challenging established ideas is not assigned only to giants like Michelson and Rumford and Rutherford, but also to every worker-bee in the scientific community. In my own professional life, several of the most important things I’ve ever done involved overturning seemingly well-established ideas. Examples include:

One could cite lots of authorities as the basis for “knowing" that it is impossible to design a voltmeter with input noise less than √ℏ.

But my colleagues and I did design a voltmeter with very low noise, substantially less than √ℏ. My colleagues actually built such a thing a few years later. I really don’t care how many textbooks say it can’t be done.

One could cite lots of authorities as the basis for “knowing" that the outputs of a neural network converge to a representation of the maximum likelihood probability P(I|C), not the maximum a priori probability P(C|I), nor the joint probability P(C,I), nor anything else.

Similarly, my colleagues and I did construct a network that produces the joint probability P(C,I), which was just what was needed to make a complex product work right.

One could cite lots of authorities as the basis for “knowing" that it is impossible to design a logic gate that dissipates less than ½ C V² per operation.

My colleagues and I did construct a logic chip dissipating much less than ½ C V² per gate per operation. I really don’t care how many textbooks say this is impossible.

The authoritative FAA Flight Training Handbook says that to recover from a spiral dive (graveyard spiral) you should roll the wings level and pull back on the yoke.

To recover from a severe spiral dive, you should roll the wings level and not pull back. This is a matter of life and death.

Et cetera.

Other examples don’t rise to the same level of originality and/or importance, but further illustrate why you can’t always rely on authority:

One could cite lots of authorities as the basis for “knowing” that a crystal is an “insulator” whenever the band gap is large compared to kT. Diamond is the canonical example.

I have built amplifier circuits using silicon field-effect-transistors, and operated them not only at nitrogen temperature (77K) but also at helium temperature (4K). They work fine, even though the band gap is huge compared to kT.

(For more about what it takes to create a practical insulator, see reference 3.)

2 How Research Is Done

A central part of the research job description involves finding out where the authorities are wrong. People who have never done research commonly think it revolves around exploring entirely uncharted areas and discovering entirely new ideas. That does happen sometimes, but it’s very rare. Even the most brilliant and revolutionary ideas have antecedents. Newton acknowledged that he stood on the shoulders of giants.

Successful research commonly involves reviewing seemingly-well-established topics and dethroning not-quite-reliable assumptions. This requires judgment, skill, and intuition, because if you go around questioning everything you’ll never make any progress. Science is full of assumptions that have usually held in the past; the trick is to find a breakdown in the assumptions, and see if if can be exploited.

3 The Value and Limitations of Soft Evidence

There is a huge difference between hard evidence and soft evidence. Reliance on authority, credentials, and reputation is near the top of the soft-evidence scale. If you don’t have any hard evidence, you ought to be guided by the best available soft evidence. However: remember that the top of the soft-evidence scale remains far below the bottom of the hard-evidence scale.

Hard evidence always outweighs soft evidence.

See also section 4 for a discussion of the gray area.

In order of declining value, items on the soft-evidence scale include:

Authoritative opinion. (However: remember, even at the top of the soft-evidence scale, it’s still just soft evidence.)
Non-authoritative opinion.
Random guessing.
Seeing who argues the loudest and/or the longest. (This is worse than nothing, because it gives the biggest advantage to the biggest scoundrel, and just encourages bad behavior.)

A fine example of the role of authority is as follows: suppose a civilization that has never had roads or traffic suddenly invents them. There is a rash of head-on collisions. One group starts agitating for a rule that says everybody should drive on the right. Another group wants everybody to drive on the left. There is no hard evidence that either idea is intrinsically better. There never will be, either; the choice is fundamentally arbitrary. One day the chief shaman has a vision, and announces which side is to be used. The problem is solved. It’s a miracle ... before, there was no reason to prefer one side over the other, and afterward you’ll be pilloried if you drive on the wrong side. A choice needed to be made, and soft evidence carried the day — in the absence of hard evidence.

To illustrate the limitations of authority, consider the notion that the sun rises in the east each day. That is authoritative, indeed proverbial. Now suppose you are visiting northern Sweden. Yesterday you observed the sun rise in the south, and today it didn’t rise at all. You’ve got first-hand hard evidence, and no amount of soft evidence can outweigh it. (In addition to your observational evidence, there is strong theoretical evidence, based on celestial mechanics supported by millions of very precise measurements, far more precise than any proverb will ever be.)

The idea of reputation is related to the idea of authority. As an example, typical citizens who buy a voltmeter at the home-center have no first-hand hard evidence that the voltage readings will be accurate. Mostly they are relying on the reputation of the store, and perhaps the reputation of the manufacturer. This makes a certain amount of sense. A reputable store will accept returns on defective merchandise, and naturally they want to stay in business, so they can’t afford too many returns. Similarly the manufacturer of unreliable instruments will acquire a bad reputation, which will cause people to stop buying their stuff.

These reputation-based arguments hinge on the notion that the vendor wants to stay in business. If that’s questionable, all bets are off. For instance, if the vendor has been taken over and is being looted by some corporate raider, prior reputable actions are not hard evidence of present or future good behavior. Similarly a firm that is in financial distress will cut corners to survive in the short term, even if doing so ruins their reputation in the longer term. So beware: reputation is a lagging indicator.

If my scientific reputation were at stake, I would be sure to calibrate my meters against a primary standard ... or better yet, if possible, design the experiment to produce a dimensionless result immune to calibration errors.

If life-and-death were at stake, I would take even stronger measures to obtain hard evidence ... I would not rely merely on some vendor’s reputation.

Scientific results in a given area are considered hard facts by the people doing the work. The rest of the world reads the initial reports of the work and treats them as soft evidence, based on the reputation of the author, the author’s institution, and the journal.

Sometimes mistakes (and sometimes outright frauds) can come from supposedly-reputable institutions and appear in supposedly-reputable journals.

Scientific empirical results become firmer when other workers replicate the experiment. Theoretical results become firmer when people repeat the calculation, and/or check for consistency with other theoretical and empirical results.

4 The Gray Area

In principle, there must exist a gray area between the top of the usual soft-evidence scale and the bottom of the usual hard-evidence scale.

It seems rather uncommon to land in this area. I’m not sure why. Perhaps it is because there are some strong nonlinearities involved.

One result with 10-sigma confidence far outweighs ten results with 1-sigma confidence each.

(Remember that the probability is exponential in the square of the confidence, and exp(10²) is a big number.)

5 Definitions, Terminology, Notation, and Other Conventions

There are some things that exist only by convention. This includes the meaning of ordinary words, technical terminology, notation, et cetera. There is no point in writing a word unless the reader can figure out what you mean by it.

There are endless disputes – among professional lexicographers as well as laypersons – as to what really defines a word.

Some emphasize the prescriptive approach, based on what a word “should” mean. This involves a certain amount of appeal to authority.
Some give more emphasis to the descriptive approach, based on how the word is actually used in practice. This relies more on statistics than authority.
Last but not least, some say that authors should be allowed to define terms however they like, within reason.

I’m not going to wade into these disputes. I like to stay away from both extremes. I’m not fond of the narrow authority-based approach (e.g. l’Académie française), but I’m not overly fond of the wide-open free-for-all approach either, especially when it comes to technical terms.

Mostly I think definitions often receive far too much emphasis. Knowing what a word really means often requires a great deal more than any dictionary definition could provide.

6 The Problems with Appeal to Authority

6.1 Interlude: Some Quotations

With tongue firmly in cheek, let me cite some things that authorities have said about appeal to authority:

Chiunque conduca un argomento appellandosi all’autorità non usa la sua intelligenza ma la sua memoria.
      – Leonardo
The Devil can cite scripture for his purpose.
      – Shakespeare (Merchant of Venice)
Aliquando bonus dormitat Homerus.
      – Horace [approximately]
Your instructor was right not to give you any points, for your answer was wrong, as he demonstrated using Gauss’s law. You should, in science, believe logic and arguments, carefully drawn, and not authorities. [....] I am not sure how I did it, but I goofed. And you goofed, too, for believing me.
      – Richard Feynman⁴^,⁵

6.2 Overvaluing Symbols

As the proverb says:

Never confuse a symbol
with the thing symbolized.

Having credentials in not the same as having a clue. For example, a college diploma is only a symbol, only loosely symptomatic of anything we ought to care about.

For example,

Would you hire somebody with no degrees other than astrophysics and ask them to teach biochemistry?
Would you hire somebody with no degrees other than physics and ask them to teach history?
Would you hire somebody with no degrees other than political science and ask them to teach computing or psychology?

You know, somebody like Max Delbrück, Thomas Kuhn, or Herb Simon?

Those are extreme examples, but the underlying trend is both broad and deep. In many fields, the pace of change is so fast that it is virtually impossible to find anybody working on the exact thing they were trained for.

When I see somebody placing strict emphasis on credentials (such as GPA, degree, major and minor field, etc.), it tells me they do not trust their own recruiting team to exercise good judgment. That’s a problem. Perhaps this distrust is well founded, in which case you have two problems: oppressive bureaucracy on top of incompetent recruiting due to lack of judgment. Two wrongs don’t make a right.

On the other hand, I’ve seen places where they do trust their own judgment. They will happily hire and promote someobdy who lacks credentials, if that is in fact the right person for the job.

6.3 Overvaluing Soft Evidence

One problem is that far too often, people try to use appeal to authority to outweigh hard facts. That is, often people overestimate the value of appeal to authority. Perhaps because it outweighs most other forms of soft evidence, people imagine it might outweigh at least some forms of hard evidence ... but it doesn’t.

If an authority with every possible credential makes a physically-incorrect argument, it is a physically-incorrect argument.

If a ten-year-old with no credentials whatsoever makes a physically-correct argument, it is a physically-correct argument.

Relative to other forms of soft evidence, authority is important.

Relative to any form of hard evidence, authority is worthless.

I have had many, many experiences where people tried to win an argument by waving credentials in my face – and almost always it’s a sign that they’re dead wrong. If they had a physically-correct argument, they wouldn’t need to flaunt their credentials. It is particularly hilarious when somebody tries this without realizing that their credentials are a small subset of mine. In such cases I could maybe win the argument by citing my credentials, but I never do, because that would set a bad example about the larger issue, namely the folly of appeal to authority.

I try to support my arguments by saying “if you don’t believe me, go do the following experiment" or some such ... as opposed to saying “you have to agree with me because I have such-and-such fancy credentials or such-and-such fancy affiliation”.

On the other side of the same coin: I get tons of mail from people who have no credentials whatsoever. I make a big point of paying respectful attention to them. I’ve learned a lot this way!

6.4 Fudging the Evidence

Another problem is the tendency for people to misquote authorities. Appeal to authority is bad enough ... but bogus appeal to authority is even worse.

Commonly, people hype an idea by shamelessly making up some “quote” out of whole cloth and putting it in the mouth of some authority. There seems to be no end to the made-up “quotes” attributed to Einstein.
Sometimes a genuine quote is modified in a way that violates both the letter and spirit of the original meaning.
Commonly a correct idea is attributed to the wrong person. This leaves you with the correct physics but the wrong history. Physicists do not have a license to make up bogus history, just as historians do not have a license to make up bogus physics.
- The so-called Wheatstone bridge was not invented by Wheatstone.
- The so-called Cavendish balance was not invented by Cavendish.
- The so-called Newton’s first law of motion is not Newton’s at all, but rather Galileo’s.
- The principle of relativity is often attributed to Einstein, but in fact it was stated by Galileo, clearly and in detail, more than 270 years previously.
- The idea that time is the fourth dimension is often attributed to Einstein, but in fact it originated with Minkowski. Einstein didn’t even like the idea when he first heard of it.
- Et cetera.........
Sometimes the quote adheres to the letter of the original, but takes it so badly out of context as to radically distort the meaning.
Conversely, sometimes the quotation captures most of the meaning, even though the wording is not exactly correct, as in the misquotation of Horace given above.¹

I get tired of hearing the argument that “So-And-So said U=0” as if merely intoning the name of So-And-So were sufficient to establish the universal and eternal validity of the U=0 expression.

What does the symbol U denote? I can think of several possibilities, depending on context.
Was So-And-So advocating the notion of U=0, or merely mentioning it in passing, or actively deprecating it?
Was U=0 meant to be universally, eternally, and tautologically true, or was it a special case? Maybe it was merely a hypothesis, assumed temporarily for sake of argument. Maybe it was a narrow conclusion, based on a horde of hypotheses, premises, and provisos.

Most importantly: Presumably So-And-So had a reason for concluding that U=0. So please don’t give us just the name of So-And-So. Instead, please give us the reasoning of So-And-So, and let us judge the reasoning on its merits. Let us judge for ourselves.

7 More Examples of Scientific Successes and Failures

Sometimes the scientific process works well, sometimes not so well. Actually there are four main classes:

Sometimes the ideas need changing, and the scientific community handles it well. Examples: Michelson and Morley; Rutherford; Rumford.

Sometimes idea-changes are handled poorly. As a relatively recent deplorable example: the idea of continental drift was not adopted nearly as fast as it should have been.

Sometimes the ideas don’t need changing, and it is handled well. Example: The null results of Eötvös.

Sometimes the ideas don’t need changing, and it is handled poorly, i.e. heretical ideas are sometimes given much more attention than they deserve. Examples include N-rays and cold fusion.

And there is a fifth class, where the scientific community responds scientifically but fails to bring the broader society along. Examples: copper bracelet therapy, magnetic bracelet therapy, homeopathic medicines.

8 How To Be a Heretic

There are some fairly-well-established procedures that heretics are expected to follow to gain respectability. These include:

As James Randi likes to point out: extraordinary claims require extraordinary proof. If you are making tremendous claims, be prepared to back them up with a tremendous amount of explaining. Your results will not become accepted until they are confirmed by others, but nobody will invest the effort to attempt confirmation until you make your claims plausible. The burden of plausibility is on you, the heretic.
Uphold the correspondence principle. That is, if you have a new theory, show that it agrees with classical notions in some appropriate limit, at least in the cases where things have actually been checked. Similarly, if you have suprising data, show that your methods would have been consistent with classical results, within the error bars, over the range where things have actually been checked.
This is related to the more general pedagogical principle that “learning proceeds from the known to the unknown”. For example:
- Once I had someone claim that the sun rises in the east and sets in the west, exactly, by definition. I suggested that was proverbially true but not exactly true; it applied to people who live in temperate latitudes and don’t look too closely. The other person argued with me. They ridiculed me. They told me all about their credentials, including their fancy education and their fancy research job.
  I started by agreeing that twice a year it’s more-or-less exact. However, the rest of the time it’s off by a noticeable amount, and the amount depends on latitude. I mentioned some first-hand evidence: once I saw the sun rise in the south, and the next day it didn’t rise at all. (I was in Finland at the time.) I also mentioned that for years I drove to work along an east/west road, and on some days the rising sun was directly in my eyes and sometimes it wasn’t. Then I grabbed an orange and a held a pencil against it, tangentially, to represent the east-west line of sight. After a few minutes of that they conceded that I might have a point.
- Similarly, special relativity predicts that the kinetic energy is ½mv² (or, better, ½p·v) in cases that have actually been checked, i.e. in the low-speed limit. This correspondence builds confidence in the theory. At super-high speeds the same relativisic formula says the KE is a whole p·v. See reference 6.
- Similarly, quantum mechanics agrees with classical physics in the appropriate limits. If it didn’t, we wouldn’t trust QM at all.
- et cetera.

9 References

I thank Tim Folkerts and other members of the Phys-L discussion group for helping me think more clearly about this topic.

: 1.
John Denker, “Scientific Methods” www.av8n.com/physics/scientific-methods.htm
: 2.
John Denker, “Famous Authoritative Pronouncements” www.av8n.com/physics/ex-cathedra.htm
: 3.
John Denker, “Why White Things are White” www.av8n.com/physics/white.htm
: 4.
Michelle Feynman, ed.
Perfectly Reasonable Deviations from the Beaten Track
The Letters of Richard P. Feynman (2005) pp 288-289.
: 5.
Richard Feynman apud Joseph McClain
“Feynman’s advice to W&M student resonates 45 years later”
https://www.wm.edu/news/stories/2020/feynmans-advice-to-wm-student-resonates-45-years-later.php
: 6.
John Denker,
“Welcome to Spacetime”
www.av8n.com/physics/spacetime-welcome.htm

1: The actual quote is: ... indignor quandoque bonus dormitat Homerus ....

[Contents]