Posts tagged with operator theory

If your vector space is a shopping cart full of groceries, then the checkout clerk is a linear operator on that space.













Why is the boundary of a boundary always null? What is a coboundary?

Beware, some of the + signs in Ghrist’s Elementary Applied Topology are formal sums, which is pasting things together
http://upload.wikimedia.org/wikipedia/en/e/e8/F2_Cayley_Graph.png
rather than umming in the usual dimension-reducing sense.










A fun exercise/problem/puzzle introducing function space.

hi-res




Suppose you are an intellectual impostor with nothing to say, but with strong ambitions to succeed in academic life, collect a coterie of reverent disciples and have students around the world anoint your pages with respectful yellow highlighter. What kind of literary style would you cultivate?

Not a lucid one, surely, for clarity would expose your lack of content. The chances are that you would produce something like the following:

We can clearly see that there is no bi-univocal correspondence between linear signifying links or archi-writing, depending on the author, and this multireferential, multi-dimensional machinic catalysis. The symmetry of scale, the transversality, the pathic non-discursive character of their expansion: all these dimensions remove us from the logic of the excluded middle and reinforce us in our dismissal of the ontological binarism we criticised previously.

This is a quotation from the psychoanalyst Félix Guattari, one of many fashionable French ‘intellectuals’ outed….

scientist and polemicist Richard Dawkins, Postmodernism Disrobed. A review of Intellectual Impostures published in Nature 9 July 1998, vol. 394, pp. 141-143.

 

Above we read an assertion without evidence. Dawkins posits that an intellectual impostor with nothing to say would write in a certain way. But where’s the proof? I guess whoever’s reading this book review is assumed to already know what Dawkins (Sokal/Bricmont) are talking about and agree with his implications: namely, that postmodernists have nothing to say, and that they cultivate an obtuse literary style to obscure the fact (and that this somehow also attracts followers).

Who says “chances are”? Dawkins’ attack amounts to a flame.

 

Here is a not-unusual passage written in that other famously obtuse jargon, mathematics:

The prototypical example of a C*-algebra is the algebra B(H) of bounded (equivalently continuous) linear operators defined on a complex Hilbert space H; here x* denotes the adjoint operator of the operator x: H → H. In fact every C* algebra, A, is *-isomorphic to a norm-closed adjoint closed subalgebra of B(H)….

That’s from Wikipedia’s article on C* algebras. I think the language is similarly impenetrable to Guattari’s. But mathematics = science = good and humanities = not science = bad, at least in the minds of some.

Here is an excerpt (via @wtnelson) written for teachers of 4–12-year-olds, 40 years ago, by Zoltán Pál Dienes:

psychologically speaking, relating an object to another object is a very different matter from relating a set of objects to another set of objects. In the first case, perceptual judgment can be made on whether the relation holds or not in most cases, whereas in the case of sets, a certain amount of conceptual activity is necessary before such a judgment can take place. For example, we might need to count how many of a certain number of things there are in the set and how many of a certain number of these or of other things there are in another set before we can decide whether the first and the second sets are or are not related by a certain particular relation to each other.

Clear as mud! Clearly Z. P. Dienes was an intellectual impostor with ambitions to collect a coterie of reverent disciples.

 

I don’t know enough about postmodernism to opine on it. I just get annoyed when putatively sceptical people casually wave it off without proving their point.

(And if you’re going to point me to the Sokal Affair or Postmodernism Generator CGI, I’ll point you to At Whom Are We Laughing?.)

 

In Lacan: A Beginner’s Guide, Lionel Bailly describes his subject as “a thinker whose productions are sometimes irritatingly obscure”. He goes on:

Most Lacanian theory [comes from his]  spoken teachings…developed in discourse with…pupils…. [Various modes of presentation which are appropriate in speech] make frustrating reading. …leading the reader toward an idea, but never becoming absolutely explicit…difficult to discover what he actually said…thought on his feet—the ideas…in his seminars were never intended to be cast in stone…freely ascribes to common words new meanings within his theoretical model…Lacan, despite the fuzziness of his communication style, strove desperately hard for intellectual rigour….at the end of the day, it is … clinical relevance that validates Lacan’s model. [Lacan being a psychoanalyst and his ideas coming out of that work.]

So there’s an alternative hypothesis from an authority. Bailly admits the communication style was poor and gives reasons why it was. But rather than judging the work on rhetorical grounds, we should judge it on clinical merit—the ultimate empirical test!

Compare this to Dawkins. Besides the suppositions I already mentioned, he chooses words like: “intellectuals” within scare quotes; ‘anoint’, ‘revere’, ‘coterie’—to undermine the intellectual seriousness of his targets. Who are the empiricists here and who relies on rhetoric?

(Source: members.multimania.nl)




What is the best interpretive program for making sense of quantum mechanics? Here is the way I would put it now. The question is completely backward. It acts as if there is this thing called quantum mechanics, displayed and available for everyone to see as they walk by it—kind of like a lump of something on a sidewalk. The job of interpretation is to find the right spray to cover up any offending smells. The usual game of interpretation is that an interpretation is always something you add to the pre-existing, universally recognized quantum theory.


What has been lost sight of is that physics [theory] is a dynamic interplay between storytelling and equation writing. Neither one stands alone, not even at the end of the day. But which has the more fatherly role? If you ask me, it’s the storytelling…. An interpretation is powerful if it gives guidance, and I would say the very best interpretation is the one whose story is so powerful it gives rise to the mathematical formalism itself (the part where nonthinking can take over)….


Take the nearly empty imagery of the many-worlds interpretation(s). Who could derive the specific structure of complex Hilbert space out of it if one didn’t already know the formalism? Most present-day philosophers of science just don’t seem to get this: If an interpretation is going to be part of physics, instead of a self-indulgent ritual to the local god, it had better have some cash value for physical practice itself.




One way to think about quantum operators is as Questions that are asked of a quantum system.

  • Identity operator = "Who are you?"
  • Energy operator = "How much do you weigh?"
  • "What is your spin along the z axis?”
  • and so on.







Statistical moments, letter values, and other verbs that are often just called “statistics” can be thought of the same way: asking questions of a data set.


For example, after you run the ∑/n operation to get the mean happiness in Europe (2.0 / 3.0) versus the mean happiness in the US (1.2 / 2.0), you naturally would want to ask things like:

  • What about the least happy people? Are there more people answering near 0.0 in the US or Europe?
  • What’s the variance √∑²/n?
  • What’s the skewness? (Blanchflower & Oswald’s data survey 45,000 Americans and 400,000 Europeans — enough degrees of freedom to meaningfully measure skew.)
  • What’s the conditional value-at-risk at the 10% level? (average of the bottom 10% unhappiness.)
  • Apply a smoothing kernel to pick up which country has the more least-happy people without choosing a particular cutoff. (And maybe a second kernel to deal with the different scales: should we assume US1.0 = EUR1.5? Or maybe count from the top, to US1.8 = EUR2.8?)

Running these operators on the dataset will tell you an answer to one question, just like in English.

One difference is that classical statistical operators typically spit out two numbers in reply to your question: an answer, and a confidence level in that answer. The confidence in the answer is computed based on experimental assumptions by people with names like Pearson, Fisher, and Chisquare.




I hope I can say this in a way that makes sense.

One kind of mathematical symbology your eyes eventually get used to is the Σum over all individuals” concept:

image

Yes, at first it’s painful, but eventually it looks no more confusing than the oddly-bent Phoenician-alphabet letters that make up English text.

I believe there is a generally worthwhile pearl of thought-magic wrapped up in the sub-i, sub-j pattern. I mean, a metaphor that’s good for non-mathematicians to introduce into their mind-swamps.

That pearl is a certain connection between the specific and the general—a way of reaching valid generality, without dropping the specificity.

 

Before I explain it, let me talk a bit more about the formalism and how it’s used. I’ll introduce one word: functional. A functional ƒ maps from a small, large, convoluted, or simple domain onto a one-dimensional codomain (range).

Examples (and non-examples) of functionals:

  • size — you can measure the volume of a convex hull, the length of an N-dimensional vector, the magnitude of a complex number, the girth of a rod, the supremum of a functional, the sum of a sequence, the length of a sequence, the number of books someone has read, the breadth of books someone has read (is that one-dimensional? maybe not), the complicatedness (Vapnik-Chervonenkis dimension) of a functional, the Gini coefficient of a country’s income distribution, the GNP of a country, the personal incomes of the lowest earning 10% of a country, the placement rate of an MBA programme, the mean post-MBA income differential, the circumference of a ball, the volume of a ball, … and many other kinds of size.
  • goodness / scorebusiness metrics often rank super-high-dimensional things like the behaviour of a group of team members into a total ordering of desireable through less desireable. When businesses use several different metrics (scores) then that’s not a functional but instead the concatenation of several functionals (into a function).
  • utility — for homo economicus, all possible choices are totally, linearly ordered by equivalence classes of isoclines.
  • fitness — all evolutionary traits (a huge, huge space) are cross-producted with an evolutionary environment to give a Fitness Within That Environment: a single score.
  • angle — if “angle” has meaning (if the space is an inner product space) then angle is a one-dimensional codomain. In the abstract sense of "angle" I could be talking about correlation or … something else that doesn’t seem like a geometrical angle as normally proscribed.
  • distance … or difference — Intimately related to size, distance is kind of like “size between two things”. If that makes sense.
  • quantum numbers — four quantum numbers define an electron. Each number (n, l, m, spin) maps to a one-dimensional answer from a finite corpus. Some of the corpora are interrelated though, so maybe it’s not really 1-D.
  • quantum operators — Actually, some quantum operators are non-examples because they return an element of Hilbert space as the answer. (like the Identity operator). But for example the Energy operator returns a unidimensional value.
  • ethics — Do I need more non-examples of functionals? A complete ethical theory might return a totally rankable value for any action+context input. But I think it’s more realistic to expect an ethical theory to return a complicated return-value type since ethics hasn’t been completely figured out.
  • regression analysis — You get several β's as return values, each mogrified by a t-value. So: not a one-dimensional return type.
  • logic — in the propositional calculus, declarative sentences return a value from {true, false} or from {true, false, n/a, don’t know yet}. You could argue about whether the latter is one-dimensional. But in modal logic you might return a value from the codomain “list of possible worlds in which proposition is true”, which would definitely not be a 1-dimensional return type.
  • factor a number — last non-example of a functional. You put in 136 and you get back {1, 2, 4, 8, 17, 34, 68, 136}. Which is 8 numbers rather than 1. (And potentially more: 1239872 has fourteen divisors or seven prime factors, whichever you want to count.)
  • median — There’s no simple formula for it, but the potential answers come from a codomain of just-one-number, i.e. one parameter, i.e. one dimension.
  • other descriptive statistics — interquartile range, largest member of the set (max), 72nd percentile, trimean, 5%-winsorised mean, … and so on, are 1-dimensional answers.
  • integrals — Integrals don’t always evaluate to unidimensional, but they frequently do. “Area under a curve” has a unidimensional answer, even though the curve is infinite-dimensional. In statistics one uses marginalising integrals, which reduce the dimensionality by one. But you also see 's that represent a sequence of ∫∫∫'s reducing to a size-type answer.
    image
  • variability — Although wiggles are by no means linear, variance (2nd moment of a distribution) measures a certain kind of wiggliness in a linearly ordered, unidimensional way.
  • autocorrelation — Another form of wiggliness, also characterised by just one number.
  • Conditional Value-at-Risk — This formula image is a so-called “coherent risk measure”. It’s like the expected value of the lowest decile. Also known as expected tail loss. It’s used in financial mathematics and, like most integrals, it maps to one dimension (expected £ loss).
  • "the" temperature — Since air is made up of particles, and heat is to do with the motions of those particles, there are really something like 10^23 dynamical orbits that make a room warm or cold (not counting the sun’s rays). “The” temperature is some functional of those—like an average, but exactly what I don’t know.
 

Functionals can potentially take a bunch of complicated stuff and say one concrete thing about it. For example I could take all the incomes of all the people in Manhattan, apply this functional:

average income of Manhattanites

and get the average income of Manhattan.

Obviously there is a huge amount of individual variation among Manhattan’s residents. However, by applying a functional I can get Just One Answer about which we can share a discussion. Complexity = reduced. Not eliminated, but collapsed.

I could apply other functionals to the population, like

  • count the number of trust fund babies (if “trust fund baby” can be defined)
  • calculate the fraction of artists (if “artist" can be defined)
  • calculate the “upper tail risk” (ETL integral from 90% to 100%, which average would include Nueva York’s several billionaires)

Each answer I am getting, despite the wide variation, is a simple, one-dimensional answer. That’s the point of a functional. You don’t have to forget the profundity or specificity of individual or group variation, but you can collapse all the data onto a single, manageable scale (for a time).

 

The payoff

The sub-i sub-j pattern allows you to think about something both specifically and in general, at once.

  1. Each individual is counted uniquely. The description of each individual (in terms of the parameter/s) is unique.
  2. Yet there is a well-defined, actual generalisation to be made as well. (Or multiple generalisations if the codomain is multi-dimensional.) These are valid generalisations. If you combine together many such generalisations (median, 95th percentile, 5th percentile, interquartile range) then you can quickly get a decent description of the whole.

image

Kind of like how thinking with probability distributions can help you avoid stereotypes: you can understand the distinctions between

  • the mean 100m sprint time of all men is faster than the mean 100m sprint time of all women
  • the medians are rather close, perhaps identical
  • the top 10% of women run faster than the bottom 80% of men
  • the variance of male sprint times is greater than the variance of female sprint times
  • differences in higher moments, should they exist
  • the CVaR's of the distributions are probably equivalent
  • conditional distributions (sub-divisions of sprint times) measured of old men; age 30-42 black women; age 35 Caribbean-born women of any race of non-US nationality who live in the state of Alabama
  • and so on.

It becomes harder to sustain sexism, racism, and to sustain stereotypes of all sorts. It becomes harder to entertain generalistic, simplistic, model-driven, data-less economic thinking.

  • For instance, the unemployment rate is the collapse/sum of ∀ lengths of individual unemployment spells: ∫ (length of unemp) • (# of people w/ that unemp length) = ∫ dx • ƒ(x).

    Like the dynamic vapor pressure of a warm liquid in a closed container, where different molecules are pushing around in the gas and alternately returning to the soup. The total pressure looks like a constant, but that doesn’t mean the same molecules are gaseous—nor does it mean the same people are unemployed.

    (So, for example, knowing that the unemployment rate is higher doesn’t tell you whether there are a few more long-term unemployed people, a lot more short-term unemployed people, or a mix.)
  • You can generalise about a group using different functionals. The average wealth (mean functional) of an African-American Estadounidense is lower than the average wealth of a German-American Estadounidense, but that doesn’t mean there aren’t wealthy AA’s (max functional) or poor GA’s (min functional).
  • You don’t have to collapse all the data into just one statistic.

    You can also collapse the data into groups, for example collapsing workers into groups based on their industry.

    (here the vertical axis = number of Estadounidenses employed in a particular industry — so the collapse is done differently at each time point)
    imageimage

Various facts about Venn Diagrams, calculus, and measure theory constrain the possible logic of these situations. It becomes tempting to start talking about underlying models, variation along a dimension, and “the real causes" of things. Which is fun.

At the same time, it becomes harder to conceive overly simplistic statements like “Kentuckians are poorer than New Yorkers”. Which Kentuckians do you mean? And which New Yorkers? Are you saying the median Kentuckian is poorer than the median New Yorker? Or perhaps that dollar cutoff for the bottom 70% of Kentuckians are poorer than the cutoff to the bottom 50% of New Yorkers? I’m sorry, but there’s too much variation among KY’s and NY’s for the statement to make sense without a more specific functional mapping from the two domains of the people in the states onto a dollar figure.

 

ADDED: This still isn’t clear enough. A friend read this piece and gave me some helpful feedback. I think maybe what I need to do is explain what the sub-i, sub-j pattern protects against. It protects against making stupid generalisations.

To be clear: in mathematics, a generalisation is good. A general result applies very broadly, and, like the more specific cases, it’s true. Since I talk about both mathematical speech and regular speech here, this might be confusing. But: in mathematics, a generalisation is just as true as the original idea but just applies in more cases. Hence is more likely to apply to real life, more likely to connect to other ideas within mathematics, etc. But as everyone knows, people who “make generalisations” in regular speech are usually getting it wrong.

Here are some stupid generalisations I’ve found on the Web.

  • Newt Gingrich: "College students are lazy."
     Is that so? I bet that only some college students are lazy.

    Maybe you could say something true like “The total number of hours studied divided by total number of students (a functional ℝ⁺^{# students}→ℝ⁺) is lower than it was a generation ago.” That’s true. But look at the quantiles, man! Are there still the same number of studious kids but only more slackers have enrolled? Or do 95% of kids study less? Is it at certain schools? Because I think U Chicago kids are still tearing their hair out and banging their heads against the wall. 
  • Do heterodox economists straw-man mainstream economics?
    I’m sure there are some who do and some who don’t.
  • The bad economy is keeping me unemployed.
    That’s foul reasoning. A high general unemployment rate says nothing directly about your sector or your personal skills. It’s a spatial average. Anyway, you should look at the length of personal unemployment spells for  
  • Conservatives say X. Liberals say Y. Libertarians think Z.
    Probably not ∀ conservatives say X. Nor ∀ liberals say Y. Nor do ∀ libertarians think Z. Do 70% of liberals say Y? Now that I’m asking you to put numbers to the question, that should make you think about defining who is a liberal and measuring what they say. Not only listening to the other side, but quantifying what they say. Are you so sure that 99% of libertarians think Z now?
  • The United States needs to focus on creating high-tech jobs.
    Are you actually just talking about opportunities for upper-middle-class people in Travis County, TX and Marin County, CA? Or does your idea really apply to Tuscaloosa, Flint, Plano, Des Moines, Bemidji, Twin Falls, Lawrence, Tempe, Provo, Cleveland, Shreveport, and Jacksonville?
  • Green jobs are the future!
    For whom?
  • Alaskans are enslaved to oil companies.
  • Meat eaters, environmentalists, blacks, hipsters, … you can find something negative said about almost any group.
    Without quantification or specificity, it will almost always be false. With quantification, one must become aware of the atoms that make up a whole—that the unique atoms may clump into natural subgroups; that variation may derive from other associations—that the true story of a group is always richer and more interesting than the imagined stereotypes and mental shorthand.
  • What’s wrong with the teenage mind? WSJ.
    a teenage mind?
  • French women eat rich food without getting fat. Book.
  • French parents are better than American parents. WSJ.
  • What is it about twenty-somethings? NY Times.

If you sub-i, sub-j these statements, you can come up with a more accurate and productive sentence that could move disagreeing parties forward in a conversation.

Worfimage

Unwarranted generalisations are like Star Trek: portraying an entire race as being defined by exactly one personality trait (“Klingons are warlike”, “Ferengi’s ony care about money”). That sucks. The sub-i, sub-j way is more like Jack Kerouac’s On the Road: observing and experiencing individuals for who they are. That’s the way.

Neal CassadyAllen GinsbergWilliam S Burroughs, oldJack Kerouac

If you want to make true generalisations—well, you’re totally allowed to use a functional. That means the generalisations you make are valid—limited, not overbearing, not reading too much into things, not railroading individuals who contradict your idea in service of your all-important thesis.

OK, maybe I’ve found it: a good explanation of what I’m trying to say. There are valid ways to generalise about groups and there are invalid ways. Invalid is making sweeping over-generalisations that aren’t true. Sub-i, sub-j generalisations are true to the subject while still moving beyond “Everyone is different”.




The LaPlace Transform is the continuous version of a power series.

Think of a power series
a_n x^n
\sum_n \text{const}_n \cdot \blacksquare^n \ = \ f(\blacksquare)
as mapping a sequence of constants to a function.
{ const_1, const_2, ... } ’ f(x)
Well, it does, after all.

Then turn the into a . And turn the x^k into a exp ( ‒k ln x ). Now you have the continuous version of the “spectrum” view that allows so many tortuous ODE’s to be solved in a flash. I wonder what the economic value of that formula is?

In addition to solving some ODE’s that occur in engineering applications, there is also wisdom to be had here. Thinking of functions as all being made up of the same components allows fair comparisons between them.

plot(eXp, xlab="exponent in the power series", ylab="value of constant", main="Spectrum of exp", log="y", cex.lab=1.1, cex.axis=.9, type="h", lwd=8, lend="butt", col="#333333")    eXp <- c(1, 1/2, 1/6, 1/2/3/4, 1/2/3/4/5, 1/2/3/4/5/6, 1/2/3/4/5/6/7, 1/2/3/4/5/6/7/8, 1/2/3/4/5/6/7/8/9, 1/2/3/4/5/6/7/8/9/10, 1/2/3/4/5/6/7/8/9/10/11),    eXp <- c(1, 1/2, 1/6, 1/2/3/4, 1/2/3/4/5, 1/2/3/4/5/6, 1/2/3/4/5/6/7, 1/2/3/4/5/6/7/8, 1/2/3/4/5/6/7/8/9, 1/2/3/4/5/6/7/8/9/10, 1/2/3/4/5/6/7/8/9/10/11)

(If you really want to know what a power series is, read Roger Penrose’s book.

To summarise: a lot of functions can be approximated by summing weighted powers of the input variable, as an equally valid alternative to applying the function itself. For example, adding input¹  1/2 ⨯ input²  1/2/3 ⨯ input³  1/2/3/4 ⨯ input⁴ and so on, eventually approximates e^input.)




Bilinear maps and dual spaces
Think of a function that takes two inputs and gives one output. The + operator is like that. 9+10=19 or, if you prefer to be computer-y about it, plus(9, 10) returns 19.
So is the relation &#8220;the degree to which X loves Y&#8221;. Takes as inputs two people and returns the degree to which the first loves the second. Not necessarily symmetrical! I.e. love(A→B) ≠ love(B→A). * It can get quite dramatic.

An operator could also take three or four inputs.  The vanilla Black-Scholes price of a call option asks for {the current price, desired exercise price, [European | American | Asian], date of expiry, volatility}.  That&#8217;s five inputs: three ℝ⁺ numbers, one option from a set isomorphic to {1,2,3} = ℕ₃, and one date.

A bilinear map takes two inputs, and it&#8217;s linear in both terms.  Meaning if you adjust one of the inputs, the final change to the output is only a linear difference.
Multiplication is a bilinear operation (think 3×17 versus 3×18). Vectorial dot multiplication is a bilinear operation. Vectorial cross multiplication is a bilinear operation but it returns a vector instead of a scalar. Matrix multiplication is a bilinear operation which returns another matrix. And tensor multiplication ⊗, too, is bilinear.
Above, Juan Marquez shows the different bilinear operators and their duals. The point is that it&#8217;s just symbol chasing.

* The distinct usage &#8220;I love sandwiches&#8221; would be considered a separate mathematical operator since it takes a different kind of input.

Bilinear maps and dual spaces

Think of a function that takes two inputs and gives one output. The + operator is like that. 9+10=19 or, if you prefer to be computer-y about it, plus(9, 10) returns 19.

So is the relation “the degree to which X loves Y”. Takes as inputs two people and returns the degree to which the first loves the second. Not necessarily symmetrical! I.e. love(A→B) ≠ love(B→A). * It can get quite dramatic.

L(A,B) &notequals; L(B,A)

An operator could also take three or four inputs.  The vanilla Black-Scholes price of a call option asks for {the current price, desired exercise price, [European | American | Asian], date of expiry, volatility}.  That’s five inputs: three ⁺ numbers, one option from a set isomorphic to {1,2,3} = ℕ₃, and one date.

inputs and outputs of vanilla Black-Scholes

A bilinear map takes two inputs, and it’s linear in both terms.  Meaning if you adjust one of the inputs, the final change to the output is only a linear difference.

Multiplication is a bilinear operation (think 3×17 versus 3×18). Vectorial dot multiplication is a bilinear operation. Vectorial cross multiplication is a bilinear operation but it returns a vector instead of a scalar. Matrix multiplication is a bilinear operation which returns another matrix. And tensor multiplication , too, is bilinear.

Above, Juan Marquez shows the different bilinear operators and their duals. The point is that it’s just symbol chasing.

File:Dual Cube-Octahedron.svg

* The distinct usage “I love sandwiches” would be considered a separate mathematical operator since it takes a different kind of input.


hi-res