Posts tagged with functionals

Today’s glossary provoked by Michael Spivak.

tensors spivak diff geom vol 1 ch 4

Below I’ve drawn two vector spaces connected by a linear homomorphism ƒ, plus a linear functional λ going to ℝ. After seeing these pictures I hope it’s easier to understand how the pullback ƒ* works.

Here’s one of the main pictures for flavour:

dual space linear functional

Also, you can probably just skim the pictures and get the point (especially the Number Field one and the final one). That’s the fastest way to read this post.

Start with an abstract 𝓥ector space.

abstract vector space

I’ll do some violence because I’ll need coordinates in a minute.

invent linear functional

Read More

If your vector space is a shopping cart full of groceries, then the checkout clerk is a linear operator on that space.

Thank you, steel manufacturing companies, and thank you, chemical processing companies, for giving us the time to read. —Hans Rosling

Totally good point about how the mechanisation of the rich world has allowed us to have so many professors, doctors, photographers, lawyers, and social media managers.


But I wonder: why is laundry so important?

There has to be a good reason; no one working with their hands for 70+ hours a week would choose to do an extra 10 hours of labour a week if they could avoid it. But I know from experience that, in my world, if you don’t do laundry for months at a time, nothing bad happens to you.

What did I do instead of laundry? I’ve taken a few options, some of which would have been available to poor humans now or in the past:

  1. wash clothing with the excess soapy water that falls off me in the shower (not available to them)
  2. turn clothing inside out and leave it outside (requires a lot of socks but before the 19th century no one was wearing socks anyway)

The second you would think poor people could do pretty easily. I used my porch, which got sun and wind and blew away, over time, most of the smells

So what’s the reason they couldn’t do that? I have a few theories.

  • They laboured with their bodies, getting much sweatier than I do at my computer.
  • Bugs and germs were more prevalent in their environment and got in their clothing if it weren’t soaped — or at least exposed to ammonia rising off the castle pissing grounds.
  • They got dirtier, muddier, muckier. But why would you need to deal with that?
  • Having clean clothes raised your appeal to the opposite sex, and social status went along with that as it goes along with attractiveness today. Clean isn’t necessary; it’s just sexy (on average).

Anyway, I wonder if it isn’t the other changes to the modern OECD environment (reduction in bugs and reduction in manual labour) that made for the progress. Nowadays I just use the washer when I’ve exercised or played in the mud.

If the wash was always just a way of keeping up with the Joneses, however, then we can’t congratulate the washing machine for saving us necessary labour — it just helps us live out our autocompetitive rank obsessions in other ways now the elbow’s been surpassed on that dimension.

I hope I can say this in a way that makes sense.

One kind of mathematical symbology your eyes eventually get used to is the Σum over all individuals” concept:


Yes, at first it’s painful, but eventually it looks no more confusing than the oddly-bent Phoenician-alphabet letters that make up English text.

I believe there is a generally worthwhile pearl of thought-magic wrapped up in the sub-i, sub-j pattern. I mean, a metaphor that’s good for non-mathematicians to introduce into their mind-swamps.

That pearl is a certain connection between the specific and the general—a way of reaching valid generality, without dropping the specificity.


Before I explain it, let me talk a bit more about the formalism and how it’s used. I’ll introduce one word: functional. A functional ƒ maps from a small, large, convoluted, or simple domain onto a one-dimensional codomain (range).

Examples (and non-examples) of functionals:

  • size — you can measure the volume of a convex hull, the length of an N-dimensional vector, the magnitude of a complex number, the girth of a rod, the supremum of a functional, the sum of a sequence, the length of a sequence, the number of books someone has read, the breadth of books someone has read (is that one-dimensional? maybe not), the complicatedness (Vapnik-Chervonenkis dimension) of a functional, the Gini coefficient of a country’s income distribution, the GNP of a country, the personal incomes of the lowest earning 10% of a country, the placement rate of an MBA programme, the mean post-MBA income differential, the circumference of a ball, the volume of a ball, … and many other kinds of size.
  • goodness / scorebusiness metrics often rank super-high-dimensional things like the behaviour of a group of team members into a total ordering of desireable through less desireable. When businesses use several different metrics (scores) then that’s not a functional but instead the concatenation of several functionals (into a function).
  • utility — for homo economicus, all possible choices are totally, linearly ordered by equivalence classes of isoclines.
  • fitness — all evolutionary traits (a huge, huge space) are cross-producted with an evolutionary environment to give a Fitness Within That Environment: a single score.
  • angle — if “angle” has meaning (if the space is an inner product space) then angle is a one-dimensional codomain. In the abstract sense of "angle" I could be talking about correlation or … something else that doesn’t seem like a geometrical angle as normally proscribed.
  • distance … or difference — Intimately related to size, distance is kind of like “size between two things”. If that makes sense.
  • quantum numbers — four quantum numbers define an electron. Each number (n, l, m, spin) maps to a one-dimensional answer from a finite corpus. Some of the corpora are interrelated though, so maybe it’s not really 1-D.
  • quantum operators — Actually, some quantum operators are non-examples because they return an element of Hilbert space as the answer. (like the Identity operator). But for example the Energy operator returns a unidimensional value.
  • ethics — Do I need more non-examples of functionals? A complete ethical theory might return a totally rankable value for any action+context input. But I think it’s more realistic to expect an ethical theory to return a complicated return-value type since ethics hasn’t been completely figured out.
  • regression analysis — You get several β's as return values, each mogrified by a t-value. So: not a one-dimensional return type.
  • logic — in the propositional calculus, declarative sentences return a value from {true, false} or from {true, false, n/a, don’t know yet}. You could argue about whether the latter is one-dimensional. But in modal logic you might return a value from the codomain “list of possible worlds in which proposition is true”, which would definitely not be a 1-dimensional return type.
  • factor a number — last non-example of a functional. You put in 136 and you get back {1, 2, 4, 8, 17, 34, 68, 136}. Which is 8 numbers rather than 1. (And potentially more: 1239872 has fourteen divisors or seven prime factors, whichever you want to count.)
  • median — There’s no simple formula for it, but the potential answers come from a codomain of just-one-number, i.e. one parameter, i.e. one dimension.
  • other descriptive statistics — interquartile range, largest member of the set (max), 72nd percentile, trimean, 5%-winsorised mean, … and so on, are 1-dimensional answers.
  • integrals — Integrals don’t always evaluate to unidimensional, but they frequently do. “Area under a curve” has a unidimensional answer, even though the curve is infinite-dimensional. In statistics one uses marginalising integrals, which reduce the dimensionality by one. But you also see 's that represent a sequence of ∫∫∫'s reducing to a size-type answer.
  • variability — Although wiggles are by no means linear, variance (2nd moment of a distribution) measures a certain kind of wiggliness in a linearly ordered, unidimensional way.
  • autocorrelation — Another form of wiggliness, also characterised by just one number.
  • Conditional Value-at-Risk — This formula image is a so-called “coherent risk measure”. It’s like the expected value of the lowest decile. Also known as expected tail loss. It’s used in financial mathematics and, like most integrals, it maps to one dimension (expected £ loss).
  • "the" temperature — Since air is made up of particles, and heat is to do with the motions of those particles, there are really something like 10^23 dynamical orbits that make a room warm or cold (not counting the sun’s rays). “The” temperature is some functional of those—like an average, but exactly what I don’t know.

Functionals can potentially take a bunch of complicated stuff and say one concrete thing about it. For example I could take all the incomes of all the people in Manhattan, apply this functional:

average income of Manhattanites

and get the average income of Manhattan.

Obviously there is a huge amount of individual variation among Manhattan’s residents. However, by applying a functional I can get Just One Answer about which we can share a discussion. Complexity = reduced. Not eliminated, but collapsed.

I could apply other functionals to the population, like

  • count the number of trust fund babies (if “trust fund baby” can be defined)
  • calculate the fraction of artists (if “artist" can be defined)
  • calculate the “upper tail risk” (ETL integral from 90% to 100%, which average would include Nueva York’s several billionaires)

Each answer I am getting, despite the wide variation, is a simple, one-dimensional answer. That’s the point of a functional. You don’t have to forget the profundity or specificity of individual or group variation, but you can collapse all the data onto a single, manageable scale (for a time).


The payoff

The sub-i sub-j pattern allows you to think about something both specifically and in general, at once.

  1. Each individual is counted uniquely. The description of each individual (in terms of the parameter/s) is unique.
  2. Yet there is a well-defined, actual generalisation to be made as well. (Or multiple generalisations if the codomain is multi-dimensional.) These are valid generalisations. If you combine together many such generalisations (median, 95th percentile, 5th percentile, interquartile range) then you can quickly get a decent description of the whole.


Kind of like how thinking with probability distributions can help you avoid stereotypes: you can understand the distinctions between

  • the mean 100m sprint time of all men is faster than the mean 100m sprint time of all women
  • the medians are rather close, perhaps identical
  • the top 10% of women run faster than the bottom 80% of men
  • the variance of male sprint times is greater than the variance of female sprint times
  • differences in higher moments, should they exist
  • the CVaR's of the distributions are probably equivalent
  • conditional distributions (sub-divisions of sprint times) measured of old men; age 30-42 black women; age 35 Caribbean-born women of any race of non-US nationality who live in the state of Alabama
  • and so on.

It becomes harder to sustain sexism, racism, and to sustain stereotypes of all sorts. It becomes harder to entertain generalistic, simplistic, model-driven, data-less economic thinking.

  • For instance, the unemployment rate is the collapse/sum of ∀ lengths of individual unemployment spells: ∫ (length of unemp) • (# of people w/ that unemp length) = ∫ dx • ƒ(x).

    Like the dynamic vapor pressure of a warm liquid in a closed container, where different molecules are pushing around in the gas and alternately returning to the soup. The total pressure looks like a constant, but that doesn’t mean the same molecules are gaseous—nor does it mean the same people are unemployed.

    (So, for example, knowing that the unemployment rate is higher doesn’t tell you whether there are a few more long-term unemployed people, a lot more short-term unemployed people, or a mix.)
  • You can generalise about a group using different functionals. The average wealth (mean functional) of an African-American Estadounidense is lower than the average wealth of a German-American Estadounidense, but that doesn’t mean there aren’t wealthy AA’s (max functional) or poor GA’s (min functional).
  • You don’t have to collapse all the data into just one statistic.

    You can also collapse the data into groups, for example collapsing workers into groups based on their industry.

    (here the vertical axis = number of Estadounidenses employed in a particular industry — so the collapse is done differently at each time point)

Various facts about Venn Diagrams, calculus, and measure theory constrain the possible logic of these situations. It becomes tempting to start talking about underlying models, variation along a dimension, and “the real causes" of things. Which is fun.

At the same time, it becomes harder to conceive overly simplistic statements like “Kentuckians are poorer than New Yorkers”. Which Kentuckians do you mean? And which New Yorkers? Are you saying the median Kentuckian is poorer than the median New Yorker? Or perhaps that dollar cutoff for the bottom 70% of Kentuckians are poorer than the cutoff to the bottom 50% of New Yorkers? I’m sorry, but there’s too much variation among KY’s and NY’s for the statement to make sense without a more specific functional mapping from the two domains of the people in the states onto a dollar figure.


ADDED: This still isn’t clear enough. A friend read this piece and gave me some helpful feedback. I think maybe what I need to do is explain what the sub-i, sub-j pattern protects against. It protects against making stupid generalisations.

To be clear: in mathematics, a generalisation is good. A general result applies very broadly, and, like the more specific cases, it’s true. Since I talk about both mathematical speech and regular speech here, this might be confusing. But: in mathematics, a generalisation is just as true as the original idea but just applies in more cases. Hence is more likely to apply to real life, more likely to connect to other ideas within mathematics, etc. But as everyone knows, people who “make generalisations” in regular speech are usually getting it wrong.

Here are some stupid generalisations I’ve found on the Web.

  • Newt Gingrich: "College students are lazy."
     Is that so? I bet that only some college students are lazy.

    Maybe you could say something true like “The total number of hours studied divided by total number of students (a functional ℝ⁺^{# students}→ℝ⁺) is lower than it was a generation ago.” That’s true. But look at the quantiles, man! Are there still the same number of studious kids but only more slackers have enrolled? Or do 95% of kids study less? Is it at certain schools? Because I think U Chicago kids are still tearing their hair out and banging their heads against the wall. 
  • Do heterodox economists straw-man mainstream economics?
    I’m sure there are some who do and some who don’t.
  • The bad economy is keeping me unemployed.
    That’s foul reasoning. A high general unemployment rate says nothing directly about your sector or your personal skills. It’s a spatial average. Anyway, you should look at the length of personal unemployment spells for  
  • Conservatives say X. Liberals say Y. Libertarians think Z.
    Probably not ∀ conservatives say X. Nor ∀ liberals say Y. Nor do ∀ libertarians think Z. Do 70% of liberals say Y? Now that I’m asking you to put numbers to the question, that should make you think about defining who is a liberal and measuring what they say. Not only listening to the other side, but quantifying what they say. Are you so sure that 99% of libertarians think Z now?
  • The United States needs to focus on creating high-tech jobs.
    Are you actually just talking about opportunities for upper-middle-class people in Travis County, TX and Marin County, CA? Or does your idea really apply to Tuscaloosa, Flint, Plano, Des Moines, Bemidji, Twin Falls, Lawrence, Tempe, Provo, Cleveland, Shreveport, and Jacksonville?
  • Green jobs are the future!
    For whom?
  • Alaskans are enslaved to oil companies.
  • Meat eaters, environmentalists, blacks, hipsters, … you can find something negative said about almost any group.
    Without quantification or specificity, it will almost always be false. With quantification, one must become aware of the atoms that make up a whole—that the unique atoms may clump into natural subgroups; that variation may derive from other associations—that the true story of a group is always richer and more interesting than the imagined stereotypes and mental shorthand.
  • What’s wrong with the teenage mind? WSJ.
    a teenage mind?
  • French women eat rich food without getting fat. Book.
  • French parents are better than American parents. WSJ.
  • What is it about twenty-somethings? NY Times.

If you sub-i, sub-j these statements, you can come up with a more accurate and productive sentence that could move disagreeing parties forward in a conversation.


Unwarranted generalisations are like Star Trek: portraying an entire race as being defined by exactly one personality trait (“Klingons are warlike”, “Ferengi’s ony care about money”). That sucks. The sub-i, sub-j way is more like Jack Kerouac’s On the Road: observing and experiencing individuals for who they are. That’s the way.

Neal CassadyAllen GinsbergWilliam S Burroughs, oldJack Kerouac

If you want to make true generalisations—well, you’re totally allowed to use a functional. That means the generalisations you make are valid—limited, not overbearing, not reading too much into things, not railroading individuals who contradict your idea in service of your all-important thesis.

OK, maybe I’ve found it: a good explanation of what I’m trying to say. There are valid ways to generalise about groups and there are invalid ways. Invalid is making sweeping over-generalisations that aren’t true. Sub-i, sub-j generalisations are true to the subject while still moving beyond “Everyone is different”.

With the the increasing availability of complicated alternative investment strategies to both retail and institutional investors, and the broad availability of financial data, an engaging debate about performance analysis and evaluation is as important as ever. There won’t be one right answer delivered in these metrics and charts. What there will be is an accretion of evidence, organized to assist a decision maker in answering a specific question that is pertinent to the decision at hand.
Performance Analytics R package
(by Brian G. Peterson & Peter Carl) 

For those not in the know, here’s what mathematicians mean by the word “measurable”:

  1. The problem of measure is to assign a ℝ size ≥ 0 to a set. (The points not necessarily contiguous.) In other words, to answer the question:
    How big is that?
  2. Why is this hard? Well just think about the problem of sizing up a contiguous ℝ subinterval between 0 and 1.
    • It’s obvious that [.4, .6] is .2 long and that
    • [0, .8] has a length of .8.
    • I don’t know what the length of √2√π/3] is but … it should be easy enough to figure out.
    • But real numbers can go on forever: .2816209287162381682365...1828361...1984...77280278254....
    • Most of them (the transcendentals) we don’t even have words or notation for.
      most of the numbers are black = transcendental
    • So there are a potentially infinite number of digits in each of these real numbers — which is essentially why the real numbers are so f#cked up — and therefore ∃ an infinitely infinite number of numbers just between 0% and 100%.

    Yeah, I said infinitely infinite, and I meant that. More real numbers exist in-between .999999999999999999999999 and 1 than there are atoms in the universe. There are more real numbers just in that teensy sub-interval than there are integers (and there are integers).

    In other words, if you filled a set with all of the things between .99999999999999999999 and 1, there would be infinity things inside. And not a nice, tame infinity either. This infinity is an infinity that just snorted a football helmet filled with coke, punched a stripper, and is now running around in the streets wearing her golden sparkly thong and brandishing a chainsaw:
    I think the analogy of 5_1 to Patrick Bateman is a solid and indisputable one.

    Talking still of that particular infinity: in a set-theoretic continuum sense, ∃ infinite number of points between Barcelona and Vladivostok, but also an infinite number of points between my toe and my nose. Well, now the simple and obvious has become not very clear at all!
    Europe  Data set:> eurodist                 Athens Barcelona Brussels Calais Cherbourg Cologne CopenhagenBarcelona         3313                                                       Brussels          2963      1318                                             Calais            3175      1326      204                                    Cherbourg         3339      1294      583    460                             Cologne           2762      1498      206    409       785                   Copenhagen        3276      2218      966   1136      1545     760           Geneva            2610       803      677    747       853    1662       1418Gibraltar         4485      1172     2256   2224      2047    2436       3196Hamburg           2977      2018      597    714      1115     460        460Hook of Holland   3030      1490      172    330       731     269        269Lisbon            4532      1305     2084   2052      1827    2290       2971Lyons             2753       645      690    739       789     714       1458Madrid            3949       636     1558   1550      1347    1764       2498Marseilles        2865       521     1011   1059      1101    1035       1778Milan             2282      1014      925   1077      1209     911       1537Munich            2179      1365      747    977      1160     583       1104Paris             3000      1033      285    280       340     465       1176Rome               817      1460     1511   1662      1794    1497       2050Stockholm         3927      2868     1616   1786      2196    1403        650Vienna            1991      1802     1175   1381      1588     937       1455                Geneva Gibraltar Hamburg Hook of Holland Lisbon Lyons MadridBarcelona                                                                   Brussels                                                                    Calais                                                                      Cherbourg                                                                   Cologne                                                                     Copenhagen                                                                  Geneva                                                                      Gibraltar         1975                                                      Hamburg           1118      2897                                            Hook of Holland    895      2428     550                                    Lisbon            1936       676    2671            2280                    Lyons              158      1817    1159             863   1178             Madrid            1439       698    2198            1730    668  1281       Marseilles         425      1693    1479            1183   1762   320   1157Milan              328      2185    1238            1098   2250   328   1724Munich             591      2565     805             851   2507   724   2010Paris              513      1971     877             457   1799   471   1273Rome               995      2631    1751            1683   2700  1048   2097Stockholm         2068      3886     949            1500   3231  2108   3188Vienna            1019      2974    1155            1205   2937  1157   2409                Marseilles Milan Munich Paris Rome StockholmBarcelona                                                   Brussels                                                    Calais                                                      Cherbourg                                                   Cologne                                                     Copenhagen                                                  Geneva                                                      Gibraltar                                                   Hamburg                                                     Hook of Holland                                             Lisbon                                                      Lyons                                                       Madrid                                                      Marseilles                                                  Milan                  618                                  Munich                1109   331                            Paris                  792   856    821                     Rome                  1011   586    946  1476               Stockholm             2428  2187   1754  1827 2707          Vienna                1363   898    428  1249 1209      2105  Multi-dimensional scaling of the distances:  > cmdscale(eurodist)                        [,1]        [,2]Athens           2290.274680  1798.80293Barcelona        -825.382790   546.81148Brussels           59.183341  -367.08135Calais            -82.845973  -429.91466Cherbourg        -352.499435  -290.90843Cologne           293.689633  -405.31194Copenhagen        681.931545 -1108.64478Geneva             -9.423364   240.40600Gibraltar       -2048.449113   642.45854Hamburg           561.108970  -773.36929Hook of Holland   164.921799  -549.36704Lisbon          -1935.040811    49.12514Lyons            -226.423236   187.08779Madrid          -1423.353697   305.87513Marseilles       -299.498710   388.80726Milan             260.878046   416.67381Munich            587.675679    81.18224Paris            -156.836257  -211.13911Rome              709.413282  1109.36665Stockholm         839.445911 -1836.79055Vienna            911.230500   205.93020  Plot       require(stats)     loc <- cmdscale(eurodist)     rx <- range(x <- loc[,1])     ry <- range(y <- -loc[,2])     plot(x, y, type="n", asp=1, xlab="", ylab="")     abline(h = pretty(rx, 10), v = pretty(ry, 10), col = "light gray")     text(x, y, labels(eurodist), cex=0.8)
    So it’s a problem of infinities, a problem of sets, and a problem of the continuum being such an infernal taskmaster that it took until the 20th century for mathematicians to whip-crack the real numbers into shape.
  3. If you can define “size” on the [0,1] interval, you can define it on the [−535,19^19] interval as well, by extension.

    If you can’t even define “size” on the [0,1] interval — how do you think you’re going to define it on all of ℝ? Punk.
  4. A reasonable definition of “size” (measure) should work for non-contiguous subsets of ℝ such as “just the rational numbers” or “all solutions to cos² x = 0(they’re not next to each other) as well.

    Just another problem to add to the heap.
  5. Nevertheless, the monstrosity has more-or-less been tamed. Epsilons, deltas, open sets, Dedekind cuts, Cauchy sequences, well-orderings, and metric spaces had to be invented in order to bazooka the beast into submission, but mostly-satisfactory answers have now been obtained.

    It just takes a sequence of 4-5 university-level maths classes to get to those mostly-satisfactory answers.
    One is reminded of the hypermathematicians from The Hitchhiker’s Guide to the Galaxy who time-warp themselves through several lives of study before they begin their real work.


For a readable summary of the reasoning & results of Henri Lebesgue's measure theory, I recommend this 4-page PDF by G.H. Meisters. (NB: His weird ∁ symbol means complement.)

That doesn’t cover the measurement of probability spaces, functional spaces, or even more abstract spaces. But I don’t have an equally great reference for those.

Oh, I forgot to say: why does anyone care about measurability? Measure theory is just a highly technical prerequisite to true understanding of a lot of cool subjects — like complexity, signal processing, functional analysis, Wiener processes, dynamical systems, Sobolev spaces, and other interesting and relevant such stuff.

It’s hard to do very much mathematics with those sorts of things if you can’t even say how big they are.

Given a time-series of one security’s price-train P[t], a low-frequency trader’s job (forgetting trading costs) is to find a step function S[t] to convolve against price changes P[t]


with the proviso that the other side to the trade exists.

S[t] represents the bet size long or short the security in question. The trader’s profit at any point in time τ is then given by the above definite integral.

  • I haven’t seen anyone talk this way about the problem, perhaps because I don’t read enough or because it’s not a useful idea. But … it was a cool thought, representing a >0 amount of cogitation.
  • This came to mind while reading a discussion of “Monkey Style Trading” on NuclearPhynance. My guess is that monkey style is a Brownian ratchet and as such should do no useful work.
  • If I were doing a paper investigating the public-welfare consequences of trading, this is how I’d think about the problem.

    Each hedge fund / central bank / significant player is reduced to a conditional response strategy, chosen from the set of all step functions uniformly less than a liquidity constraint. This endogenously coughs up the trading volume which really should be fed back into the conditional strategies.
  • Does this viewpoint lead to new risk metrics?
  • Should be mechanical to expand to multiple securities. Would anything interesting come from that?

I wouldn’t usually think that multiplication of functions has anything to do with trading. Maybe some theorems can do a bit of heavy lifting here; maybe not.

It at least feels like an antidote to two wrongful axiomatic habits. For economists who look for real value, logic, and Information Transmission, it says The market does whatever it wants, and the best response is a response to whatever that is. For financial engineering graduates who spent too long chanting the mantraμ dt + σ dBt" this is just another way of emphasising: you can’t control anything except your bet size.

UPDATE: Thanks to an anonymous commenter for a correction.

OKCupid is using the wrong mathematics to match potential dates together. But before I critique them, let me compliment them on what they’re doing right:

  • "Our" mutual score is the geometric average of your score of me, and my score of you.
  • They low-ball the match % until they have enough statistical confidence in the number of questions we’ve both answered.
  • Questions come from users as well as staff. So they avoid some potential blind spots. (crowdsourcing)
  • OKCupid prompts you with questions that have the greatest chance of distinguishing you as quickly as possible. (maximally separating hyperplanes) If OKC already knows you want your date to shower at least once a day, keep a clean room, and that picking food from the trashcan is unacceptable, it won’t ask if you prefer crustpunks or gutterpunks.
  • You don’t have to be the same as me for us to match. I get to specify what answers I want from you.
  • They use a logarithmic scale of importance. Logs are the natural way we perceive levels or categories of importance. (For example “categories” of how big a war was, emerge naturally when you take the log of number of deaths.)
  • It’s simple. At least they’re not using a non-linear Bayesian splitting tree didactogram or some other hunky machine-learning jiu jitsu.

But, there’s still room for improvement. Particularly the following critique, originally made by Becky Russoniello. Currently, OKCupid is set up to award high scores just for being not-a-terrible match. That’s bad.


To show why I need to first detail how your score of me is calculated:

  1. You answer questions like, “Is homosexuality a sin?” Your answer consists of: (a) what you think, (b) what answer/s are acceptable for me to give, and (c) how important it is for me to get this question “right” per your definition.
  2. The question’s importance draws from {Mandatory, Very Important, Somewhat Important, A Little Important, Irrelevant} which biject to the numbers {250, 50, 10, 1, 0}.
  3. If I get a Very Important question “right”, I get 50/50 points, and if I get a Very Important question “wrong”, I get 0/50 points. If I haven’t answered the Very Important question, I get 0/0 points — neither penalised nor rewarded.

For more details, see their FAAAQ.



Here’s the important flaw: the denominator grows as long as we’ve answered the same question. In practice, the Mandatory questions both

  1. crowd out more interesting differentiators, and
  2. inflate the scores of people who merely have tolerable political views.

To demonstrate this, I’ll share some of the Mandatory questions from my own OKCupid profile.

  • Do you think homosexuality is a sin?
  • How often are you open with your feelings? (can’t be Rarely or Never)
  • Would it bother you if your boss was minority, female, or gay?
  • Would you write your child’s college entry essay?
  • What volume level do you prefer when listening to music? (can’t be “I prefer not to listen to music”)
  • Would you try to control your mate with threats of suicide?
  • Gay marriage — should it be legal?
  • Are you married, engaged to be married, or in a relationship that you believe will lead to marriage?
  • How important to you is a match’s sense of humor? (can’t be Not Important)
  • Would the world be a better place if people with low IQ’s were not allowed to reproduce?

Some other doozies which I might wrongly make Mandatory include:

  • Which is bigger? The Earth, or the Sun?
  • How many continents are there?
  • Do you consider astrology to be a legitimate science?

The problem with all of these filters, is that I mean them to act only in a negative direction. (Could I call them “quasi-filters”?)



In other words, someone doesn’t become a great potential match simply because they’re not

  • a bigot,
  • a cheat,
  • a eugenicist,
  • or a depressive manipulative.

You need to receive those check-marks just to get to zero with me. You also need to be not-married-to-someone-else. That doesn’t win you plus points, it’s just a requirement. But under the current OKCupid schema, you do win 250/250 from me for simply being available. Oops.

Likewise, knowing basic facts from grade-school seems, like, uh, necessary. But, even if somebody thinks there are 6 or 8 continents, do you really think you won’t be able to tell once they message you?

Few people will be culled by the Continents question, and if you make 10 such easy questions Mandatory, then everybody else will start with 2500/2500 points — so the rest of your match questions will barely distinguish one from the other. Even the Very Important questions (50 points apiece) will only budge the score a little below a default of 100%. And the Somewhat Important questions, which tend to be the more discriminative ones, are mowed down by the juggernaught of Easy Questions.

EDIT (23 NOV): According to the comments, the number of continents is not a universal fact, but rather varies from culture to culture (and within cultures). So that’s a really terrible question to make Mandatory! I should have said above Few people will be culled by asking whether the Earth is bigger than the sun, and if you make 10 such easy questions Mandatory, then everybody else will start with 2500/2500 points.

OKCupid asks other, more useful questions, like:

  • Are you annoyed by people who are super logical?
  • Do you like abstract art?
  • Do you spend more money on clothes, or food?
  • Could you tolerate a ___________________ [my political / religious views] ?
  • Do you like dogs?

which would actually distinguish among potential dates for me. Let’s face it: I write a blog about mathematics, so someone who is annoyed by super logical people is probably going to dislike me. And, I like abstract art. Maybe we could go to a gala for our first date.

Although everyone knows there are 7 continents the Sun is bigger than the Earth, not everyone is bothered by “logical” personalities. So those questions better sort the available dates.

want to go on a cruise on us stevenf?


The worst side effect of the current scoring system, is that a spammer could easily answer only the questions with obvious answers (basic facts and display of non-bigotry) and get a decently high match percentage with a lot of people. At which point, the spammer uploads a picture of an attractive guy/girl, writes some generic profile text, and scams away.



I think a better model oft how people evaluate potential dates can be found within economics. Specifically, Kahneman & Tversky's Prospect Theory:


The main lessons I draw from prospect theory, as a theory of psychology, are:

  1. We evaluate things based on a reference point (“zero”).
  2. Small perceived negatives are twice as bad, as small perceived positives are good (“local kink at zero”).
  3. Really bad or really good, we lose our ability to coherently measure how far from zero (“log-like at great distances”).

How does P.T. apply to dating and OKCupid?

Amos Tversky

Bigots, cheats, eugenicists, and depressive manipulatives are way off in negative land. I’m not even interested in meeting them. I don’t care whether OKC gives them a 0% or a 10%, because those are effectively the same to me: ignore. I only need OKCupid to accurately score people who are somewhere north of my reference point.

  • What if the scoring system simply binned everyone below 50%? They could all be labelled “non-match” and then twice as many numbers would be available to grade the remaining candidates.

    That’s a mathematically good idea, but doesn’t address the issue of dilution. And, it seems to ignore an aspect of “numbers psychology”: people like using only the upper half of the scale. Think about how people use the hotness scale: they would never be comfortable dating a 4.
  • What if OKCupid revamped their whole framework along the lines of Prospect Theory? Try to establish a reference point, do some research into psychology papers that bear on the topic, and so on.

    Well, it might be cool. But that’s a lot of work, and OKC is already successful. Big changes alienate users.

Here’s the simplest solution I can think of — which requires no UI changes and no research. In fact an OKC developer should only need to amend one line of code.

  • Mandatory questions can only give out negative points for answering wrong. No plus points for right answers to Mandatory.

Mathematically this is ugly because you introduce a discontinuity — but, so what? I think this is what the broad majority of people mean when they say something is mandatory. If you have a mandatory employee meeting, do people get a bonus for showing up? Does HM Revenue pat you on the back for paying tax?

In the eloquent phrasing of Chris Rock:

If OKC ends out giving some negative (or I guess imaginary, under the square root from the geometric average) scores, so what? I was ignoring everybody under 60% anyway.


If you use OKCupid, there is a way to improve your matches even if they never change their matching algorithm:

  • Lower the importance of questions with obvious answers. I bet you won’t start matching with people who believe the Earth is larger than the Sun. And you will pick up extra precision in matches with other people.
  • Even if something is mandatory for you to date someone, don’t use the Mandatory category like that. Maybe you can have a few mandatory questions, but overall it just dilutes the scoring.