Posts tagged with optimisation

High modernist subjectivity gives an extraordinary privilege … to judgement and especially to cognition…. The modern predominance of reading….

High [modernism] … furthermore … privileges the cognitive and moral over the aesthetic and the libidinal, the ego over the id, the visual over touch, and discursive over figural communication.

…the individual [is] somehow ‘closed’ instead of open; to be somehow obsessed with self-mastery and self-domination.

Lash, S. & Friedman, J. (Eds.). (1993). Modernity & Identity. Massachusetts: Blackwell, pg. 5

via writingcapital



gradient descent on a 2-dimensional convex, quadratic cost function with condition number=100
adding momentum the gradient speeds up the approximation, in these high-condition cases — still using gradient descent (which scales better than Newton-Raphson in high-D)
like adding momentum in an oscillating mechanical system that vibrates too much
heavy ball method (Polyak)

gradient descent on a 2-dimensional convex, quadratic cost function with condition number=100

  • adding momentum the gradient speeds up the approximation, in these high-condition cases — still using gradient descent (which scales better than Newton-Raphson in high-D)
  • like adding momentum in an oscillating mechanical system that vibrates too much
  • heavy ball method (Polyak)



For me this question is a symbol of problems that people’s intuitions are good at, but their mathematical models fail at.

  • We can certainly define the domain (set of possible words)
  • and we can define reasonable scalar-ish codomains (number of hit records, rankings by critics, faces on the people outside your show, …)
  • but how would you set up an optimisation problem to answer this question?


It doesn’t just fail because it needs to be parameterised by

  • the history of other bands (“Lady Gaga”)
  • puns or linguistic meaning (“The Beatles”)
  • emotional tenor of the band’s songs (imagine if The BeeGees were instead called Thräsherdëth)


but also because calculus' Really Cool Idea

finds no purchase since any 1-D lineup of all possible band names won’t be 𝒞¹ onto the success of the band.


Like “What should I write about today?”, “What line of business should I get into?”, “What scientific problem should I study?”, “What should I do with my life?”, and a lot of other “broad, open-ended” questions, choosing a band-name is something I don’t think can be mathematised today. It’s also a mental shorthand for me for any question that is going to be answered better by “art” than by “science”.

The dual V* of a vector space V  over ℝ matches lists of reals to linear functionals.

What’s the simplest way to say this? Talk about a number like “5”. Initially I think of it as 5 stones ⬤⬤⬤⬤⬤. But I could also imagine a line through the origin with a slope of 5, representing the verb quintuple.

linear maps as multiplication
linear mappings -- notice they're ALL straight lines through the origin!

pictures of lines through the origin with various slopes

Seen as a function ƒ₅=quintuple, the-line-through-the-origin-with-a-slope-of-5, is ƒ₅(x)=5•x. That ƒ₅ does things like

  • ƒ₅(■■■)=5+5+5 and
  • ƒ₅(■■■■■■)=5+5+5+5+5+5.

Counting in the dual space ƒ₀,ƒ₁,ƒ₂,… would look like _ / ∕ ...|. Increasing slope from _ to ⁄ to | instead of increasing number from 0 to 1 to ∞. Or I could say id, double, triple, quadruple, quintuple, ….

(Why did I jump so suddenly 0,1,… from _ flat to ⁄  45°? This just proves that half of the ℝ⁺ are stuffed between [0,1) and the other half are between (1,∞).
To jump between the two worlds you use the reciprocal map flip(■)≝1/■. T
hen you’d be counting id, half, third, fourth, fifth, sixth, seventh… Infinity in a teacup.)


These two things—the five rocks ⬤⬤⬤⬤⬤ and the function ƒ₅—aren’t even the same kind of thing. One is nouns and one is a verb.

But still, for any real number that I “counted” I could match up a function, just like I did with

  • "5 that I counted" and
  • "function ƒ₅ = quintuple

So these two qualitatively different things are in bijection. (One can hope for insights by viewing things through one lens or the other, noun or verb version.)

This one-dimensional story can be upgraded to a multi-dimensional one where

  • lists of reals (3.1, √2, −2.1852, ..., 6) 

match to

  • many-to-one functions ƒ( list ) = 3.1•list[first] + √2•list[second] − 2.1852•list[third] + ... + 6•list[Nth].

Translating between the noun and verb viewpoints is then called musical isomorphism, represented with ♭ and ♯ symbols. Raising and lowering indices in a tensor is ♯ and ♭.

Akamai delivers like 20% of all internet traffic.

5 minutes explaining what it is they do.

  • differences between last & first mile of HTTP delivery, versus “the middle mile”
  • TCP is a really chatty technology
  • edge servers
  • bottlenecks
  • optimal delivery path in terms of time, not hops

The last bit is a sales pitch; minutes 1–3 are more worth watching than 0–1 or 3+.

(por akamaitechnologies)

In the world of constrained optimisation (and let’s be realistic, what effort isn’t constrained?), slack vectors ask: “What happens if I push back the walls?”

The canonical purpose of slack vectors is in sensitivity analysis, but I think the metaphor-that-exists-in-your-head-once-you-understand-slack-vectors applies to everyday life too.

When you build a mathematical model of something—starting with assumptions and working through to conclusions—sensitivity analysis questions the assumptions you made at the outset. “What if I was off by 1% about Factor 7? How screwed would I be? What if I was off by 1% about Factor 23?” The slack vector approach treats the optimisation problem as locally linear. So if the problem curves around 5%, 10%, 50%, you should take that into account. Also, look out for large interaction terms—say getting factors 23, 7, and 318 wrong together is much, much worse than getting any individual or pair wrong.


How about elsewhere? I see the slack vector metaphor as being appropriate to GPA. Generic human resources desks the world around want to see “a GPA of at least 3.3 / 4.0” or “a GPA of at least 3.7 / 4.0”, which is an ignorant way to go about things.

Obviously, there’s a tradeoff between how difficult a student’s classes are and how high their GPA is. Someone who challenges herself by switching out Psych 101 for Organic Chemistry is likely to experience a lower GPA — not only in O-Chem, but in other courses as well — as she converts resources that could have gone to other classes into the difficult subject. To summarise a semester (let alone 8 of them) with a single number is to ignore

  • the spread and, more importantly,
  • the difficulty

of the classes someone takes. Incorporating all the data would mean something like considering her place in several grade distributions per semester. (You might need to condition upon the professor or the classmates to understand the distribution across years.)

In other words, the “minimum GPA” approach ignores most of the optimisation problem. (It looks at the length of the Lagrangian without taking account of its direction or the constraints of the choice space.) Is your firm really trying to hire grade-grubbers who take the easiest classes they can? Didn’t think so.


Strategic business thinking can be thought of with slack vectors as well. (Or, if you’re labour rather than capital, substitute “strategic career management”.) Assume that you are working within certain constraints, but over a longer time horizon you have the ability to change things at least a little.

  • Which walls should you push against?
  • Which bonds are most worth your effort trying to loosen?
  • (Conversely, over time the constraints that were working your favour might turn against you; which risks are most important to prepare for?)

Peter Todd has been misquoted about the mathematics of dating here, here, here (here), here, here, here, herehere, here, here, and in at least five trillion issues of Cosmo. (Surprisingly, this and this did not misquote him.) It’s enough to make me want to write a strongly worded DEAR SIR to the Hearst Tower.

Here is what they say:

  • Only after you’ve dated twelve people, are you ready to decide who’s “The One”!

An even wronger version of the story goes like this:

  • The twelfth guy you date — he’s The One! Science says so! No pressure!!!!!!!

Not only is this wrong, but I’ve heard Peter rant in person, specifically about these misquotations. The problem he studies is known colloquially as "The Search for a Parking Space”.

  1. When you arrive at the movie theatre, you circle around the car park until you see an opening. (Let’s assume it’s below freezing outside.)
  2. When you see that opening, you can immediately tell how far away it is from the theatre. So you know how far you will have to walk in the cold.
  3. At that moment, you have to decide whether to drive on (keep looking for somewhere closer) or accept the probably-imperfect husband — oops, I mean parking space — that you’re staring straight in the face (oops, I mean tarmac).
  4. You can’t back up; you can’t see ahead; all you can do is remember the past, guess about the future, and assess the situation you’re in. That’s all you’ve got to go on. Try to solve that problem optimally.

The paper that’s being referenced (though apparently not read) in these magazines deals with an even stricter problem, known as "The Vizier Wants to Keep His Head":

  1. In this version of the blind forward-search problem, the greedy, vindictive, lazy Prince has to choose a wife.
  2. Being lazy, he tasks the Vizier with solving his problem. Being vindictive, if the Vizier gets it wrong, the Vizier loses his head. Being greedy, the Prince wants the Vizier to find him the wife with the richest dowry.

    (I believe dowry is chosen because it’s seen as a one-dimensional, objectively valuable quantity — as opposed to beauty, which is multifaceted and arguable. If we’re talking about various land holdings, I think dowry would also be multifaceted; that things have a single price is an illusion <link> of simplistic economic thinking.

    Imagine a woman whose family had holdings in modern-day Lebrija, Huelva, Palma del Condado, Aracena, and Ayamonte. Each taxable area will bring in unpredictable revenues year upon year, and the natural beauty of each estate is just as disputatious as a woman’s face. So how is that a one-dimensional value? Oh, well. The point is to assign a scalar to each woman.)

  3. The debutantes enter the Prince’s chamber one at a time; as each enters, a courtier reads her name and family holdings. So the Prince and Vizier assign a scalar to that maiden. Then the Prince either proposes marriage or declines.

  4.  Once an heiress has been declined, the Prince can’t call her back. In other words, even if he thinks to himself: “Crap! B_tch Number 37 had a nice rack and a fabulous estate in Milano. I should have gone with her!”, that’s just too bad. Even a handsome, powerful, jerk of a Prince can’t un-dump a ladyfriend.

  5. So the Vizier is set up a similar, but more constrained, problem to the Car Park Dilemma. Except the Prince can’t circle around the way a driver could.

  6. Also, this is important: exactly one-hundred dames will appear before the prince. The solution changes if an infinite progression of dames (or even just all the singles in your greater metropolitan region of choice) paraded before him.

  7. If a richer girl is to be found among either the post-wife sequence of the pre-wife sequence of heiresses, off with the Vizier’s head. 

Given that problem: pick the highest scalar from a forward-blind, one-by-one sequence of scalars, the Vizier maximises his probability of living past the ritual (to something like 30%) with the following strategy:

  1. Observe the wealth / beauty / scalar value of the first 12 women.
  2. Whatever is the highest wealth / beauty / scalar out of that group, becomes your “aspiration level” A.
  3. As soon as you see an heiress with wealth/beauty/scalar ≥A, tell the Prince to marry her.

Again, that strategy doesn’t make the Vizier win (i.e., it doesn’t make you pick the perfect boyfriend every time); it merely maximises the chances of maximisation, within this narrowly specified problem.

So here are the reasons the magazines & blogs are wrong:

  • A boyfriend is not a scalar.
  • Who says that a date equals a sample? I’ve been getting to know the human race my whole life. Every day I spend single, married, or it’s complicated — I am learning more information that can be used to set my aspiration level for a partner.
  • You can go back sometimes — either to rekindle a relationship that, in retrospect, was red-hot, or to revisit a crush you didn’t get far enough with to make things awkward.
  • There aren’t just 100 boys to look through. Let’s face it, there might as well be an infinite number of fish in the sea.
  • Um: a boyfriend is not a scalar. Love depends on you as well; if you could reduce your feelings to a scalar, you’d still want to model the relationship as a 2-equation dynamical system. Interplay; choices; reactions.


The original paper is called “Satisficing in Mate Search”. (I couldn’t find it online). Here is much, much more material on both data on dating and the science of thinking smarter by Dr. Todd.

You can also read Simple Heuristics That Make Us Smart (it’s on my to-read list — and it contains “Satisficing in Mate Search”), and if you look at Amazon’s similar books for the title you’ll come across all kinds of fascinating stuff: about Bayes’ rule, thinking from the gut, less is more, why it’s good to be stupid, willpower, and even an intro to game theory. (I haven’t read that particular treatment, but I do recommend reading just-a-little-bit of game theory as an awesome way to expand your imagination.)

You can get instant gratification with a free chapter of each, so these popular treatments are just as candy-like as Wired or Cosmo.


Just like with modern physics, this modern psychological science is super interesting. Way too interesting to justify wasting time on false and farcical narratives that totally miss the point.

To gild refined gold, to paint the lily, to throw perfume on a violet, … is wasteful and ridiculous excess. —King John