Posts tagged with size

population map of Scotland

population map of Scotland

(Source: Wikipedia)


  • solid — the category FinSet, a sack of wheat, a bag of marbles; atoms; axiom of choice; individuation. The urelemente or wheat-kernels are interchangeable although they’re technically distinct. Yet I can pick out just one and it has a mass.
  • liquid — continuity; probability mass; Lewis’ gunky line; Geoff Hellman; the pre-modern, “continuous” idea of water; Urs Schreiber; Yoshihiro Maruyama; John L Bell
  • gas — Lebesgue measure theory; sizing Wiener processes image or other things in other “smooth” categories; here I mean again the pre-atomic vision of gas: in some sense it has constant mass, but it might be so de-pressurised that there’s not much in some sub-chamber, and the mass might even be so dispersed not only can you not pick out atoms and expect them to have a size (so each point of probability density has “zero” chance of happening), but you might need a “significant pocket” of gas before you get the volume—and unlike liquid, the gas’ volume might confuse you without some “pressure”-like concept “squeezing” the stuff to constrain the notion of volume.

The rank-nullity theorem in linear algebra says that dimensions either get

  • thrown in the trash
  • or show up

after the mapping.


By “the trash” I mean the origin—that black hole of linear algebra, the /dev/null, the ultimate crisscross paper shredder, the ashpile, the wormhole to void and cancelled oblivion; that country from whose bourn no traveller ever returns.

The way I think about rank-nullity is this. I start out with all my dimensions lined up—separated, independent, not touching each other, not mixing with each other. ||||||||||||| like columns in an Excel table. I can think of the dimensions as separable, countable entities like this whenever it’s possible to rejigger the basis to make the dimensions linearly independent.


I prefer to always think about the linear stuff in its preferably jiggered state and treat how to do that as a separate issue.

abstract vector space

So you’ve got your 172 row × 81 column matrix mapping 172→ separate dimensions into →81 dimensions. I’ll also forget about the fact that some of the resultant →81 dimensions might end up as linear combinations of the input dimensions. Just pretend that each input dimension is getting its own linear λ stretch. Now linear just means multiplication.

linear maps as multiplication
linear mappings -- notice they're ALL straight lines through the origin!

Linear stretches λ affect the entire dimension the same. They turn a list like [1 2 3 4 5] into [3 6 9 12 15] (λ=3). It couldn’t be into [10 20 30 − 42856712 50] (λ=10 except not everywhere the same stretch=multiplication).


Also remember – everything has to stay centred on 0. (That’s why you always know there will be a zero subspace.) This is linear, not affine. Things stay in place and basically just stretch (or rotate).

So if my entire 18th input dimension [… −2 −1 0 1 2 3 4 5 …] has to get transformed the same, to [… −2λ −λ 0 λ 2λ 3λ 4λ 5λ …], then linearity has simplified this large thing full of possibility and data, into something so simple I can basically treat it as a stick |.

If that’s the case—if I can’t put dimensions together but just have to λ stretch them or nothing, and if what happens to an element of the dimension happens to everybody in that dimension exactly equal—then of course I can’t stick all the 172→ input dimensions into the →81 dimension output space. 172−81 of them have to go in the trash. (effectively, λ=0 on those inputs)

So then the rank-nullity theorem, at least in the linear context, has turned the huge concept of dimension (try to picture 11-D space again would you mind?) into something as simple as counting to 11 |||||||||||.

Oh! This one only took me 17 years or so to figure out. This was a “fact” I had committed to memory in school but never thought about why.


From The Symplectization of Science by Mark Gotay and James Isenberg:


There are some connections to circles and homogeneous coordinates (v/‖v‖) but let’s leave those for another time.

Gotay & Isenberg’s exposition using the metric makes it clear that the
/‖v‖ part of the definition of cosine isn’t where the right-angle concept comes from. It comes from the v₁ w₁ + v₂ w₂.



So if the slope of my starting line is m, why is the slope of its perpendicular line −1/m?

First I could draw some examples.


I drew these with which is a good place to count out the “rise over run” and “negative run over rise” Δx & Δy distances to make sure they really do look perpendicular.

The length and the (affine or “shift”) positioning of perpendicular line segments doesn’t matter to their perpendicularity. So to make life easier on myself I’ll centre everything on zero and make the segments equal length.


The metric formula is going to work if let’s say my first vector v is (+1,+1) (one to the right and one up) and my second vector goes one down and one to the right. Then the metric would do:

+1 • +1 (horizontal) + +1 • −1 (vertical)

which cancels.


What if it were a slope of 9.18723 or something I don’t want to think about inverting?

This is a case where it’s probably easier to think in terms of abstractions and deduce, rather than using imagination in the conventional way.

If I went over +a steps to the right and +b steps to the up (slope=b/a), then the metric would do:

a•? + b•¿

What is that missing? If I plugged in (?←−b, ¿←a) or (?←b, ¿←−a), the metric would definitely always cancel.

And in either of those cases, the slope of the question marks (second line) would be −a/b.

So the multiplicative inverse (flipping) corresponds to swapping terms in the metric so that the two parts anti-match. And the additive inverse (sign change) means the anti-matched pairs will “fold in” to zero each other (rather than amplifying=doubling one another).

OK, not every day. But whenever I shop for packaged retail goods like a coffee or in the grocers.

The Pythagorean theorem demonstrates that a slightly larger circle has twice as much area as a slightly smaller circle.

Pythagorean Theorem  This is how I first really understood the Pythagorean Theorem.  The outer circle looks just a little bit larger than the inner circle. But actually, its area is twice as large.  Kind of like the difference between medium and large soda cups, or how a tiny house still requires kind of a lot of timber, for how much air it encloses. If you buy a slightly wider pizza or cake it will serve proportionally more people; and if an inverse-square force (sound, radio power, light brightness) expands a little bit more it will lose a lot of its energy.  Ideas involved here:  scaling properties of squared quantities(gravitational force, skin, paint, loudness, brightness)  circumcircle & incircle  2  This is also how I first really understood 2, now my favourite number.

(Since the diagonal of that square is √2 long relative to the "1" of the interior radius=leg of the right triangle. So the outer radius=hypotenuse=√2, and √2 squared is 2.)


And some of us know from Volume Integrals in calculus class that a cylinder's volume = circle area × height — and something like a sausage with a fat middle, or a cup with a wider mouth than base, can be thought of as a “stack” of circle areas
or in the case of a tapered glass, a “rectangle minus triangle” (when the circle is collapsed so just looking at base-versus-height “camera straight ahead on the table” view).


The shell-or-washer-method volume integral lessons were, I think, supposed to teach about symbolic manipulation, but I got a sense of what shapes turn out to be big or small volume as well.

By integrating dheight sized slices of circles that make up a larger 3-D shape, I can apply the inverse-square lesson of the Pythagorean theorem to how real-life “cylinders” or “cylinder-like things” will compare in volume.

  • A regulation Ultimate Frisbee can hold 6 beers. (It’s flat/short, but really wide)
    File:Frisbee Catch- Fcb981.jpg
  • The “large” size may not look much bigger but its volume can in fact be.
  • Starbucks keeps the base of their Large cups small, I think, to make the large size look noticeably larger (since we apparently perceive the height difference better than the circle difference). (Maybe also so they fit in cup holders in cars.)

Cylinder = line-segment × disc

C = | × ●

The “product rule” from calculus works as well with the boundary operator as with the differentiation operator .

∂C  =   ∂| × ●   +   | × ∂●




Oops. Typo. Sorry, I did this really late at night! cos and sin need to be swapped.










Oops. Another typo. Wrong formula for circumference.



  • It’s easier for me to grok statistical significance (p's and t's) from a scatterplot than magnitude (β's).
  • Even though magnitude can be the most important thing, it’s "hidden" off to the left.

    Note to self: look off to the left more, and for longer.
  • But I’m set up to understand the correlativeness in a sub_i, sub_j sense — which particular countries fit the pattern as well as how closely.


  • Minute __:__ Do each of the dimensions of social problems correlate individually, or is this only a mass effect of the combination?

If it’s true that raising marginal tax rates on the rich lowers crime rates without paying for any anti-crime programmes, that’s almost a free lunch.

UPDATE: Oh, hey, six months after I watch this and 3 days after I put up the story, I see Harvard Business Review has a story corroborating the same effect, instead pointing out how economists don’t look at the p's and t's on a regression table. I feel like I “mentally cross out” any lines with a low t value and then wonder about the F value on a regression with the “worthless” line removed.

In 20th-century abstract mathematics, one builds up ideas and properties—not assuming anything except what one is told. You think 2+3=5? Well in my space that I just made up, e₂⊕e₃ = e₁, and 5 doesn’t even exist!

Concepts are added in incrementally, like

  • ‖A‖ means the “size” of A. size exists
  • ‖A − B‖ means the “distance” between A and B. plus exists & negative exists; or, comparison exists
  • (If zero exists, we could say the size of A = the distance between A and 0: ‖ A − 0 ‖ = ‖A‖.)
  • ⟨ A | B ⟩ means A “times” B. times exists
  • arccos ⟨A|B⟩ ‖A‖⁻¹ ‖B‖⁻¹ inverses exist. times exists. so angle exists
  • topology adds in neighbourhood relationships—not necessarily in a way that you can infer size or distance (∵¬□∃ metric), but so that you could talk about paths or connectedness
  • order or ranking — is it a total order? a transitive order? a partial order? a lattice? Order is subordinate to size, to distance, and to linearity.
  • dimensionality — a set containing { ‘a’, ‘b’, the moon, 12, the vector (0 1 1 0 1)∈ℝ⁵, my cat’s hairball } doesn’t inherently have dimensions to it — so structured sets like ℝ² are supposed to explain how their universe breaks down
  • linearitypossibly the scariest word in mathematics class? I’ve tried and will continue to try to explain it elsewhere, but “linear” is an extremely-restrictive-but-not-that-restrictive-because-so-many-things-are-linear-once-you-allow-calculus-and-maps-across-domains-for-example-fourier-transforms property. Linearity presumes monotonicity (order preservation), size, and a kind of “constancy” that tells you if 2 went to 4, then 13 is going to go to 26. Or “the 26 of the present land”.

Someone GPL’ed this nice (but not comprehensive) chart of two paths through the theory space—starting with a pair (thing, operation) [“magma”—sweet name, right?] and gradually adding more and more axioms until you get to a group.


Mathematical words obtain everyday meaning—sometimes unexpected meaning—in applications. For example

  • "angle" might mean "correlation" — the angle between two pulse-trains would be their correlation; and in recommendation engines the matrix “cosine distance” is a basic measure of similarity
  • "multiplication" — well what if you want to multiply two functions together? You could convolve them. Convolution doesn’t seem very much at all the same action as 3×8 = three groups of eight. Neither do Photoshop blends seem like multiplication, but some of them are.
  • "size" — well maybe I mean "how well the business did" on a slew of different metrics — in which case, are there 20 different conceptions of "size"? I guess so.

Could you multiply two trees together? Could you define the angle between two natural numbers? The angle between two business models? Sure. If you know what you’re doing and why, you might even come up with a conclusion that makes sense. It all depends on (a) your ingenuity, (b) domain knowledge of the real-life situation, and (c) mathematical vocabulary.

Sometimes there is more than one interpretation that works with a given set. For example, {0,1} × {0,1} → {0,1} might be joined to operations that define “logical AND" and "logical OR”, or it might be interpreted just as on/off. Or it might be interpreted as the story of unrequited love.


All of that preface is meant to dislodge any notions you might have that ℝ² is somehow a “default” or “standard” paradigm. Sometimes number×number is an appropriate metaphor and sometimes not.

For example in the movie Rogue Trader, Nick Leeson’s boss is portrayed talking about “synergy” and “the information curve”. “Nick has positioned himself right there on the information curve!” It’s a parody and nobody seems to know quite what “the information curve” is (what’s on the axes? why is it curved?) but because Nick appears to be earning 70% of Barings’ profits, nobody questions the information curve.

Your typical crappy airport “business advice” books—Thomas Friedman kind of crap—will throw around 2-D charts that make no sense as well. Please leave some pics in the comments if you know what I’m talking about and examples come to mind. Here are a few dubious 2-D metaphors:



The “political compass” labels reduce the complexity of the world in particular ways that suit the rhetorical aims of these libertarian authors. For example projecting totalitarianism and populism into the same neighbourhood when one could just as well project them onto opposite ends of some other spectrum.

Here are some dubious scales—where either order, linearity, or 1-dimensionality is suspect.

This chart additionally uses way too many significant figures. How is it you gauge "total novelty in the universe" again?


(Remember: {"heroic", ”pragmatic”, ”circumspect”, ”brazen”} also comprises or belongs to a scale—in the ggplot sense of the word as well as other senses.)


Wow! You mean that losses are bad and earnings are good? That is some insightful business insight.



Crappy reductions needn’t be 2-D. The MBTI is a crappy reduction of personality in 4-D. And here are some in 1-D and other-D:


I like how step 5 leads to step 2. This should be a list rather than a flow.

Bloom's taxonomy is unjustified, both the projections and the order

Order, 1-dimensionality questionable.

Again, a list. This one has a heading. Apparently headings deserve 4 connecting wires whereas list items only deserve 3?

This is just a list of things. There is no “center” or “flow” or “order” or “cycle” relationship. Maybe “give them” and “get them” could have used a two-way arrow between them.


8-D and I just do not understand what these axis labels mean.

I actually spent hours finding the worst graphics evar. Not gonna tell you my google keywords though.


And, not to be critical all the time, here’s a 2-D metaphor that does work:



Stagepiece one: undermine the conceit that ℝ² is a default. Stagepiece two: cruddy graphics from various domains that force a metaphor that doesn’t really work. And now, the main act.

Today, I want to take aim at a highly suspect 2-D chart from the world of psychology:  the affect × intensity description of feelings.




Right away when I look at this, it seems like an overly limiting and not internally valid picture of emotional range. Like so many taxonomies, it gets deeply under my skin in a way that I can’t explain, except to shout: Bad theory! Bad theory!  I mean — how does it make sense to say

  1. that each of these states is a point, as opposed to a spray or splotch or something else
  2. that this precise “point” is the same for all individuals
  3. "delighted" is slightly to the left of "happy" but happy is directly above "pleased"
  4. that “sleepy” is to the right of “tired” instead of the other way around
  5. that tired and sleepy are the same distance from each other as “pleased” and “glad”
  6. WTF is “droopy”? It sounds like a word to be applied to a plant, not a person. I also don’t think it qualifies as an emotion. "Droopy" sounds like a word Good Housekeeping would use to shame a 1950’s American married woman for not being perky! happy! sexy! listening! rubbing his feet! when her husband returns home from work.
  7. Are “sleepy” and “tense” actually moods or emotions? They sound like physical states.
  8. All of these emotions are near the perimeter, but some are closer to the origin than others
  9. sad minus gloomy = satisfied minus calm
??? because all of those are implicit in the drawings.

Remember what I was outlining at first. In abstract mathematics and in deciding the shape of a theory, we shouldn’t assume anything that doesn’t have to be assumed to explain the results.

I could attack the valence-intensity model in at least two ways.

  1. First would be to exclaim “But you didn’t justify any of that stuff! Linearity? Dimensionality? Order? You skipped it all! Where’s the justification?”
  2. Second, perhaps a little stronger than merely asking for backup, would be to point out flaws. For example if I could find a counterexample showing that emotional states don’t have magnitude, can’t be added, don’t break down on dimensions, or aren’t linear across dimensions.
The easiest critique of type [2] I could think of is to question the existence of a “zero-point” emotion. It might be possible to have low-or-zero activation of an emotion on the intensity axis, but on the valence axis? Could I have high intensity of zero valence? What about high intensity in the negative direction at zero valence? It doesn’t make sense.

I came up with a list—several years ago—of different feelings which all could contend for “emotional zero”.

  • neither happy nor sad
  • neutral
  • feel blank
  • both happy and sad (bittersweet)
  • not sure
  • ambivalent
  • "I feel nothing"
  • kinda sort
  • middling

That’s just feelings we have the words for. There are lots of nameless emotions (or emotional superpositions) that could contend for the neutral canvas — the origin from which all other emotions are measured.

The fact that so many clearly distinct feelings all contend for the “origin” made me think there is, in fact, no origin. But making the space affine (removing zero) doesn’t fix the problems I had begun to notice with the circumplex view of the emotional spectrum. I think we just have to think of the range of emotions as a totally different kind of space. I don’t know its topology; I do believe there should be some “activation level” (like a scalar) at least sometimes; I do believe that superpositions are possible.

[G]eometry and number[s]…are unified by the concept of a coordinate system, which allows one to convert geometric objects to numeric ones or vice versa. …

[O]ne can view the length ❘AB❘ of a line segment AB not as a number (which requires one to select a unit of length), but more abstractly as the equivalence class of all line segments that are congruent to AB.

With this perspective, ❘AB❘ no longer lies in the standard semigroup ℝ⁺, but in a more abstract semigroup (the space of line segments quotiented by congruence), with addition now defined geometrically (by concatenation of intervals) rather than numerically.

A unit of length can now be viewed as just one of many different isomorphisms Φ: ℒ → ℝ⁺ between and ℝ⁺, but one can abandon … units and just work with directly. Many statements in Euclidean geometry … can be phrased in this manner.

(Indeed, this is basically how the ancient Greeks…viewed geometry, though of course without the assistance of such modern terminology as “semigroup” or “bilinear”.)
Terence Tao


For those not in the know, here’s what mathematicians mean by the word “measurable”:

  1. The problem of measure is to assign a ℝ size ≥ 0 to a set. (The points not necessarily contiguous.) In other words, to answer the question:
    How big is that?
  2. Why is this hard? Well just think about the problem of sizing up a contiguous ℝ subinterval between 0 and 1.
    • It’s obvious that [.4, .6] is .2 long and that
    • [0, .8] has a length of .8.
    • I don’t know what the length of √2√π/3] is but … it should be easy enough to figure out.
    • But real numbers can go on forever: .2816209287162381682365...1828361...1984...77280278254....
    • Most of them (the transcendentals) we don’t even have words or notation for.
      most of the numbers are black = transcendental
    • So there are a potentially infinite number of digits in each of these real numbers — which is essentially why the real numbers are so f#cked up — and therefore ∃ an infinitely infinite number of numbers just between 0% and 100%.

    Yeah, I said infinitely infinite, and I meant that. More real numbers exist in-between .999999999999999999999999 and 1 than there are atoms in the universe. There are more real numbers just in that teensy sub-interval than there are integers (and there are integers).

    In other words, if you filled a set with all of the things between .99999999999999999999 and 1, there would be infinity things inside. And not a nice, tame infinity either. This infinity is an infinity that just snorted a football helmet filled with coke, punched a stripper, and is now running around in the streets wearing her golden sparkly thong and brandishing a chainsaw:
    I think the analogy of 5_1 to Patrick Bateman is a solid and indisputable one.

    Talking still of that particular infinity: in a set-theoretic continuum sense, ∃ infinite number of points between Barcelona and Vladivostok, but also an infinite number of points between my toe and my nose. Well, now the simple and obvious has become not very clear at all!
    Europe  Data set:> eurodist                 Athens Barcelona Brussels Calais Cherbourg Cologne CopenhagenBarcelona         3313                                                       Brussels          2963      1318                                             Calais            3175      1326      204                                    Cherbourg         3339      1294      583    460                             Cologne           2762      1498      206    409       785                   Copenhagen        3276      2218      966   1136      1545     760           Geneva            2610       803      677    747       853    1662       1418Gibraltar         4485      1172     2256   2224      2047    2436       3196Hamburg           2977      2018      597    714      1115     460        460Hook of Holland   3030      1490      172    330       731     269        269Lisbon            4532      1305     2084   2052      1827    2290       2971Lyons             2753       645      690    739       789     714       1458Madrid            3949       636     1558   1550      1347    1764       2498Marseilles        2865       521     1011   1059      1101    1035       1778Milan             2282      1014      925   1077      1209     911       1537Munich            2179      1365      747    977      1160     583       1104Paris             3000      1033      285    280       340     465       1176Rome               817      1460     1511   1662      1794    1497       2050Stockholm         3927      2868     1616   1786      2196    1403        650Vienna            1991      1802     1175   1381      1588     937       1455                Geneva Gibraltar Hamburg Hook of Holland Lisbon Lyons MadridBarcelona                                                                   Brussels                                                                    Calais                                                                      Cherbourg                                                                   Cologne                                                                     Copenhagen                                                                  Geneva                                                                      Gibraltar         1975                                                      Hamburg           1118      2897                                            Hook of Holland    895      2428     550                                    Lisbon            1936       676    2671            2280                    Lyons              158      1817    1159             863   1178             Madrid            1439       698    2198            1730    668  1281       Marseilles         425      1693    1479            1183   1762   320   1157Milan              328      2185    1238            1098   2250   328   1724Munich             591      2565     805             851   2507   724   2010Paris              513      1971     877             457   1799   471   1273Rome               995      2631    1751            1683   2700  1048   2097Stockholm         2068      3886     949            1500   3231  2108   3188Vienna            1019      2974    1155            1205   2937  1157   2409                Marseilles Milan Munich Paris Rome StockholmBarcelona                                                   Brussels                                                    Calais                                                      Cherbourg                                                   Cologne                                                     Copenhagen                                                  Geneva                                                      Gibraltar                                                   Hamburg                                                     Hook of Holland                                             Lisbon                                                      Lyons                                                       Madrid                                                      Marseilles                                                  Milan                  618                                  Munich                1109   331                            Paris                  792   856    821                     Rome                  1011   586    946  1476               Stockholm             2428  2187   1754  1827 2707          Vienna                1363   898    428  1249 1209      2105  Multi-dimensional scaling of the distances:  > cmdscale(eurodist)                        [,1]        [,2]Athens           2290.274680  1798.80293Barcelona        -825.382790   546.81148Brussels           59.183341  -367.08135Calais            -82.845973  -429.91466Cherbourg        -352.499435  -290.90843Cologne           293.689633  -405.31194Copenhagen        681.931545 -1108.64478Geneva             -9.423364   240.40600Gibraltar       -2048.449113   642.45854Hamburg           561.108970  -773.36929Hook of Holland   164.921799  -549.36704Lisbon          -1935.040811    49.12514Lyons            -226.423236   187.08779Madrid          -1423.353697   305.87513Marseilles       -299.498710   388.80726Milan             260.878046   416.67381Munich            587.675679    81.18224Paris            -156.836257  -211.13911Rome              709.413282  1109.36665Stockholm         839.445911 -1836.79055Vienna            911.230500   205.93020  Plot       require(stats)     loc <- cmdscale(eurodist)     rx <- range(x <- loc[,1])     ry <- range(y <- -loc[,2])     plot(x, y, type="n", asp=1, xlab="", ylab="")     abline(h = pretty(rx, 10), v = pretty(ry, 10), col = "light gray")     text(x, y, labels(eurodist), cex=0.8)
    So it’s a problem of infinities, a problem of sets, and a problem of the continuum being such an infernal taskmaster that it took until the 20th century for mathematicians to whip-crack the real numbers into shape.
  3. If you can define “size” on the [0,1] interval, you can define it on the [−535,19^19] interval as well, by extension.

    If you can’t even define “size” on the [0,1] interval — how do you think you’re going to define it on all of ℝ? Punk.
  4. A reasonable definition of “size” (measure) should work for non-contiguous subsets of ℝ such as “just the rational numbers” or “all solutions to cos² x = 0(they’re not next to each other) as well.

    Just another problem to add to the heap.
  5. Nevertheless, the monstrosity has more-or-less been tamed. Epsilons, deltas, open sets, Dedekind cuts, Cauchy sequences, well-orderings, and metric spaces had to be invented in order to bazooka the beast into submission, but mostly-satisfactory answers have now been obtained.

    It just takes a sequence of 4-5 university-level maths classes to get to those mostly-satisfactory answers.
    One is reminded of the hypermathematicians from The Hitchhiker’s Guide to the Galaxy who time-warp themselves through several lives of study before they begin their real work.


For a readable summary of the reasoning & results of Henri Lebesgue's measure theory, I recommend this 4-page PDF by G.H. Meisters. (NB: His weird ∁ symbol means complement.)

That doesn’t cover the measurement of probability spaces, functional spaces, or even more abstract spaces. But I don’t have an equally great reference for those.

Oh, I forgot to say: why does anyone care about measurability? Measure theory is just a highly technical prerequisite to true understanding of a lot of cool subjects — like complexity, signal processing, functional analysis, Wiener processes, dynamical systems, Sobolev spaces, and other interesting and relevant such stuff.

It’s hard to do very much mathematics with those sorts of things if you can’t even say how big they are.