Posts tagged with linear

(NB: Actually a weighted sum. But if you just normalise it (divide by the overall total) you’ll get a weighted average.)
normalisation

  • The Economist's Which MBA? website scores MBA programmes on:

    faculty quality, student quality, student diversity, percentage who found jobs through the careers service, student assessment of career service, percentage in work three months after graduation, increase in salary, potential to network, internationalism of alumni, student rating of alumni effectiveness, and a few other metrics

    — and lets you adjust how important each of these factors are to you, determining your ranking of MBA programmes (using their data=methodology),
    economist which MBA ranking
    rather than pretending there’s a universal or objective weighting of importance of factors (as the US News & World Report ranking of US undergrad schools does).
    MBA rankings
  • My friend made a spreadsheet of all the factors that determined what city she wants to move to.
    city data
    She scored each city on various factors, then assigned each of those factors an importance, added and timesed and got a total score for each city. (I don’t think the result is meaningful, because I don’t think the space is linear. But the exercise itself was fun and gave her a reason to do the research.)
  • In my car radio I have knobs for “treble" and "bass", which weight particular functional forms more heavily than others.
    http://image.shutterstock.com/display_pic_with_logo/874921/874921,1328400891,1/stock-photo-three-golden-equalizer-knobs-for-bass-middle-and-treble-94352236.jpg
  • When you do a Gaussian Blur in photoshop
    Gaussian kernel smooth in 2D.
    box blur
    or smooth a time series against a Gaussian kernel,
    image


    you’re (basically) covectoring against a Normal curve. In other words you weight the neighbours with heights of 2**−distance² = 1/2, 1/16, 1/512, ....
    gaussian
    (I actually think of the Gaussian now as an optimal smoother, primarily, instead of as Bell Curve religion. But that’s a story for another time.)
  • The standard “regression beta"—the OLS squares minimisation problem—is to adjust a covector—the tilts of the various data columns
    image
    =properties you’ve observed and quantified (plus a column of ones) to match
    http://33.media.tumblr.com/dc7d606f35cb95a4ac8834b324512f8c/tumblr_naubqg3kYK1qc38e9o1_1280.png
    a straight-line fit up against whatever you’ve chosen as y.
    linear mappings -- notice they're ALL straight lines through the origin!
  • An artist in a coffee shop once told me he had found some great numerical parameters for the particular visual (like a Winamp style one) he was creating. He was clearly thinking about the parameter space as such, but the maximisation procedure he was following was probably not a mechanical one.
  • If polynomials are sequences where, instead of being limited to a largest digit of 9 in the hundreds digit, we’re not limited to positive, negative, fraction, whatever, in the xx=x² constant, then the constants you line up — whether they have some well-known name or pattern like combinatorial sequence, Sheffer sequence, Schur polynomial, Taylor series, or have no name — are the covector. (This overstretches my simplification that covectors are averages. Here they really need to be sums.)
  • A client wanted customers to be able to browse his wares easier in his online store. This boils down to bubbling up to the top what they want to see and sorting down what they don’t want to see. One idea he had was to give the customer a number of “sliders” and let them choose which aspects were important to them. So instead of sorting first by price, then sorting within that sort by alphabetical, you would catalogue various properties of the stuff in your ecommerce storefront, multiply those by a fixed number chosen by the customer, add those subscores together to get a total score, and then sort on that total score. That way the list can be mixed. (The customer wants to penalise high prices and non-red dresses, but doesn’t want to see only $2 purse accessories that somehow got parsed by the computer as “dress”.)
    image
    Another way to say this is he wanted to let customers define their own “scoring metric” and sort results based on that.

All of these are covectors.

In order to not get confused about the meaning of “parameter" versus "variable" — let me just use the concrete examples above. The weighting scheme on the MBA programme is the covector and the observed  properties of each MBA programme are the vector. Multiply the vector for a particular school and the covector (weighting scheme) you’ve chosen, and you get “your score” (a single number). Do this for each school and you can then sort the results to get “your ranking”.

If you changed the weighting scheme, you change the covector, i.e. you change the parameters. This is “moving in the dual space” and it outputs a different “your ranking”.

So the next time someone says to you "Canonically identify a vector space with its dual via g↦∫fg", thatbasically what they mean.

(By the way, this duality is also used in the reproducing kernel Hilbert space, a key part of machine learning.)




Today’s glossary provoked by Michael Spivak.

tensors spivak diff geom vol 1 ch 4

Below I’ve drawn two vector spaces connected by a linear homomorphism ƒ, plus a linear functional λ going to ℝ. After seeing these pictures I hope it’s easier to understand how the pullback ƒ* works.

Here’s one of the main pictures for flavour:

dual space linear functional

Also, you can probably just skim the pictures and get the point (especially the Number Field one and the final one). That’s the fastest way to read this post.



Start with an abstract 𝓥ector space.

abstract vector space

I’ll do some violence because I’ll need coordinates in a minute.

invent linear functional

Read More




If your vector space is a shopping cart full of groceries, then the checkout clerk is a linear operator on that space.




If the astronomical observations and other quantities on which the computation of orbits were absolutely correct, the elements also, whether deduced from three or four observations, would be strictly accurate (so far indeed as the motion is supposed to take place exactly according to the laws of Kepler), and, therefore, if other observations were used, they might be confirmed but not corrected.

But since all our measurements and observations are nothing more than approximations to the truth, the same must be true of all calculations resting upon them, and the highest aim of all computations made concerning concrete phenomena must be to approximate, as nearly as practicable, to the truth. But this can be accomplished in no other way than by a suitable combination of more observations than the number absolutely requisite for the determination of the unknown quantities. This problem can only be properly understood when an approximate knowledge of the orbit has been already attained, which is afterwards to be corrected so as to satisfy all the observations in the most accurate manner possible.

Johann Carl Friedrich Gauß, Theoria Motus Corporum Cœlestium in Sectionibus Conicis solem Ambientium, 1809

(translation by C.H. Davis 1963)

(Source: cs.unc.edu)




The rank-nullity theorem in linear algebra says that dimensions either get

  • thrown in the trash
  • or show up

after the mapping.

image

By “the trash” I mean the origin—that black hole of linear algebra, the /dev/null, the ultimate crisscross paper shredder, the ashpile, the wormhole to void and cancelled oblivion; that country from whose bourn no traveller ever returns.

The way I think about rank-nullity is this. I start out with all my dimensions lined up—separated, independent, not touching each other, not mixing with each other. ||||||||||||| like columns in an Excel table. I can think of the dimensions as separable, countable entities like this whenever it’s possible to rejigger the basis to make the dimensions linearly independent.

image

I prefer to always think about the linear stuff in its preferably jiggered state and treat how to do that as a separate issue.

abstract vector space

So you’ve got your 172 row × 81 column matrix mapping 172→ separate dimensions into →81 dimensions. I’ll also forget about the fact that some of the resultant →81 dimensions might end up as linear combinations of the input dimensions. Just pretend that each input dimension is getting its own linear λ stretch. Now linear just means multiplication.

image
linear maps as multiplication
linear mappings -- notice they're ALL straight lines through the origin!

Linear stretches λ affect the entire dimension the same. They turn a list like [1 2 3 4 5] into [3 6 9 12 15] (λ=3). It couldn’t be into [10 20 30 − 42856712 50] (λ=10 except not everywhere the same stretch=multiplication).

image

Also remember – everything has to stay centred on 0. (That’s why you always know there will be a zero subspace.) This is linear, not affine. Things stay in place and basically just stretch (or rotate).

So if my entire 18th input dimension [… −2 −1 0 1 2 3 4 5 …] has to get transformed the same, to [… −2λ −λ 0 λ 2λ 3λ 4λ 5λ …], then linearity has simplified this large thing full of possibility and data, into something so simple I can basically treat it as a stick |.

If that’s the case—if I can’t put dimensions together but just have to λ stretch them or nothing, and if what happens to an element of the dimension happens to everybody in that dimension exactly equal—then of course I can’t stick all the 172→ input dimensions into the →81 dimension output space. 172−81 of them have to go in the trash. (effectively, λ=0 on those inputs)

So then the rank-nullity theorem, at least in the linear context, has turned the huge concept of dimension (try to picture 11-D space again would you mind?) into something as simple as counting to 11 |||||||||||.




Blairthatcher
by and © Aude Oliva & Philippe G. Schyns

A hybrid face presenting Margaret Thatcher (in low spatial frequency) and Tony Blair (in high spatial frequency)…
[I]f you … defocus while looking at the pictures, Margaret Thatcher should substitute for Tony Blair ( if this … does not work, step back … until your percepts change).

Blairthatcher

by and © Aude Oliva & Philippe G. Schyns

A hybrid face presenting Margaret Thatcher (in low spatial frequency) and Tony Blair (in high spatial frequency)

[I]f you … defocus while looking at the pictures, Margaret Thatcher should substitute for Tony Blair ( if this … does not work, step back … until your percepts change).

(Source: cvcl.mit.edu)





[Karol] Borsuk’s geometric shape theory works well because … any compact metric space can be embedded into the “Hilbert cube” [0,1] × [0,½] × [0,⅓] × [0,¼] × [0,⅕] × [0,⅙] ×  …
A compact metric space is thus an intersection of polyhedral subspaces of n-dimensional cubes …
We relate a category of models A to a category of more realistic objects B which the models approximate. For example polyhedra can approximate smooth shapes in the infinite limit…. In Borsuk’s geometric shape theory, A is the homotopy category of finite polyhedra, and B is the homotopy category of compact metric spaces.

—-Jean-Marc Cordier and Timothy Porter, Shape Theory
(I rearranged their words liberally but the substance is theirs.)
in R do: prod( factorial( 1/ 1:10e4) ) to see the volume of Hilbert’s cube → 0.

[Karol] Borsuk’s geometric shape theory works well because … any compact metric space can be embedded into the “Hilbert cube” [0,1] × [0,½] × [0,⅓] × [0,¼] × [0,⅕] × [0,⅙] ×  …

A compact metric space is thus an intersection of polyhedral subspaces of n-dimensional cubes …

We relate a category of models A to a category of more realistic objects B which the models approximate. For example polyhedra can approximate smooth shapes in the infinite limit…. In Borsuk’s geometric shape theory, A is the homotopy category of finite polyhedra, and B is the homotopy category of compact metric spaces.

—-Jean-Marc Cordier and Timothy Porter, Shape Theory

(I rearranged their words liberally but the substance is theirs.)

in R do: prod( factorial( 1/ 1:10e4) ) to see the volume of Hilbert’s cube → 0.







199 Plays • Download

The Nervous System

  • dissections of live criminals’ brains
  • animal spirits (psychic)
  • neuron νεῦρον is Greek for cord
  • Galen thought the body was networked together by three systems: arteries, veins, and nerves
  • Descartes as the source of the theory of reflexive responses—fire stings hand, νευρώνες tugging on the brain, fluids in the brain tug on some other νευρώνες, and the hand pulls away—automatically.
  • the analogy of a clock (…today we’re much smarter. We think of brains as being like computers, which is definitely not an outgrowth of today’s hot technology!)
  • cogito ergo sumsensation is what’s distinctive about our brains. How could a clock feel something? (Today again, we’re much smarter: we think it’s the ability to reflect on thought—anything with at least one “meta” term in it must be intelligent.)
  • Muscles fire like bombs exploding (a chemical reaction of two mutually combustible elements)—and the fellow who came up with this theory had been spending a lot of time in the battlefield where bombs were the new technology.
  • autonomic, peripheral, central nervous systems
  • Willis, Harvey, Newton
  • What makes nerves transmit information so fast?
  • Galvani’s theory that electricity is only an organic phenomenon. (Hucksters arise!)
  • The theory of the synapse—it’s the connections that matter.
  • The discovery that nerves aren’t continuous connected strings, but rather made up of billions of individual parts.
  • Activation thresholds—a classic and simple non-linear function!

(Source: BBC)




http://ecx.images-amazon.com/images/I/51ohaNaIEGL.jpg
from C. H. Edwards, Jr.
come some examples of linear (vector) spaces less familiar than span{(1,0,0,0,0), (0,1,0,0,0), ..., (0,0,0,0,1)}.

  • The infinite set of functions {1, cos θ, sin θ, ..., cos n • θ, sin n • θ, ...} is orthogonal in 𝓒[−π,+π]. This is the basis for Fourier series.
    http://www.math.harvard.edu/archive/21b_fall_03/fourier/approximation.gif
  • Let 𝓟 denote the vector space of polynomials, with inner product (multiplication) of p,q ∈ 𝓟 given by ∫_1¹ p(x) • q(x) • dx. Applying Gram-Schmidt orthogonalisation gets us within constant factors of the Legendre polynomials 1, x, x²−⅓, x³−⅗x, x⁴−6/7 x²+9/5, ...
    http://www.math.kth.se/math/GRU/2008.2009/SF1625/CMIEL/taylor%20Files/appviewer_010.gif
  • (and, from M A Al-Gwaiz)
    The set of all infinitely-smooth complex-valued functions that map to zero outside a finite interval (i.e., have compact support). These tempered distributions lead to generalised distributions (hyperfunctions) and imprecision on purpose.