I can’t quite decide if LinkedIn is kind of creepy or unacceptably creepy. Perhaps the best argument-silencing reason that I host my own email is that I am certain that I have never sold you out to a social network through my email contacts. You’re welcome. But I am only half of the conversation. Person A and person B may have both sent me emails at some point. Or person A may have sent an email to me and also to person B. Now LinkedIn knows A and B might know each other and/or me.

While I do give companies I deal with unique email addresses so that I can track their activities, I do not do this for friends and other humans. To most people, what LinkedIn knows about you must just seem like the result of modern life. Having been scrupulously careful to disclose as little as possible through non-explicit mechanisms, the stuff LinkedIn comes up with for me can be pretty disturbing.

The hard push of this universal replicator to turn raw material into more opportunities to recursively do the same lead to some interesting data. My "People You May Know" suggestions seemed mildly interesting, but as I scrolled down I was impressed at how many people they could stitch together with a plausible connection to me from other people’s email contacts. Eventually, exactly 900 potential contacts were recommended.

Remember, these are not my LinkedIn contacts, but people near me in the network graph of email correspondents and LinkedIn connections. I thought this data set was large enough to be significant and insulated from me enough to not contain the biases of my actual contacts (though other biases clearly exist).

Lightly scraping these contacts I was able to extract some interesting observations.

  • 832 (92.4%) shared a contact with me

  • 415 (46.1%) shared 3 or more contacts with me

  • 185 (20.6%) work at some kind of university or institute

  • 160 (17.8%) are engineers of some kind (including network, software, etc.)

  • 112 (12.4%) are professors

  • 110 (12.2%) are software engineers or programmers

  • 82 (9.1%) are scientists

  • 66 (7.3%) work at UCSD (my employer and largest in the region)

  • 45 (5.0%) work at Google

  • 30 (3.3%) work at Microsoft

  • 13 (1.4%) work at Microsoft Research

The numbers are not perfect since I don’t really know exactly what these people are doing, but I would say the magnitudes are correct based on my analysis and quick correction of some messy data points. I think it’s interesting that LinkedIn has more people for me to consider from Google and (plus) Microsoft than from my own employer which is the largest (26,000) in San Diego. It’s also interesting that you’re more likely to hit a professor than a software engineer in my vicinity of the network graph, though many are no doubt both. Professors also probably have a high vertex degree.

If this were something serious, it would seem that I’m pretty well connected to the high stratosphere of STEM. My actual contacts and especially my real friends corroborate this. In real life I don’t know what this means or how useful it is. Although this may well imply that I have "knowledge and skills for jobs of the future" at this moment in time I think it mostly just means that I’m a nerd.