repeated the experiment and I got the same resultâmy address book was in Pathâs hands.â 16
The company was holding detailed information on the friends, families, coworkers, and contacts for all three hundred thousand or so of its users, a list that potentially included tens of millions of people. They quickly issued an apology and a software update. But, in many ways, the damage was already done.
Allan has called this the inevitable result of the increasing market for mobile software among people who donât understandâand have no desire to learnâhow their most cherished devices work. We want our apps to know us, to present customized answers to our problems and questions, but we donât care how they arrive at those solutions until thereâs a problem.
Most people who download Instagram, Twitter, or Facebook to their phone already understand, at least in part, that theyâre risking their personal private information in doing so. But they probably wouldnât elect to give their
grandparentsâ
contact information and other personal details to some strange company. Given that more than 9 percent of the entire U.S. population is part of ageo-social network (as calculated from the fact that 18 percent of smartphone owners are part of a geo-social network, and well more than 50 percent of the population owns a smartphone), further incidents of data leakage will affect the U.S. population well beyond the smartphone-owning community.
We presume that our personal data is compromised only when we
choose
to take a certain risky action. Maybe some people find amusement in these silly networks and donât mind giving away their information to strangers, but that shouldnât have any bearing on
me,
goes this line of thinking. But our friends and loved ones create data about life and that data includes us, whether we wish to be tagged or not. This is why we are using the wrong set of words to explain this phenomenon; we think of data leakage as an act of theft but we need to understand it as a contagion event. If you know someone who geo-tags their tweets, Facebook posts, or Instagram photos, youâve already been infected.
Telemetry, Simulation, and Bayes
Once these signals are sensed, they must be processed if they are to form the basis of a useful prediction. But predictionsâlike the future itselfâspring from the brain. The challenge is getting computers, programs, and systems to make predictions on the basis of continuously sensed information, on the basis of whatâs happening now in (sort of) the same way that the brain does. This is an entirely recent problem related to the rise of continuous data streams and all the artifacts of modern information overload. But the mathematical formula to tackle it has actually been around for centuries and can be utilized as easily by a college undergrad as by a roomful of scientists.
Researchers use plenty of statistical methods, and mathematical tricks can be employed, in isolation or combination, to turn data into a prediction. But the one method that allows you to make new predictions and update old predictions on the basis of newinformation is named after its founder, Thomas Bayes. The theorem in its simplest form is:
Â
In the above,
P
is probability,
A
is the outcome we are trying to predict, and
X
is some condition that could affect
P
. The theory solves for
A
given ( | )
X
. The value you award
P
when you begin is sometimes called the âpriorâ; the value you award
P
after youâve run the formula is called the âposterior
.
â
Undeniably, compared with other statistical methods Bayes wonât always give you the most
accurate
answer based on the data that youâre looking at. But it does give you a fairly honest answer. A large gap (in value) between the prior and the posterior suggests a small degree of confidence.
Celebrated artificial intelligence (AI) luminary and statistician Judea