<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=J5kSo1IWhd105T" style="display:none" height="1" width="1" alt="">

Nexosis @ Work & Play

Nexosis @ Work & Play

Data Analysis Will Be As Common As the Internet

Data Analysis Will be as Common as the Internet

Joe is frustrated. He's frustrated a lot of people don't consider themselves "numbers people." He believes concepts like probability, mean, and variance should be as commonly understood as noun, verb, and adjective. Why? Because we have more data available to us than ever before. Read his explanation below. 


I absolutely hate the question "So, Joe, what do you do?" I hate having to tell people that I am a Data Scientist and that I research mathematics for a living. I hate it because it invariably leads to some variation of "Oh, you do math for a living? I was never any good at that. I'm just not a numbers person." This response upsets me everytime I hear it. I'm not upset about that person admitting their ignorance of a subject that I happen to love dearly. I'm upset that the declaration is often made with a strange zeal.

I've never encoutered a person who, almost to the point of glee, exclaims "Oh, you read for a living? I was never any good at that. I'm just not a words person," and then, glancing around the room, expects everyone around them to have the shared experience of being illiterate. That's absurd. Yet, somehow, we've collectively decided that everyone gets a pass when it comes to math. Math is hard. Math is for nerds. No one cares about what time two trains collide or how many apples you have to split between X friends or whatever dumb trope problem that is currently part of the zeitgeist.

Yet, somehow, we've collectively decided that everyone gets a pass when it comes to math. Math is hard. Math is for nerds. No one cares about what time two trains collide or how many apples you have to split between X friends or whatever dumb trope problem that is currently part of the zeitgeist.

I'm not proposing that everyone drop what they are doing and begin a systematic study of calculus and linear algebra. What I do suggest is that concepts like probability, mean, and variance be as commonly understood as noun, verb, and adjective. No one walks around wondering if you can read or write - it's just assumed; literacy is expected of you. We should start expecting some level of numeracy as well.

I think that this is critically important given that, at the time of this writing, data is being produced at a rate of roughly 30 zettabytes (30 billion terabytes) per second worldwide. That is roughly 2.6 million billion trillion bytes of data per day. That is 2,600,000,000,000,000,000,000,000,000 bytes a day.

That is 2,600,000,000,000,000,000,000,000,000 bytes a day.

That number will be larger by the time you read this article. That number will only continue to grow, especially considering the projected popularity of IoT (internet of things) devices. I've read estimates depending on your source, estimates range from 20 to 38 billion IoT devices in use by 2020. That is so much information.

When I was a kid, I remember learning that the closest star to our solar system, Alpha Centauri, was 25 trillion miles away. I was in third grade at the time and Mrs. Bowers might as well have told me it was 25 bajillion miles away. 25 trillion was my first encounter with a truly astronomical number. Yet, it is much, much smaller than the terrestrial 2.6 million billion trillion bytes of data per day that we generate.


Data isn't going anywhere and it is incumbent on all of us to know how to understand it.


Ready to start building machine learning applications?

Get your free API key  Talk to an expert


Joe Volzer

Joe is one of our data scientists who moonlights as master of hype, pizza enthusiast, and collector of unintentional PhDs (in mathematics, no less).

 August 31, 2017 @ 2:11 PM |   Musings