When you were in Statistics 101, and the Professor said ok, we are now going to learn about the Central Limit Theorem, did you tune out? Did you sarcastically say when is someone going to grab me and order me to tell them about the Central Limit Theorem? Come on, admit it, you did. Well, so did I – I was 18 years old, and couldn’t care less.
Well, you know what? Understanding the Central Limit Theorem has really big implications for big data analytics. Check out this 20 minute video, and you’ll see that by applying the Central Limit Theorem and some statistical theory, you can approximate the results of an expensive multi-server implementation for interrogating really large databases.
I’ll show you how you can obtain very precise estimates on really large databases by simply applying some basic statistics you should have learned Freshman year (but you were too busy partying, weren’t you?)
stay tuned, I’ll be coming out with a big data analytics class in the New Year. If you want to learn more about SQL, programming, open source GIS, or Manifold, check out courses at www.gisadvisor.com.
Powerful video, Art, I remember doing something similar using a Monte Carlo simulation that was moved to a HP calculator. Using the central limit theorem, I was able to get as good results as a computer with lots of speed and a huge memory (compared to the calculator). I got the idea from an article in a technical magazine that created a Monte Carlo simulation using a Basic program running on a Commodore 64.
Wow, shows my age. Lots of fun in those days.
I wish you had been a professor in my college days. I would have loved your courses.
thanks, Ron. I actually deleted your first two posts since the video is resolved now. And yes, I noted to a friend that a lot of what we are doing today with big data is the same stuff we were doing 30-40 years ago because the data was too big (20MB)!! So, we have just increased our volume, and find ourselves returning to the same tricks we used in the past.