We live in an era of big data. Our devices are producing terabytes’ worth of information, and the trend is only accelerating. It’s difficult to express how valuable this information can be – but more importantly, it’s valuable to someone. Whether you’re trying to optimize your street commute, get a leg up on the competition at work, or scale your business exponentially, there are valuable lessons to be had from analyzing data sets. In fact, as time goes by it becomes more and more likely that those who aren’t analyzing data sets will not even remain competitive for long… But where to begin? What does it all mean? In this 3-part series, we’ll give you a little taste of the world of data science. In part 1, we lay down our foundational concepts. This includes an introduction to some of the most common terms and concepts surrounding data science – including a few things you might have heard of, and a few things you probably won’t have. Without further ado… A section by section breakdown might be helpful here. In part 1 of this series, we’ll cover off the following: Data Science Course For Beginners Terms and Concepts Introduction to R Programming
1. The Big Picture Data science deals with data – lots of it.
This means that your mind needs to rely on aggregate information rather than individual instances when making decisions (subconsciously, this is often referred to as “gut feeling”) – making it much more likely that you’ll make a poor decision if you haven’t done your research. Even worse… If the data that’s being presented to you has any form of bias or manipulation associated with it, then your brain will fail to notice – meaning that even if there are red flags present in the data itself, they won’t be picked up by your subconscious. To know more check RemoteDBA.com.
2. Critical Thinking Obviously, critical thinking requires one to think critically.
It’s important for two reasons: 1) Things don’t always add up – math errors happen all of the time (even at NASA), and 2) Data can be manipulated. We know these sounds like some sort of conspiracy theory nonsense, but it’s an absolute fact. Data is often misinterpreted or misrepresented by those who are trying to sell us something, whether that is a product or their version of the truth. When you see data being presented to you, don’t just think about it – try to find reasons why it could have been misinterpreted before accepting it as factual information.
3. The Big Picture Critical thinking helps answer one big question:
How do I know that the data I’m looking at represents what someone else says it represents? Maybe there’s some statisticians out there reading this right now wondering if they’ve messed up somewhere along the line, but others are probably thinking “Who cares? It seems accurate enough.” Actually… That answer is wrong. If you’ve ever been curious about the answer to this question, you’re in luck. The famous mathematician John von Neumann once said: “With four parameters I can fit an elephant and with five I can make him wiggle his trunk.” (He was referring to regression analysis). These days we have a lot more than 5 parameters at our disposal; we also have access to powerful computers and deep learning algorithms – but there is still no such thing as a perfect model. If someone says that they’ve developed the perfect algorithm for predicting stock prices (or anything else) then take their word for it… Unless, of course, it’s your money that’s on the line.
4. Data Ethics this falls into two categories:
1) Ethics in the handling of data (this includes privacy related issues) 2) Ethics in the interpretation of data (this would include things like misrepresenting or omitting context). Let’s take these one at a time, shall we?
Before you do anything else (including buying this book), make sure that you understand the differences between these terms. There’s a lot to consider when it comes to data science – and we haven’t even covered the most important part yet (in our opinion).