Learning Data Science: Our Favorite Data Science Books

Learning Data Science: Our Favorite Data Science Books

April 18, 2019 Data Science Consulting 0

Originally Posted Here

Whether you are just breaking into data science, or you are looking to improve your data science skills. Books are one great method to get a base level understanding of specific topics. Now, we personally believe nothing beats experience, but in lieu of that, taking a course or reading a book is a great way possibilities that you can build on later when you are trying to practically approach data science.

In data science, there are many topics to cover, so we wanted to focused on several specific topics. This post will cover books on python, R programming, big data, SQL and just some generally good reads for data scientists.

Heads Up! — This post contains referral links from Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.

Data Science Books

As a data scientist, you have a very important role. Your goal is to provide your company insights into improving the companies bottom or top line. The problem is, we can make data say anything we want. It can be very easy to manipulate data to prove that our feature was effective and it can be tempting if the company incentivizes that type of behavior.

Thus, a great general read for data scientists (and really anyone in our modern world) is Naked Statistics. This is kind of like the much older book How To Lie With Statistics which you can read for free.

 

We do prefer Naked Statistics because it is a little more modern and covers much more complex statistical debauchery than its much older counterpart. It just goes to show you that numbers are at your whim and you have a lot of responsibility to make sure your numbers are right. If something seems amiss with your data…it probably is. Rather than reporting it out right away, think about how you might unknowingly be miss representing the facts.

Another similar book is Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are.

 

Now, this takes it from the perspective of the people creating the data..us…vs the data scientist. It discusses how surveys, research, and reporting have all skewed data because…we lie. In counter to, the internet is often very truthful. We search when we are sick, when we are hungry, when we are in love, etc. We google great ideas for dates or am I dying. We are literally telling the internet what we are thinking about. This book covers that as well as how this impacts our ability as a data scientist to accurately create models based on this.

Finally, for books that we have read that we found helpful in our journey is Story Telling With Data. One thing we enjoyed about this book is it doesn’t just cover what to do, but what not to do. When you first start developing charts and models it is tempting to over clutter with every possible feature that Tableau and D3 offer us. But, honestly, those features might drown out the impact you are trying to make. This book takes an entire chapter to discuss avoiding over cluttering and it is great for those of us who need to remember to hold back.

Read More Here

 

For further reading and videos on data science, SQL and Python:

Learning Data Science: Our Favorite Data Science Resources From Free To Not

How Algorithms Can Become Unethical and Biased

How To Load Multiple Files With SQL

How To Develop Robust Algorithms

Dynamically Bulk Inserting CSV Data Into A SQL Server

4 Must Have Skills For Data Scientists

SQL Best Practices — Designing An ETL Video