Friday, 1 May 2020

10 Books Data Scientists Should Read During Lockdown


Introduction

It’s been a few weeks now that the whole world is engaged in fighting the COVID-19 and most people have been locked down for a while. Even though most of us are still working from home, it’s quite easy to get bored during our free time, not sure what to watch next on Netflix.
Personally, I love reading, so I thought I’d recommend some books I’ve read lately that might be useful or entertaining for people who work with data. Some of them are more technical, while some are more amusing, but I’m sure you will find your cup of tea here.

Statistics: a Very Short Introduction
This book is specially useful if you are a data scientist from an IT or business background and want to understand the basics of statistical techniques without getting to much into the details. It covers the basics, from probability distributions to regression analysis, and decision trees. As the name says, it is quite short and it should take you only a couple of days to read. For a more detailed overview, you can check its summary here.

Lean Analytics
This book is specially recommended if you work with data in a startup or if you own a startup and want it to develop its data potential. It helps you define the most important metrics for your company, depending on its business model and how to optimise them, without drowning in a pile of useless metrics. You can find a more detailed summary here.

The Man Who Solved the Market
A non-technical book, it tells the story of Jim Simons, a mathematician who started using statistics to trade stocks, back in a time where everyone else in the market used only instincts and traditional fundamental analysis. Obviously, everyone was skeptical of his methods at first, but after years managing his fund and yielding astonishing results, people eventually gave in and started acknowledging the power of the so-called quant hedge funds, which play a huge role in the financial industry these days.

The Business Forecasting Deal
A primer on the art of business forecasting, one of the most traditional ways of using data and statistics in business applications. It really helps if you have some knowledge on statistics and time series, and have to do forecasts at work, such as predicting revenue. It covers the basics and the myths, with a very practical approach (even though the techniques presented can be considered old-fashioned, they usually work surprisingly well in this domain, compared to machine learning techniques).

Storytelling with Data
A must-have for anyone who has to use numbers at work to sell an idea, present results or tell a story. It’s more suited for business analysts, but data scientists could also benefit from it, by learning data visualisation techniques that will help them showcase their model’s results and better visualise and plot data. The kind of book you go back to all the time to review important concepts.

The Book of Why
We are often told that “correlation does not imply causation”. When you think about it, however, the concept of causation is not very clear: what exactly does it mean? This book tells the story of how we see causality from a philosophical perspective and then introduces the mathematical tools and models to understand it. It will change the way you think of cause and effect.

Moneyball
This is the story of Billy Beane and Paul DePodesta, who were capable of taking Oakland Athletics, a small baseball team, through an outstanding campaign in the Major League Baseball, by picking cheap overlooked players. How did they do it? By using data. This changed the whole way teams choose they players, which was previously done exclusively by scouts and their instincts. The story has also inspired a film by the same name, and they are both masterpieces.

Data Strategy
This one is more on the business side of the thing, and it can be helpful for executive managers and even C-level people understand how to unlock the power of data in an organisation. It goes from how to extract valuable insight from data to how to monetise it. If you are a data scientist, it can help you have a broader vision of your role in the company, and how you can help it deliver value using data. If you want to learn more, there’s a good article on the subject here.

Feature Engineering for Machine Learning
Although Feature Engineering is one of the most important steps in the data science workflow, it is sometimes overlooked. This book is a good erview of this process, including detailed techniques, caveats and practical applications. It comes with the mathematical explanation and Python code for most methods, so you need a reasonable technical background to follow through. For a brief summary on Feature Engineering methods, you can read this article.

Artificial Intelligence: A Guide for Thinking Humans
I recommend this book not only for data scientists but for anyone interested in AI and in its future outcomes. For a book aiming for the general public, it spends a lot of time on the details of computer vision and how computers “think”, giving a clear and broad overview on the subject. It also talks about AI perspectives for the future and what you can expect for the next years. AI potential can sometimes scare us, but we also tend to overestimate its progress in the past few years.


EmoticonEmoticon