February 10, 2020

Become a Data Scientist in 2020 with these 5 resources

I am a Mechanical engineer by education. And I started my career with a core job in the steel industry.

With those heavy steel enforced gumboots and that plastic helmet, venturing around big blast furnaces and rolling mills. Artificial safety measures, to say the least, as I knew that nothing would save me if something untoward happens. Maybe some running shoes would have helped. As for the helmet. I would just say that molten steel burns at 1370 degrees C.

As I realized based on my constant fear, that job was not for me, and so I made it my goal to move into the Analytics and Data Science space somewhere around in 2011. From that time, MOOCs have been my goto option for learning new things, and I ended up taking a lot of them. Good ones and bad ones. More additional Info At Data Science Online Training

Now in 2020, with the Data Science field changing so rapidly, there is no shortage of resources to learn data science. But that also often poses a problem for a beginner as to where to start learning and what to learn? There are a lot of great resources on the internet, but that means there are a lot of bad ones too.

1) Python 3 Programming Specialization

First, you need a programming language. This specialization from the University of Michigan is about learning to use Python and creating things on your own.

You will learn about programming fundamentals like variables, conditionals, and loops, and get to some intermediate material like keyword parameters, list comprehensions, lambda expressions, and class inheritance.

You might also like to go through my Python Shorts posts while going through this specialization.

2) Applied Data Science with Python

This specialization inApplied Data Science with Python gives an intro to many modern machine learning methods that you should know about. Not a thorough grinding, but you will get the tools to build your models.

3) Machine Learning Theory and Fundamentals

After doing these above courses, you will gain the status of what I would like to call a “Beginner.”

Yet, you do not fully understand all the math and grind that goes behind all these models.

You need to understand what goes behind the clf.fit. Its time to face the music. Nobody is going to take you seriously till you understand the Math behind your models.

4) Learn Statistical Inference

You will learn about hypothesis testing, confidence intervals, and statistical inference methods for numerical and categorical data.

5) Learn SQL Basics for Data Science

While we feel much more accomplished by creating models and coming up with the different hypotheses, the role of data munging can’t be understated.

And with the ubiquitousness of SQL when it comes to ETL and data preparation tasks, everyone should know a little bit of it to at least be useful. Read on to Find more Python Online Training

SQL has also become a de facto standard of working with Big Data Tools like Apache Spark. This SQL specialization from UC Davis will teach you about SQL as well as how to use SQL for distributed computing.