Not Another Article on Learning Data Science!

Dale Smith, Ph.D.
4 min readMar 5, 2021
From OpenMoji — CC Share Alike License https://creativecommons.org/licenses/by-sa/4.0/

Why another guide? I’ll explain. Keep reading.

I have years of experience teaching undergraduate calculus and linear algebra. I also have years of experience keeping up with machine learning and statistical learning as they grew from their infancy. Taking the academic approach — calculus refresher, probability and statistics, linear algebra — is downright hard.

Based on my teaching experience, the main factor that determines success is what you tell yourself. There is no “math gene”, and learning calculus has nothing to do with anyone but you. Your goal is not performing at the highest level on the Mathematics Olympiad or publishing quality research papers.

I trace my epiphany on this topic back to Andrew Ng’s original Coursera class. I realized that inverting the subject of gradient descent would be an effective method to introduce calculus concepts. What does this mean? It means starting with the gradient descent concept. Calculating the gradient descent leads us naturally to the idea of approximating a hill by a plane to get the steepest slope down the hill.

Concepts, not thirty problems that start with “calculate the derivative”. Who wants to grind thru homework problems after work, dinner and clean-up, homework help, and bedtime?

No, there is a different way for anyone with time constraints, and who isn’t in college or a boot camp.

And yes, you can do this. The primary problem is you — telling yourself you can’t do it, you can’t learn coding, and you aren’t good at math. All that thinking is in the past. It doesn’t matter that your parent or grandparent told you they weren’t good at math, and let you skate with poor grades. What matters most is the person you are today. Start telling yourself yes, I can. Follow the process I’ve laid out here, and yes, you will.

OpenMoji CAA https://emojipedia.org/openmoji/1.0/party-popper/

Take a Python class. There are plenty of free online classes. Improve your coding first. Learn about source code control. Get your hands dirty and make mistakes. Learn from them. Udemy offers discounts for courses, and Coursera offers tuition assistance. Sign up, watch for the discount or apply for tuition assistance, and jump in.

Start with Python. It is the most accessible way to learn coding. I’ve learned several languages over the past 30 years, and Python is the most accessible.

This first step — learning coding — is important because it gives you a success that you can build upon.

Why take a class? It gives you a structured format to learn from, and gives you a way to hold yourself accountable. If you can’t work on it every day, then don’t. Give yourself a break. Remember that life can intervene. Don’t be hard on yourself, but do stick to the coursework. Focus on the immediate goal and don’t let your feelings distract you.

Once you have some Python under your belt, you are ready for the next step — Bayesian methods in statistics! But we’re going to do it a different way than your college class by inverting the process. Coding first, then math.

Can Davidson Pilon has created an online book Bayesian Methods for Hackers. Use your new Python skills to start coding and learning as you code. For now, skip the Tensorflow part — you can come back to it later. The goal is to learn by doing the coding, and build your own confidence.

Your Python skills and knowlege of Data Science are growing. Take ke the Data Cleaning challenge. This should not take you longer than two to six weeks. It’s a necessary step and will improve your knowledge of Python, as well as introduce you to Jupyter Notebooks.

--

--

Dale Smith, Ph.D.

Co-Founder and Chief Research Officer — Vallum Software. My interests are in C/C++, machine learning, Python, Pandas, and Jupyter Notebooks.