The ONLY Python learning roadmap you need.
Python is the no.1 required skill for Data Scientists.
But you don’t need a coding bootcamp to learn Python.
———
𝗠𝘆 𝘀𝘁𝗼𝗿𝘆:
I didn't know Python, when I started my Data journey.
I applied & interviewed for many Data Science jobs.
But I got rejected for all of them.
It made me realize I had to learn Python.
If I was serious about becoming a Data Scientist.
But I couldn't afford a bootcamp or college courses.
And so, I figured out how to learn Python on my own.
I used a combination of 3 things:
1. This learning roadmap
2. DataCamp for learning: https://lnkd.in/gngM9z8D
3. Jupyter Notebooks to build projects with my new skills
———
𝗛𝗲𝗿𝗲’𝘀 𝗺𝘆 𝗿𝗼𝗮𝗱𝗺𝗮𝗽 𝘁𝗼 𝗹𝗲𝗮𝗿𝗻 𝗣𝘆𝘁𝗵𝗼𝗻 𝗳𝗼𝗿 𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 👇🏽
1️⃣ Python basics
𝘋𝘰𝘯’𝘵 𝘴𝘬𝘪𝘱 𝘵𝘩𝘪𝘴 𝘴𝘵𝘦𝘱. 𝘐𝘵 𝘴𝘦𝘵𝘴 𝘵𝘩𝘦 𝘧𝘰𝘶𝘯𝘥𝘢𝘵𝘪𝘰𝘯 𝘧𝘰𝘳 𝘦𝘷𝘦𝘳𝘺𝘵𝘩𝘪𝘯𝘨 𝘦𝘭𝘴𝘦.
Variables and data types
→ type()
→ int(), float(), str()
→ list(), dict()
Control structures
→ if, elif, else
→ for & while loops
→ range()
Functions
→ def
→ return
→ args
List comprehensions
→ [expression for item in iterable if condition]
2️⃣ Data manipulation & cleaning
Data cleaning
→ df.dropna(), df.fillna()
→ df.drop_duplicates()
→ df.replace()
Merging and reshaping data
→ pd.merge()
→ df.pivot()
→ df.melt()
Grouping and aggregation
→ df.groupby()
→ df.agg()
📍 𝟭𝘀𝘁 𝗰𝗵𝗲𝗰𝗸𝗽𝗼𝗶𝗻𝘁: do a data cleaning project 📍
3️⃣ Exploratory Data Analysis
You’ll combine your creativity with your tech skills.
Descriptive statistics
→ df.mean(), df.median(), df.mode()
→ df.std(), df.var()
→ df.min(), df.max(), df.quantile()
Data distribution
→ df.hist()
→ stats.normaltest()
Correlation analysis
→ df.corr()
→ plt.imshow()
→ stats.pearsonr()
4️⃣ Data visualization with 𝘮𝘢𝘵𝘱𝘭𝘰𝘵𝘭𝘪𝘣
Basic plotting
→ plt.plot() (line plots)
→ plt.scatter() (scatter plots)
→ plt.bar() (bar charts)
Histograms and density plots
→ plt.hist()
→ plt.kde()
Box plots
→ plt.boxplot()
Subplots and multiple charts
→ plt.subplots()
→ fig.add_subplot()
Customizing plots
→ plt.xlabel(), plt.ylabel(), plt.title()
→ plt.xscale(), plt.yscale()
→ plt.legend()
📍 𝟮𝗻𝗱 𝗰𝗵𝗲𝗰𝗸𝗽𝗼𝗶𝗻𝘁: do an EDA project 📍
5️⃣ Machine learning
Model training and evaluation
→ model_selection.train_test_split()
→ base.BaseEstimator.fit(), base.BaseEstimator.predict()
→ model_selection.cross_val_score()
Regression models
→ linear_model.LinearRegression()
→ metrics.mean_squared_error()
→ metrics.r2_score()
Classification models
→ linear_model.LogisticRegression()
→ metrics.accuracy_score()
→ metrics.confusion_matrix()
Clustering
→ cluster.KMeans()
📍 𝟯𝗿𝗱 𝗰𝗵𝗲𝗰𝗸𝗽𝗼𝗶𝗻𝘁: build your first ML model in Python 📍
Btw, I learned Python entirely through DataCamp, and I highly recommend it.
You can learn all of the above in one place: https://lnkd.in/gngM9z8D
♻️ Found this useful? Repost it so others can see it too.