In the First Module you will be introduced to the basic concepts of Data Science such as Definition and application of data, the roles of a data scientist, sources and types of data, Data storage and retrieval methods, Data Pipelines, Data Preparation, Exploratory Data Analysis and Interactive Dashboards
In this part, you’ll learn to install Python and Jupyter notebook, data types and variables, and use conditionals and loops to control the flow of your programs. You’ll harness the power of complex data structures like lists, sets, dictionaries, and tuples to store collections of related data. You’ll define and document your own custom functions, write scripts, and handle errors.
In this module, you will learn about a commonly used data structure in Python for scientific data: NumPy arrays. You will write Python code to import data as numpy arrays and to run calculations and summarize data in NumPy arrays.
The objective of this module is to get you familiar with the basic plotting functions of the library. It contains several examples which will give you hands-on experience in generating plots in python.
In this module, you will learn about the basic characteristics of Python dictionaries and learn how to access and manage dictionary data. Once you have finished this module, you should have a good sense of when a dictionary is the appropriate data type to use, and how to do so.
In this module, you’ll get started with Pandas and get to know the ins and outs of how you can use it to analyse data with Python.
In this module, you’ll see how calculations can be performed on objects in Python. By the end of this tutorial, you will be able to create complex expressions by combining objects and operators.
In this module, you’ll learn about python control flow structure (or python control flow) which is a programming block that analyses variables and selects a direction to go in based on specified parameters.
After completing this module, you will be able to explain indexing for pandas dataframe and use indexing and filtering to select data from pandas dataframe.
When you’re working with data in Python, loops can be a powerful tool. But they can also be a little bit confusing when you’re just starting out. In this module, we’re going to dive headfirst into loops and learn how they can be used to do all sorts of interesting things when you’re doing data cleaning or data analysis in Python.
A necessary aspect of working with data is the ability to describe, summarize, and represent data visually. Python statistics libraries are comprehensive, popular, and widely used tools that will assist you in working with data. In this module, you’ll learn What numerical quantities you can use to describe and summarize your datasets How to calculate descriptive statistics in pure Python How to get descriptive statistics with available Python libraries How to visualize your datasets
In this module, we will explore basic principles behind using data for estimation and for assessing theories. We will analyse both categorical data and quantitative data, starting with one population techniques and expanding to handle comparisons of two populations. We will learn how to construct confidence intervals. We will also use sample data to assess whether or not a theory about the value of a parameter is consistent with the data. A major focus will be on interpreting inferential results appropriately.
The pandas package is the most important tool at the disposal of Data Scientists and Analysts working in Python today. The powerful machine learning and glamorous visualization tools may get all the attention, but pandas is the backbone of most data projects. Learn some of the most important pandas features for exploring, cleaning, transforming, visualizing, and learning from data.
This module is all about the act of combining—or merging—DataFrames, an essential part of any data scientist's toolbox. You'll hone your pandas skills by learning how to organize, reshape, and aggregate multiple datasets to answer your specific questions.
In this module you’ll be learning: Explain what data visualization is and its importance in our world today Understand why Python is considered one of the best data visualization tools Describe matplotlib and its data visualization features in Python List the types of plots and the steps involved in creating these plot
In this module you’ll learn about Seaborn which is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
In this module you will get an introduction to flat files, importing data from other file types using Python and also learn to work with Relational Databases in Python.
In this module, we’ll leverage Python’s libraries to deal with common data problems and to make them ready for analysis.
In this module you’ll learn about EDA which is a phenomenon under data analysis used for gaining a better understanding of data aspects like:
– main features of data
– variables and relationships that hold between them
– identifying which variables are important for our problem
In this module we look at machine learning (ML), what it is and how it works. We take a look at a couple supervised learning algorithms and 1 unsupervised learning algorithm. No coding is required in this module. You will also get an introduction of deeplearning.
Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python. This library, which is largely written in Python, is built upon NumPy, SciPy and Matplotlib.
By the end of the course, you’ll apply clustering and dimensionality reduction in Machine Learning using Python as well as Master Unsupervised Learning to solve real-world problems!
In this module, you'll learn how to use Python to train decision trees and tree-based models. You'll understand the advantages and shortcomings of trees and demonstrate how ensembling can alleviate these shortcomings, all while practicing on real-world datasets. Finally, you'll also understand how to tune the most influential hyperparameters in order to get the most out of your models.
After completing the module, you will be able to quickly apply various clustering algorithms on data, visualize the clusters formed and analyse results.
Gradient boosting is currently one of the most popular techniques for efficient modeling of tabular datasets of all sizes. XGboost is a very fast, scalable implementation of gradient boosting, with models using XGBoost regularly winning online data science competitions and being used at scale across different industries. In this module, you'll learn how to use this powerful library to build and tune supervised learning models.
In this module we will study What is Dimensionality Reduction. Also, will cover every related aspect of Dimensionality Reduction like components & Methods of Dimensionality Reduction, Principle Component analysis & Importance of Dimensionality Reduction, Feature selection & Extraction.
This module covers the basics of how and when to perform data pre-processing. You'll learn how to standardize your data so that it's in the right form for your model, create new features to best leverage the information in your dataset, and select the best features to improve your model fit.
The Time Series Forecasting course provides students with the foundational knowledge to build and apply time series forecasting models in a variety of business contexts. You will learn:
The key components of time series data and forecasting models
How to use ETS (Error, Trend, Seasonality) models to make forecasts
In this module, you will learn to create new features to improve the performance of your Machine Learning models.
In this module, we will cover the basics of model validation, discuss various validation techniques, and begin to develop tools for creating validated and high performing models.
In this module, Learn the difference between Hyperparameters and Parameters, how a model's hyperparameters affect the model's performance, how to use grid search, random search and informed search to try different hyperparameter values.
Program Structure
Platform Discussions
Python Installation
Linkedin Branding & Networking Introduction
Introduction to Data Science
Introduction to Python
Profile Optimisation & Basic Hygiene
Case Study : Bike Sharing
Supervised Learning with scikit-learn - Logistics Regression
Case Study : Telecom Churn Case Study
Resume Building
Numpy
Matplotlib
Dictionaries
Pandas
Overview
Logical Operators
Control Flow
Filtering Pandas
Dataframes
Loops
Doubt Clarification Session 1
Creating Content & building credibility
Pandas
Introduction to Importing Data in Python
Introduction to Data Visualisation Using Matplotlib
Growing the Network of TG
Pandas Contd
Introduction to Importing Data in Python
Introduction to Data Visualisation Using Matplotlib
Data Visualization with Seaborn
Doubt Clarification Session 2
Asking for work/job, reading the stats and tracking your growth
Case Study : Movie Reviews - Pandas
Practice Questions
Linkedin doubt clearing session
Creating a GitHub account
Creating a Repository - Assignment
Creating and Uploading Projects
Descriptive Statistics Using Python
Inferential Statistics Using Python
Inferential Statistics Using Python
Feature Engineering for Machine Learning
Feature Engineering for Machine Learning contd.
Exploratory Data Analysis
Case Study : Credit EDA
Doubt Clarification Session 4
Case Study : Statistics and Hypothesis
Practice Questions
Hackathon 1
Introduction to Machine Learning
Supervised Learning with scikit-learn - K - Nearest Neighbors
Supervised Learning with scikit-learn - KNN Contd.
Supervised Learning with scikit-learn - Linear Regression
Supervised Learning with scikit-learn - Regularised Regression
Doubt Clarification Session 5
Case Study : Bike Sharing
Case Study : US based Housing Company
Practice Questions
Supervised Learning with scikit-learn - Logistics Regression
Case Study : Telecom Churn Case Study
Doubt Clarification Session 6
Case Study : Lead Scoring
Unsupervised Learning - KMeans
Unsupervised Learning - Hierarchical Clustering
Dimensionality Reduction - PCA
Case Study - Clustering of Countries
Practice Questions
Doubt Clarification Session 7
Machine Learning for Time Series Data
Case Study - Time Series
Case Study - GDP Analysis
Doubt Clarification Session 8
GitHub Review
Practice Questions
Resume Building Session
Capstone Project Initiated
Hackathon 2
Interview Preparation Live Session
Written Interview Test 1
Doubt Clarification Session 9
Mock Interviews - Group Practice 1
Mock Interviews - Group Practice 2
Written Interview Test 2
Practice interview with outside experts - 1
Doubt Clarification Session 10
Practice interview with outside experts - 2
Interview Drives for Selected Candidates
Besides being a passionate researcher in his areas of interest, Arka takes up Data Science as a profession too. Academically, a Masters in Economics, he has been mostly working on and around several niche areas of Machine Learning, Econometric Modeling and Computational Linguistics (viz. NLP).
A Certified Data Scientist, with a demonstrated history of working in the information technology and services industry. Skilled in Data Science, Machine Learning, Deep Learning and Computer vision. Strong engineering professional with a Bachelor of Technology - BTech focused in Computer Science from SVKM's Narsee Monjee Institute of Management Studies (NMIMS).
Training and Development,Technology Enthusiast with 8+ years of remarkable experience in multi-sector ranged domains of Data Science, Artificial Intelligence - Machine Learning - Deep Learning and Industry 4.0 who's greatly interested in challenging roles towards real-time solutions, Research and becoming a part of change-oriented end products or services while being a continuous learner, upgrading and applying to the best of Innovation resulting in qualitative productivity to Individuals and Industry.
You can share your Course Certificates in the Certifications section of your LinkedIn profile, on printed resumes, CVs, or other documents.