fbpx

What is Data Science

Data Science Course in Kochi | Best Data Science Courses

Data science makes use of scientific methods, processes and algorithms to extract knowledge from data that is structural or non-structural.

Why Data Science?

Unlike traditional structured data, most of the data generated today are unstructured or semi-structured.In order to extract knowledge from the unstructured huge volume of data more complex, advanced analytical tools and algorithms are needed. There Comes the importance of Data Science.

Who is a Data Scientist?

Data Scientist uses their strong expertise in certain scientific disciplines to crack complex data problems. They extract and present the information in a more useful manner as compared to the raw data available to them from structured as well as unstructured forms.

Data Science Course in Kochi, Kerala

Data Science Syllabus

Module – 1(Python Basics) Introduction to Python (22 hours)

  • Welcome To The Course
  • Software Installation
  • Jupyter Notebook Tutorial
  • Comments
  • Variable,Operators,DataTypes
  • If Else,For and While Loops
  • Functions
  • Lambda Expression
  • Taking input from keyboard
  • List
  • Tuple
  • Set
  • Dictionary
  • Modules and Packages
  • Objects and Classes
  • File Handling
  • MySQL

 

  • What is Artificial Intelligence
  • Introduction To Data Science
  • Real Time Use Cases Of Data Science
  • Who is a Data Scientist?
  • Data Science Project Lifecycle
  • Skill sets needed for Data Scientist
  • Difference between Data Engineer, Data Scientist and Data Analyst
  • How to Transition into Data Science from Different Backgrounds
  • Machine Learning
  • Supervised vs Unsupervised
  • DeepLearning vs Machine Learning
  • INTERVIEW QUESTIONS ASSIGNMENT-1

 

Module – 2(Python Advanced) NumPy,Pandas (8 hours)

  • Introduction to Numpy
  • Creating Arrays
  • arange(),linspace() etc.
  • Creating Arrays of Random Numbers
  • Basic Operations on an Array
  • Applying Universal functions on an array
  • Linear Algebra operations on an array
  • Numpy DataTypes
  • Type Conversion
  • Array Stacking

 

  • Introduction to Pandas
  • Creating DataFrames
  • Reading data from csv,excel etc. into a DataFrame & writing df into csv,excel
  • Selection and Indexing
  • Conditional Selection
  • Groupby
  • Pivot Table
  • Merging , Joining, Concatenation
  • Missing Value Treatment
  • INTERVIEW QUESTIONS ASSIGNMENT-2

 

Module – 3 (Visualisation) Visualisation-Matplotlib,Seaborn,Plotly (3 hours)

  • Line Plots
  • Scatter Plots
  • Pair Plots
  • Histograms
  • Heat Maps
  • Bar Plots
  • Stacked Bar plot
  • Pie chart
  • Box Plots
  • Swarm Plots

 

Module – 4 (Statistics) Statistics (8 hours)

  • Descriptive vs Inferential Statistics
  • Mean, Median, Mode
  • Central Limit Theorem
  • Measure of dispersion
  • Inter Quartile Range
  • Variance
  • Standard Deviation
  • Z score
  • Pearson’s Product Moment Correlation-r
  • R square
  • Adjusted R-square
  • Normal Distribution
  •  Standard Normal Distribution
  • Empirical rule of Normal Distribution
  • What is an Outlier
  • Outlier Detection and Removal
  • Exploratory Data Analysis
  • INTERVIEW QUESTIONS ASSIGNMENT-3

 

Module – 5 (ML-Linear Regression) Linear Regression, Cost Function, Gradient Descent (10 hours)

  • Introduction to Machine Learning
  • Supervised vs Unsupervised
  • Regression vs Classification
  • Bias and Variance tradeoff
  • Cross Validation
  • Linear Regression Theory
  • Gradients/Derivative Theory
  • Assumption of Linear Regression
  • Cost Function
  • Optimize Cost function using Gradient Descent
  • Mathematical Derivation
  • Multi- Colinearity
  • MAE
  • MSE
  • RMSE
  • Multiple Linear Regression
  • Polynomial Regression
  • INTERVIEW QUESTIONS ASSIGNMENT-4

 

Module -6 (ML-Logistic Regression, Algorithm Validation) Logistic Regression (8 hours)

  • Logistic Regression Theory
  • Logistic function
  • Classification Algorithm Validation
  • Confusion Matrix
  • Classification Report
  • Recall
  • Precision
  • AUC
  • ROC
  • INTERVIEW QUESTIONS ASSIGNMENT-5

 

Module -7(ML- Naive Bayes, SVM ) Naive Bayes, SVM (6 hours)

  • Naive Bayes classification
  • Bayes Theorem
  • Support Vector Machine (SVM)
  • Support Vectors
  • Kernel Trick

 

Module – 8 (Decision Tree, Random Forest) Decision Tree (6 hours)

  • What is ID3 Algorithm
  • Entropy
  • Calculating Information Gain
  • Overfitting, Underfitting, Best fit
  • Random Forest
  • What is Bootstrap
  • Bagging
  • Difference between Random Forest and Decision Tree
  • Feature Selection using Random Forest
  • Hyperparameter tuning
  • INTERVIEW QUESTIONS ASSIGNMENT-6

 

Module – 9 (KMeans) KMeans Clustering (2 hours)

  • Introduction to Unsupervised Machine Learning
  • KMeans Theory
  • How to decide K in KMeans

 

Module – 10(PCA, Recommendation Systems ) Principal Component Analysis (6 hours)

  • Introduction to Dimensionality Reduction
  • PCA Theory discussion
  • Eigenvalues , Eigen Vectors
  • Step by Step Detail Mathematical Derivation
  • Singular Value Decomposition
  • Recommendation Systems
  • Content-Based Filtering
  • Collaborative Filtering
  • INTERVIEW QUESTIONS ASSIGNMENT-7

 

Module -11 (NLP) Text Mining (3 hours)

  • Introduction to NLP
  • Text Preprocessing Techniques using Space and NLTK
  • Word Tokens
  • Document Similarity
  • StopWord Removal
  • Lemmatization
  • Stemming
  • Count Vectorizer
  • Tf-Idf Vectorizer
Enquire now