Posts

Showing posts with the label Data Preprocessing

Machine Learning -3 Syllabus

  MACHINE LEARNING Syllabus: UNIT-1 Introduction: Brief Introduction to Machine Learning, Abstraction and Knowledge Representation, Types of Machine Learning Algorithms, Definition of learning systems, Goals and applications of machine learning, Aspects of developing a learning system, Data Types, training data, concept representation, function approximation. Data Pre-processing: Definition, Steps involved in pre-processing, Techniques UNIT-2 Performance measurement of models: Accuracy, Confusion matrix, TPR, FPR, FNR, TNR, Precision, recall, F1-score, Receiver Operating Characteristic Curve (ROC) curve and AUC. Supervised Learning1: Linear Regression, Multiple Variable Linear Regression, Naïve Bayes Classifiers, Gradient Descent, Multicollinearity, Bias-Variance trade-off. UNIT-3 Supervised Learning2 : Regularization, Logistic Regression, Squashing function, KNN, Support Vector Machine. Decision Tree Learning: Representing concepts as decision trees, Recursive induction of decisi

4. Data Preprocessing in Machine learning (Handling Missing values )

Image
4. Data Preprocessing in Machine learning (Handling Missing values) 1. Importing the libraries 2.Importing the Datasets Now we need to import the datasets which we have collected for our machine learning project. But before importing a  dataset, we need to set the current directory as a working directory. read_csv() function: Now to import the dataset, we will use read_csv() function of pandas library, which is used to read a csv file and  performs various operations on it. Using this function, we can read a csv file locally as well as through an URL. Handling Missing data: The next step of data preprocessing is to handle missing data in the datasets. If our dataset contains some missing  data , then it may create a huge problem for our machine learning model. Hence it is necessary to handle missing  values present in the dataset. Operating on Null Values Pandas treats None and NaN as essentially interchangeable for indicating missing or null values. To facilitate this  conven

3. Data Preprocessing in Machine Learning

Image
  Data Preprocessing in Machine learning Data preprocessing is a process of preparing the raw data and making it suitable for a machine learning model. It is the first and crucial step while creating a machine learning model. When creating a machine learning project, it is not always a case that we come across clean and formatted data. And while doing any operation with data, it is mandatory to clean it and put it in a formatted way. So for this, we use data preprocessing task. Why do we need Data Preprocessing? A real-world data generally contains noises, missing values, and maybe in an unusable format which cannot be directly used for machine learning models. Data preprocessing is required tasks for cleaning the data and making it suitable for a machine learning model which also increases the accuracy and efficiency of a machine learning model. It involves below steps: 1. Getting the dataset 2. Importing libraries 3. Importing datasets 4. Finding Missing Data 5. Encoding Categorical