Skip to content

vikasvachheta08/Exploratory-Data-Analysis-on-Automobile-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis on Automobile Dataset

This EDA is performed using the following tools and techniques :-

  • Performed EDA on a Automobile Dataset.
  • Packages : Pandas and Numpy for Data Manipulation
  • Data Visualization Libraries : Matplotlib and Seaborn for Data Visualisation.
  • IDE : Jupyter Notebook

Table of Contents :-

  • Importing required libraries
  • Pre processing data in Python
  • Data Cleaning
  • Dealing Missing values
  • Data Formatting
  • Univariate Analysis
  • Bivariate Analysis

Univariate Analysis :-

Univariate analysis is the simplest form of analyzing data. “Uni” means “one”, so in other words your data has only one variable. It doesn't deal with relationships and it's major purpose is to describe. It takes data, summarizes that data and finds patterns in the data.

Vehicle by make frequency diagram By this we say that toyota make more vehicle

90% of the people use cars that run of gas rather than diesel

Bivariate Analysis :-

Bivariate analysis is one of the simplest forms of quantitative (statistical) analysis. It involves the analysis of two variables (often denoted as X, Y), for the purpose of determining the empirical relationship between them.

Boxplot for make brand and price

Findings: Below are our findings on the make and price of the car :
  • The most expensive car is manufacture by Mercedes benz and the less expensive is Chevrolet.
  • The premium cars costing more than 20000 are BMW, Jaquar, Mercedes benz and Porsche.
  • Less expensive cars costing less than 10000 are Chevrolet, Dodge, Honda, Mitsubishi, Plymoth.
  • Rest of the cars are in the midrange between 10000 and 20000 which has the highest number of cars.

Alt text

Positive Relationship between Price and Engine size (Engine size increase then price also increase) Alt text

Drive wheels and City MPG Alt text

Drive wheels and Highway MPG Alt text

Normalized losses based on body style and num of doors

Findings : As we understand the normalized loss which is the average loss payment per insured vehicle is calculated with many features of the cars which includes body style and no. of doors. Normalized losses are distributed across different body style but the two door cars has more number of losses than the four door cars.

Alt text