Skip to content

The objective of the project is to conduct a comprehensive analysis of a dataset of data science job postings, identifying the most important factors that influence salaries. Build predictive models that can be used to predict salaries for data science professionals, taking into account factors such as experience level, education, skills etc.

Notifications You must be signed in to change notification settings

ShreyaPatil1199/Data_Science_Salary_Prediction

Repository files navigation

Data_Science_Salary_Prediction

image

License: CC0-1.0

GitHub release (latest by date)

GitHub last commit

Python 3

Table of Contents

Objective

The objective of this GitHub project is to conduct a comprehensive Data Science Job Salaries Regression Analysis. This project aims to:

1. Explore and Analyze Data: Collect and preprocess job salary data to gain insights into trends and patterns within the data science job market.

2. Build Regression Models: Develop regression models to predict salaries based on various features, such as job title, location, experience, and skills.

3. Evaluate Algorithms: Compare and evaluate different regression algorithms to identify the most effective models for salary prediction.

4. Provide Insights: Share meaningful insights and conclusions derived from the analysis, helping job seekers, employers, and policymakers make informed decisions.

By achieving these objectives, this project aims to empower stakeholders in the data science job market with valuable insights, enhance predictive modelling skills, and contribute to the broader data science community.

Prerequisite

To run this analysis, you need the following prerequisites:

Python 3

Jupyter Notebook (optional)

Pandas

Matplotlib (for data visualization)

Seaborn (for enhanced data visualization)

Data Description

1. Unnamed

Data Type: Integer (int64)

Description: An index or identifier for each data record.

2. work_year

Data Type: Integer (int64)

Description: The year in which the job information was recorded or applicable.

3. experience_level

Data Type: Object (String)

Description: The level of experience required or possessed for the job, categorized into different levels (e.g., Junior, Mid-Level, Senior).

4. employment_type

Data Type: Object (String)

Description: The type of employment associated with the job (e.g., Full-Time, Part-Time, Contract, etc.).

5. job_title

Data Type: Object (String)

Description: The title or name of the job position.

6. Salary

Data Type: Integer (int64)

Description: The salary associated with the job position, denominated in the local currency.

7. salary_currency

Data Type: Object (String)

Description: The currency in which the salary is denominated.

8. salary_in_usd

Data Type: Integer (int64)

Description: The salary is converted into United States Dollars (USD) for standardization or comparison purposes.

9. employee_residence

Data Type: Object (String)

Description: The location or residence of the employee, often specified by country or region.

10. remote_ratio

Data Type: Integer (int64)

Description: The ratio or percentage of remote work allowed or expected for the job position.

11. company_location

Data Type: Object (String)

Description: The location of the company or employer, often specified by country or city.

12. company_size

Data Type: Object (String)

Description: The size category of the company, typically categorized by the number of employees (e.g., Small, Medium, Large).

About

The objective of the project is to conduct a comprehensive analysis of a dataset of data science job postings, identifying the most important factors that influence salaries. Build predictive models that can be used to predict salaries for data science professionals, taking into account factors such as experience level, education, skills etc.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages