Skip to content

Explore multivariate statistics through hands-on university projects. Each project delves into real-world datasets, applying statistical techniques like ANOVA, two-factor analysis, and binary logistic regression. Understand data analysis, interpretation, and modeling with R.

Notifications You must be signed in to change notification settings

Sahar-dev/Multivariate_statistics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multivariate Statistics University Projects

This repository contains projects related to multivariate statistics completed as part of university coursework. Each project explores the application of statistical techniques to real-world datasets, providing valuable insights into data analysis and interpretation. Below, you'll find summaries of each project along with instructions on how to explore them.

TP1: Analysis of Variance (ANOVA) - One Factor

Explore the distribution of lead concentration across four substances using one-way Analysis of Variance (ANOVA). Key steps include:

  • Variable Description: Understand the variable "Lead" in relation to the type of product.
  • Preliminary Tests: Verify ANOVA assumptions through preliminary tests.
  • ANOVA Testing: Examine if there's a significant difference in lead concentration among the four substances.
  • Residual Analysis: Validate ANOVA results through a meticulous analysis of residuals.
  • Bonferroni Correction: Identify the substance with the highest lead concentration using Bonferroni correction.

TP2: Two-Factor Analysis of Variance

Exercise 1:

Evaluate responses to three treatments at two different doses each. Key objectives include:

  • Variable Analysis: Describe the "Response" variable and assess its normality.
  • Dose-Response Relationship: Investigate the relationship between "Response" and preparations at different doses.
  • Average Response: Calculate average responses for each treatment and draw meaningful conclusions.
  • Mode 1 Analysis: Analyze the effect of the preparation on the response without considering the dose.
  • Mode 2 Analysis: Test the effects of treatment, dose, and their interaction.
  • Subgroup Analysis: Conduct subgroup analysis based on the dose, determining the most effective treatment using pairwise t-tests with Bonferroni correction.
  • ANCOVA: Perform an analysis of covariance (ANCOVA) considering patient age as a covariate.

TP3: Binary Logistic Regression

Apply binary logistic regression to the "birthwt" dataset, aiming to identify risk factors associated with low birth weight. Tasks include:

  • Data Preprocessing: Transform variables for regression analysis.
  • Dataset Partitioning: Split the dataset into training and testing sets.
  • Regression Modeling: Apply logistic regression with model selection using AIC and stepwise methods.
  • Influential Values: Identify influential values and determine significant variables affecting low birth weight.
  • Odds Ratios and Confidence Intervals: Calculate odds ratios and their confidence intervals.
  • Performance Evaluation: Assess predictive performance using the "blorr" package.

Feel free to explore each project for comprehensive analyses and insights into the world of multivariate statistical techniques. Dive into the code, methodologies, and results to enhance your understanding of statistical modeling.

How to Explore

  1. Navigate to each project folder (TP1, TP2, TP3) for detailed documentation and code.
  2. Review the provided README files for specific instructions and objectives.
  3. Dive into the R scripts and notebooks to understand the methodology and analysis steps.
  4. Explore the datasets associated with each project.
  5. Gain insights into statistical techniques and their application through real-world examples.

Feel free to use, modify, and learn from the projects. If you have any questions or suggestions, please don't hesitate to reach out. Happy coding!

About

Explore multivariate statistics through hands-on university projects. Each project delves into real-world datasets, applying statistical techniques like ANOVA, two-factor analysis, and binary logistic regression. Understand data analysis, interpretation, and modeling with R.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages