MyAwesomeEDA

This Python module provides a set of tools for exploring and analyzing your dataset. Whether you’re a data scientist, analyst, or enthusiast, this module will help you gain insights into your data quickly and efficiently.

Code & full README
Usage Guide

Features

  • Welcome Gif: A fun welcome gif to kick off your exploration.
  • Basic Dataset Information: Quickly get an overview of the number of observations (rows) and parameters (columns) in your dataset.
  • Data Type Summary: Understand the data types of each column in your dataset.
  • Categorization of Features: Categorize features into numerical, string, and categorical based on unique threshold.
  • Summary Statistics: Get descriptive statistics for numerical features, including mean, standard deviation, minimum, 25th percentile, median, 75th percentile, and maximum values.
  • Outliers Detection: Identify outliers in numerical features using the interquartile range (IQR) method.
  • Missing Values Analysis: Investigate missing values in your dataset, including total missing values, rows with missing values, and columns with missing values.
  • Duplicate Rows Detection: Identify duplicate rows in your dataset.
  • Visualizations: Generate informative visualizations including bar plots of missing values by variable, correlation heatmap for numerical features, and histograms with boxplots for numerical features.