TITANIC -DATA VISUALIZATION

Allu Daddy Durga Praveen
4 min readApr 24, 2021

--

Introduction: -

The RMS Titanic sank in the early morning hours of 15 April 1912 in the North Atlantic Ocean, four days into her maiden voyage from Southampton to New York City. The largest ocean liner in service at the time, Titanic had an estimated 2,224 people on board when she struck an iceberg at around 23:40 (ship’s time)[a] on Sunday, 14 April 1912. Her sinking two hours and forty minutes later at 02:20 (ship’s time; 05:18 GMT) on Monday, 15 April, resulted in the deaths of more than 1,500 people, making it one of the deadliest peacetime maritime disasters in history.

Titanic sank with over a thousand passengers and crew still on board. Almost all of those who jumped or fell into the water drowned or died within minutes due to the effects of cold shock and incapacitation. RMS Carpathia arrived about an hour and a half after the sinking and rescued all of the survivors by 09:15 on 15 April, some nine and a half hours after the collision. The disaster shocked the world and caused widespread outrage over the lack of lifeboats, lax regulations, and the unequal treatment of the three passenger classes during the evacuation. Subsequent inquiries recommended sweeping changes to maritime regulations, leading to the establishment in 1914 of the International Convention for the Safety of Life at Sea (SOLAS).

Exploratory Data Analysis:-

Let’s see a simple Visualization of this data. This data fetched from Kaggle. Then data is represented in the various graphs.

Importing various packages for Data Visualization. Packages are Pandas, Numpy, Matplotlib, Seaborn.

Pandas: Pandas is an open-source Python package that is most widely used for data science/data analysis and machine learning tasks.

Numpy: It provides support for multi-dimensional arrays.

Matplotlib: It creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.

Seaborn: It is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

Reading the Dataset :

Reading Titanic dataset by creating data frame using pandas.

Reading dataset

Finding NAN values : [Heat map]

By using seaborn we can able to find the null values visually in the dataset.

Filling NAN values :

By using Mean & Forward Fill (ffill) & Backward Fill (bfill), we can fill the NA values in the Dataset.

Re-Verifying the NA values : [Heat map]

We can also verify NA values using “isna().any()” function.

Dropping the Un-necessary labels :

By using “ drop()” we can remove the unnecessary labels.

GRAPHS:-

Sex Vs Age using Bar-Graph :

Sex Vs Age Vs Pclass using Bar-Plot :

Sex Vs Age Vs Pclass using Violin-Plot :

Classifying Age using Distplot :

Survived Vs Age Vs Pclass using Bar-Plot :

Result: -

By the above Data-Visualization, we can conclude that mostly the people who are aged and kids, the people who are in Pclass1, and females were Survived.

--

--

No responses yet