Week 3 - BALT 4363 - Handling and Cleaning Data with Python Libraries
Chapter 3: Handling and Cleaning Data with Python Libraries This past week in BALT 4363 , I learned about handling and cleaning data with python libraries. This is a very important topic to understand. During this semester, I have learned how to create nice visuals using RStudio and Python. Creating visuals is not highly difficult, the difficulty comes from organizing data to be able to create them. The trickiest part of this is cleaning the data to be able to create better visuals. Pandas Pandas is a library that provides easy, high performance data structures and data analysis tools. It is very useful for handling large datasets by offering flexible data manipulation tools. Inside of pandas, there are two primary data structures: Series and DataFrame. Series is a one dimensional array, DataFrame is a two dimensional data structure. NumPy NumPy (Numerical Python) is a library for Python that adds support for large arrays and matrices, while also having a large collection o...