Python uses PyJanitor's expansion_ Column, split_ Column, separate_ Columns function and other functions for data splitting
Preparation work:
1. Ensure that the Python interpreter and pip package management tool are installed.
2. Use the following command to install PyJanitor and other required libraries on the terminal or command prompt:
pip install pyjanitor pandas
Dependency Class Library:
-PyJanitor: Python package for Data cleansing and conversion.
-Pandas: A Python package used for data processing and operations.
Data sample:
Assuming there is a DataFrame containing address information, where the address format is "Street Name, City, State", we want to split the address into three independent columns, namely "Street Name", "City", and "State".
Complete code example:
python
import pandas as pd
import janitor
#Create a DataFrame containing address information
data = pd.DataFrame({
'address': ['123 Main St, CityA, StateX',
'456 Elm St, CityB, StateY',
'789 Oak St, CityC, StateZ']
})
#Using expand_ The column function splits address columns
data = data.expand_column('address',
['street', 'city', 'state'],
sep=',')
print(data)
Output results:
street city state
0 123 Main St CityA StateX
1 456 Elm St CityB StateY
2 789 Oak St CityC StateZ
Summary:
By using PyJanitor's expansion_ Column, split_ Column and separate_ The columns function allows us to easily split and convert data. Before using, we need to complete the environment setup, which involves installing a Python interpreter and necessary class libraries. Then, according to specific needs, various functions provided by PyJanitor can be used to process data, making data conversion simpler and more efficient.