I'm trying to figure out how to add multiple columns to pandas simultaneously with Pandas. I would like to do this in one step rather than multiple repeated steps.
import pandas as pd
data = {'col_1': [0, 1, 2, 3],
'col_2': [4, 5, 6, 7]}
df = pd.DataFrame(data)
I thought this would work here...
df[['column_new_1', 'column_new_2', 'column_new_3']] = [np.nan, 'dogs', 3]
Answers
The issue with your attempt lies in how you're trying to assign multiple values to multiple columns. The method you're using expects the right-hand side (the values being assigned) to be a DataFrame, list of lists, or NumPy array with the correct shape.
To add multiple columns to a pandas DataFrame simultaneously, you can do the following:
- Using a DataFrame: This approach works because the dimensions of the new data match the columns you want to add.
import pandas as pd
import numpy as np
data = {'col_1': [0, 1, 2, 3],
'col_2': [4, 5, 6, 7]}
df = pd.DataFrame(data)
# Adding multiple columns simultaneously
df[['column_new_1', 'column_new_2', 'column_new_3']] = pd.DataFrame({
'column_new_1': [np.nan] * len(df),
'column_new_2': ['dogs'] * len(df),
'column_new_3': [3] * len(df)
})
print(df)
Explanation:
- You create a
DataFrame
on the right-hand side with the same number of rows as the original DataFrame. The new columns will be added simultaneously.
- Using NumPy Arrays or Lists: You can also use a list of lists or a 2D NumPy array with the correct shape.
# Using a NumPy array
df[['column_new_1', 'column_new_2', 'column_new_3']] = np.array([
[np.nan, 'dogs', 3],
[np.nan, 'dogs', 3],
[np.nan, 'dogs', 3],
[np.nan, 'dogs', 3]
])
print(df)
In this case, the NumPy array has the shape (4, 3)
to match the number of rows and columns you're trying to assign.