How to add multiple columns to pandas dataframe in one assignmen

ghz 3months ago ⋅ 36 views

I'm trying to figure out how to add multiple columns to pandas simultaneously with Pandas. I would like to do this in one step rather than multiple repeated steps.

import pandas as pd

data = {'col_1': [0, 1, 2, 3],
        'col_2': [4, 5, 6, 7]}
df = pd.DataFrame(data)

I thought this would work here...

df[['column_new_1', 'column_new_2', 'column_new_3']] = [np.nan, 'dogs', 3]

Answers

The issue with your attempt lies in how you're trying to assign multiple values to multiple columns. The method you're using expects the right-hand side (the values being assigned) to be a DataFrame, list of lists, or NumPy array with the correct shape.

To add multiple columns to a pandas DataFrame simultaneously, you can do the following:

  1. Using a DataFrame: This approach works because the dimensions of the new data match the columns you want to add.
import pandas as pd
import numpy as np

data = {'col_1': [0, 1, 2, 3],
        'col_2': [4, 5, 6, 7]}
df = pd.DataFrame(data)

# Adding multiple columns simultaneously
df[['column_new_1', 'column_new_2', 'column_new_3']] = pd.DataFrame({
    'column_new_1': [np.nan] * len(df),
    'column_new_2': ['dogs'] * len(df),
    'column_new_3': [3] * len(df)
})

print(df)

Explanation:

  • You create a DataFrame on the right-hand side with the same number of rows as the original DataFrame. The new columns will be added simultaneously.
  1. Using NumPy Arrays or Lists: You can also use a list of lists or a 2D NumPy array with the correct shape.
# Using a NumPy array
df[['column_new_1', 'column_new_2', 'column_new_3']] = np.array([
    [np.nan, 'dogs', 3],
    [np.nan, 'dogs', 3],
    [np.nan, 'dogs', 3],
    [np.nan, 'dogs', 3]
])

print(df)

In this case, the NumPy array has the shape (4, 3) to match the number of rows and columns you're trying to assign.