datetime column of pandas multiply a number

ghz 昨天 ⋅ 1 views

I have a dataframe with a datetime column in string type, like this:

>>> df2
       date     a    b
0  2020/1/1   8.0  5.0
1  2020/1/2  10.0  7.0
2  2020/1/3   6.0  1.0
3  2020/1/4   6.0  3.0

I want use its 'date' column to generate a new index with various length by multiply a array, like this:

>>> idx_list = [2,3,1,2]
>>> df2.date*idx_list

but I got a unexpected result:

>>> df2.date*idx_list
0            2020/1/12020/1/1
1    2020/1/22020/1/22020/1/2
2                    2020/1/3
3            2020/1/42020/1/4

Is there a way to make a new index series to be a sequential data, like:

0 2020/1/1
1 2020/1/1
2 2020/1/2
3 2020/1/2
4 2020/1/2
5 2020/1/3
6 2020/1/4
7 2020/1/4

Answer

To achieve your goal of generating a sequential index based on the multiplication of the date column with an array like idx_list, you'll need to use a different approach because simply multiplying the column will not produce the desired result. Instead, you can use np.repeat() to repeat the values in the date column according to the corresponding values in idx_list.

Here's a step-by-step solution:

import pandas as pd
import numpy as np

# Create your DataFrame
data = {
    'date': ['2020/1/1', '2020/1/2', '2020/1/3', '2020/1/4'],
    'a': [8.0, 10.0, 6.0, 6.0],
    'b': [5.0, 7.0, 1.0, 3.0]
}
df2 = pd.DataFrame(data)

# Convert 'date' column to datetime
df2['date'] = pd.to_datetime(df2['date'])

# Define the idx_list
idx_list = [2, 3, 1, 2]

# Use np.repeat() to repeat the dates according to idx_list
new_dates = np.repeat(df2['date'].values, idx_list)

# Create a new DataFrame with the new index
new_df = pd.DataFrame({
    'date': new_dates,
    'a': np.repeat(df2['a'].values, idx_list),
    'b': np.repeat(df2['b'].values, idx_list)
})

print(new_df)

Explanation:

  1. Convert date column to datetime: We first convert the date column to a datetime type, so the dates are handled properly.
  2. Use np.repeat(): The np.repeat() function repeats the date column values based on the lengths specified in idx_list. For example, 2 means the date will be repeated twice, 3 means the date will be repeated three times, etc.
  3. Create a new DataFrame: We create a new DataFrame with the repeated dates, as well as the values in columns a and b, which are also repeated based on idx_list.

Output:

       date     a    b
0 2020-01-01   8.0  5.0
1 2020-01-01   8.0  5.0
2 2020-01-02  10.0  7.0
3 2020-01-02  10.0  7.0
4 2020-01-02  10.0  7.0
5 2020-01-03   6.0  1.0
6 2020-01-04   6.0  3.0
7 2020-01-04   6.0  3.0

This will give you a new DataFrame where the date column is repeated according to idx_list, and the a and b values are also repeated accordingly.