I am trying to plot 3 variables x,y,z on a 2d plot, with x (date) on the x axis, y (time) on the y axis and z (temperature) mapped with a colorscale. I have the three variables available within a pandas Dataframe and created an extra column with the datenumber so that matplotlib can work with it.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
data=pd.DataFrame()
data['datenum']=mdates.date2num(data['Date'])
Example:
Date Time Tgrad datenum
0 2016-08-01 00 -0.841203 736177.0
1 2016-08-01 01 -0.629176 736177.0
2 2016-08-01 02 -0.623608 736177.0
3 2016-08-01 03 -0.615145 736177.0
4 2016-08-01 04 -0.726949 736177.0
5 2016-08-01 05 -0.788864 736177.0
6 2016-08-01 06 -0.794655 736177.0
7 2016-08-01 07 -0.775724 736177.0
8 2016-08-01 08 -0.677951 736177.0
But have not been successful due to the wrong shape of my input data I think. I have tried something like this:
fig, ax = plt.subplots()
ax.imshow(data['Tgrad'], extent = [min(data['datenum']), max(data['datenum']),min(data['Time']), max(data['Time'])], cmap="autumn", aspect = "auto")
ax.xaxis_date()
But get a ValueError:
ValueError: setting an array element with a sequence
Is it necessary to have the data as numpy array or any other type? And how can I map the data once I have it in a different format?
Answer
The issue you're encountering comes from the shape and structure of your data. imshow
in Matplotlib expects a 2D array of values (i.e., a grid of data points), but your data is structured as a 1D column with datenum
values (which are already in the correct numeric format for plotting dates) and Time
values as a categorical variable (hours). You need to reshape your data into a 2D grid format for imshow
to work, where each point in the grid corresponds to a (datenum, Time)
pair.
You can achieve this by converting your DataFrame into a 2D matrix of Tgrad
values that correspond to specific datenum
(date) and Time
(hour) combinations. Here's how to do it:
Step-by-step solution:
-
Pivot your DataFrame: You'll need to reshape your data into a grid where each column represents a specific date (
datenum
) and each row represents a specific time (Time
). The values in this grid will be theTgrad
values. -
Plot the data using
imshow
: After reshaping the data into a 2D grid, you can useimshow
to plot theTgrad
values. You'll also want to properly format the x and y axes to correspond to your dates and times.
Here’s how to implement this:
Code Example:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# Sample data
data = pd.DataFrame({
'Date': ['2016-08-01', '2016-08-01', '2016-08-01', '2016-08-01', '2016-08-01',
'2016-08-01', '2016-08-01', '2016-08-01', '2016-08-01'],
'Time': ['00', '01', '02', '03', '04', '05', '06', '07', '08'],
'Tgrad': [-0.841203, -0.629176, -0.623608, -0.615145, -0.726949,
-0.788864, -0.794655, -0.775724, -0.677951]
})
# Convert 'Date' to datetime and 'Time' to int
data['Date'] = pd.to_datetime(data['Date'])
data['Time'] = data['Time'].astype(int)
# Convert Date to datenumber
data['datenum'] = mdates.date2num(data['Date'])
# Pivot the DataFrame to create a grid of Tgrad values (dates as columns, times as rows)
pivoted_data = data.pivot(index='Time', columns='datenum', values='Tgrad')
# Create the plot
fig, ax = plt.subplots(figsize=(10, 6))
# Plot the data using imshow
cax = ax.imshow(pivoted_data, aspect='auto', cmap='autumn', interpolation='nearest')
# Set the x and y ticks
ax.set_xticks(range(len(pivoted_data.columns)))
ax.set_xticklabels([mdates.num2date(d) for d in pivoted_data.columns], rotation=45)
ax.set_yticks(range(len(pivoted_data.index)))
ax.set_yticklabels(pivoted_data.index)
# Label the axes
ax.set_xlabel('Date')
ax.set_ylabel('Time (Hour)')
# Add a colorbar to map temperature gradient (Tgrad)
fig.colorbar(cax, label='Temperature Gradient (Tgrad)')
plt.tight_layout()
plt.show()
Explanation:
-
Pivoting the Data:
- The
pivot
function is used to transform the DataFrame into a 2D matrix where the rows represent the time (hour of the day), the columns represent thedatenum
(which corresponds to the date), and the values in the matrix are theTgrad
(temperature gradient) values.
- The
-
Using
imshow
:imshow
is then used to plot this matrix. We setaspect='auto'
to allow the plot to adjust the aspect ratio according to the data.
-
Formatting the Axes:
- We set the x-axis to correspond to the actual dates by converting the
datenum
back to date format usingmdates.num2date
. - The y-axis is labeled with the times (hours), which are integers from the
Time
column.
- We set the x-axis to correspond to the actual dates by converting the
-
Color Mapping:
- A colorbar is added to indicate the temperature gradient values (
Tgrad
) corresponding to the colors in the plot.
- A colorbar is added to indicate the temperature gradient values (
Result:
This will generate a 2D plot with dates
on the x-axis, times (hours)
on the y-axis, and the Tgrad
values displayed using a color scale. The colorbar will indicate the values of the Tgrad
variable, and the x-axis will display the actual dates.
Key Points:
- The pivoting step is crucial, as it reshapes your data from long format (1D) into wide format (2D).
- The
imshow
function expects a 2D matrix, and you can then map the color scale (viacmap
) to the values ofTgrad
. - You can fine-tune the color mapping and axis labels as needed.