Notes on plots and dataframes

Before we begin, a couple of notes on plots dataframes: this post has been prepared by converting a jupyter notebook into a markdown file using nbconvert. As it turns out, the default output format used by nbconvert to display dataframes isn’t rendered correctly by the Hugo backend I’m using for the blog. I was not able to find a “clean” solution for it, so for this post (and until I’ll find the actual fix) all tables are represented in raw html, rather than in a more decent format. I also need to figure out a better way of saving the plots, which allow to specify a facecolor, so that figures have a white background both in light and dark mode. For the time being, it’s possible to switch to light mode to fix that particular problem.

The nbconvert-generated version of this post is also available here.

Plotting temperature and humidity sensor readings with matplotlib

In this notebook, I will take a look at the readings from a DHT22 temperature-humidity sensor that I have left for a few days in a couple of locations in my house. As I’m writing this, we’re in the middle of the summer and there’s a heatwave going on, so the average temperatures will be pretty high.

By gathering data over the course of different days and by varying some conditions (e.g. opening certain windows), I’m hoping to find what’s the best way of cooling down the house at night, and keeping it cool during the day.

Libraries

Let’s start by talking about the libraries. Nothing pretty fancy here. pandas is there to put all the log readings into a dataframe, matplotlib and seaborn are graph visualization libraries for plotting data and datetime is there because I am working with timeseries. matplotlib.dates is used to customize some of the plotting parameters when handling dates.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime
import matplotlib.dates as mdates

Part 1: the attic

Reading and cleaning the data

The temperature/humidity readings are stored in a csv file, together with the timestamp of the reading. Let’s start with the readings coming from the sensor left in the attic.

df = pd.read_csv('log.txt', header=0, names=['date','time', 'temperature', 'humidity'])

The next few commands are there for turning the datetime into a more manageable format, which helps with plotting later. In particular, since the date and time columns are read as strings, the following lines convert them back into datetime format.

Then, the two are combined back into a singular datetime object for each row, with the entire timestamp.

df['time'] = df['time'].apply(datetime.time.fromisoformat)
df['date'] = df['date'].apply(datetime.date.fromisoformat)

df['datetime'] = pd.to_datetime([datetime.datetime.combine(a, b) for a,b in zip(df['date'], df['time'])])

Now that I have a unique datetime index for every row, I can reset the index of the dataframe so that it uses the timestamp, rather than the regular ID.

The next step is resampling the timeline in order to reduce the number of readings, and to reduce the effect of anomalous readings. In this example, readings are grouped every 30 minutes, then they are aggregated by using the mean function.

Out of curiosity, and following a procedure I am not completely sure of, I am trying to plot the variance of the measurement over the same interval of time by aggregating with var, then plotting using the fill_between function.

# Resetting the index so that it uses the datetime, then aggregating according to mean and variance. 
df_resampled = df.set_index('datetime').resample('30Min').mean()

df_var = df.set_index('datetime').resample('30Min').var()

Time for the first plots. Here, I am plotting all the readings from the moment I activated the process, to the moment I saved it on disk to study the readings. To better distinguish between days, I am searching for the timestamps with time (0,0) (i.e. midnight), then I’m saving them in the variable midnight.

The command df_resampled.index.time==datetime.time(0,0) looks in the .time attribute of the datetime objects in the index and checks if the value contained therein is the same as datetime.time(0,0).

The first plot will show the timeseries of temperature and humidity, along with their (supposed) error bars. Since I’m using the same code for plotting both temperature and humidity, I wrote a simple function that takes as input the dataframe and the variable of interest, then prepares the resulting plot.

def plot_overall_var(df, tgt_var, df_var=None):

    def get_bounds(df_base, df_var, column):
        '''Preparing the upper and lower bounds by adding the variance
        '''
        y1 = df_base[column] + df_var[column]
        y2 = df_base[column] - df_var[column]

        return y1,y2

    # Setting the size of the figure
    fig = plt.figure(figsize=(16,4))

    # Plotting the main line
    ax = sns.lineplot(x=df_resampled.index, y=df_resampled[tgt_var])
    plt.title(tgt_var)

    # If provided, adding the variance
    if df_var is not None:
        yhigh, ylow = get_bounds(df_resampled, df_var, tgt_var)
        plt.fill_between(df_resampled.index, ylow, yhigh, color='orange')

    # Plotting vertical lines for highlighting the change of date
    midnight = df_resampled.loc[df_resampled.index.time==datetime.time(0,0)].index
    for day in midnight:
        plt.axvline(day)
        
    # Plotting ticks every 3 hours
    _ = ax.xaxis.set_major_locator(mdates.HourLocator(interval=3))
    
    # Rotating the ticks
    ax.xaxis.set_tick_params(rotation=60)
    
    # Tightening the result
    plt.tight_layout()

plot_overall_var(df_resampled, 'temperature', df_var)
plot_overall_var(df_resampled, 'humidity', df_var)

png png

Plotting results by day

In the next section, I will reshape the data with the objective of comparing the readings gathered in different days, to look for potential similarities and differences. To do so, I’ll first split the datetime index into columns date and time, then I will pivot the table so that the new index will be time, and the columns will report temperature and humidity for each day. As you’ll notice, a bunch of values are missing and get with the value NaN: this is due to the fact that I started the data collection around 14h on July 13th.

df_resampled['date'] = df_resampled.index.date
df_resampled['time'] = df_resampled.index.time

df_pivot=df_resampled.pivot(index='time', columns='date')
# Displaying only the first 10 rows
df_pivot.head(10)
temperaturehumidity
date2022-07-132022-07-142022-07-152022-07-162022-07-132022-07-142022-07-152022-07-16
time
00:00:00NaN29.64035131.48363631.187500NaN44.25614141.01272738.969643
00:30:00NaN29.32142831.04107130.917544NaN44.34107140.33214338.903509
01:00:00NaN29.02500030.60000030.540741NaN44.51785740.02456139.196297
01:30:00NaN28.77500030.23818230.076786NaN44.77678639.93636339.655357
02:00:00NaN28.51929829.96140329.866071NaN44.96666740.04561439.594642
02:30:00NaN28.32321429.65714329.710527NaN44.79821439.98392940.035088
03:00:00NaN28.08392929.34000029.500000NaN44.83571439.55818240.560714
03:30:00NaN27.83928629.05438629.285714NaN45.02321439.13157942.167857
04:00:00NaN27.51428628.79464329.000000NaN45.29107239.33035744.741072
04:30:00NaN27.25178628.58181828.656364NaN45.67142840.03818245.830909

Here I’m plotting the values of temperature and humidity recorded on each day.

time_range = pd.timedelta_range(start='1 day', end='2 days', freq='3H')
fig, axs = plt.subplots(2,1, sharex=True, figsize=(10, 8))
df_pivot=df_resampled.pivot(index='time', columns='date')
a=df_pivot['temperature'].plot(title='Temperature', ylabel='Temperature (*C)', ax=axs[0], legend=True)

df_pivot=df_resampled.pivot(index='time', columns='date')
df_pivot['humidity'].plot(ax=axs[1], title='Humidity', ylabel='Humidity (%)', legend=True)

# plt.legend([a.lines])
plt.tight_layout()
plt.savefig('temp-humid-attic.png', transparent=False, facecolor='white')

png

Finding the daily extremes

In this section, I will be parsing the data and look for the daily extremes (maximum and minimum) for temperature and humidity, and the time at which they occurred.

# Simple dictionary for formatting the unit of measurement that will be printed
d_form = {'temperature': '*C', 'humidity': '%'}
# Grouping by day
for idx, day_g in df_resampled.groupby('date'):
    print(f'Date: {idx}')
    
#     Maximum
#     idxmax returns the position of the maximum value
    aggr_i = day_g[['temperature', 'humidity']].aggregate(pd.Series.idxmax)
    aggr_m = day_g[['temperature', 'humidity']].aggregate(pd.Series.max)
#     Putting the values together for pretty printing
    df_t=pd.concat([aggr_i, aggr_m], axis=1)
    for i, val in df_t.iterrows():
        print(f'Max {i:>12} = {val[1]:.1f} {d_form[i]:>2} at {val[0].time()}')
        
#     Minimum
#     idxmax returns the position of the maximum value
    aggr_i = day_g[['temperature', 'humidity']].aggregate(pd.Series.idxmin)
    aggr_m = day_g[['temperature', 'humidity']].aggregate(pd.Series.min)
#     Putting the values together for pretty printing
    df_t=pd.concat([aggr_i, aggr_m], axis=1)
    for i, val in df_t.iterrows():
        print(f'Min {i:>12} = {val[1]:.1f} {d_form[i]:>2} at {val[0].time()}')
Date: 2022-07-13
Max  temperature = 33.5 *C at 18:00:00
Max     humidity = 49.8  % at 14:30:00
Min  temperature = 30.0 *C at 23:30:00
Min     humidity = 42.6  % at 18:30:00
Date: 2022-07-14
Max  temperature = 35.6 *C at 18:00:00
Max     humidity = 47.4  % at 10:30:00
Min  temperature = 26.4 *C at 07:30:00
Min     humidity = 38.8  % at 19:00:00
Date: 2022-07-15
Max  temperature = 36.3 *C at 16:00:00
Max     humidity = 43.7  % at 10:00:00
Min  temperature = 27.6 *C at 07:00:00
Min     humidity = 33.4  % at 16:00:00
Date: 2022-07-16
Max  temperature = 36.0 *C at 18:00:00
Max     humidity = 51.1  % at 09:00:00
Min  temperature = 27.7 *C at 07:30:00
Min     humidity = 38.9  % at 00:30:00

Part 2: the bedroom

In this section, I will be repeating most of what was done in Part 1: the main difference will be the location of the sensor. Rather than keeping it in the attic, in fact, I was keeping this on the desk in my bedroom. In particular, the sensor was exposed to the heat coming from a pretty chonky desktop PC, as well as two monitors, and it is positioned close to a window.

df = pd.read_csv('readings-room.csv', header=0, names=['date','time', 'temperature', 'humidity'])
df['time'] = df['time'].apply(datetime.time.fromisoformat)
df['date'] = df['date'].apply(datetime.date.fromisoformat)

df['datetime'] = pd.to_datetime([datetime.datetime.combine(a, b) for a,b in zip(df['date'], df['time'])])
# Resetting the index so that it uses the datetime, then aggregating according to mean and variance. 
df_resampled = df.set_index('datetime').resample('30Min').mean()

df_var = df.set_index('datetime').resample('30Min').var()

Below, I’m plotting the overall temperature and humidity readings. Unlike in the case of the attic, however, the variance is far more larger, which leads to a plot that is almost unreadable in the case of humidity. For this reason, I am also plotting the graphs without added bands by calling plot_overall_var without the df_var variable.

plot_overall_var(df_resampled, 'temperature', df_var)
plot_overall_var(df_resampled, 'humidity', df_var)

plot_overall_var(df_resampled, 'temperature')
plot_overall_var(df_resampled, 'humidity')

png

png

png

png

Plotting results by day

df_resampled['date'] = df_resampled.index.date
df_resampled['time'] = df_resampled.index.time

df_pivot=df_resampled.pivot(index='time', columns='date')
# Displaying only the first 10 rows
df_pivot.head(10)
temperaturehumidity
date2022-07-062022-07-072022-07-082022-07-062022-07-072022-07-08
time
00:00:00NaNNaN24.601786NaNNaN60.225000
00:30:00NaN26.79285723.985965NaN62.98571561.821053
01:00:00NaN26.93725523.458929NaN62.36862763.941071
01:30:00NaN26.82500022.831579NaN62.90178666.178947
02:00:00NaN26.42105321.867273NaN63.63859669.143637
02:30:00NaN26.39642822.728572NaN64.70535765.946429
03:00:00NaN26.17193022.787719NaN64.52105364.991228
03:30:00NaN25.91896522.864285NaN65.10862165.960714
04:00:00NaN25.97857123.108929NaN64.52500066.139285
04:30:00NaN25.79473623.110527NaN65.34912366.324561
val_idxmax=df_pivot[['temperature', 'humidity']].apply(pd.Series.idxmax)
val_max=df_pivot[['temperature', 'humidity']].apply(pd.Series.max)
df_concat = pd.concat([val_idxmax, val_max],  axis=1).reset_index().pivot(index='date', columns='level_0')
df_concat
01
level_0humiditytemperaturehumiditytemperature
date
2022-07-0618:30:0017:00:0067.53773628.174419
2022-07-0707:30:0017:00:0070.47017528.641072
2022-07-0805:00:0017:30:0070.80526327.100000
mi = pd.MultiIndex.from_product([['humidity', 'temperature'], ['timeat', 'value']])
rows=[]
for idx, row in df_concat.iterrows():
    t = [row[idx][variable]  for variable in ['humidity', 'temperature'] for idx in [0,1] ]
    rows.append([idx] + t)
df_new = pd.DataFrame.from_records(rows)
df_new=df_new.set_index(0)
df_new.index.rename('date', inplace=True)
df_new.columns = mi
df_new
humiditytemperature
timeatvaluetimeatvalue
date
2022-07-0618:30:0067.53773617:00:0028.174419
2022-07-0707:30:0070.47017517:00:0028.641072
2022-07-0805:00:0070.80526317:30:0027.100000
time_range = pd.timedelta_range(start='1 day', end='2 days', freq='3H')
fig, axs = plt.subplots(2,1, sharex=True, figsize=(10, 8))
df_pivot=df_resampled.pivot(index='time', columns='date')
a=df_pivot['temperature'].plot(title='Temperature', ylabel='Temperature (*C)', ax=axs[0], legend=True)

df_pivot=df_resampled.pivot(index='time', columns='date')
df_pivot['humidity'].plot(ax=axs[1], title='Humidity', ylabel='Humidity (%)', legend=True)

# plt.legend([a.lines])
plt.tight_layout()
plt.savefig('temp-humid-bedroom.png', transparent=False, facecolor='white')

png

Finding the daily extremes

# Simple dictionary for formatting the unit of measurement that will be printed
d_form = {'temperature': '*C', 'humidity': '%'}
# Grouping by day
for idx, day_g in df_resampled.groupby('date'):
    print(f'Date: {idx}')
    
#     Maximum
#     idxmax returns the position of the maximum value
    aggr_i = day_g[['temperature', 'humidity']].aggregate(pd.Series.idxmax)
    aggr_m = day_g[['temperature', 'humidity']].aggregate(pd.Series.max)
#     Putting the values together for pretty printing
    df_t=pd.concat([aggr_i, aggr_m], axis=1)
    for i, val in df_t.iterrows():
        print(f'Max {i:>12} = {val[1]:.1f} {d_form[i]:>2} at {val[0].time()}')
        
#     Minimum
#     idxmax returns the position of the maximum value
    aggr_i = day_g[['temperature', 'humidity']].aggregate(pd.Series.idxmin)
    aggr_m = day_g[['temperature', 'humidity']].aggregate(pd.Series.min)
#     Putting the values together for pretty printing
    df_t=pd.concat([aggr_i, aggr_m], axis=1)
    for i, val in df_t.iterrows():
        print(f'Min {i:>12} = {val[1]:.1f} {d_form[i]:>2} at {val[0].time()}')
Date: 2022-07-06
Max  temperature = 28.2 *C at 17:00:00
Max     humidity = 67.5  % at 18:30:00
Min  temperature = 26.7 *C at 21:30:00
Min     humidity = 60.8  % at 22:00:00
Date: 2022-07-07
Max  temperature = 28.6 *C at 17:00:00
Max     humidity = 70.5  % at 07:30:00
Min  temperature = 24.4 *C at 23:00:00
Min     humidity = 58.5  % at 23:30:00
Date: 2022-07-08
Max  temperature = 27.1 *C at 17:30:00
Max     humidity = 70.8  % at 05:00:00
Min  temperature = 21.7 *C at 05:00:00
Min     humidity = 60.2  % at 00:00:00

Part 3: Concluding

Let’s wrap up the post by summarizing what was shown, then making some observations.

Overall, this was an interesting exercise to play around with timeseries, which is a format I haven’t really considered during my PhD. I had to figure out how to handle them using the functions provided by pandas, then how to use matplotlib to prepare the graphs while getting around the shortcomings of my data-gathering setup. In particular, seaborn was far more stubborn than I expected and I wasn’t able to find a proper way of wrangling the correct plots with the library.

Regarding the actual data, it’s interesting to see the differences between the readings coming from the two locations. They were gathered over different time periods, and it was far hotter while I was recording the temperature in the attic: this makes direct comparisons not very meaningful, due to how different the external conditions were. Still, the difference in readings between the room and the attic can be explained by the fact that there is no activity in the attic; on the other hand, the sensor in the bedroom picked up on both my activity there: this involved the use of a desktop PC and multiple monitors and additional devices, coupled with the fact that windows can generate a noticeable breeze that goes through the entire top floor of the building. This goes to show that a good ventilato helps a lot with cooling down a hot interior, and that electronic devices work directly against that!

I’m still working on building a different temperature sensor, and possibly something more complex that would allow to gather further information (a weather station maybe?), so I will definitely come back to this subject at some point in the future.

Appendix: Other (unsuccessful) attempts

In this section I’m gathering some additional methods I tried while working on the points shown above. I’m adding these both as a note to myself of what doesn’t work, and for readers to see what can happen while trying stuff out.

From my understanding, part of the reason why these methods did not work quite as well as I hoped is due to the fact that I’m working with datetime objects, rather than “regular” indexing. This results in some funky behavior in some cases.

Plotting struggles

The pandas.DataFrame.melt function allows to obtain a “long form” version of a dataframe that pivots around some columns, while keeping other columns as variables. This can be extremely useful for plotting different variables on the same plot using libraries such as seaborn, which is what I will be trying in a moment.

df_melted = df_resampled.melt(id_vars=['time', 'date'], value_vars=['temperature'])
df_melted['time'] = df_melted['time'].apply(datetime.time.isoformat)
display(df_melted)

# df_melted = df_melted.sort_values(['date', 'time'])
timedatevariablevalue
014:30:002022-07-13temperature31.266667
115:00:002022-07-13temperature32.048214
215:30:002022-07-13temperature32.547368
316:00:002022-07-13temperature32.926316
416:30:002022-07-13temperature33.198245
...............
14716:00:002022-07-16temperature35.548214
14816:30:002022-07-16temperature35.810526
14917:00:002022-07-16temperature35.903573
15017:30:002022-07-16temperature35.987719
15118:00:002022-07-16temperature36.000000

152 rows × 4 columns

A “melted” dataframe can be plotted fairly easily using a command such as the one below. Besides the x- and y-axes, it is possible to distinguish the lines by hue (in this case, different colors for different dates) and by style (different line style for different variables). Here, it does not make a lot of sense to plot humidity and temperature in the same plot, but it is useful for the sake of the example.

Something else to note in the plot below is that the x-axis starts at 14:30, then continues for 24 hours until 14:00 the next day: this is due to the fact that the data captured by the log is incomplete for the day of 2022-07-13. After searching around for a while, I wasn’t able to reset the ticks on the x-axis to make them work properly, so I had to look for some workarounds.

Indeed, the current plot shows a pretty major inconsistency at 0:00, which is when the date shifts from July 13th to July 14th, which is also why the orange line “jumps” below and seems to be a continuation of the blue line: this is exactly what that is!

fig = plt.figure(figsize=(12,4))
ax = sns.lineplot(data = df_melted, x='time', y='value', hue='date', style='variable')
ax.tick_params(rotation=90)

png

It is possible to build a plot by using the full datetime object, rather than splitting by day, and the result is shown below:

df_melted = df_resampled.reset_index().melt(id_vars=['datetime'], value_vars=['temperature'])
display(df_melted)

fig = plt.figure(figsize=(12,4))
ax = sns.lineplot(data = df_melted, x='datetime', y='value')
ax.tick_params(rotation=90)
datetimevariablevalue
02022-07-13 14:30:00temperature31.266667
12022-07-13 15:00:00temperature32.048214
22022-07-13 15:30:00temperature32.547368
32022-07-13 16:00:00temperature32.926316
42022-07-13 16:30:00temperature33.198245
............
1472022-07-16 16:00:00temperature35.548214
1482022-07-16 16:30:00temperature35.810526
1492022-07-16 17:00:00temperature35.903573
1502022-07-16 17:30:00temperature35.987719
1512022-07-16 18:00:00temperature36.000000

152 rows × 3 columns

png

Now, this plot looks much clearer, however it’s not exactly what I wanted. For this reason, I decided to get rid of the offending data series (i.e. the incomplete series of readings from the first day) and plot the data for the remaining days. This isn’t really the best way of doing it, which is why I went for the solution in the first section of this post, but it gets the job done at least.

So, now my objective is selecting only the readings that start after the first day has ended, which can be done using either one of the two commands below, whose output is exactly the same.

df_resampled.index returns the index of the dataframe, which is in datetime format: this allows to compare it with a different datetime object that I am declaring (either fromisoformat(...) or date(...)). The comparison allows to find the datetime indices that occur after the given value, and keep only those.

df_1 =df_resampled.loc[df_resampled.index >= datetime.datetime.fromisoformat('2022-07-14 00:00:00')]

df_2 =df_resampled.loc[df_resampled.index.date > datetime.date(2022, 7,13)]

# Checking that the two indices are equal
all(df_1.index == df_2.index)
True

Now that there is a cleaner dataframe to plot, the usual lineplot function can be used to prepare the temperature and humidity comparisons. |

df_melted = df_1.melt(id_vars=['time', 'date'], value_vars=['temperature'], value_name='temp')
df_melted['time'] = df_melted['time'].apply(datetime.time.isoformat)
df_melted = df_melted.sort_values(['date', 'time'])

fig = plt.figure(figsize=(10,4))
ax = sns.lineplot(data = df_melted, x='time', y='temp', hue='date')
ax.tick_params(rotation=90)

png

df_melted = df_1.melt(id_vars=['time', 'date'], value_vars=['humidity'], value_name='hum')
df_melted['time'] = df_melted['time'].apply(datetime.time.isoformat)
df_melted = df_melted.sort_values(['date', 'time'])

fig = plt.figure(figsize=(10,4))
ax = sns.lineplot(data = df_melted, x='time', y='hum', hue='date')
ax.tick_params(rotation=90)

png

Extracting extremes

After splitting the readings on a day-by-day basis using pivot, I looked for a way of getting the extremes using the same table. Turns out, this was far more complex than I thought, especially for printing them in an easy-to-read way.

Let’s start by taking a look at df_pivot again:

display(df_pivot.head())
temperaturehumidity
date2022-07-132022-07-142022-07-152022-07-162022-07-132022-07-142022-07-152022-07-16
time
00:00:00NaN29.64035131.48363631.187500NaN44.25614141.01272738.969643
00:30:00NaN29.32142831.04107130.917544NaN44.34107140.33214338.903509
01:00:00NaN29.02500030.60000030.540741NaN44.51785740.02456139.196297
01:30:00NaN28.77500030.23818230.076786NaN44.77678639.93636339.655357
02:00:00NaN28.51929829.96140329.866071NaN44.96666740.04561439.594642

This dataframe has a two-level header, with one half of the column covering the temperature on different days, and the other half recording humidity in the same periods of time. The index is a timeseries (not a datetime anymore, since the date is recorded in the columns).

# Extracting the extremes and saving them in dataframes
val_idxmax=df_pivot[['temperature', 'humidity']].apply(pd.Series.idxmax)
val_max=df_pivot[['temperature', 'humidity']].apply(pd.Series.max)
# Concatenating dataframes
df_concat = pd.concat([val_idxmax, val_max],  axis=1).reset_index().pivot(index='date', columns='level_0')
df_concat
01
level_0humiditytemperaturehumiditytemperature
date
2022-07-1314:30:0018:00:0049.77451033.494737
2022-07-1410:30:0018:00:0047.35818235.562499
2022-07-1510:00:0016:00:0043.69636436.289285
2022-07-1609:00:0018:00:0051.06071536.000000

The result above kind of works, but is not very understandable: on the left, we have the time at which the extreme was measured and on the right we have the actual measured value.

It would be nicer to have a format similar to the original df_pivot, with temperature and humidity on top and the two variables timeat and measured value below. This can be done using a MultiIndex, which in this case I am building with the function .from_product. This function executes the cartesian product of the iterables passed as inputs and produces a MultiIndex object where each iterable represents a level in the multiindex.

mi = pd.MultiIndex.from_product([['humidity', 'temperature'], ['timeat', 'value']])
mi
MultiIndex([(   'humidity', 'timeat'),
            (   'humidity',  'value'),
            ('temperature', 'timeat'),
            ('temperature',  'value')],
           )

At this point, I was looking for a way of massaging the columns and the index in order to make them match the multiindex, and this is what I came up with. Pretty ugly, but it works.

For each row in df_concat, I select the two columns of the multiindex (timeat and value, with labels 0 and 1 respectively), and then the specific variable (either humidity or temperature), so that the content of the list is:

[row[timeat][humidity], row[value, humidity], row[timeat][temperature], row[value][temperature]]

Finally, I add the index (i.e. the date) to the list, then I build the dataframe starting from those records. In the new dataframe, the index is then set to use the date column added to the list.

This is one way of printing the values, however I later realized that this makes actual pretty printing far more difficult, and the entire process thus far was quite clunky and “unpythonic”, which is why I decided to something else.

rows=[]
for idx, row in df_concat.iterrows():
    t = [row[i_][variable]  for variable in ['humidity', 'temperature'] for i_ in [0,1] ]
    rows.append([idx] + t)
df_new = pd.DataFrame.from_records(rows)
df_new=df_new.set_index(0)
df_new.index.rename('date', inplace=True)
df_new.columns = mi
df_new
humiditytemperature
timeatvaluetimeatvalue
date
2022-07-1314:30:0049.77451018:00:0033.494737
2022-07-1410:30:0047.35818218:00:0035.562499
2022-07-1510:00:0043.69636416:00:0036.289285
2022-07-1609:00:0051.06071518:00:0036.000000