Defines several methods for analyzing, plotting, and exporting wearable data, including a Pandas accessor for wearable dataframes
Overview
The circadian.readers module contains several methods for working with wearable data such as step counts, heart rate, and sleep. It also defines a Pandas accessor called WearableData to standardize and validate wearable dataframes.
Loading wearable data
The circadian.readers module provides functionality to import files in several formats, including raw CSV counts, JSON files, and data coming from Actiwatch readers in CSV format. For example, to load a CSV file with heart rate data we can do:
from circadian.readers import load_csvfile_path ='circadian/sample_data/hr_data.csv'df_hr = load_csv(file_path, timestamp_col='timestamp')
heartrate
timestamp
datetime
0
79.0
4.688359e+07
1971-06-27 15:13:12.693424232
1
80.0
4.688329e+07
1971-06-27 15:08:09.693448064
2
81.0
4.688306e+07
1971-06-27 15:04:20.692736632
3
80.0
4.688273e+07
1971-06-27 14:58:46.686474800
4
85.0
4.688257e+07
1971-06-27 14:56:08.187120912
...
...
...
...
99995
97.0
3.271680e+07
1971-01-14 15:59:56.779711960
99996
95.0
3.271679e+07
1971-01-14 15:59:49.779711960
99997
95.0
3.271679e+07
1971-01-14 15:59:48.779711960
99998
95.0
3.271678e+07
1971-01-14 15:59:43.779711960
99999
93.0
3.271677e+07
1971-01-14 15:59:34.779711960
100000 rows × 3 columns
by indicating which column contains the unix timestamp information, load_csv automatically generates a new column with the datetime information. If no timestamp column is provided, it is assumed that a column named ‘datetime’ (or ‘start’ and ‘end’) is present in the file. For data specified via time intervals, such as step counts, no new column is generated and the user can choose how to process the data. For example, to load a CSV file with step counts we can do:
where df_dict is a dictionary with the dataframes for each stream. The keys of the dictionary are the names of the streams. For example, to access the dataframe with the wake data we can do:
df_wake = df_dict['wake']
start
end
wake
0
1970-02-03 04:49:01.000000
1970-02-03 09:01:00.000000
0
1
1970-02-03 09:02:00.000000
1970-02-03 11:25:00.000000
0
2
1970-02-04 04:51:01.000000
1970-02-04 12:35:00.000000
0
3
1970-02-04 12:36:00.000000
1970-02-04 12:37:00.000000
0
4
1970-02-04 12:38:00.000000
1970-02-04 12:39:00.000000
0
...
...
...
...
2750
1971-06-27 07:38:31.105829
1971-06-27 08:01:01.105829
0
2751
1971-06-27 08:03:01.105829
1971-06-27 08:55:31.105829
0
2752
1971-06-27 09:05:31.105829
1971-06-27 09:07:01.105829
0
2753
1971-06-27 09:08:01.105829
1971-06-27 12:06:01.105829
0
2754
1971-06-27 12:08:01.105829
1971-06-27 12:15:31.105829
0
2755 rows × 3 columns
The circadian.readers module only accepts specific column names for wearable data. The accepted column names are stored in VALID_WEARABLE_STREAMS:
note that load_actiwatch automatically generates a new column with the datetime information and standardizes column names.
Resampling wearable data
The circadian.readers module provides functionality to resample both data that is specified via time intervals or via timestamps. For example, to resample a dataframe with step counts we can do:
name ='steps'resample_freq ='1D'agg_method ='sum'resampled_steps = resample_df(df_steps, name, resample_freq, agg_method)
datetime
steps
0
1970-01-01
847.000000
1
1970-01-02
1097.000000
2
1970-01-03
1064.000000
3
1970-01-04
2076.000000
4
1970-01-05
2007.000000
...
...
...
538
1971-06-23
9372.098478
539
1971-06-24
10137.802450
540
1971-06-25
14977.306682
541
1971-06-26
5644.161346
542
1971-06-27
3823.642766
543 rows × 2 columns
where resample_freq is a string indicating the frequency of the resampling in Pandas offset aliases notation. Under name, the column to be resampled is specified and the agg_method parameter indicates how to aggregate the data.
Combining wearable data
We can combine wearable data from different streams into a single dataframe. To achieve this we can use the combine_wearable_dataframes method which resamples and aggregates data to produce a dataframe with a single datetime index and columns for each stream. For example, to combine all the loaded dataframes from the previous section we would do:
Resample a wearable dataframe. If data is specified in intervals, returns the density of the quantity per minute.
Type
Default
Details
df
DataFrame
dataframe to be resampled
name
str
name of the wearable data to resample (one of steps, heartrate, wake, light_estimate, or activity)
freq
str
frequency to resample to. String must be a valid pandas frequency string (e.g. ‘1min’, ‘5min’, ‘1H’, ‘1D’). See https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
agg_method
str
aggregation method to use when resampling
initial_datetime
Timestamp
None
initial datetime to use when resampling. If None, the minimum datetime in the dataframe is used
final_datetime
Timestamp
None
final datetime to use when resampling. If None, the maximum datetime in the dataframe is used
Combine a dictionary of wearable dataframes into a single dataframe with resampling
Type
Default
Details
df_dict
Dict
dictionary of wearable dataframes
resample_freq
str
resampling frequency (e.g. ‘10min’ for 10 minutes, see Pandas Offset aliases: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases)