6 Preparing data for analysis
There are multiple ways of preparing data for analysis.
create_custom_interpolation
: The most simple is to merge the data together (at a specified time resolution) and interpolated or not.create_rolling_window
: It’s also possible to derive summary statistics using a rolling window, which progresses across the timeseries to make calculations.create_summary_statistics
: Where there is a particular pattern that need to be extracted from the data such as sustained pressure change or activity, this function derives summary statistics for these periods
6.1 Merge sensor data together
Because data from different sensors are collected at different temporal resolutions (e.g. 5 minutes, 30 mintues or4 hours), reducePAM
formats data to the same time intervals as a specified variable (e.g. pressure) by summarising finer resolution data (median, sum or skip) and interpolating (or not) lower resolution data.
# Crop the data
start = as.POSIXct("2015-08-01","%Y-%m-%d", tz="UTC")
end = as.POSIXct("2016-06-21","%Y-%m-%d", tz="UTC")
PAM_data = create_crop(bee_eater, start, end)
6.1.1 Interpolation
Format it for every 30 mins and interpolate data with larger intervals, and provide median for data with smaller intervals.
date | pressure | light | pit | act | temperature | gX | gY | gZ | mX | mY | mZ |
---|---|---|---|---|---|---|---|---|---|---|---|
2015-08-01 00:00:00 | 1004 | 0 | 24 | 0 | 33 | 796.00 | -1993.000 | -4741 | -2016.000 | 11156 | 12528.0 |
2015-08-01 00:30:00 | 1004 | 0 | 24 | 0 | 33 | 854.75 | -1630.875 | -4661 | -2759.125 | 10517 | 11372.5 |
2015-08-01 01:00:00 | 1004 | 0 | 24 | 0 | 33 | 913.50 | -1268.750 | -4581 | -3502.250 | 9878 | 10217.0 |
2015-08-01 01:30:00 | 1004 | 0 | 24 | 0 | 33 | 972.25 | -906.625 | -4501 | -4245.375 | 9239 | 9061.5 |
2015-08-01 02:00:00 | 1004 | 0 | 24 | 0 | 33 | 1031.00 | -544.500 | -4421 | -4988.500 | 8600 | 7906.0 |
2015-08-01 02:30:00 | 1004 | 0 | 24 | 0 | 33 | 1089.75 | -182.375 | -4341 | -5731.625 | 7961 | 6750.5 |
6.1.2 No interpolation
Format it for every 5 minutes and don’t interpolate anything
6.2 Rolling window
Interpolation is not always advisable (especially linear), and another alternative for formatting data for analysis is to use a rolling window with create_rolling_window
, which progresses across all the timeseries and creates summary statistics for the data contained within that window of a certain time.
Derived variables include:
- median : Median
- sd : Standard deviation
- sum : Cumulative sum of values
- min : Minimum
- max : Maximum
- range : Range (i.e. maximum - minimum)
- cumu_diff : Cumulative difference (i.e. sum of absolute differences)
6.2.1 Interpolation
Create a 2h window with summary statistics every 15 minutes. Because sensors such as the magnetometer record every 4 hours, we can avoid spaces in the dataset by interpolating between points (linearly) and then calculating summary statistics for these interpolated points.
date | pressure | light | pit | act | temperature | gX | gY | gZ | mX | mY | mZ | median_pressure | median_light | median_pit | median_act | median_temperature | median_gX | median_gY | median_gZ | median_mX | median_mY | median_mZ | sd_pressure | sd_light | sd_pit | sd_act | sd_temperature | sd_gX | sd_gY | sd_gZ | sd_mX | sd_mY | sd_mZ | sum_pressure | sum_light | sum_pit | sum_act | sum_temperature | sum_gX | sum_gY | sum_gZ | sum_mX | sum_mY | sum_mZ | min_pressure | min_light | min_pit | min_act | min_temperature | min_gX | min_gY | min_gZ | min_mX | min_mY | min_mZ | max_pressure | max_light | max_pit | max_act | max_temperature | max_gX | max_gY | max_gZ | max_mX | max_mY | max_mZ | cumu_diff_pressure | cumu_diff_light | cumu_diff_pit | cumu_diff_act | cumu_diff_temperature | cumu_diff_gX | cumu_diff_gY | cumu_diff_gZ | cumu_diff_mX | cumu_diff_mY | cumu_diff_mZ | range_pressure | range_light | range_pit | range_act | range_temperature | range_gX | range_gY | range_gZ | range_mX | range_mY | range_mZ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2015-08-01 00:45:00 | 1004 | 0 | 24 | 0 | 33 | 884.125 | -1449.8125 | -4621 | -3130.688 | 10197.5 | 10794.75 | 1004 | 0 | 24 | 0 | 33 | 898.8125 | -1359.2812 | -4601 | -3316.469 | 10037.75 | 10505.875 | 0.0000000 | 0 | 0.3535534 | 0 | 0 | 71.95376 | 443.5107 | 97.97959 | 910.1385 | 782.612 | 1415.193 | 8032.0 | 0 | 191 | 0 | 264 | 7190.5 | -10874.25 | -36808 | -26531.75 | 80302 | 84047 | 1004.0 | 0 | 23 | 0 | 33 | 796.000 | -1993.000 | -4741 | -4616.938 | 8919.5 | 8483.75 | 1004 | 0 | 24 | 0 | 33 | 1001.625 | -725.5625 | -4461 | -2016.000 | 11156.0 | 12528.00 | 0.0 | 0 | 2 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 | 0.0 | 0 | 1 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 |
2015-08-01 01:00:00 | 1004 | 0 | 23 | 0 | 33 | 913.500 | -1268.7500 | -4581 | -3502.250 | 9878.0 | 10217.00 | 1004 | 0 | 24 | 0 | 33 | 928.1875 | -1178.2188 | -4561 | -3688.031 | 9718.25 | 9928.125 | 0.0000000 | 0 | 0.3535534 | 0 | 0 | 71.95376 | 443.5107 | 97.97959 | 910.1385 | 782.612 | 1415.193 | 8032.0 | 0 | 191 | 0 | 264 | 7425.5 | -9425.75 | -36488 | -29504.25 | 77746 | 79425 | 1004.0 | 0 | 23 | 0 | 33 | 825.375 | -1811.938 | -4701 | -4988.500 | 8600.0 | 7906.00 | 1004 | 0 | 24 | 0 | 33 | 1031.000 | -544.5000 | -4421 | -2387.562 | 10836.5 | 11950.25 | 0.0 | 0 | 2 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 | 0.0 | 0 | 1 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 |
2015-08-01 01:15:00 | 1004 | 0 | 24 | 0 | 33 | 942.875 | -1087.6875 | -4541 | -3873.812 | 9558.5 | 9639.25 | 1004 | 0 | 24 | 0 | 33 | 957.5625 | -997.1562 | -4521 | -4059.594 | 9398.75 | 9350.375 | 0.0000000 | 0 | 0.3535534 | 0 | 0 | 71.95376 | 443.5107 | 97.97959 | 910.1385 | 782.612 | 1415.193 | 8032.0 | 0 | 191 | 0 | 264 | 7660.5 | -7977.25 | -36168 | -32476.75 | 75190 | 74803 | 1004.0 | 0 | 23 | 0 | 33 | 854.750 | -1630.875 | -4661 | -5360.062 | 8280.5 | 7328.25 | 1004 | 0 | 24 | 0 | 33 | 1060.375 | -363.4375 | -4381 | -2759.125 | 10517.0 | 11372.50 | 0.0 | 0 | 2 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 | 0.0 | 0 | 1 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 |
2015-08-01 01:30:00 | 1004 | 0 | 24 | 0 | 33 | 972.250 | -906.6250 | -4501 | -4245.375 | 9239.0 | 9061.50 | 1004 | 0 | 24 | 0 | 33 | 986.9375 | -816.0938 | -4481 | -4431.156 | 9079.25 | 8772.625 | 0.0000000 | 0 | 0.3535534 | 0 | 0 | 71.95376 | 443.5107 | 97.97959 | 910.1385 | 782.612 | 1415.193 | 8032.0 | 0 | 191 | 0 | 264 | 7895.5 | -6528.75 | -35848 | -35449.25 | 72634 | 70181 | 1004.0 | 0 | 23 | 0 | 33 | 884.125 | -1449.812 | -4621 | -5731.625 | 7961.0 | 6750.50 | 1004 | 0 | 24 | 0 | 33 | 1089.750 | -182.3750 | -4341 | -3130.688 | 10197.5 | 10794.75 | 0.0 | 0 | 2 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 | 0.0 | 0 | 1 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 |
2015-08-01 01:45:00 | 1004 | 0 | 24 | 0 | 33 | 1001.625 | -725.5625 | -4461 | -4616.938 | 8919.5 | 8483.75 | 1004 | 0 | 24 | 0 | 33 | 1016.3125 | -635.0312 | -4441 | -4802.719 | 8759.75 | 8194.875 | 0.1767767 | 0 | 0.3535534 | 0 | 0 | 71.95376 | 443.5107 | 97.97959 | 910.1385 | 782.612 | 1415.193 | 8031.5 | 0 | 191 | 0 | 264 | 8130.5 | -5080.25 | -35528 | -38421.75 | 70078 | 65559 | 1003.5 | 0 | 23 | 0 | 33 | 913.500 | -1268.750 | -4581 | -6103.188 | 7641.5 | 6172.75 | 1004 | 0 | 24 | 0 | 33 | 1119.125 | -1.3125 | -4301 | -3502.250 | 9878.0 | 10217.00 | 0.5 | 0 | 1 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 | 0.5 | 0 | 1 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 |
2015-08-01 02:00:00 | 1004 | 0 | 24 | 0 | 33 | 1031.000 | -544.5000 | -4421 | -4988.500 | 8600.0 | 7906.00 | 1004 | 0 | 24 | 0 | 33 | 1045.6875 | -453.9688 | -4401 | -5174.281 | 8440.25 | 7617.125 | 0.3720119 | 0 | 0.0000000 | 0 | 0 | 71.95376 | 443.5107 | 97.97959 | 910.1385 | 782.612 | 1415.193 | 8030.5 | 0 | 192 | 0 | 264 | 8365.5 | -3631.75 | -35208 | -41394.25 | 67522 | 60937 | 1003.0 | 0 | 24 | 0 | 33 | 942.875 | -1087.688 | -4541 | -6474.750 | 7322.0 | 5595.00 | 1004 | 0 | 24 | 0 | 33 | 1148.500 | 179.7500 | -4261 | -3873.812 | 9558.5 | 9639.25 | 1.0 | 0 | 0 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 | 1.0 | 0 | 0 | 0 | 0 | 205.625 | 1267.438 | 280 | 2600.938 | 2236.5 | 4044.25 |
6.2.2 No interpolation
However, there are many assumpations made with assumptions (i.e. is the data truly linear). One option is either to increase the window to be larger than the greatest data resolution (in this case more than 4 hours). Another is to simply leave the NAs in the data using interp = FALSE
6.3 Extracting statistics for specific data patterns
If working with bird data, pamlr offers some predefined functions for classifying behaviour.
Flight bouts can be characterised by:
- continuous high activity which can be extracted from the data using
create_summary_statistics( ... ,method = "flap")
- endurance activity using
create_summary_statistics( ... ,method = "endurance")
- a pressure change greater than the background pressure changes due to weather using
create_summary_statistics( ... ,method = "pressure")
- a period of continuous light using
create_summary_statistics( ... ,method = "light")
- continuous high activity which can be extracted from the data using
Incubation bouts can be characterised by:
- periods of darkness using
create_summary_statistics( ... ,method = "darkness")
- periods of resting using
create_summary_statistics( ... ,method = "rest")
- periods of darkness using
twl = GeoLight::twilightCalc(PAM_data$light$date, PAM_data$light$obs,
LightThreshold = 2, ask = FALSE)
TOclassify = create_summary_statistics(dta = PAM_data,
method= "flap",
twl = twl)
date | start | end | duration | total_daily_duration | total_daily_event_number | cum_pressure_change | cum_altitude_change | cum_altitude_up | total_daily_P_change | P_dep_arr | pressure_range | altitude_range | mean_night_P | sd_night_P | mean_nextnight_P | sd_nextnight_P | night_P_diff | median_activity | sum_activity | prop_resting | prop_active | mean_night_act | sd_night_act | sum_night_act | mean_nextnight_act | sd_nextnight_act | sum_nextnight_act | night_act_diff | median_pitch | sd_pitch | median_light | nightime | median_gX | median_gY | median_gZ | median_mX | median_mY | median_mZ | median_temp | sd_temp | cum_temp_change |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2015-08-01 | 2015-08-01 12:05:00 | 2015-08-01 12:30:00 | 0.4166667 | 0.9166667 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1001.938 | 0.4425306 | 1004.938 | 0.4425306 | 3.00 | 16 | 100 | 0.1666667 | 0.8333333 | 0.0210526 | 0.1443214 | 2 | 0.0107527 | 0.1036952 | 1 | 0.0102999 | 20 | 7.842194 | 9984 | 0 | NA | NA | NA | NA | NA | NA | 41 | NA | 0 |
2015-08-01 | 2015-08-01 15:10:00 | 2015-08-01 15:20:00 | 0.1666667 | 0.9166667 | 4 | 0 | 0 | 0 | 0 | 1 | NA | NA | 1001.938 | 0.4425306 | 1004.938 | 0.4425306 | 3.00 | 27 | 56 | 0.0000000 | 1.0000000 | 0.0210526 | 0.1443214 | 2 | 0.0107527 | 0.1036952 | 1 | 0.0102999 | 36 | 10.392305 | 9984 | 0 | NA | NA | NA | NA | NA | NA | NA | NA | 0 |
2015-08-01 | 2015-08-01 04:30:00 | 2015-08-01 04:40:00 | 0.1666667 | 0.9166667 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1001.938 | 0.4425306 | 1004.938 | 0.4425306 | 3.00 | 30 | 76 | 0.3333333 | 0.6666667 | 0.0210526 | 0.1443214 | 2 | 0.0107527 | 0.1036952 | 1 | 0.0102999 | 20 | 9.451631 | 424 | 0 | NA | NA | NA | NA | NA | NA | 34 | NA | 0 |
2015-08-01 | 2015-08-01 10:00:00 | 2015-08-01 10:10:00 | 0.1666667 | 0.9166667 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1001.938 | 0.4425306 | 1004.938 | 0.4425306 | 3.00 | 43 | 113 | 0.3333333 | 0.6666667 | 0.0210526 | 0.1443214 | 2 | 0.0107527 | 0.1036952 | 1 | 0.0102999 | 13 | 4.509250 | 9984 | 0 | NA | NA | NA | NA | NA | NA | 39 | NA | 0 |
2015-08-02 | 2015-08-02 11:00:00 | 2015-08-02 11:10:00 | 0.1666667 | 1.4166667 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1004.938 | 0.4425306 | 1001.188 | 0.5439056 | 3.75 | 19 | 39 | 0.3333333 | 0.6666667 | 0.0107527 | 0.1036952 | 1 | 0.0319149 | 0.2296387 | 3 | 0.0211622 | 21 | 8.144528 | 9984 | 0 | NA | NA | NA | NA | NA | NA | 40 | NA | 0 |
2015-08-02 | 2015-08-02 11:30:00 | 2015-08-02 11:50:00 | 0.3333333 | 1.4166667 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1004.938 | 0.4425306 | 1001.188 | 0.5439056 | 3.75 | 23 | 109 | 0.2000000 | 0.8000000 | 0.0107527 | 0.1036952 | 1 | 0.0319149 | 0.2296387 | 3 | 0.0211622 | 30 | 6.913754 | 9984 | 0 | NA | NA | NA | NA | NA | NA | 39 | NA | 0 |
These functions also calculate summary statistics for each event (e.g. flight bout).
These include:
- date : Date (without time)
- start : Start time and date of the event,
POSIXct
format - end : Time and date that the event finished,
POSIXct
format - duration : How long it lasted (in hours)
- total_daily_duration : The total duration of all the events that occured that day (in hours)
- total_daily_event_number : The total number of events which occured that day
- cum_pressure_change : The cumulative change in atmospheric pressure during that event (in hectopascals)
- cum_altitude_change : The cumulative change in altitude during that event (in metres)
- cum_altitude_up : The cumulative number of metres that the bird went upwards during that event
- total_daily_P_change : The cumulative change in atmospheric pressure for all the events for that date (in hectopascals)
- P_dep_arr : The difference between atmospheric pressure at the start of the event, and at the end (in hectopascals)
- pressure_range : The total range of the atmospheric pressure during that event (maximum minus miniimum - in hectopascals)
- altitude_range : The total altitude range during that event (maximum minus miniimum - in metres)
- mean_night_P : The mean pressure during the night before the event took place (in hectopascals)
- sd_night_P : The standard deviation of pressure the night before the event took place (in hectopascals)
- mean_nextnight_P : The mean pressure the night after the event took place (in hectopascals)
- sd_nextnight_P : The standard deviation of pressure the night after the event took place (in hectopascals)
- night_P_diff : The difference between the mean pressures of the night before and the night after the event took place (in hectopascals)
- median_activity : The median activity during that event
- sum_activity : The sum of the activity during that event
- prop_resting : The propotion of time during that event where activity = 0
- prop_active : The propotion of time during that event where activity > 0
- mean_night_act : The mean activity during the night before the event took place
- sd_night_act : The standard deviation of activity the night before the event took place
- sum_night_act : The summed activity during the night before the event took place
- mean_nextnight_act :The mean activity the night after the event took place
- sd_nextnight_act : The standard deviation of activity the night after the event took place
- sum_nextnight_act : The summed activity the night after the event took place
- night_act_diff : The difference between the mean activity of the night before and the night after the event took place
- median_pitch : The median pitch during that event
- sd_pitch : The standard deviation of pitch during that event
- median_light : The median light recordings during that event
- nightime : Whether or not it was night during the majority of the event (1= night, 0 = day)
- median_gX : Median raw acceledation on the x axis during the event
- median_gY : Median raw acceledation on the y axis during the event
- median_gZ : Median raw acceledation on the z axis during the event
- median_mX : Median raw magnetic field on the x axis during the event
- median_mY : Median raw magnetic field on the y axis during the event
- median_mZ : Median raw magnetic field on the z axis
- median_temp : Median temperature during the event (in celsius)
- sd_temp : Standard deviation of temperature during the event (in celsius)
- cum_temp_change : Cumulative absolute difference in temperature during the event (in celsius)