I have a dataset consisting of timestamp, person_id and acceleration. The data is regular csv data. I am having trouble figuring out how to only work with the last minute of data.
I have around 25 accelerometer data points per second. I want to count how many times those data points are outside a threshold, within the last recorded minute. I cant filter the time, as I don’t have any set times for recording. So for each person_id, I want to get roughly the last 1500 rows, based on timestamp and count how many times acc > 0.3 for example.
Welcome to QuickSight Community , sounds like interesting use case. Could you please upload sample data and sample visual in the QuickSight Arena and share it to us to work together on this usecase.
You want to find out the latest recorded datapoints for every person_id and for each person-id there will be many datapoints with the same latest recorded timestamp. Wondering is the threshold is static number(0.3) or any logic needs to be applied and computed for every person-id?
To count the number of times the acceleration deviates (don’t be confused be the name. It doesn’t count pr. second here). As you can see, it aggregates over the whole day. I want it over the last minute, or 250 measurements.
You can see what I mean in the above sheet.
So the green might be the latest 5 measurements, and because 3/5 are over 0.3, the patient is moving. The yellow might be the previous 5 measurements and as 3/5 is under 0.3, the patient is classified as non moving.
I want to continuously evaluate the newest 5 measurements and set “is_moving” based on that.
Thank you for reaching out to Quicksight Community.
In this case you can use countOver function.
Since you would want to use two conditions here :
calculate only for last one minute
calculate acc > 0. 3
you can create two calculated fields (ifelse function) for above two.
Eg : 1. acc_condition = ifelse( acc > 0.3,1,0)
2. is_one_minute = ifelse condition based on date.
Now you can use these two calculated fields as partition by under first countOver function.
Let us know if this works.
Step 1: Calculate minute against all records based on timestamp field. You can truncate your timestamp to the nearest minute.
Step 2: Identify the latest minute. Since you are looking for a dynamic approach, you can use maxover to fetch the latest minute timestamp. You should use the person id as partition since you intend to calculate for each person.
Step 3: write the conditions as @rajencho mentioned above. ifelse( truncated timestamp = max timestamp and deviation>0.2, return person_id, else NULL). Since you have calculated deviation already, have not broken down that step.
Step 4. Now to calculate no. of times deviated, you can do count of new field from step 3.
Also, a sample dataset upload in the Arena can help us share with you the solution.
I got over my lazyness and tried out arena. It seems good to learn, so thanks for the push haha. Here is a link: last_minute_of_data_sample
The table I included has all the times in the dataset and all the people, but as mentioned above, I want one row for each person id, with a label “has_moved” (or something like that). This label should be true if “g_deviation” > 0.3 50% of the time within the last minute available for the person (just as a concrete example. I might tweek the specifics).
I tried the countOver, but it seems like I’ve used it wrong.
I would also love if you could explain how I truncate time and calculate over time in more detail, this is the part I’m struggling with.
Oh and what if i want the last 5 minutes for example. I can see that the last minute in my data is never really a full minute, as I will almost always stop recording data in the middle of a minute.
EDIT - Solved:
If I want to take the last 5 min i just add a dateDiff check.
last_5_min_deviation_count:
count(
ifelse(
dateDiff(truncDate('MI', time), latest_minute_by_patient, 'MI') <= 5
and g_deviation > g_deviation_threshold,
time,
NULL
)
)