A scheduled dashboard (weekly & monthly) with automatic dates for queries and title

Hello,

Newbie to AWS, I would really appreciate any help, advice, links to tutorials/doc, … on the following task:

Create a scheduled dashboard (weekly & monthly, with basically counts and avg values) with automatic dates for queries and title .

I tried to find information on AWS web, Google, but didn’t found any useful/understandable tips yet.

So far my pipeline is manual and on demand as follow:

  • Data in S3 bucket , constantly updated, are conversation metrics, nested JSON format and partitioned (YYYY/MM/DD/HH).
    I’m only interested in one field of the JSON: “detail”, a structure containing a list of records for some of which I need to calculate some avg and counts.
  • A glue crawler (on demand so far, I know how to schedule it) builds the meta-table of the data (which structure can change with time).
  • Athena to test my SQL queries: ok
  • QuickSight : direct SQL queries, because apparently QuickSight can’t read JSON with a structure containing a list of record (so bad!)

Any possibility to get the nested JSON structure “detail” directly to Spice?

  • Analysis and dashboard on demand, ok

But, how can I schedule it with automatic dates (last week & month) & title?

Ideally, I would like to QuickSight to publish a dashboard on a weekly basis with automatic Glue Crawler, date-queries & title.

More specifically:

  • Is it possible to trigger Glue-crawler update when QuickSight is actualized?
  • How to pass to QuickSight SQL queries with var date?
  • How to input var date into QuickSight dashboard title?

Maybe my pipeline is not adequate, which would be best?

I’m actually looking at Lambda function documentation.

Bonus question: Timestamp

All my data are UTC time stamped, but my questions should be us-Oregon: What is the best way to deal with that.

PS: okay for coding in python, SQL, but hate JSON Java {}

Best and thanks,

Lionel

Create a scheduled dashboard (weekly & monthly, with basically counts and avg values) with automatic dates for queries and title .

Here is a way to automate sending reports. Scheduling and sending reports by email - Amazon QuickSight

For displaying dashboards on the week / month you can use filters. Filtering data in Amazon QuickSight - Amazon QuickSight

In regards to automatic filters and titles you can look into parameters and rolling dates.

For your pipeline why don’t you create the tables / views in Athena? Athena can parse JSON. By doing that you can structure the data into a tabular format that quicksight can read and use it in SPICE.

https://prestodb.io/docs/current/functions/json.html

If you have nested arrays you can look to this.

This way you can get the nested JSON into quicksight.

Lastly, in regards to your timestamp question you can handle that in Athena / SQL.

Here is a field that we use to cast to America/Chicago time.

date_add(‘hour’,timezone_hour(now() AT TIME ZONE ‘America/Chicago’),date_field_from_table) date_field_chicago

Hope this helps

2 Likes

Thank you for your help @Max, much appreciated. If I may ask,

how can I create Athena tables/views that dynamically grabs dates/timestamp for the S3 data partitioned by day and hour. As new data hits S3, to have it in a way that QuickSight can refresh and update visuals with new data.

What options are available to achieve this?

Hello Max,

Thanks a lot for your insightful answers.

Even if I’m a newbie to AWS, I already checked “scheduling & sending reports by email”, “Filtering data in QuickSight”. I guess I’m still missing a global understanding of QuickSight possibilities. Step by step learning!

“Parameters in QuickSight” seems to be what I’m looking for, checking your link and doc.

For my pipeline, I’m actually using a direct SQL (so implicitly Athena) query (and calculation of count & avg) to get the JSON part I need. I’m cheeking if it’s better/easier to do the calculations in my SQL query or directly in QuikSight/SPICE.

Thanks for the timestamp tip.

Best regards,

Lionel

Are your tables partitioned by day / hour?

In that case you can select day from your table.

Select day,hour FROM table

Here is a helpful guide.

1 Like

I always prefer to move as much logic as you can into SQL.

In regards to where you run that SQL I prefer creating custom sql in quicksight because it’s easier for me to debug / test out new queries in QuickSight instead of having to go between two different services.

Hope that helps!

Thank you, my tables are partitioned by day and hour. what do you mean by select day,hour FROM table?

If they are partitioned you can just select by the values that you partioned by it.

If your tables ends like this

PARTITIONED BY (day string, hour string)

Then you can select from day / hour.

SELECT day, hour FROM table

Thank you @Max , my question is how can the Athena table automatically detect new S3 folders/partitions?

Also, Is there a way to pass the date from QuickSight to Athena table, so that the table queries S3 using the date passed from QuickSight.

Something like:
The following will be part of Athena table or View

WHERE day={FILTER COMING FORM QUICKSIGHT} AND hour={FILTER COMING FORM QUICKSIGHT}

Thank you for your responses!

1 Like

For new ones there are a couple of ways.

If you set up your folders correctly you can run this command to automatically add partitions to all folders.

However, this would need to be scheduled in some sort of way.

I would suggest using partition projects to automatically add new partitions.


Next, for your question of passing to a query, you can do that now! But only with direct query. This is in the edit dataset section.

Other options are setting filters on your dashboard / analysis to be rolling / static.

Hi Max,

Something I’d like to do!
But is that in QuickSight / DataSet / Edit SQL query, as your snapshot seems to show. In mine (dataset direct SQL query) I don’t have the same view:

imagen

I’m missing something or just lost because newbie?

Thanks,
Lionel

And also I thought that Athena do not actually support timezome

Athena does support time zones. However, for my use case I was looking to change the time zone based on a user selection as that is not available in quicksight yet.

In regards to you not seeing it, I believe this is only for the beta / preview version of quicksight. You should be able to reach out to your account rep for QuickSight to get you access to this.

Ok Thanks for the Athena “AT TIME ZONE” info.

I’m actually also looking to change the time zone based on a user selection in QuickSight:

  • I was trying to change the UTC times to a user selected timezome in QuickSight, but no solution yet: I now i can calculate a new field from my UTC time, adding/subtracting hours, but it’s not correct as UTC doesn’t have summer/winter time!
  • I will give a go with passing parameters (time zone) from QuickSight (beta) to direct SQL query.

Thanks for your useful advices.

1 Like