Sankey diagram node is in the middle of the graph

I created a sankey diagram to show the redirections between sources and targets. Dataset looks like this:

I have Redirected From column in the Source and Redirected To column in the Destination in QuickSight. However, there’s a node that stays in the middle of the graph, like the screenshot below (the red node):

Is there anyway to format the node and have it displayed on the source and destination separately?

Thanks!

Hi @kiko, this is due to the cyclic nature of the data, please refer to this documentation. If that does not help, please create a sample dashboard with sample dataset showing your problem state using Arena and post it here. Details on using Arena can be found here - QuickSight Arena

Working with cyclical data

Sometimes, the data that you use for a Sankey diagram contains cycles. For example, suppose that you’re visualizing user traffic flows between pages on a website. You might discover that users who come to page A move to page E, and then come back to page A. An entire flow might look something like A-E-A-B-A-E-A.

When your data contains cycles, the nodes in each cycle are repeated in QuickSight. For example, if your data contains the flow A-E-A-B-A-E-A, the following Sankey diagram is created.

sankey-diagram-5

We hope this solution worked for you. Let us know if this is resolved. And if it is, please help the community by marking this answer as a “Solution.” (click the check box under the reply)

Hi @Xclipse Thank you for your reply! Yeah there are cyclical data but is there any way to only show A-E and E-B separately if the data is A-E-B? We only want to consider Redirected From and Redirected To columns separately.

Hi @kiko,

To focus on displaying only the direct connections like A-E and E-B in your Sankey diagram, while excluding any indirect cycles, you can consider adjusting your dataset to strictly follow the flow from “Redirected From” to “Redirected To” without tracking the return to a previous node. Here are a few steps you can take to achieve this in QuickSight:

  1. Data Preparation: Make sure your dataset is structured in such a way that it only includes direct redirects. Each row should represent a direct connection from one page to another without any intermediary steps back to a previous page. For example, if your raw data includes paths like A-E-B and B-A, modify it to only include A-E and E-B.
  2. Use Calculated Fields: You can create a calculated field to identify and filter out any rows where a cyclical pattern is detected. For instance, create a flag in your data preparation process that checks if a destination (current row) becomes a source (next row) for the same node.
  3. Adjust the Visualization Settings: In the settings of your Sankey diagram, make sure to arrange the nodes in a manner that emphasizes the flow from start to finish without loops. You might need to manually adjust the order of the nodes or tweak the visualization properties to make the direct paths more apparent.
  4. Apply Filters in QuickSight: Apply filters directly in your QuickSight analysis to exclude any data points that represent return paths. This can be done by setting up filters that compare the source and destination fields to ensure they do not form a loop within the same session or user journey.

By following these steps, your Sankey diagram should more clearly reflect the direct transitions between pages, making it easier to interpret the flow of user movements without the distraction of cyclical data. If you encounter any specific issues while implementing these changes, feel free to ask for more detailed guidance!

1 Like

Hi @Xclipse Thank you for your response. It worked for me. Thanks!