Based on AWS documentation, to estimate Spice size each text and numeric value uses 12 bytes x no. of rows and text uses (24 bytes + text length) x no. of rows.
My questions are:
For text length is that the size of the field or the size of the value in the field?
If I have 2 data sources linked in a dataset one with 2 text columns (varchar 10) and a 1000 rows joined to a dataset with 2 text columns (varchar 20) and 100 rows, how do i calculate estimated size? Is it (34x2x1000) + (34x2x100) or 34x4x1000. ie, is all the data in a dataset combined into a single tabular file?
This should have been posted in the Q&A section. Following looks like the formula. I am not sure what to put in UTF-8 encoded character length per field
The number of rows in your dataset is based on your join and the number of columns in your example as well depends on the columns you selected to remain in the dataset.
Total logical row size in bytes =
(Number of Numeric Fields * 8 bytes per field)
(Number of Date Fields * 8 bytes per field)
(Number of Text Fields * (24 bytes + UTF-8 encoded character length per field) )
Total bytes of data = Number of rows * Total logical row size in bytes