Manage storage of Flow tables

The tables created in Athena warehouse are the external tables. So whenever user drops and creates tables it creates a new directory in the s3 bucket without deleting the old directory. As a result, whenever table is recreated the number of directories keep on increasing that increases the usage of the storage capacity in the s3 bucket.

To manage the storage for table, Sprinkle provides below option:

  • When creating the table, user should explicitly define the path and then create the table.

  • The same location that the user is mentioning in the query should be given in the flow settings Cleanup tab. Before running flow, it will check Cleanup tab if any path is provided by the user, it will drop that path and will create a new table again.

  • If a flow is under managed data loading, the issue would only be seen for the staging table and not for the final table so user need to follow the same procedure for staging table as done for the normal flows.

Last updated