We have a table using TWCS with 5 days TTL and 1 day
gc_grace_seconds.
compaction = {'class':'.apache.cassandra.dbpaction.TimeWindowCompactionStrategy','compaction_window_size':'12','compaction_window_unit':'HOURS','max_threshold':'32','min_threshold':'4'}
Which would mean table should have 10 sstables as compaction window size is 12 hours (2 sstables per day and has 5 days TTL).
But when we run sstable metadata we see 19 sstables on the disk and sstable is not deleted even when maxDeletionTime is showing as yesterday's date. What could be the reason for this?
Was expecting 10 sstables as compaction window size is 12 hours (2 sstables per day and has 5 days TTL).
We have a table using TWCS with 5 days TTL and 1 day
gc_grace_seconds.
compaction = {'class':'.apache.cassandra.dbpaction.TimeWindowCompactionStrategy','compaction_window_size':'12','compaction_window_unit':'HOURS','max_threshold':'32','min_threshold':'4'}
Which would mean table should have 10 sstables as compaction window size is 12 hours (2 sstables per day and has 5 days TTL).
But when we run sstable metadata we see 19 sstables on the disk and sstable is not deleted even when maxDeletionTime is showing as yesterday's date. What could be the reason for this?
Was expecting 10 sstables as compaction window size is 12 hours (2 sstables per day and has 5 days TTL).
There are a couple of potential obvious reasons this can happen:
unsafe_aggressive_sstable_expiration
- and should not be used without properly understanding if it applies and if it is safe in your specific instance.It's possible that you've got data mixed between SSTables. Do you have read repairs enabled or potentially mix your writes by issuing updates? If so, you may have timestamp overlaps in your SSTables which will cause Cassandra to block dropping expired tables. You can check if this is the case with sstableexpiredblockers.
https://cassandra.apache./doc/stable/cassandra/tools/sstable/sstableexpiredblockers.html
You can also directly check for overlap between your SSTables by comparing the min and max timestamps. Check this post from The Last Pickle that uses sstablemetadata.
https://thelastpickle/blog/2016/12/08/TWCS-part1.html
As mentioned by Andrew, you can have Cassandra ignore the overlap by setting unsafe_aggressive_sstable_expiration
which will delete old tables once they expire. I've found this very helpful for managing TWCS tables, but it can cause deleted data to reappear, so make sure you fully understand it before enabling.