The first step in Amazon S3 monitoring is to check the current state of your S3 buckets and how fast they grow. You can easily get this information from the CloudWatch Management console, running a AWS CLI command or AWS SDK script.
Bucket Size
Here is an example of AWS CLI command to get the size of a bucket for every day within --start-time
and --end-time
date range:
aws cloudwatch get-metric-statistics \ --metric-name BucketSizeBytes --namespace AWS/S3 \ --start-time 2018-10-01T00:00:00Z --end-time 2018-10-08T00:00:00Z \ --statistics Maximum --unit Bytes --region us-east-1 \ --dimensions Name=BucketName,Value=cloudsqale Name=StorageType,Value=StandardStorage \ --period 86400 --query 'Datapoints[*].[Timestamp, Maximum]' \ --output text | sort | python cloudwatch_s3_metrics.py
I use a simple Python script cloudwatch_s3_metrics.py
to format the data and calculate the bucket growth per day:
import sys prev = None print "Date\tBucket Size\tGrowth per Day" for i in sys.stdin: line = i.split('\t') # Using 1000 instead of 1024 to match CloudWatch metrics cur = float(line[1])/(1000*1000*1000*1000) diff = (cur - prev) if prev is not None else float('nan') prev = cur print line[0][0:10] + '\t' + "%.3f TB" % cur + '\t' + "%.3f TB" % diff
So the sample output is as follows:
Date Bucket Size Growth per Day 2018-10-01 2087.301 TB nan 2018-10-02 2099.817 TB 12.516 TB 2018-10-03 2117.809 TB 17.992 TB 2018-10-04 2138.358 TB 20.549 TB 2018-10-05 2158.940 TB 20.582 TB 2018-10-06 2179.499 TB 20.559 TB 2018-10-07 2203.798 TB 24.299 TB
Number of objects
Now let’s check the number of objects in the bucket:
aws cloudwatch get-metric-statistics \ --metric-name NumberOfObjects --namespace AWS/S3 \ --start-time 2018-10-01T00:00:00Z --end-time 2018-10-08T00:00:00Z \ --statistics Maximum --unit Count --region us-east-1 \ --dimensions Name=BucketName,Value=cloudsqale Name=StorageType,Value=AllStorageTypes \ --period 86400 --query 'Datapoints[*].[Timestamp, Maximum]' \ --output text | sort | python cloudwatch_s3_metrics_obj.py
Again I use a simple Python script cloudwatch_s3_metrics_obj.py
to format the data and calculate the growth of the number of objects per day:
import sys prev = None print "Date\tNumber of Objects\tGrowth per Day" for i in sys.stdin: line = i.split('\t') cur = float(line[1]) diff = (cur - prev) if prev is not None else float('nan') prev = cur print line[0][0:10] + '\t' + "%.0f" % cur + '\t' + "%.0f" % diff
Here is my sample result:
Date Number of Objects Growth per Day 2018-10-01 8851954 nan 2018-10-02 8912936 60982 2018-10-03 8975252 62315 2018-10-04 9031277 56025 2018-10-05 9078046 46768 2018-10-06 9129067 51020 2018-10-07 9170534 41467
As soon as you know how your S3 operates in general, it is time to see what actually drives this growth.