Data Integrity checking
With S3 Browser you can reliably
upload and
download
your files to and from Amazon S3. S3 Browser supports data integrity features to guarantee that
data was not corrupted traversing the network.
Data Integrity checking for Uploads
To enable data integrity checking for uploads:
1. Click Tools, Options
Click Tools, Options
2. The Options dialog will open, switch to the Data Integrity tab:
Click Tools, Options, Data Integrity
3. Tick the Perform data integrity check when uploading files checkbox.
4. Click Save Changes.
When this option is turned on, S3 Browser calculates an SHA256 hash for an each file you upload
and sends this hash to Amazon S3. Amazon S3 calculates the hash on the server side
and compares it with the hash provided by S3 Browser. If they do not match, returns an error.
The file will not be written on Amazon S3 if hashes do not match.
S3 Browser will also write calculated hash as a custom metadata header (x-amz-meta-sha256)
to check data integrity when you download the file.
This option is also strongly recommended if you would like to
verify data integrity when downloading files (see below).
Data Integrity checking for Downloads
To enable data integrity checking for downloads:
1. Click Tools, Options
Click Tools, Options
2. The Options dialog will open, switch to the Data Integrity tab:
Click Tools, Options, Data Integrity
3. Tick the Perform data integrity check when downloading files checkbox.
4. Click Save Changes.
When this option is turned on, S3 Browser calculates the hash for downloaded file and compares it with the hash
returned by the server (Amazon S3). If hashes do not match it returns the error and file is not written on local disk.
There are two modes of data integrity checking for downloaded files:
In Flexible mode S3 Browser performs data integrity test only for files for which valid hash is provided
by the server. Files with unknown or missing hashes are not checked and treated as valid.
In Strict mode S3 Browser checks all files. Files with unknown or missing hashes are treated as corrupted.
More details about data integrity checking
S3 Browser users the following approach to get the hash of remote file:
First it attempts to read the hash from the x-amz-meta-sha256 custom metadata header,
and if the header contains 256 bit hash string, calculates an SHA256 hash of the downloaded file and compares the hashes.
If there is no valid SHA256 hash found, it attempts to read the hash from the x-amz-meta-md5
custom metadata header (this header was used by old versions of S3 Browser to store an MD5 hash of the file)
and if the header contains 128 bit hash string, calculates an MD5 hash of the downloaded file and compares the hashes.
If there is no the x-amz-meta-sha256 and x-amz-meta-md5 headers or they do not contain valid hashes,
it looks up into the ETag header. This header is used by Amazon S3 to write an MD5 hash for files uploaded via
single-part uploads, and, if the valid hash found, calculates and MD5 hash of the downloaded file and compares the hashes.
But, we may have the situation when we do not have valid hash of the remote file and thereby can not perform data integrity test.
Amazon S3 doesn?t write valid MD5 hash into the ETag header for files uploaded via the Multipart Upload feature.
Amazon S3 compatible storages may also use another hashing algorithms and data formats for storing
file hashes or not provide file hashes at all.
The file could be uploaded by 3-rd party software or by S3 Browser with the data integrity turned off.
You may choose how to handle the files with unknown or missing hashes:
In Flexible mode S3 Browser performs data integrity test only for files for which valid hash is provided
by the server. Files with unknown or missing hashes are not checked and treated as valid.
In Strict mode S3 Browser checks all files. Files with unknown or missing hashes are treated as corrupted.
How to avoid the ?missing hash? issue
First, you can enable data integrity checking when uploading files. When this option is turned on,
S3 Browser writes the hash in file metadata during uploading (Amazon S3 allows you to store any
custom information for an each file).
Later, when you download the file, S3 Browser can extract the hash from the metadata
and compare it with the hash calculated.
Another way is to disable multipart uploads in Tools, Options, General, but this could reduce the speed of your uploads.
Plus you can upload files up to the 5Gb in size only, for larger files you need to enable multipart uploads.
How to fix the hashes for already uploaded files
The method described below allows you to fix file hashes for files up to the 5Gb in size.
There is no way to fix hashes for files over 5Gb in size, you can only re-upload them with
data integrity checking for uploads enabled or select the Flexible mode for data integrity checking for downloads.
To fix the hashes for files up to the 5Gb in size:
- Start S3 Browser and choose the bucket
- Select one or multiple files
- Open the Http Headers tab and click Apply
S3 Browser will update the ETag header with the hash through the simple COPY request.
|