Skip to content
Snippets Groups Projects
Commit ac3c85d1 authored by Milan's avatar Milan
Browse files

Change S3 service description

parent 52c54d7d
No related branches found
No related tags found
No related merge requests found
...@@ -16,10 +16,10 @@ Data Management Services is a portfolio of services allowing to facilitate the w ...@@ -16,10 +16,10 @@ Data Management Services is a portfolio of services allowing to facilitate the w
**S3** is a general service suitable for most of the usecases (archives, backups, special applications...). It also allows to share your data with other users or publicly via link. **S3** is a general service suitable for most of the usecases (archives, backups, special applications...). It also allows to share your data with other users or publicly via link.
[:octicons-arrow-right-24: Overview of S3 Service](./object-storage/s3-service.md) [:octicons-arrow-right-24: Overview of S3 Service](./object-storage/s3-service.md)<br/>
[:octicons-arrow-right-24: Favourite S3 Clients](./object-storage/rclone.md) [:octicons-arrow-right-24: Favourite S3 Clients](./object-storage/rclone.md)<br/>
[:octicons-arrow-right-24: Advanced S3 Functions](./object-storage/rclone.md) [:octicons-arrow-right-24: Advanced S3 Functions](./object-storage/rclone.md)<br/>
[:octicons-arrow-right-24: Veeam stup against S3](./object-storage/rclone.md) [:octicons-arrow-right-24: Veeam stup against S3](./object-storage/rclone.md)<br/>
......
...@@ -2,6 +2,10 @@ site_name: "storage" ...@@ -2,6 +2,10 @@ site_name: "storage"
nav: nav:
- Data Storage Services: index.md - Data Storage Services: index.md
- Object Storage Services: - S3 Service:
- S3 Overview: object-storage/s3-service.md - S3 Overview: object-storage/s3-service.md
- General Storage Guides: object-storage.md - Fauvorite S3 clients: object-storage/s3-clients.md
- Advanced S3 fetures: object-storage/s3-features.md
- Veeam backup over S3: object-storage/veeam-backup.md
- RBD Service:
- RBD Sevice: object-storage.md
---
languages:
- en
- cs
---
# Favourite S3 service clients
In the following section you can find recommended S3 clients.
## AWS-CLI (Linux, Windows)
[AWS CLI](https://aws.amazon.com/cli/) - Amazon Web Services Command Line Interface - is standardized too; supporting S3 interface. Using this tool you can handle your data and set up your S3 data storage. You can used the command line control or you can incorporate AWS CLI into your automated scripts. [Tutorial for AWS CLI](aws-cli.md).
## Rclone (Linux, Windows)
The tool [Rclone](https://rclone.org/downloads/) is suitable for data synchronization and data migration between more endpoints (even between different data storage providers). Rclone preserves the time stamps and checks the checksums. It is written in Go language. Rclone is available for multiple platforms (GNU/Linux, Windows, macOS, BSD and Solaris). In the following guide, we will demonstrate the usage in Linux and Windows systems. [Rclone guide](rclone.md).
object-storage/s3-service-screeshots/direct_upload.png

70.5 KiB

object-storage/s3-service-screeshots/s3_backup.png

71.3 KiB

object-storage/s3-service-screeshots/s3_distribution.png

206 KiB

...@@ -7,21 +7,43 @@ languages: ...@@ -7,21 +7,43 @@ languages:
S3 service is a general service suited for most of the use cases. S3 service can be used for elementary data storing, automated backups, or various types of data handling applications. S3 service is a general service suited for most of the use cases. S3 service can be used for elementary data storing, automated backups, or various types of data handling applications.
S3 service utilizes similar name convention as AWS S3. The convention is “bucket.domain.cz”. The tenant is the unique identificator and domain is s3.clX.du.cesnet.cz. If you will not explicitly mention the tenant it should be recognized automatically. The recognition is being performed based on the access key and secret key. So it should be sufficient to use the format as follows: s3.clX.du.cesnet.cz/bucket Access to the service is controlled by virtual organizations and coresponding groups. S3 is suitable for sharing data between individual users and groups that may have members from different institutions. Tools for managing groups and users are provided by the e-infrastructure. Users with access to S3 can be people, as well as "service accounts", for example for backup machines (a number of modern backup tools support natively S3 connection). Data is organized into buckets in S3. It is usually appropriate to link individual buckets to the logical structure of your data workflow, for example different stages of data processing. Data can be stored in the service in an open form or in case of sensitive data it is possible to use encrypted buckets on the client side. Where even the storage manager does not have access to the data. Client-side encryption also means that the transmission of data over the network is encrypted, and in case of eavesdropping during transmission, the data cannot be decrypted.
In case your client considers the endpoint as native AWS you have to switch to S3 compatible endpoint. Most of the clients can automatically process both formattings. However, in some cases is necessary to specify the format explicitly.
???+ note "How to get S3 service?" ???+ note "How to get S3 service?"
To connect to S3 service you have to contact Data Storage support at: To connect to S3 service you have to contact Data Storage support at:
`du-support@cesnet.cz` `du-support@cesnet.cz`
Once you obtain your credentials you can continue to connection itself using one of the following S3 client. ----
## S3 Elementary use cases
In the following section you can find the description of elementary use cases related to S3 service.
### Automated backup of large datasets using the tools natively supporting S3 service
If you use specialized automated tools for backup, such as Veeam, bacula, restic..., most of these tools allow native use of S3 service for backup. So you don't have to deal with connecting block devices etc. to your infrastructure. You only need to request an S3 storage setup and reconfigure your backup. Can be combined with the WORM model as protection against unwanted overwriting or ransomware attacks.
![](s3-service-screenshots/s3_backup.png){ style="display: block; margin: 0 auto" }
### Data sharing across you laboratory or over multiple institutions
If you manage multiple research groups where you need users to share data, such as data collection and its post-processing, you can use S3. The S3 service allows you to share data within a group or between users. This use case assumes that each user has own access to the repository. This use case is also suitable if you need to share sensitive data between organizations and do not have a secure VPN. You can use encrypted buckets (client-side encryption) within the S3 service. Client-side encryption also means that the transmission of data over the network is encrypted, and in case of eavesdropping during transmission, the data cannot be decrypted.
![](s3-service-screenshots/s3_distribution.png){ style="display: block; margin: 0 auto" }
### Life systems handlig the data - Learning Management Systems, Catalogues, Repositories
You have large data and you operate an application in e-infrastructure that issues data to your users. This use case is particularly relevant to applications that distribute large data (raw scans, large videos, large scientific data sets for computing environments...) to end users. For this use case, it is possible to use the S3 service again. The advantage of using S3 for these applications is that there is no need to upload data to the application server, but the end user can upload/download data directly to/from object storage using S3 presign requests.
![](s3-service-screenshots/direct_upload.png){ style="display: block; margin: 0 auto" }
## S3 Data Reliability (Data Redundancy) - replicated vs erasure coding
In the section below are described two aproaches for data redundancy applied to the object storage pool. S3 service can be equipped with **replicated** or **erasure code (EC)** redundancy.
### Replicated
Your data is stored in three copies in the data center. In case one copy is corrupted, the original data is still readable in an undamaged form, and the damaged data is restored in the background. Using a service with the replicated flag also allows for faster reads, as it is possible to read from all replicas at the same time. Using a service with the replicated flag reduces write speed because the write operation waits for write confirmation from all three replicas.
???+ note "Suitable for?"
Suitable for smaller volumes of live data with a preference for reading speed (not very suitable for large data volumes).
### Erasure Coding (EC)
Erasure coding (EC) is a data protection method. It is similar to the dynamic RAID known from disk arrays. Erasure coding (EC) is a method where data is divided into individual fragments, which are then stored with some redundancy across the data storage. Therefore, if some disks (or the entire storage server) fail, the data is still accessible and will be restored in the background. So it is not possible for your data to be on one disk that gets damaged and you lose your data.
## S3 service clients
In the following section you can find recommended S3 clients.
### AWS-CLI (Linux, Windows)
[AWS CLI](https://aws.amazon.com/cli/) - Amazon Web Services Command Line Interface - is standardized too; supporting S3 interface. Using this tool you can handle your data and set up your S3 data storage. You can used the command line control or you can incorporate AWS CLI into your automated scripts. [Tutorial for AWS CLI](aws-cli.md).
### Rclone (Linux, Windows)
The tool [Rclone](https://rclone.org/downloads/) is suitable for data synchronization and data migration between more endpoints (even between different data storage providers). Rclone preserves the time stamps and checks the checksums. It is written in Go language. Rclone is available for multiple platforms (GNU/Linux, Windows, macOS, BSD and Solaris). In the following guide, we will demonstrate the usage in Linux and Windows systems. [Rclone guide](rclone.md).
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment