diff --git a/faq.md b/faq.md new file mode 100644 index 0000000000000000000000000000000000000000..906193e4840dd404628520c801fc27c833462adf --- /dev/null +++ b/faq.md @@ -0,0 +1,8 @@ +--- +languages: + - en + - cs +--- + +# FAQ - Frequently Asked Questions +Frequently asked questions can be found in the [Data Storage documentation](FAQ - Frequently Asked Questions). diff --git a/index.md b/index.md index 0f10e0a46cdf168bd0eb42a4d8fbff4d81ac6297..717365cef276b993bb4ac4b54f0cc5ab0b51f037 100644 --- a/index.md +++ b/index.md @@ -3,20 +3,35 @@ hide: - toc --- -# Data Management Services +# Data Storage Services Data Management Services is a portfolio of services allowing to facilitate the whole data workflow needed for research and academic communities. <div class="grid cards" markdown> -- :fontawesome-solid-server:{ .lg .middle } __Data Storage Services__ + +- :fontawesome-solid-server:{ .lg .middle } __S3 Service__ + + --- + + **S3** is a general service suitable for most of the usecases (archives, backups, special applications...). It also allows to share your data with other users or publicly via link. It can be connected from all over the world ;-). + + [:octicons-arrow-right-24: Overview of S3 Service](./object-storage/s3-service.md)<br/> + [:octicons-arrow-right-24: Favourite S3 Clients](./object-storage/s3-clients.md)<br/> + [:octicons-arrow-right-24: Advanced S3 Functions](./object-storage/s3-features.md)<br/> + [:octicons-arrow-right-24: Veeam setup against S3](./object-storage/veeam-backup.md)<br/> + + + +- :fontawesome-solid-server:{ .lg .middle } __RBD Service__ --- - Do you need common **Data Storage Services**? + **RBD** is Rados Block Device service. **RBD** is Rados Block Device service. The prerequisite for this service is a Linux machine with a public IPv4 address. + + [:octicons-arrow-right-24: Overview of RBD Service](./object-storage/rbd-service.md)<br/> + [:octicons-arrow-right-24: Setup of RBD Service](./object-storage/rbd-setup.md)<br/> - [:octicons-arrow-right-24: Object Data Storage](https://du.cesnet.cz/en/navody/object_storage/start) - [:octicons-arrow-right-24: Filesystem Data Storage](https://du.cesnet.cz/en/navody/sluzby/start) <!--- [:octicons-arrow-right-24: Account properties and lifecycle](/account/properties) ---> @@ -27,20 +42,21 @@ Data Management Services is a portfolio of services allowing to facilitate the w Do you need to cooperate with your colleagues, edit documents and share data? - [:octicons-arrow-right-24: Owncloud](https://du.cesnet.cz/en/navody/owncloud/start) - [:octicons-arrow-right-24: Onlyoffice](https://du.cesnet.cz/en/navody/onlyoffice/start) + [:octicons-arrow-right-24: ownCloud](https://du.cesnet.cz/en/navody/owncloud/start) + + [:octicons-arrow-right-24: ONLYOFFICE](https://du.cesnet.cz/en/navody/onlyoffice/start) <!--- [:octicons-arrow-right-24: Account properties and lifecycle](/account/properties) ---> -- :fontawesome-solid-server:{ .lg .middle } __Long Tail Data Preservation__ +- :fontawesome-solid-server:{ .lg .middle } __Longterm Data Preservation__ --- Do you need to archive your data in the binary reliable data storage? - [:octicons-arrow-right-24: Longtail Preservation - CZ only](https://du.cesnet.cz/cs/navody/ltp/start) + [:octicons-arrow-right-24: Longterm Preservation - CZ only](https://du.cesnet.cz/cs/navody/ltp/start) <!--- [:octicons-arrow-right-24: Account properties and lifecycle](/account/properties) diff --git a/mkdocs.yml b/mkdocs.yml index 22b6adb37b8619b839cef0d1887cd92426773c60..799b01e057cb6dfad038e8d176bfd56aeaaa6ad6 100755 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,5 +1,13 @@ site_name: "storage" nav: - - Object Storage Guides: ./index.md - - Object Storage Guides: ./object-storage.md + - Data Storage Services: index.md + - S3 Service: + - S3 Overview: object-storage/s3-service.md + - Fauvorite S3 clients: object-storage/s3-clients.md + - Advanced S3 features: object-storage/s3-features.md + - Veeam backup over S3: object-storage/veeam-backup.md + - RBD Service: + - RBD Overview: object-storage/rbd-service.md + - RBD Setup: object-storage/rbd-setup.md + - FAQ: faq.md diff --git a/object-storage/aws-cli.md b/object-storage/aws-cli.md new file mode 100644 index 0000000000000000000000000000000000000000..316c32a7bd1067761e6cd2f7fe4aed601a1f8451 --- /dev/null +++ b/object-storage/aws-cli.md @@ -0,0 +1,135 @@ +--- +languages: + - en + - cs +--- + +# AWS CLI tool for command line usage + +AWS CLI is a common tool allowing to control S3 service. AWS CLI tool is written in python. + +## AWS CLI installation + +To install AWS CLI we recommend using [official AWS docummentation](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html). There you can find the guide on how to install AWS CLI on Linux and Windows as well. + +???+ note "AWS-CLI in virtual environment" + If you need to install AWS CLI in the virtual environment you can use [this guide](https://docs.aws.amazon.com/cli/latest/userguide/install-virtualenv.html). + +## Configuration of AWS CLI + +???+ note "User profile" + To configure AWS CLI we recommend using the option `--profile` which allows you to define multiple user profiles with different user credentials. Of course, you can also use the settings without the option `--profile`. All commands will be the same, you will just omit the option `--profile`. AWS will then use the **default** settings. + +!!! warning + In the configuration wizard, it is necessary by the option **Default region name** to hit the space bar. If you will not put the space into “Default region name” the config file will not contain **region** parameter. You will then obtain the error related to **InvalidLocationConstraint** during the usage **aws s3**. + +In the following, we will demonstrate the AWS CLI configuration. Following exemplary commands utilize the `--profile` option. + + aws configure --profile test_user + AWS Access Key ID [None]: xxxxxxxxxxxxxxxxxxxxxx + AWS Secret Access Key [None]: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + Default region name [None]: + Default output format [None]: text + +_AWS Access Key ID_ - access key, obtained from data storage administrator<br/> +_Secret Access Key_ - secret key, obtained from data storage administrator<br/> +_Default region name_ - Here just press the space bar!!! Some software tools can have special requirements, e.g. Veeam, in that case, insert storage<br/> +_Default output format_ - choose the output format (json, text, table)<br/> + +???+ note "Endpoint URL" + For smooth operation is necessary to use option `--endpoint-url` with particular S3 endpoint address provided by CESNET. + +!!! warning + **Multipart S3 upload - the maximal size of the file is limited up to 5 GB**. It's a best practice to use aws s3 commands (such as aws s3 cp) for multipart uploads and downloads because these aws s3 commands automatically perform multipart uploading and downloading based on the file size. By comparison, **aws s3api** commands, such as aws s3api create-multipart-upload, should be used only when aws s3 commands don't support a specific upload need, such as when the multipart upload involves multiple servers, a multipart upload is manually stopped and resumed later, or when the aws s3 command doesn't support a required request parameter. More information can be found on the [AWS websites](https://aws.amazon.com/premiumsupport/knowledge-center/s3-multipart-upload-cli/). + +## Controls of AWS CLI - high-level (s3) + +To show the help (available commands) you can use help - **aws s3** tool allows you to use several advanced functions, see below. + + aws s3 help + +### Operation with buckets +???+ note "Unique name of the bucket" + The bucket name has to be unique within tenant. It should contain lower letters, numbers, dashes, and dots. The bucket name should begin only with a letter or number and cannot contain dots followed by a dash or dots preceded by a dash or multiple dots. We also recommend not using “slash” in the bucket name. Using the slash will disallow the usage of the bucket via API. + +**Bucket creation** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz mb s3://test1 + +**Bucket listing** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz ls + 2019-09-18 13:30:17 test1 + +**Bucket deletion** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz rb s3://test1 + +### Operation with files +**File upload** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz cp C:/Users/User/Desktop/test_file.zip s3://test1 + upload: Desktop\test_file.zip to s3://test1/test_file.zip + +**File download** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz cp s3://test1/test_file.zip C:\Users\User\Downloads\ + download: s3://test1/test_file.zip to Downloads\test_file.zip + +**File deletion** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz rm s3://test1/test_file.zip + delete: s3://test1/test_file.zip + +### Directory/Folder operation +???+ note "" + The content of the source folder is always copied while using the following command. It does not depend on the slash character at the end of the source path. The behavior of **aws** is in this perspective different than the rsync behavior. If you wish to have the source directory in the destination you can add the name of the source directory to the destination path. **AWS tool will create the directory in the destination while copying the data**, see the exemplary commands below. The same is valid in the case of directory downloads or synchronization via **aws s3 sync**. + +**Upload the directory** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz cp C:\Users\User\Desktop\test_dir s3://test1/test_dir/ --recursive + +**Download the directory** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz cp s3://test1/test_dir C:\Users\User\Downloads\test_dir\ --recursive + +**Directory deletion** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz rm s3://test1/test_dir --recursive + +**Directory sync -> upload to cloud** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz sync C:\Users\User\Desktop\test_sync s3://test1/test_sync/ + +**Directory sync -> download from cloud** + + aws s3 --profile test_user --endpoint-url https://s3.cl2.du.cesnet.cz sync s3://test1/test_sync/ C:\Users\User\Downloads\test_sync + +## Controls of AWS CLI - api-level (s3api) + +**aws** tool allows the usage of **aws s3api** module. This module provides advanced functions to control S3 service, see below. The configuration of credentials and connections is the same like for **aws** in the beginning of this guide. + +The set of available commands can be obtained by the following command with the option **help**. Alternatively is the complete list available in the [AWS website](https://docs.aws.amazon.com/cli/latest/reference/s3api/index.html). + +## Exemplary configuration file for AWS-CLI +After successful configuration, the configuration file should be created. You can find the example below. You can find the credentials file in the same path. +???+ note "Config file" + Windows: C:/Users/User/.aws/config<br/> + Linux: /home/user/.aws/config<br/> + <br/>[profile test-user]<br/> + region =<br/> + output = text<br/> + + +## Special functions of AWS-CLI +There are several advanced functions in AWS-CLI for sharing the data or its versioning. + +### Presign URLs +For object in S3 service you can generate presign URL to allow your colleagues to download the data. You can find more information the the section dedicated to [advanced S3 features](s3-features.md) + +### Bucket policies +To share your data you can setup so called bucket policies. You can share specific bucket to a specific group (tenant) or make your bucket publicly readable. You can find more information the the section dedicated to [advanced S3 features](s3-features.md) + +### Bucket versioning +You can setup object versioning inside in your buckets. Then you can restore any previous version of the object (file). You can find more information the the section dedicated to [advanced S3 features](s3-features.md) + diff --git a/object-storage/cloudberry-screenshots/cloudberry1.png b/object-storage/cloudberry-screenshots/cloudberry1.png new file mode 100644 index 0000000000000000000000000000000000000000..26797c8107590c59451bce2a07b21d12410692ae Binary files /dev/null and b/object-storage/cloudberry-screenshots/cloudberry1.png differ diff --git a/object-storage/cloudberry-screenshots/cloudberry2.png b/object-storage/cloudberry-screenshots/cloudberry2.png new file mode 100644 index 0000000000000000000000000000000000000000..0d146f2c394187ab4b7f4bebe53d99236d1bdae0 Binary files /dev/null and b/object-storage/cloudberry-screenshots/cloudberry2.png differ diff --git a/object-storage/cloudberry-screenshots/cloudberry3.png b/object-storage/cloudberry-screenshots/cloudberry3.png new file mode 100644 index 0000000000000000000000000000000000000000..2a76fd1f53b3dd854eedace1109e680d44cea901 Binary files /dev/null and b/object-storage/cloudberry-screenshots/cloudberry3.png differ diff --git a/object-storage/cloudberry-screenshots/cloudberry4.png b/object-storage/cloudberry-screenshots/cloudberry4.png new file mode 100644 index 0000000000000000000000000000000000000000..1f4d30b31871c6d29d46a6a2faf0f04e204d9b3f Binary files /dev/null and b/object-storage/cloudberry-screenshots/cloudberry4.png differ diff --git a/object-storage/cloudberry-screenshots/cloudberry5.png b/object-storage/cloudberry-screenshots/cloudberry5.png new file mode 100644 index 0000000000000000000000000000000000000000..8b0475cfbd85023441ed6d15ce226393d4070b3f Binary files /dev/null and b/object-storage/cloudberry-screenshots/cloudberry5.png differ diff --git a/object-storage/cloudberry.md b/object-storage/cloudberry.md new file mode 100644 index 0000000000000000000000000000000000000000..f5f575fdaa17c3ec7426b965e9c91026ec442274 --- /dev/null +++ b/object-storage/cloudberry.md @@ -0,0 +1,37 @@ +--- +languages: + - en + - cs +--- + +# CloudBerry Explorer for Amazon S3 + +[CloudBerry Explorer](https://cloudberry-explorer-for-amazon-s3.en.softonic.com/) is an intuitive file explorer that helps you manage your S3 account as if it were another folder on your local drive. The program has a double-pane interface and acts as an FTP client, with each window dedicated to a single folder. These locations are not fixed and can be switched to suit your current task: a local computer and a remote S3 server, two local folders, or even two S3 accounts. + +## Cloudberry Installation +You can use exe installer from [the oficial websites of Cloudberry](https://cloudberry-explorer-for-amazon-s3.en.softonic.com/). When you start the program, it will be always informed about the registration options. Registration is free. Then you receive the key via e-mail and then all pop-ups are avoided. + +!!! warning + CloudBerry in the FREE version does not support Multipart Upload and Multithreading, which means that it cannot work with files larger than 5GB. Encryption and compression is also enabled in the PRO version. + +## Cloudberry Configuration +Storage configuration can be done via **1. File** menu, where you select **2 Add New Account**. Do not select the Amazon S3 Accounts option, as it does not have the option of entering a service point etc.! + +{ style="display: block; margin: 0 auto" } + +In the next window Select Cloud Storage - **1. S3 Compatible** option. + +{ style="display: block; margin: 0 auto" } + +In the next step you have to fill in S3 credentials including the S3 endpoint. + +{ style="display: block; margin: 0 auto" } + +Then you can start to upload your data. From the **1. Source selector** you will select your **2. S3 account**, which has been previously configured. + +{ style="display: block; margin: 0 auto" } + +First you need to **1. Create new bucket** and then you can upload your data into it. + +{ style="display: block; margin: 0 auto" } + diff --git a/object-storage/cyberduck-screenshots/cyberduck1en.png b/object-storage/cyberduck-screenshots/cyberduck1en.png new file mode 100644 index 0000000000000000000000000000000000000000..89917d294940fcd60168504714eae0220b7aaa56 Binary files /dev/null and b/object-storage/cyberduck-screenshots/cyberduck1en.png differ diff --git a/object-storage/cyberduck-screenshots/cyberduck2en.png b/object-storage/cyberduck-screenshots/cyberduck2en.png new file mode 100644 index 0000000000000000000000000000000000000000..f3f22f6dc556348b7ed655d82884c092668d2866 Binary files /dev/null and b/object-storage/cyberduck-screenshots/cyberduck2en.png differ diff --git a/object-storage/cyberduck-screenshots/cyberduck3en.png b/object-storage/cyberduck-screenshots/cyberduck3en.png new file mode 100644 index 0000000000000000000000000000000000000000..77c36f7bae69af88af14192916919e95ac3cb7cf Binary files /dev/null and b/object-storage/cyberduck-screenshots/cyberduck3en.png differ diff --git a/object-storage/cyberduck-screenshots/cyberduck4en.png b/object-storage/cyberduck-screenshots/cyberduck4en.png new file mode 100644 index 0000000000000000000000000000000000000000..18a47f021ca50b7acc5bb7259597e6bdbde185f3 Binary files /dev/null and b/object-storage/cyberduck-screenshots/cyberduck4en.png differ diff --git a/object-storage/cyberduck.md b/object-storage/cyberduck.md new file mode 100644 index 0000000000000000000000000000000000000000..9d32907dcd6a481d89a92a689fad0038412814f8 --- /dev/null +++ b/object-storage/cyberduck.md @@ -0,0 +1,30 @@ +--- +languages: + - en + - cs +--- + +# CyberDuck tool + +[CyberDuck](https://cyberduck.io/) is a swiss knife tool for various cloud storage providers. It supports FTP, SFTP, WebDAV, OpenStack, OneDrive, Google Drive, Dropbox, etc. + +## Installation +You can download the exe installer from the [CybeDuck webpage](https://cyberduck.io/) and follow the installation steps. + +## Configuration + +Setup of new storage can be done via button **New connection** in the left menu. + +{ style="display: block; margin: 0 auto" } + +In the following window you can select **Amazon S3** and then insert the URL of the server s3.clX.du.cesnet.cz, where `X` is number asociated with your S3 account (e.g. `cl4`). Then please insert the `acces_key` and `secret_key`. Then you can click on the **Connection** button. + +{ style="display: block; margin: 0 auto" } + +The you can create a bucket - in the main directory can be only directories (buckets). + +{ style="display: block; margin: 0 auto" } + +While creating the bucket keep default region. + +{ style="display: block; margin: 0 auto" } diff --git a/object-storage/os-index.md b/object-storage/os-index.md new file mode 100644 index 0000000000000000000000000000000000000000..a0b43f5324fd3b25dad40a9b455ceb196f78aa7a --- /dev/null +++ b/object-storage/os-index.md @@ -0,0 +1,39 @@ +--- +# template: home.html +hide: + - toc +--- +# Object Storage Services + +Object Storage Services is a portfolio of services allowing to facilitate your archive and backup data. + +<div class="grid cards" markdown> + +- :fontawesome-solid-server:{ .lg .middle } __S3 Service__ + + --- + + **S3** is a general service suitable for most of the usecases (archives, backups, special applications...). It also allows to share your data with other users or publicly via link. + + [:octicons-arrow-right-24: Overview of S3 Service](./s3-service.md) + [:octicons-arrow-right-24: Favourite S3 Clients](./rclone.md) + [:octicons-arrow-right-24: Advanced S3 Functions](./rclone.md) + [:octicons-arrow-right-24: Veeam stup against S3](./rclone.md) +<!--- +[:octicons-arrow-right-24: Account properties and lifecycle](/account/properties) +---> + +- :fontawesome-solid-server:{ .lg .middle } __RBD Service__ + + --- + + Do you need to cooperate with your colleagues, edit documents and share data? + + [:octicons-arrow-right-24: Owncloud](https://du.cesnet.cz/en/navody/owncloud/start) + +<!--- +[:octicons-arrow-right-24: Account properties and lifecycle](/account/properties) +---> + +</div> + diff --git a/object-storage/rbd-service-screenshots/central_backup.png b/object-storage/rbd-service-screenshots/central_backup.png new file mode 100644 index 0000000000000000000000000000000000000000..ae0e6b21094e2013aec4e899433b54efb0c5019a Binary files /dev/null and b/object-storage/rbd-service-screenshots/central_backup.png differ diff --git a/object-storage/rbd-service-screenshots/shared_distribution.png b/object-storage/rbd-service-screenshots/shared_distribution.png new file mode 100644 index 0000000000000000000000000000000000000000..b239a54a9a67d888640a25482347ed039011e8fc Binary files /dev/null and b/object-storage/rbd-service-screenshots/shared_distribution.png differ diff --git a/object-storage/rbd-service.cs.md b/object-storage/rbd-service.cs.md new file mode 100644 index 0000000000000000000000000000000000000000..346acaed9067e0c347023cb23445e0947aedfeb9 --- /dev/null +++ b/object-storage/rbd-service.cs.md @@ -0,0 +1,54 @@ +--- +languages: + - en + - cs +--- +# Služba RBD + +Rados Block Device **RBD** je blokové zařízení, které si můžete připojit do vaší infrastruktury. Připojení je nutné provést pomocí linuxového stroje (připojení RBD do Windows není v současné době provozně stabilní, proto jej nedoporučujeme). Následně si můžete připojené blokové zařízení reexportovat kamkoliv v rámci vašich systémů (samba remount do vaší sítě). RBD je vhodné zejména pro použití v centralizovaných zálohovacích systémech. RBD je velmi úzce specializovaná služba, která vyžaduje na straně uživatele širší zkušenosti se správou linuxových zařízení. Služba je určena řádově pro větší objemy dat - vyšší stovky TB. Blokové zařízení je možné rovněž na vaší straně opatřit šifrováním (client side) pomocí LUKS. Šifrování na straně klienta rovněž znamená, že přenos dat po síti je šifrován a v případě odposlechnutí během přenosu není možné data dešifrovat. + +!!! warning + Připojení RBD je možné pouze z dedikovaných IPv4 adres, které jsou povoleny na firewallu. Pokud stroj, na který chcete připojit RBD, má pouze IPv6 adresu, tak **NE**bude RBD možné připojit a budete muset využít službu S3. RBD obraz je možné následně připojit pouze na jednom zařízení, není možné, aby si každý z vašich uživatelů připojil stejné RBD na svoji pracovní stanici - za předpokladu, že RBD není použito pro clusterovaný file systém. Použití clusterovaných filesystemů nad RBD je potřeba nejdříve konzultovat s podporou Datových úložišť CESNET. + +???+ note "Jak získám službu RBD?" + Pro získání služby RBD prosím kontaktujte náš support: + `support@cesnet.cz` + +---- +## Základní případy užití služby RBD +V následujících sekcích naleznete základní případy užití týkající se služby RBD. + +### Zálohování velkých data setů vyžadující lokální filesystém +Pokud máte centralizovaný zálohovací systém (sada skriptů, bacula, BackupPC…) vyžadující lokální filesystém, pak vám doporučujeme použití [služby RBD](rbd-setup.md), viz níže. RBD obraz je možné připojit přímo ke stroji, kde běží centrální zálohovací systém, jako blokové zařízení. RBD je možné opatřit snapshoty viz, popis služeb, jako ochranu proti nechtěnému přepsání anebo ransomware útoku. + +{ style="display: block; margin: 0 auto" } + +### Centrální share pro vnitřní potřeby instituce +Pokud ukládáte živá data a potřebujete na úložiště pouštět jednotlivé uživatele, pak můžete využít [službu RBD](rbd-setup.md), kterou si připojíte k vám do infrastruktury pomocí linuxového stroje. Na připojeném blokovém zařízení si můžete udělat souborový systém, případně jej opatřit šifrováním a dále je reexportovat dovnitř vaší infrastruktury například pomocí samba, NFS, ftp, ssh, aj. (možno i formou kontejnerů zajišťujících distribuci protokolů do vaší interní sítě). Šifrování na straně klienta rovněž znamená, že přenos dat po síti je šifrován a v případě odposlechnutí během přenosu není možné data dešifrovat. Výhodou je, že si můžete vytvářet skupiny a spravovat práva zcela dle vašich preferencí, případně použít vaši lokální databázi uživatelů a skupin. Blokové zařízení RBD je dále možné opatřit snapshoty na úrovni RBD, tudíž pokud dojde k nechtěnému odmazání dat je možné se vrátit například ke snapshotu z předchozího dne. + +{ style="display: block; margin: 0 auto" } + +## Jak je řešena redundance dat - replicated vs erasure coding? +Níže jsou popsány druhy konfigurace služby RBD, které řeší redundaci dat nad úložným poolem. Služba RBD může být vybavena **replicated** nebo **erasure code (EC)** redundancí a dále **synchronní nebo asynchronní geografickou replikou** + +### Replicated +Vaše data jsou na úložišti uložena ve třech kopiích. V případě poškození dat v jedné kopi jsou původní data stále čitelná v nepoškozené formě a na pozadí dojde k obnově poškozených dat. Použití služby s příznakem replicated rovněž umožňuje rychlejší čtení, protože je možné číst ze všech replik najednou. Použití služby s příznakem replicated snižuje rychlost zápisu, protože operace zápisu čeká na potvrzení zápisu ze všech třech replik. Naopak čtení je rzchlejší, peorože dochází ke čtení ze všech replik současně. + + +???+ note "Vhodné pro?" + Vhodné pro menší objemy živých dat s preferencí rychlostí čtení (ne příliš vhodná pro velké datvé objemy). + +### Erasure Coding (EC) +Erasure coding (EC) je metoda ochrany dat, jedná se o obdobu dynamického RAID známého z diskových polí. Erasure coding (EC) je metoda, kde jsou data rozdělena na jednotlivé fragmenty, které jsou následně uloženy s určitou redundancí napříč datovým úložištěm. Pokud tedy dojde k selhání některých disků (nebo celého storage serveru), tak jsou data stále přístupná a na pozadí dojde k jejich obnovení. Není tedy možné, aby vaše data ležela na jednom disku, který se poškodí a vy o data přijdete. Tato technologie je vhodná pro živější data (rychlejší zápis), ktere nevyžadují časté čtení. Zároveň je tato technologie úspornější co do množství obsazeného místa. + +???+ note "Vhodné pro?" + Vhodné např. spíše pro ukládaní velkých datových objemů. + +### RBD snapshoty +Na úrovni RBD (replikované/erasure coding) je možné použít snapshoty. Ovládání snapshotů se provádí z klientské strany. [RBD snapshotování](rbd-setup.md) je jedna z možností náhrady za `tape_tape` politiku v případě mirroringu snapshotu do jiné geografické lokality. + +### Synchronní geografická replika +Synchronní geografická replika chrání před výpadkem datového centra. Synchronní geografická replika zhoršuje rychlost zápisu, protože systém čeká na úspěšné potvrzení zápisu na obou geografických lokacích. Pokud máte dojem, že potřebujete tuto službu, tak se nám ozvěte. + +### Asynchronní geografická replika +Asynchronní geografická replika chrání částečně před výpadkem datového centra (může dojít ke ztrátě určitých dat mezi jednotlivými asynchronními synchronizacemi z důvodu časové prodlevy). U asynchronní geografické repliky je však čas v případě poškození dat (ransomware) zasáhnout a přerušit synchronizaci. Pokud máte dojem, že potřebujete tuto službu, tak se nám ozvěte. diff --git a/object-storage/rbd-service.md b/object-storage/rbd-service.md new file mode 100644 index 0000000000000000000000000000000000000000..191f9f9c80078e08192cb6f5247a9997de448c79 --- /dev/null +++ b/object-storage/rbd-service.md @@ -0,0 +1,53 @@ +--- +languages: + - en + - cs +--- +# RBD Service + +The Rados Block Device **RBD** is a block device that you can connect into your infrastructure. The connection must be done using a **Linux machine** (RBD connection to Windows is not yet implemented in reliable manner). Subsequently, you can re-export the connected block device anywhere within your systems (samba remount to your network). RBD is particularly suitable for use in centralized backup systems. RBD is a very specialized service that requires the user to have extensive experience in managing Linux devices. The service is intended for larger volumes of data - hundreds of TB. The block device can also be encrypted on your side (client side) using LUKS. Client-side encryption also means that the transmission of data over the network is encrypted, and in case of eavesdropping during transmission, the data cannot be decrypted. Access to the service is controlled by virtual organizations and coresponding groups. + +!!! warning + RBD connection is only possible from dedicated IPv4 addresses that are enabled on the firewall in our Data Centers. An RBD image can only be subsequently mounted on **ONE** machine, it is not possible for each of your users to mount the same RBD on their workstation - having said that the RBD is not used as clustered file system. Usage of clustered file systems over RBD must first be consulted with Data Care support. + +???+ note "How to get RBD service?" + To connect to RBD service you have to contact support at: + `support@cesnet.cz` + +---- +## RBD elementary use cases +In the following section you can find the description of elementary use cases related to RBD service. + +### Large dataset backups requiring local filesystem +If you have a centralized backup system (script suite, bacula, BackupPC…) requiring local file system, then we recommend you to use [RBD service](rbd-setup.md), see the figure below. The RBD image can be connected directly to the machine where the central backup system is running, as a block device. RBD can then be equipped with snapshots, see service description, as protection against unwanted overwriting or ransomware attacks. + +{ style="display: block; margin: 0 auto" } + +### Centralized shared storage for internal redistribution +If you need to store live data and need to provide the storage for individual user, then you can use [RBD](rbd-setup.md) service which you can connect to you infrastructure using a Linux machine. You can create a file system on the connected block device, or equip it with encryption, and then re-export them inside your infrastructure using, for example, samba, NFS, ftp, ssh, etc. (also in the form of containers ensuring the distribution of protocols to your internal network). Client-side encryption also means that the data transmission over the network is encrypted and the data cannot be decrypted once the transmission is sent. The advantage is that you can create groups and manage rights according to your preferences, or use your local database of users and groups. The RBD block device can also be equipped with snapshots at the RBD level, so if data is accidentally deleted, it is possible to return to a snapshot from the previous day, for example. + +{ style="display: block; margin: 0 auto" } + +## RBD Data Reliability (Data Redundancy) - replicated vs erasure coding +In the section below are described additional aproaches for data redundancy applied to the object storage pool. RBD service can be equipped with **replicated** or **erasure code (EC)** redundancy or with **synchronous/asynchronous geographical repliacation**. + +### Replicated +Your data is stored in three copies in the data center. In case one copy is corrupted, the original data is still readable in an undamaged form, and the damaged data is restored in the background. Using a service with the replicated flag also allows for faster reads, as it is possible to read from all replicas at the same time. Using a service with the replicated flag reduces write speed because the write operation waits for write confirmation from all three replicas. + +???+ note "Suitable for?" + Suitable for smaller volumes of live data with a preference for reading speed (not very suitable for large data volumes). + +### Erasure Coding (EC) +Erasure coding (EC) is a data protection method. It is similar to the dynamic RAID known from disk arrays. Erasure coding (EC) is a method where data is divided into individual fragments, which are then stored with some redundancy across the data storage. Therefore, if some disks (or the entire storage server) fail, the data is still accessible and will be restored in the background. So it is not possible for your data to be on one disk that gets damaged and you lose your data. + +???+ note "Suitable for?" + Suitable, for example, for storing large data volumes. + +### RBD snapshots +Snapshots can be used at the RBD (replicated/erasure coding) level. Snapshots are controlled from the client side. [RBD snapshotting](rbd-setup.md) is one of the replacement options for the `tape_tape` policy - snapshots mirrored to another geographic location, see below. + +### Synchronous geographical replication +Synchronous geographical replication protects against data center failure. Synchronous geographical replication degrades write speed because the system waits for a successful write confirmation at both geographic locations. If you feel that you need this service, please contact us. + +### Asynchronous geographical replication +Asynchronous geographical replication partially protects against data center failure (certain data may be lost between individual asynchronous synchronizations due to time lag). However, with an asynchronous geographical replication, in case of data corruption (ransomware), you can disrupt the replication and safe your data. If you feel that you need this service, please contact us. diff --git a/object-storage/rbd-setup.md b/object-storage/rbd-setup.md new file mode 100644 index 0000000000000000000000000000000000000000..dd032be0b8ed934a9a1a34d99f52c807591bb963 --- /dev/null +++ b/object-storage/rbd-setup.md @@ -0,0 +1,245 @@ +--- +languages: + - en + - cs +--- +# Connecting and configuring Ceph RBD using a Linux client +Ceph RBD (RADOS Block Device) provides users with a network block device that looks like a local disk on the system where it is connected. The block device is fully managed by the user. An user can create a file system there and use it according to his needs. + +???+ note "Advantages of RBD" + * Possibility to enlarge the image of the block device. + * Import / export block device image. + * Stripping and replication within the cluster. + * Possibility to create read-only snapshots; restore snapshots (if you need snapshots on the RBD level you must contact us). + * Possibility to connect using Linux or QEMU KVM client + +## Setup of RBD client (Linux) + +!!! warning + To connect RBD, it is recommended to have a newer kernel version on your system. In lower kernel versions are the appropriate RBD connection modules deprecated. So not all advanced features are supported. Developers even recommend a kernel version at least 5.0 or higher. However developers has backported some functionalities to CentOS 7 core. + +???+ note "Ceph client version" + For proper functioning it is highly desired to use the same version of Ceph tools as is the current version being operated on our clusters. Currently it is version 16 with the code name Pacific . So we will set up the appropriate repositories, see below. + +### CentOS setup +First, install the release.asc key for the Ceph repository. + + sudo rpm --import 'https://download.ceph.com/keys/release.asc' + +In the directory **/etc/yum.repos.d/** create a text file **ceph.repo** and fill in the record for Ceph instruments. + + [ceph] + name=Ceph packages for $basearch + baseurl=https://download.ceph.com/rpm-nautilus/el7/$basearch + enabled=1 + priority=2 + gpgcheck=1 + gpgkey=https://download.ceph.com/keys/release.asc + +Some packages from the Ceph repository also require third-party libraries for proper functioning, so add the EPEL repository. + +CentOS 7 + + sudo yum install -y epel-release + +CentOS 8 + + sudo dnf install -y epel-release + +RedHat 7 + + sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm + +Finally, install the basic tools for Ceph which also include RBD support. + +CentOS 7 + + sudo yum install ceph-common + +On CentOS 8 + + sudo dnf install ceph-common + +### Ubuntu/Debian setup +Ubuntu/Ceph includes all necessary packages natively. So you can just run following command. + + sudo apt install ceph + +## RBD configuration and its mapping + +Use the credentials which you received from the system administrator to configure and connect the RBD. These are the following: + + * pool name: **rbd_vo_poolname** + * image name: **vo_name_username** + * keyring: **[client.rbd_user] key = key_hash ==** + +In the directory **/etc/ceph/** create the text file **ceph.conf** with the following content. + +???+ note "CL1 Data Storage" + [global]<br/> + fsid = 19f6785a-70e1-45e8-a23a-5cff0c39aa54<br/> + mon_host = [v2:78.128.244.33:3300,v1:78.128.244.33:6789],[v2:78.128.244.37:3300,v1:78.128.244.37:6789],[v2:78.128.244.41:3300,v1:78.128.244.41:6789]<br/> + auth_client_required = cephx + +???+ note "CL2 Data Storage" + [global]<br/> + fsid = 3ea58563-c8b9-4e63-84b0-a504a5c71f76<br/> + mon_host = [v2:78.128.244.65:3300/0,v1:78.128.244.65:6789/0],[v2:78.128.244.69:3300/0,v1:78.128.244.69:6789/0],[v2:78.128.244.71:3300/0,v1:78.128.244.71:6789/0]<br/> + auth_client_required = cephx + +???+ note "CL3 Data Storage" + [global]<br/> + fsid = b16aa2d2-fbe7-4f35-bc2f-3de29100e958<br/> + mon_host = [v2:78.128.244.240:3300/0,v1:78.128.244.240:6789/0],[v2:78.128.244.241:3300/0,v1:78.128.244.241:6789/0],[v2:78.128.244.242:3300/0,v1:78.128.244.242:6789/0]<br/> + auth_client_required = cephx + +???+ note "CL4 Data Storage" + [global]<br/> + fsid = c4ad8c6f-7ef3-4b0e-873c-b16b00b5aac4<br/> + mon_host = [v2:78.128.245.29:3300/0,v1:78.128.245.29:6789/0] [v2:78.128.245.30:3300/0,v1:78.128.245.30:6789/0] [v2:78.128.245.31:3300/0,v1:78.128.245.31:6789/0]<br/> + auth_client_required = cephx + +Further in the directory **/etc/ceph/** create the text file **ceph.keyring**. Then save in that file the keyring, see the example below. + + [client.rbd_user] + key = sdsaetdfrterp+sfsdM3iKY5teisfsdXoZ5== + +!!! warning + If the location of the files `ceph.conf` and `username.keyring` differs from the default directory **/etc/ceph/**, the corresponding paths must be specified during mapping. See below. + sudo rbd -c /home/username/ceph/ceph.conf -k /home/username/ceph/username.keyring --id rbd_user device map name_pool/name_image + +Then check the connection in kernel messages. + + dmesg + +Now check the status of RBD. + + sudo rbd device list | grep "name_image" + +## Encrypting and creating a file system + +The next step is to encrypt the mapped image. Use **cryptsetup-luks** for encryption. + + sudo yum install cryptsetup-luks + +Then it encrypts the device. + + sudo cryptsetup -s 512 luksFormat --type luks2 /dev/rbdX + +Finally, check the settings. + + sudo cryptsetup luksDump /dev/rbdX + +In order to perform further actions on an encrypted device, it must be decrypted first. + + sudo cryptsetup luksOpen /dev/rbdX luks_rbdX + +???+ note "" + We recommend using XFS instead of EXT4 for larger images or those they will need to be enlarged to more than 200TB over time, because EXT4 has a limit on the number of inodes. + +Now create file system on the device, here is an example xfs. + + sudo mkfs.xfs -K /dev/mapper/luks_rbdX + +!!! warning + If you use XFS, do not use the nobarrier option while mounting, it could cause data loss! + +Once the file system is ready, we can mount the device in a pre-created folder in /mnt/. + + sudo mount /dev/mapper/luks_rbdX /mnt/rbd + +## Ending work with RBD + +Unmount the volume. + + sudo umount /mnt/rbd/ + +Close the encrypted volume. + + sudo cryptsetup luksClose /dev/mapper/luks_rbdX + +Volume unmapping. + + sudo rbd --id rbd_user device unmap /dev/rbdX/ + +???+ note "" + To get better performance choose appropriate size of `read_ahead` cache depends on your size of memory. + + Example for 8GB:<br/> + + echo 8388608 > /sys/block/rbd0/queue/read_ahead_kb + + Example for 512MB:<br/> + + echo 524288 > /sys/block/rbd0/queue/read_ahead_kb + + To apply changes you have to unmap image and map it again. + + The approach described above is not persistent (won't survive reboot). To do it persistent you have to add following line into “/etc/udev/rules.d/50-read-ahead-kb.rules” file. + + # Setting specific kernel parameters for a subset of block devices (Ceph RBD) + KERNEL=="rbd[0-9]*", ENV{DEVTYPE}=="disk", ACTION=="add|change", ATTR{bdi/read_ahead_kb}="524288" + +## Permanently mapping of RBD +Settings for automatic RBD connection, including LUKS encryption and mount filesystems. + proper disconnection (in reverse order) when the machine is switched off in a controlled manner. + +### RBD image +Edit configuration file in the path `/etc/ceph/rbdmap` by inserting following lines. + + # RbdDevice Parameters + #poolname/imagename id=client,keyring=/etc/ceph/ceph.client.keyring + pool_name/image_name id=rbd_user,keyring=/etc/ceph/ceph.keyring + +### LUKS +Edit configuration file in the path `/etc/crypttab` by inserting following lines. + + # <target name> <source device> <key file> <options> + rbd_luks_pool /dev/rbd/pool_name/image_name /etc/ceph/luks.keyfile luks,_netdev + +where **/etc/ceph/luks.keyfile** is LUKS key. + +???+ note "" + path to block device (“<source device>”) is generally `/dev/rbd/$POOL/$IMAGE` + +### fstab file +Edit configuration file in the path `/etc/fstab` by inserting following lines. + + # <file system> <mount point> <type> <options> <dump> <pass> + /dev/mapper/rbd_luks_pool /mnt/rbd_luks_pool btrfs defaults,noatime,auto,_netdev 0 0 + +???+ note "" + path to LUKS container (“<file system>”) is generally `/dev/mapper/$LUKS_NAME`, + where `$LUKS_NAME` is defined in `/etc/crypttab` (like “<taget name>”) + +### systemd unit +Edit configuration file in the path `/etc/systemd/system/systemd-cryptsetup@rbd_luks_pool.service.d/10-deps.conf` by inserting following lines. + + [Unit] + After=rbdmap.service + Requires=rbdmap.service + Before=mnt-rbd_luks_pool.mount + +???+ note "" + In one case, systemd units were used on Debian 10 for some reason `ceph-rbdmap.service` instead of `rbdmap.service` (must be adjusted to lines `After=` and `Requires=`) + +---- + +### Manual connection +If the dependencies of the systemd units are correct, it performs an RBD map, unlocks LUKS and mounts all the automatic fs dependent on the rbdmap that the specified .mount unit needs (⇒ mounts both images in the described configuration). + + systemctl start mnt-rbd_luks_pool.mount + +### Manual disconnection +This command should execute if the dependencies are set correctly `umount`, LUKS `close` i RBD unmap. + + systemctl stop rbdmap.service + +(alternatively `systemctl stop ceph-rbdmap.service`) + +### Image resize +When resizing an encrypted image, you need to follow the order and the main one is the line with cryptsetup `--verbose resize image_name`. + + rbd resize rbd_pool_name/image_name --size 200T + cryptsetup --verbose resize image_name + mount /storage/rbd/image_name + xfs_growfs /dev/mapper/image_name diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted1.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted1.png new file mode 100644 index 0000000000000000000000000000000000000000..6ba56c8b93a44fd3b5c6b0a565a6673938453b2a Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted1.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted2.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted2.png new file mode 100644 index 0000000000000000000000000000000000000000..1fb7335ad8e48424bd828efc0a487b7d1c625f17 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted2.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted3.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted3.png new file mode 100644 index 0000000000000000000000000000000000000000..45418b3fc9c7b3405d8b3e7a8517c20871132b4a Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted3.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted4.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted4.png new file mode 100644 index 0000000000000000000000000000000000000000..a18669e8fe0033fd76c883a2cf62d381f4dff3d1 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted4.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted5.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted5.png new file mode 100644 index 0000000000000000000000000000000000000000..4d0b17420da3fe4146957321b2657b10f464fed3 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted5.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted6.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted6.png new file mode 100644 index 0000000000000000000000000000000000000000..e2aea1a6f1ca640546eb19b38b93d365c89151f3 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted6.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted7.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted7.png new file mode 100644 index 0000000000000000000000000000000000000000..d1305df6f54c29feb8d6abf53e26525bcf4145d9 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted7.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd-encrypted8.png b/object-storage/rclone-screenshots/rclone-cmd-encrypted8.png new file mode 100644 index 0000000000000000000000000000000000000000..d36865ace52feb0cf2fb5c66217fa4a02809240f Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd-encrypted8.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd1.png b/object-storage/rclone-screenshots/rclone-cmd1.png new file mode 100644 index 0000000000000000000000000000000000000000..543e2fe8947d6a8ab47879c9fa7014b7ac689f0f Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd1.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd10.png b/object-storage/rclone-screenshots/rclone-cmd10.png new file mode 100644 index 0000000000000000000000000000000000000000..5d9feade197895f2ef0f3bca3b12516a655fb74d Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd10.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd11.png b/object-storage/rclone-screenshots/rclone-cmd11.png new file mode 100644 index 0000000000000000000000000000000000000000..1162e0bad142c6e5684447f739f09041fa58524f Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd11.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd12.png b/object-storage/rclone-screenshots/rclone-cmd12.png new file mode 100644 index 0000000000000000000000000000000000000000..855a78213a1f2684f686215b0ee96a43a55fa90f Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd12.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd13.png b/object-storage/rclone-screenshots/rclone-cmd13.png new file mode 100644 index 0000000000000000000000000000000000000000..9b37a81143400227d321ebcbe397dce070f5441a Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd13.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd2.png b/object-storage/rclone-screenshots/rclone-cmd2.png new file mode 100644 index 0000000000000000000000000000000000000000..2d9bf7f7cfb5a445696ad4b137d25f778a253ec0 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd2.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd3.png b/object-storage/rclone-screenshots/rclone-cmd3.png new file mode 100644 index 0000000000000000000000000000000000000000..611be7ea01f8424eb260676ee506e53c4a05eabc Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd3.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd4.png b/object-storage/rclone-screenshots/rclone-cmd4.png new file mode 100644 index 0000000000000000000000000000000000000000..f4cbf916f6ae2edd8f0cedf4b7c5208477fb0c69 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd4.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd5.png b/object-storage/rclone-screenshots/rclone-cmd5.png new file mode 100644 index 0000000000000000000000000000000000000000..2b7b2caa61fcfcdd6cc1a54820eebb821ec0c293 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd5.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd6.png b/object-storage/rclone-screenshots/rclone-cmd6.png new file mode 100644 index 0000000000000000000000000000000000000000..a347efbcb0b459e82fcbfa2fefdb527cdcd70104 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd6.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd7.png b/object-storage/rclone-screenshots/rclone-cmd7.png new file mode 100644 index 0000000000000000000000000000000000000000..3f08d1145e5fbeffc7b87ec6aa32a355dc588229 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd7.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd8.png b/object-storage/rclone-screenshots/rclone-cmd8.png new file mode 100644 index 0000000000000000000000000000000000000000..b38766bd1225ccdc326c2859b38abd446f89af53 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd8.png differ diff --git a/object-storage/rclone-screenshots/rclone-cmd9.png b/object-storage/rclone-screenshots/rclone-cmd9.png new file mode 100644 index 0000000000000000000000000000000000000000..63fe3153335c7aec0281bb66c6e4622689a8cea9 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-cmd9.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted1.png b/object-storage/rclone-screenshots/rclone-gui-encrypted1.png new file mode 100644 index 0000000000000000000000000000000000000000..a2cfa4b31fb892b3bf3ba24a7af147f8ab178039 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted1.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted2.png b/object-storage/rclone-screenshots/rclone-gui-encrypted2.png new file mode 100644 index 0000000000000000000000000000000000000000..9a8c24d14ef854ce1b60b405464096d15b595f67 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted2.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted3.png b/object-storage/rclone-screenshots/rclone-gui-encrypted3.png new file mode 100644 index 0000000000000000000000000000000000000000..a32999ba3f2dc427e18d0772bc06481c25798417 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted3.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted4.png b/object-storage/rclone-screenshots/rclone-gui-encrypted4.png new file mode 100644 index 0000000000000000000000000000000000000000..721118639fc8cea1f6c16346d332551cba1cc051 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted4.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted5.png b/object-storage/rclone-screenshots/rclone-gui-encrypted5.png new file mode 100644 index 0000000000000000000000000000000000000000..cd107d65122f5ba9af989170468edd7d2b7bdb00 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted5.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted6.png b/object-storage/rclone-screenshots/rclone-gui-encrypted6.png new file mode 100644 index 0000000000000000000000000000000000000000..f4dfb9c74c95f93be84ed08ea41c76cd375662bf Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted6.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted7.png b/object-storage/rclone-screenshots/rclone-gui-encrypted7.png new file mode 100644 index 0000000000000000000000000000000000000000..1c7fba3a64bc9dee1584b0933503ab139dcc5ef0 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted7.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-encrypted8.png b/object-storage/rclone-screenshots/rclone-gui-encrypted8.png new file mode 100644 index 0000000000000000000000000000000000000000..6247951a2e0d568d9e64647ad5915c7070076191 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-encrypted8.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-s3-1.png b/object-storage/rclone-screenshots/rclone-gui-s3-1.png new file mode 100644 index 0000000000000000000000000000000000000000..f31f58bca298ed00ea8223b2cb847b692f552e3f Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-s3-1.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui-s3-2.png b/object-storage/rclone-screenshots/rclone-gui-s3-2.png new file mode 100644 index 0000000000000000000000000000000000000000..3fe9cd36267f8b7fb9850b8ea5ec0cff01a1b711 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui-s3-2.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui1.png b/object-storage/rclone-screenshots/rclone-gui1.png new file mode 100644 index 0000000000000000000000000000000000000000..acfeb7e5ba7eec2d3061dc31674c4e34fdd1d9e1 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui1.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui4.png b/object-storage/rclone-screenshots/rclone-gui4.png new file mode 100644 index 0000000000000000000000000000000000000000..7a89c2b1ea39eb22625e7d38e0a963d53c9f6534 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui4.png differ diff --git a/object-storage/rclone-screenshots/rclone-gui_upload.png b/object-storage/rclone-screenshots/rclone-gui_upload.png new file mode 100644 index 0000000000000000000000000000000000000000..5b4fce170aa9568d0760ef6d6b8031aec5226bc7 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-gui_upload.png differ diff --git a/object-storage/rclone-screenshots/rclone-path-win1.png b/object-storage/rclone-screenshots/rclone-path-win1.png new file mode 100644 index 0000000000000000000000000000000000000000..d94c207931b553249df8aa0588cc6293720e3b55 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-path-win1.png differ diff --git a/object-storage/rclone-screenshots/rclone-path-win2.png b/object-storage/rclone-screenshots/rclone-path-win2.png new file mode 100644 index 0000000000000000000000000000000000000000..ec6befc2e64209b9de45a6fbf2dc4ba54ad6f690 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-path-win2.png differ diff --git a/object-storage/rclone-screenshots/rclone-path-win3.png b/object-storage/rclone-screenshots/rclone-path-win3.png new file mode 100644 index 0000000000000000000000000000000000000000..dfb8cb585d6dfcc149a902bb7582dd1d0505b83d Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-path-win3.png differ diff --git a/object-storage/rclone-screenshots/rclone-path-win4.png b/object-storage/rclone-screenshots/rclone-path-win4.png new file mode 100644 index 0000000000000000000000000000000000000000..e3021259061ea91ac48759fd93ff80a45fb59722 Binary files /dev/null and b/object-storage/rclone-screenshots/rclone-path-win4.png differ diff --git a/object-storage/rclone.md b/object-storage/rclone.md new file mode 100644 index 0000000000000000000000000000000000000000..af2d31273c300153a9e00832e44c018a0efb3b7b --- /dev/null +++ b/object-storage/rclone.md @@ -0,0 +1,492 @@ +--- +languages: + - en + - cs +--- +# Rclone +This guide serves for rclone tool configuration. Rclone is the swiss knife for connection to multiple types of storage. However, the following guides are limited only to Object Storage namely S3 service. The following guides are suited for Windows users as well as Linux users. Using rclone tool for S3 enables the creation of the buckets, syncing folders, uploading and downloading files, and much more. + +## Downloading and installation of rclone tool +Firstly you need to download and unzip the desired [rclone](https://rclone.org/downloads/) version according to the system you operate. + +!!! warning + **We strongly recommend** to use up-to-date versions of rclone tool available at [rclone websites](https://rclone.org/downloads/). Using rclone version from system repositories can cause some errors due to the outdated rclone version. In case of a manual installation into the user profile, see below, you can use for an update following command: + + rclone selfupdate + +---- + +### Linux - manual installation into the user profile + +We need to unzip the [rclone archive](https://rclone.org/downloads/) after download. + + unzip rclone-v1.59.1-linux-amd64.zip + +Then we need to copy rclone binary file into the pre-prepared bin folder in the user profile/home. + + cp ./rclone-v1.59.1-linux-amd64/rclone /home/user/bin/ + +In the last step, we need to put the path to the rclone binary file into PATH environment variable. + + PATH=/home/user/bin:$PATH + +???+ note "Persistent setup" + For **persistent presence** of the rclone binary file path in the PATH variable you can add the following line into **.bashrc file**: + + echo 'PATH=/home/user/bin:$PATH' >> .bashrc + +Alternatively you can place the rclone binary file into the system path: + + /usr/local/bin + +In the end, we can make a quick check of PATH variable, whether the desired path is present. + + echo $PATH + +!!! warning + In case you have installed the rclone using the steps above, you can then use the following command to update rclone:<br/> + **```rclone selfupdate```**<br/> + 2022/08/25 11:54:07 NOTICE: Successfully updated rclone from version v1.59.0 to version v1.59.1 + +---- + +### Windows - manual installation in to user profile + +Firstly you need to prepare `bin` directory in your user profile, where we will place rclone.exe file. Please prepare the directory in your user profile. You can just put the following command into the file browser: + + %USERPROFILE% + +In the displayed directory you can click right mouse button -> **New** –> **Directory/Folder**. The directory should be named **bin**. Then move the **rclone.exe** file into it. The file **rclone.exe** is present in the unzipped rclone archive, which you downloaded from the [rclone websites](https://rclone.org/downloads/). + +In the next step click to **Start (1)** and search for **Edit the system environment variables (2)**. + +{ style="display: block; margin: 0 auto" } + +Then click at **Environment variables (1)** in the displayed window. +{ style="display: block; margin: 0 auto" } + +In the section User variables for **UserXY (1)** you will select the line with variable **Path (2)** and then you will click on the **Edit (3)** button. + +{ style="display: block; margin: 0 auto" } +<!-- { width="800" height="600" style="display: block; margin: 0 auto" } --> + +You can add the new path by clicking on the **New (1)** button and then you have to insert the path to the pre-prepared **“bin” folder (2)**, see below. The setup is then confirmed by clicking on the **OK (3)** button. + + %USERPROFILE%\bin + +{ style="display: block; margin: 0 auto" } + +In the end, you will click **OK** and **Apply**. + +!!! warning + In case you have installed the rclone using the steps above, you can then use the following command to update rclone:<br/> + **```rclone selfupdate```**<br/> + 2022/08/25 11:54:07 NOTICE: Successfully updated rclone from version v1.59.0 to version v1.59.1 + +## Basic configuration of rclone +Below you can find the guide for the elementary configuration of rclone tool. Below are two guides. First describes configuration using the command line and second guide describes configuration using the graphical user interface. + +---- + +### Rclone configuration using the command line +!!! warning + To be able to configure the rclone tool using this guide **first, you have to download, unzip and install rclone**, the guide can be found in the [first section](#downloading-and-installation-of-rclone-tool). + + +Rclone has a configuration wizard, which will guide you step-by-step through the configuration of your S3 data storage. + +???+ note "Command line in Windows and Linux" + **Windows users** need to run **Command Prompt** and then run the command below. + **Linux users** can open the **terminal window** and then run the command below. + + **```rclone config```** + +{ style="display: block; margin: 0 auto" } + +From the list of options, we will choose **New remote** via inserting **n** letter. Then we need to type the name of our connection/storage, e.g. `cesnet_s3`. Then we will choose the **Option Storage**, here **Amazon S3 Compliant Storage Providers…** + +{ style="display: block; margin: 0 auto" } + +In the next step, we need to choose **Option provider**, here **Ceph Object Storage**. + +{ style="display: block; margin: 0 auto" } + +Then it is necessary to choose how to enter S3 credentials. **Here we choose Enter AWS credentials in the next step**. + +{ style="display: block; margin: 0 auto" } + +In the next steps we enter our credentials **access_key_id** and **secret_access_key** which we obtained from the administrator or which we generated. + +{ style="display: block; margin: 0 auto" } + +Then we need to select **Option region**. We will leave **this option empty** and then continue with **Enter**. + +{ style="display: block; margin: 0 auto" } + +Then we need to insert **Option endpoint** according to the data center, where we have generated the credentials, here for example **s3.cl2.du.cesnet.cz**. + +{ style="display: block; margin: 0 auto" } + +Next is **Option location_constraint**. We will **leave this option empty** and then continue with **Enter**. + +{ style="display: block; margin: 0 auto" } + +Next point si **Option acl**, here we can either choose **Owner gets FULL_CONTROLL**, or we can leave this option empty and continue with **Enter**. + +{ style="display: block; margin: 0 auto" } + +In the next step, we need to choose **Option server_side_encryption**, here we pick the option **None**. + +{ style="display: block; margin: 0 auto" } + +Then we can choose **Option see_kms_key_id**, here we pick the option **None**. + +{ style="display: block; margin: 0 auto" } + +In the next step we can choose option **Edit advanced config**, here we choose **No**, option **n**. + +{ style="display: block; margin: 0 auto" } + +In the last step, we check the configuration and we will confirm it by typing **y** letter. + +{ style="display: block; margin: 0 auto" } + +---- + +### Rclone configuration using graphical user interface + +!!! warning + To be able to configure the rclone tool using this guide **first, you have to download, unzip and install rclone**, the guide can be found in the [first section](#downloading-and-installation-of-rclone-tool). + +Firstly you need to run the GUI. **Windows users** need to open **Command Prompt** and run the command below. **Linux users** need just open the **terminal window** and run the command below. + + rclone rcd --rc-web-gui + +The next steps are identical for Windows and Linux. + +After starting up the graphical interface we click in the left menu on the **Configs (1)** button and then on the **Create a New Config (2)** button. + +{ style="display: block; margin: 0 auto" } + +In the displayed window, we insert the **connection name** in our example `cesnet_s3cl2` and then we choose the option **Amazon S3 Compliant Storage Providers**. + +{ style="display: block; margin: 0 auto" } + +In the next step, we need to insert **credentials (1)** which we obtained from the administrator or which we generated. You have to insert **Ceph** into the line denoted **Choose your S3 provider**. Then we need to insert the **S3 endpoint address (2)**. Then we can click **Next**. + +{ style="display: block; margin: 0 auto" } + + +!!! warning + Please be careful during the modification in the **Configs** section. Rclone GUI sometimes **does not save the changes** in the configuration. We strongly recommend to cross-check the **[configuration file](#configuration-file)** after saving. + + +**Uploading the data from your local machine** + +After the configuration, we can start to transfer the data. + +In the left menu click on the **Explorer (1)** button. Then select **the name of configuration (2)**, for example `cesnet_s3cl2`. Then you click on the **Open (3)** button. Then there should be a window with the buckets and files from the configured data storage. + +{ style="display: block; margin: 0 auto" } + +!!! warning + Graphical user interface of rclone **DOES NOT SUPPORT** creation of empty buckets and directories. If you copy your data from the local machine you have to copy the directory with data. Alternatively, you can prepare empty buckets using [command line](#rclone-basic-controls). + +If you wish to upload your data then in the displayed window click on **upload icon (1)**. Then you can select the data from your disk or drag and drop them into the window. + +{ style="display: block; margin: 0 auto" } +---- +### Configuration file +!!! warning + Configuration file can be found in the location described below. In the configuration file are saved the credentials and all selected options. + +???+ note "Windows config file" + C:\Users\DedaLebeda\AppData\Roaming\rclone\rclone.conf<br/> + <br/>[cesnet_s3]<br/> + type = s3<br/> + provider = Ceph<br/> + access_key_id = my-access-key<br/> + secret_access_key = my-secret-key<br/> + endpoint = s3.cl2.du.cesnet.cz<br/> + acl = private + +???+ note "Linux config file" + ~/.config/rclone/rclone.conf<br/> + <br/>[cesnet_s3]<br/> + type = s3<br/> + provider = Ceph<br/> + access_key_id = my-access-key<br/> + secret_access_key = my-secret-key<br/> + endpoint = s3.cl2.du.cesnet.cz<br/> + acl = private<br/> + +## Rclone basic controls + +!!! warning + All available commands for rclone can be listed using the command + + rclone help + + Alternatively you can find rclone guide on the [rclone websites](https://rclone.org/commands/). Below are described the selected commands to control buckets, directories and files. + +### Listing buckets and directories + +**Listing of the available profiles/connections.** + + rclone listremotes + cesnet_s3_encrypted: + cesnet_s3cl2: + sftp_du4: + sftp_du5: + +**Listing of buckets of the selected profile/connection.** + + rclone lsd cesnet_s3cl2: + -1 2020-11-11 08:53:48 -1 111 + -1 2022-07-28 10:03:20 -1 test + +### Creation of the bucket, copying, deletion... + +**Creation of the new bucket.** + + rclone mkdir cesnet_s3cl2:test-bucket + +**Deletion of the bucket.** + + rclone rmdir cesnet_s3cl2:test-bucket + +**Copying the files.** + +!!! warning + Rclone cannot create empty folders, see the error below. + + rclone mkdir cesnet_s3cl2:test-bucket/empty-directory + 2022/08/24 12:18:36 NOTICE: S3 bucket test-bucket path empty-directory: Warning: running mkdir on a remote which can't have empty directories does nothing + + The solution is to use a full path inside the bucket including a non-existing directory during copying. **In case you type non-existing directory rlone will create it**, see the example below. + +Let's copy the files: + + rclone copy /home/user/test_file1.pdf cesnet_s3cl2:test-bucket/new-dir1/new-dir2/ + +Then we can check the files using ls command, where we can see that the folders have been created. Namely new-dir1 a new-dir2: + + rclone ls cesnet_s3cl2:test-bucket + 3955866 new-dir1/new-dir2/test_file1.pdf + +**File deletion.** + +To delete a particular file, we can use either command **deletefile** or the command **delete** to remove all files in the given path. + + rclone deletefile cesnet_s3cl2:test-bucket/new-dir1/new-dir2/test_file1.pdf + +!!! warning + In case you delete the only file (object) in the directory resulting in **empty directories structure** the empty directories will be deleted! Directories are in object technology always represented by the name of a particular object (file), deletion of empty directories is thus expected behavior. + +### Directory syncing + +To sync the directories you can use the option `sync`. Synchronization is affecting the content only on the target side, no changes are performed on the source side. + +Below is the exemplary command of rclone sync. The command contains recommended options which are described below. + + rclone sync --dry-run --progress --fast-list /home/user/source-dir cesnet_s3cl2:test-bucket/ + +???+ note "Syncing process" + The command above always syncs the content of the source directory. It does not matter if you **DO NOT** use the slash at the end of the source directory. **Behavior of rclone is in this perspective different than rsync behavior**. + +The command above contains several recommended options. + +Option dry-run allows performing the dry-run sync with listing the potential changes. + + --dry-run + +Option progress allows seeing the continuous progress of the sync. + + --progress + +Option fast-list allows limiting the number of API requests. This option can enhance the transfer of larger datasets. It uses one request to read the information about 1000 objects and store it in the memory. + + --fast-list + +Option interactive allows interactively deciding which change (on the target data storage) we want to accept or reject. + + --interactive + +### Data integrity checks + +???+ note "Enhancing the speed of checking" + All commands related to data integrity check should contain `--fast-list` option, see above. Using the `--fast-list` option will enhance the speed of the integrity checks. + +Rclone allows testing the integrity of transferred data. + + rclone check --fast-list C:/Users/Alfred/source-for-sync/my-local-data cesnet_s3cl2:test-sync + +The command checks the checksums on the source side as well as on the target side. For fast checks you can use the option `--size-only`, where are checked only file sizes. + + rclone check --fast-list --size-only C:/Users/Alfred/source-for-sync/my-local-data cesnet_s3cl2:test-sync + +!!! warning + To check data integrity on the encrypted buckets please use the option `cryptcheck` which is described [in the guides related to encrypted buckets](#check-of-encrypted-data-integrity). In the case of using the option check on the encrypted volume, there will occur the forced download of all data in the checked path. Forced downloads are unnecessary and can stall your client. + +## Configuration and controls of encryted bucket + +This section describes the configuration and controls of encrypted buckets using rclone tool. It goes about client-side encryption. Below are the guides for setup using the command line and for setup using the graphical user interface. + +### Configuration using the command line + +!!! warning + To be able to configure the rclone tool using this guide **first, you have to download, unzip and install rclone**, the guide can be found in the [first section](#downloading-and-installation-of-rclone-tool). + +Rclone has a wizard that eases the setup of an encrypted bucket. + +**Windows user** needs the **Command Prompt tool**, where he/she can directly start the rclone configuration using the command below. + +**Linux user** needs just to open the **Terminal window** and continue with following rclone. + + rclone config + +{ style="display: block; margin: 0 auto" } + +On the displayed list of the options, we will select **New remote** via typing **n**. Then we will insert the name of our data storage, for instance, `cesnet_s3_encrypted`. Then we will select **Option Storage**, and here **Encrypt/Decrypt a remote**. + +{ style="display: block; margin: 0 auto" } + +In the next step, we have to define **Option remote**. Here we need to select **existing S3 profile/connection** and define the name of the bucket where will rclone create the encrypted space. We have to use the format **s3-profile:bucket-name**. + +{ style="display: block; margin: 0 auto" } + +Then we need to select **Option filename_encryption**. There we can select **Encrypt the filenames** alternatively, we can keep it empty if we wish to not encrypt the filenames. + +{ style="display: block; margin: 0 auto" } + +Then we can select **Option directory_name_encryption**. There we can select **Encrypt directory names** alternatively, we can keep it empty if we wish to not encrypt the directory names. + +{ style="display: block; margin: 0 auto" } + +In the next step **Option password** we have to choose an encryption password. + +{ style="display: block; margin: 0 auto" } + +Furthermore, we recommend choosing **Option password2**. This password will be used as salt for consequencing encryption. + +{ style="display: block; margin: 0 auto" } + +Option **Edit advanced config** can be skipped, option **n**. + +{ style="display: block; margin: 0 auto" } + +The configuration is completed now. In the next step, we can confirm the option **Keep this encrypted config remote** using option **y**. + +{ style="display: block; margin: 0 auto" } + +The last step is to check the encryption. Firstly we need to list available configurations/connections. + + rclone listremotes + cesnet_s3_encrypted: + cesnet_s3cl2: + +Then we can using [sync command](#directory-syncing) upload three pictures into decrypted bucket. + + rclone sync --progress --fast-list /home/user/source-dir cesnet_s3_encrypted: + +Now we can list decrypted bucket, where we have uploaded three pictures. + + rclone ls cesnet_s3_encrypted: + 256805 DSC_0004.jpg + 337491 DSC_0006.jpg + 251493 DSC_0005.jpg + +In the end, we can list the encrypted bucket, where we can see three encrypted files. + + rclone ls cesnet_s3cl2:test-encryption + 256901 1er0np7kppc9jvkt7kr8f9sn90 + 337619 cuqqkkhsklbnf1eegkujfkrcl4 + 251589 pelqqer8osssa4k8uon95a4o6c + +### Configuration of the encrypted bucket using the graphical user interface + +!!! warning + To be able to configure the rclone tool using this guide **first, you have to download, unzip and install rclone**, the guide can be found in the [first section](#downloading-and-installation-of-rclone-tool). + +Firstly you need to deploy the graphical user interface. **Windows users** need the **Command Prompt** tool and then run the command below. The command below should open your web browser with rclone GUI. The same process is valid for **Linux users**, who need to open the **Terminal window** and run the command listed below. + + rclone rcd --rc-web-gui + +The following steps are identical for Windows as well as for Linux users. + +After GUI startup we will click in the left menu on the **Configs (1)** button and then on the **Create a New Config (2)** button. + +{ style="display: block; margin: 0 auto" } + +Firstly, we need to type **Name of this drive (1)** and then we will select from the menu option **Encrypt/Decrypt a remote (1)**. Then we will click on the **Next** button. + +{ style="display: block; margin: 0 auto" } + +In the next step, we need to specify **Remote to encrypt/decrypt (1)**. Here is **important** to define the already existing S3 profile/connection and the bucket name where we wish to create encrypted space. The input must be here in the following format **s3-profile:bucket-name**. If you choose **non-existing bucket** rclone will create it. Then we will choose the **Password for encryption (2)** and also recommended **Password for salt (2)**. + +{ style="display: block; margin: 0 auto" } + +Then we need to click on the **Explorer (1)** button. Now we are in browser mode and then via clicking at **+ (2)** we can open a new tab with an encrypted bucket. + +{ style="display: block; margin: 0 auto" } + +Then we need to click in the field **Type the name of remote you want to open (1)** and select the corresponding name of the encrypted bucket **(1)**. Then we can continue by clicking on the **Open (2)** button. + +{ style="display: block; margin: 0 auto" } + +At this moment we can start to upload the data which we wish to be encrypted. Just click on the **Upload (1)** icon and then you can select the data from the local disk or you can drag-and-drop your data using **interactive window (2)**. + +{ style="display: block; margin: 0 auto" } + +In the example below we have uploaded three pictures **(1)** into decrypted volume. We can check the upload in explorer by opening the remote S3 storage in the tab **(2)**. + +{ style="display: block; margin: 0 auto" } + +Now we can have a look into encrypted bucket **(1)**. + +{ style="display: block; margin: 0 auto" } + +Indeed we can see that our three pictures **(1)** have been encrypted. + +{ style="display: block; margin: 0 auto" } + +???+ note "Configuration files for encrypted volumes" + Configuration file for encrypted volumes can be found in the [previous section](#configuration-file). + +### Check of encrypted data integrity + +To check encrypted data integrity it is necessary to use the command **cryptcheck**, see below. Using the common workflow for data integrity checks will cause significant difficulties in the encrypted bucket. It can result in forced downloading of all data from the remote site so it can stall your client. + + rclone cryptcheck --fast-list C:\Users\Albert\Desktop\test_sync shared_encrypted:dir01/ + + 2022/08/29 16:57:45 NOTICE: Encrypted drive 'shared_encrypted:dir01/': 0 differences found + 2022/08/29 16:57:45 NOTICE: Encrypted drive 'shared_encrypted:dir01/': 14 matching files + +???+ note "Enhancing the speed of checking" + While using option cryptcheck we recommend to use option `--fast-list`. It allows cache info about more than 1000 objects within one request, so it rapidly accelerates the checks. + +### Sharing of encrypted buckets + +The buckets can be shared within the mutual space called the tenant or between users using the bucket policy. If you wish to share the buckets equipped with the encrypted volume you need to share the credentials (for encrypted volume in your bucket) with your colleagues. A shared bucket has to have a properly set up [bucket policy](aws-cli.md). + +Once you configure the encryption in your bucket you just need to share the encryption passwords, you used during the encrypted bucket creation and the bucket name with your colleague. Your colleague can use the guide above to configure corresponding encrypted buckets on his/her machine using the passwords, you shared. + +!!! warning + Please be aware of the next section describing the need for **change encrypting passwords, or loss of encrypting passwords**. + +### Compromitting of encrypting passwords vs. loss of encrypting passwords + +**In case of compromitting or leakage** of your encrypting passwords or in the situation that you need to change the passwords is only possible to create a new encrypted volume with new encrypting passwords. All data has to be transferred to the new encrypted volume and the old one should be deleted. + +Here you have two general options. The first option is to upload your data from the local machine to the encrypted volume if you have them locally. Then you can delete the old encrypted volume. + +The second option is to transfer the data using rclone. You can use rclone to copy the data from the old encrypted volume to the new encrypted volume. The advantage of this method is that you don't have to download all data locally to your machine and then upload it again, see the example below. + + rclone copy old_encrypted_drive:dir01 new_encrypted_drive:dir01 + +**In case of loss of encryption passwords you lost your data as well!** + +!!! warning + In the case of encrypted buckets, it goes about client-side encrypting. If you lose your encrypting passwords the administrators have **NO POWER** on how to restore your encrypted data. + + **Loss of encrypting passwords always means data loss!!!** diff --git a/object-storage/s3-clients.md b/object-storage/s3-clients.md new file mode 100644 index 0000000000000000000000000000000000000000..06aa117d7d7ddb766c548728351a69bff57bd5cb --- /dev/null +++ b/object-storage/s3-clients.md @@ -0,0 +1,44 @@ +--- +languages: + - en + - cs +--- +# Favourite S3 service clients +In the following section you can find recommended S3 clients. For all S3 clients are necessary S3 credentials `access_key` and `secret_key` and the S3 endpoint address, see below. + +???+ note "Available S3 endpoints" + cl1 - https://s3.cl1.du.cesnet.cz<br/> + cl2 - https://s3.cl2.du.cesnet.cz<br/> + cl3 - https://s3.cl3.du.cesnet.cz<br/> + cl4 - https://s3.cl4.du.cesnet.cz<br/> + +## S3 Browser (GUI Windows) +[S3 Browser](https://s3browser.com/) is a freeware tool for Windows to manage your S3 storage, upload and download data. You can manage up to two user accounts (S3 account) for free. [The Guide for S3 Browser](s3browser.md). + +## CloudBerry Explorer for Amazon S3 (GUI Windows) +[CloudBerry Explorer](https://cloudberry-explorer-for-amazon-s3.en.softonic.com/) is an intuitive file browser for your S3 storage. It has two windows so in one you can see the local disk and in the second you can see the remote S3 storage. Between these two windows, you can drag and drop your files. [The guide for CloudBerry explorer](cloudberry.md). + +## AWS-CLI (command line, Linux, Windows) +[AWS CLI](https://aws.amazon.com/cli/) - Amazon Web Services Command Line Interface - is standardized too; supporting S3 interface. Using this tool you can handle your data and set up your S3 data storage. You can used the command line control or you can incorporate AWS CLI into your automated scripts. [The guide for AWS-CLI](aws-cli.md). + +## Rclone (command line + GUI, Linux, Windows) +The tool [Rclone](https://rclone.org/downloads/) is suitable for data synchronization and data migration between more endpoints (even between different data storage providers). Rclone preserves the time stamps and checks the checksums. It is written in Go language. Rclone is available for multiple platforms (GNU/Linux, Windows, macOS, BSD and Solaris). In the following guide, we will demonstrate the usage in Linux and Windows systems. [The guide for rclone](rclone.md). + +## s3cmd (command line Linux) +[S3cmd](https://s3tools.org/download) is a free command line tool to upload and download your data. You can also control the setup of your S3 storage via this tool. S3cmd is written in python. It goes about open-source project available under GNU Public License v2 (GPLv2) for personal either or commercial usage. [The guide for s3cmd](s3cmd.md). + +## s5cmd for very fast transfers (command line Linux) +In case you have a connection between 1-2Gbps and you wish to optimize the transfer throughput you can use s5cmd tool. S5cmd is available in the form of precompiled binaries for Windows, Linux and macOS. It is also available in form of source code or docker images. The final solution always depends on the system where you wish to use s5cmd. A complete overview can be found at [Github project](https://github.com/peak/s5cmd). [The guide for s5cmd](s5cmd.md). + +## WinSCP (GUI Windows) +[WinSCP](https://winscp.net/eng/index.php) is the popular SFTP client and FTP client for Microsoft Windows! Transfer files between your local computer and remote servers using FTP, FTPS, SCP, SFTP, WebDAV or S3 file transfer protocols. [The guide for WinSCP](winscp.md) + +## CyberDuck (GUI Windows) +[CyberDuck](https://cyberduck.io/s3/) is a multifunctional tool for various types of data storage (FTP, SFTP, WebDAV, OpenStack, OneDrive, Google Drive, Dropbox, etc.). Cyberduck provides only elementary functionalities, most of the advanced functions are paid. [The guide for CyberDuck](cyberduck.md) + + + + + + + diff --git a/object-storage/s3-features.md b/object-storage/s3-features.md new file mode 100644 index 0000000000000000000000000000000000000000..db8cb869dbe8f3d4a4e8e60ba3c1c114f9d4c511 --- /dev/null +++ b/object-storage/s3-features.md @@ -0,0 +1,419 @@ +--- +languages: + - en + - cs +--- + +# Advanced S3 features + +In the following sections, you can find a basic description of advanced S3 features that can enhance the effectiveness of your data workflow. + +## Sharing S3 object using (presigned) URL + +!!! warning + To be able to generate the URL links for objects stored on the S3 storage you have to setup **[aws tool first](aws-cli.md)**. + +All objects and buckets are by default private. The pre-signed URL is a reference to Ceph S3 object, which allows anyone who receives the URL to retrieve the S3 object with an HTTP GET request. + +The following presigning command generates a pre-signed URL for a specified bucket and key that is valid for one hour: + + aws s3 --profile myprofile presign s3://bucket/file + +If you want to create a pre-signed URL with a custom lifetime that links to an object in an S3 bucket you have to use: + + aws s3 --profile myprofile presign s3://bucket/file --expires-in 2419200 + +This will create URL accessible for a month. Parametr `--expires-in` is in seconds. + +When pre-signed URL has been expired, you will see something like following: + + This XML file does not appear to have any style information associated with it. The document tree is shown below. + <Error> + <link type="text/css" rel="stylesheet" id="dark-mode-general-link"/> + <link type="text/css" rel="stylesheet" id="dark-mode-custom-link"/> + <style lang="en" type="text/css" id="dark-mode-custom-style"/> + <Code>AccessDenied</Code> + <RequestId>tx0000000000000000f8f26-00sd242d-1a2234a7-storage-cl2</RequestId> + <HostId>1aasd67-storage-cl2-storage</HostId> + </Error> + +???+ note "Changing the URL lifetime" + Once you generate pre-signed URL, you can't change its lifetime, you have to generate a new pre-signed URL. It applies to both, expired and non-expired URLs. + +## S3 Object versioning + +Object Versioning is used to store multiple copies of an object within the same bucket. Each of these copies corresponds to the content of the object at a specific moment in the past. This functionality can be used to protect the objects of a bucket against overwriting or accidental deletion. + +This functionality, which allows a historical record of the objects in a bucket, requires that it be enabled at the bucket level, thus giving rise to three different states of the bucket:'unversioned', 'versioning enabled' or 'versioning suspended'. + +When a bucket is created, it is always in the 'unversioned state'. + +When the functionality is enabled, the bucket can switch between the states 'versioning enabled' or 'versioning suspended' but can not return to the state 'unversioned state', that is, you can not disable the versioning of the bucket once it is enabled. It can only be suspended. + +Each version of an object is identified through a VersionID. When the bucket is not versioned, the VersionID will be a null value. In a versioned bucket, updating an object through a PUT request will store a new object with an unique VersionID. + +Access to a version of an object in a bucket can be done through its name or combination name and VersionID. In the case of accessing by name only, the most recent version of the object will be recovered. + +In the case of deleting an object in a versioned bucket, access attempts, through GET requests, will return an error, unless a VersionID is included. To restore a deleted object it is not necessary to download and upload the object. It is sufficient to issue a COPY operation including a specific VersionID. We will show you in this guide. + +To test the versioning of objects we can use the AWS CLI, an open source tool that provides commands to interact with AWS services from a terminal program. Specifically we will use the AWS CLI’s API-level commands, contained in the s3api command set. + +### Versioning the bucket + +For non versioned bucket, if an object with the same key is uploaded it overwrites the object. For versioned bucket, if an object with the same key is uploaded the new uploaded object becomes the current version and the previous object becomes the non current version: + +!!! warning + For proper functionality, it is necessary to use the --endpoint-url option for all commands for the relevant S3 addresses of the services operated by the CESNET association. + +???+ note "Bucket name restrictions" + The bucket name must be unique within tenant and should contain only uppercase and lowercase letters, numbers, and dashes and periods. The bucket name must only start with a letter or number and must not contain periods next to dashes or multiple periods. We also recommend **NOT using** `/` and `_` in the name, as this would make it impossible to use it via the API. + +First we need to create the bucket, whete we will setup the versionig. + + aws s3api create-bucket --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + +Then we can check whether the versionig is enabled. + + aws s3api get-bucket-versioning --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + +Now we will enable the versioning. + + aws s3api put-bucket-versioning --bucket "bucket name" --versioning-configuration Status=Enabled --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + +If we check the status of versioning again we can see that it is enabled. + + aws s3api get-bucket-versioning --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "Status": "Enabled", + "MFADelete": "Disabled" + } + +### Adding the object +Now we will put new object into created bucket. + + aws s3api put-object --key "file name" --body "file path 1" --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "VersionId": "KdS5Yl0d06bBSYriIddtVb0h5gofiNX" + } + +Now we can change the file via updating the body. + + aws s3api put-object --key "file name" --body "file path 2" --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa" + } + +Now we can list the object versinos. + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": true, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + }, + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "KdS5Yl0d06bBSYriIddtVb0h5gofiNX", + "IsLatest": false, + "LastModified": "2020-05-18T10:33:53.066Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + } + ] + } + +### Retrieve an object +For a versionless bucket with object lookup, it always returns a single available object. For a bucket with versioning, the search returns the current object: + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": true, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + }, + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "KdS5Yl0d06bBSYriIddtVb0h5gofiNX", + "IsLatest": false, + "LastModified": "2020-05-18T10:33:53.066Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + } + ] + } + +Now we can retrieve desired object. + + aws s3api get-object --key "file name" "file name.out" --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "AcceptRanges": "bytes", + "LastModified": "Mon, 18 May 2020 10:34:05 GMT", + "ContentLength": 13, + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "ContentType": "binary/octet-stream", + "Metadata": {} + } + +For a versioned bucket, inactive objects can be retrieved by specifying the Version ID: + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": true, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + }, + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "KdS5Yl0d06bBSYriIddtVb0h5gofiNX", + "IsLatest": false, + "LastModified": "2020-05-18T10:33:53.066Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + } + ] + } + +Now we can list the particular versions. + + aws s3api list-object-versions --bucket "bucket name" --version-id KdS5Yl0d06bBSYriIddtVb0h5gofiNX --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "AcceptRanges": "bytes", + "LastModified": "Mon, 18 May 2020 10:33:53 GMT", + "ContentLength": 13, + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "VersionId": "KdS5Yl0d06bBSYriIddtVb0h5gofiNX", + "ContentType": "binary/octet-stream", + "Metadata": {} + } + +### An object removal +For a versionless bucket, the object is permanently deleted and cannot be recovered. For a versioned bucket, all versions remain in the bucket and RGW inserts a delete flag that becomes the current version: + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": true, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + }, + +Now we can check the object versions again. + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": false, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + } + ], + "DeleteMarkers": [ + { + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + }, + "Key": "test-key-1", + "VersionId": "hxV8on0vry4Oz0FNcgsz88aDcQoZO.y", + "IsLatest": true, + "LastModified": "2020-05-18T11:21:57.544Z" + } + ] + } + +In the case of a versioned bucket, if an object with a specific VersionID is deleted, it is permanently deleted: + + aws s3api delete-object --key "file name" --version-id KdS5Yl0d06bBSYriIddtVb0h5gofiNX --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "VersionId": "KdS5Yl0d06bBSYriIddtVb0h5gofiNX" + } + +Now we can check the object versions again. + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "DeleteMarkers": [ + { + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + }, + "Key": "test-key-1", + "VersionId": "ZfT16FPCe2xVMjTh-6qqfUzhQnLQMfg", + "IsLatest": true, + "LastModified": "2020-05-18T11:22:48.482Z" + }, + + } + +### An object restoration +To restore an object, the recommended approach is to copy the previous version of the object to the same bucket. The copied object becomes the current version of the object and all versions of the object are preserved: + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": false, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + } + ], + "DeleteMarkers": [ + { + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + }, + "Key": "test-key-1", + "VersionId": "hxV8on0vry4Oz0FNcgsz88aDcQoZO.y", + "IsLatest": true, + "LastModified": "2020-05-18T11:21:57.544Z" + } + ] + } + +Now we can restore the particular version of the object. + + aws s3api copy-object --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz --copy-source "bucket name"/"file name"?versionId=xNQC4pIgMYx59digj5.gk15WC4efOOa --key "file name" + + { + "CopyObjectResult": { + "ETag": "5ec0f1a7fc3a60bf9360a738973f014d", + "LastModified": "2020-05-18T13:28:52.553Z" + } + } + +And check the object versions. + + aws s3api list-object-versions --bucket "bucket name" --profile "profil name" --endpoint-url=https://s3.cl2.du.cesnet.cz + { + "Versions": [ + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "EYXgE1z-28VkVS4zTD55SetB7Wdwk1V", + "IsLatest": true, + "LastModified": "2020-05-18T13:28:52.553Z", + "Owner": { + "DisplayName": "Testing", + "ID": "strnad$strnad" + } + }, + { + "ETag": "\"5ec0f1a7fc3a60bf9360a738973f014d\"", + "Size": 13, + "StorageClass": "STANDARD", + "Key": "test-key-1", + "VersionId": "xNQC4pIgMYx59digj5.gk15WC4efOOa", + "IsLatest": false, + "LastModified": "2020-05-18T10:34:05.072Z", + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + } + } + ], + "DeleteMarkers": [ + { + "Owner": { + "DisplayName": "Testing", + "ID": "user$tenant" + }, + "Key": "test-key-1", + "VersionId": "hxV8on0vry4Oz0FNcgsz88aDcQoZO.y", + "IsLatest": false, + "LastModified": "2020-05-18T11:21:57.544Z" + } + ] + } + +## Setup bucket policies for sharing (AWS-CLI S3 plugin) +Coming soon... diff --git a/object-storage/s3-service-screenshots/direct_upload.png b/object-storage/s3-service-screenshots/direct_upload.png new file mode 100644 index 0000000000000000000000000000000000000000..2a1616aedd12e904a8aab58f1ed580d49f7a28c5 Binary files /dev/null and b/object-storage/s3-service-screenshots/direct_upload.png differ diff --git a/object-storage/s3-service-screenshots/s3_backup.png b/object-storage/s3-service-screenshots/s3_backup.png new file mode 100644 index 0000000000000000000000000000000000000000..9f64fdfad9893dbdcc15a619e6be05f0b0f24390 Binary files /dev/null and b/object-storage/s3-service-screenshots/s3_backup.png differ diff --git a/object-storage/s3-service-screenshots/s3_distribution.png b/object-storage/s3-service-screenshots/s3_distribution.png new file mode 100644 index 0000000000000000000000000000000000000000..9ddab8443c98f035f8ec0bfe724e33b2c8d33171 Binary files /dev/null and b/object-storage/s3-service-screenshots/s3_distribution.png differ diff --git a/object-storage/s3-service.cs.md b/object-storage/s3-service.cs.md new file mode 100644 index 0000000000000000000000000000000000000000..aaab98309911156b9d6895c98f4d3c13eaf7ab12 --- /dev/null +++ b/object-storage/s3-service.cs.md @@ -0,0 +1,10 @@ +--- +languages: + - en + - cs +--- + +# Object Storage + +Detail documentation for Object Storage services could be found at [du.cesnet.cz](https://du.cesnet.cz/en/navody/object_storage/start) + diff --git a/object-storage/s3-service.md b/object-storage/s3-service.md new file mode 100644 index 0000000000000000000000000000000000000000..d2849ce26443677afc06e39634dcf1d3c085c94f --- /dev/null +++ b/object-storage/s3-service.md @@ -0,0 +1,57 @@ +--- +languages: + - en + - cs +--- +# S3 Service + +S3 service is a general service suited for most of the use cases. S3 service can be used for elementary data storing, automated backups, or various types of data handling applications. + +Access to the service is controlled by virtual organizations and coresponding groups. S3 is suitable for sharing data between individual users and groups that may have members from different institutions. Tools for managing groups and users are provided by the e-infrastructure. Users with access to S3 can be people, as well as "service accounts", for example for backup machines (a number of modern backup tools support natively S3 connection). Data is organized into buckets in S3. It is usually appropriate to link individual buckets to the logical structure of your data workflow, for example different stages of data processing. Data can be stored in the service in an open form or in case of sensitive data it is possible to use encrypted buckets on the client side. Where even the storage manager does not have access to the data. Client-side encryption also means that the transmission of data over the network is encrypted, and in case of eavesdropping during transmission, the data cannot be decrypted. + +???+ note "How to get S3 service?" + To connect to S3 service you have to contact support at: + `support@cesnet.cz` + +---- +## S3 Elementary use cases +In the following section you can find the description of elementary use cases related to S3 service. + +### Automated backup of large datasets using the tools natively supporting S3 service +If you use specialized automated tools for backup, such as Veeam, bacula, restic..., most of these tools allow native use of S3 service for backup. So you don't have to deal with connecting block devices etc. to your infrastructure. You only need to request an S3 storage setup and reconfigure your backup. Can be combined with the WORM model as protection against unwanted overwriting or ransomware attacks. + +{ style="display: block; margin: 0 auto" } + +### Data sharing across you laboratory or over multiple institutions +If you manage multiple research groups where you need users to share data, such as data collection and its post-processing, you can use S3. The S3 service allows you to share data within a group or between users. This use case assumes that each user has own access to the repository. This use case is also suitable if you need to share sensitive data between organizations and do not have a secure VPN. You can use encrypted buckets (client-side encryption) within the S3 service. Client-side encryption also means that the transmission of data over the network is encrypted, and in case of eavesdropping during transmission, the data cannot be decrypted. + +{ style="display: block; margin: 0 auto" } + +### Life systems handlig the data - Learning Management Systems, Catalogues, Repositories +You have large data and you operate an application in e-infrastructure that issues data to your users. This use case is particularly relevant to applications that distribute large data (raw scans, large videos, large scientific data sets for computing environments...) to end users. For this use case, it is possible to use the S3 service again. The advantage of using S3 for these applications is that there is no need to upload data to the application server, but the end user can upload/download data directly to/from object storage using S3 presign requests. + +{ style="display: block; margin: 0 auto" } + +### Personal space for your data +This case is similar to the VO storage service. This is a personal space in the S3 service just for your data, which does not allow sharing with a specific user. [Public reading](s3-features.md) can be set for buckets, or [presign URL requests](s3-features.md) can be used. + +### Dedicated S3 endpoint for special applications +This is a special service for selected customers/users. This dedicated S3 endpoint can be used for critical systems as protection against DDoS attacks. The endpoint would be hidden for other users, only insiders would know about it. + +### Any other application +**If you need a combination of the services listed above, or if you have an idea about some other application of object storage services, do not hesitate to contact us.** + +## S3 Data Reliability (Data Redundancy) - replicated vs erasure coding +In the section below are described additional aproaches for data redundancy applied to the object storage pool. S3 service can be equipped with **replicated** or **erasure code (EC)** redundancy. +### Replicated +Your data is stored in three copies in the data center. In case one copy is corrupted, the original data is still readable in an undamaged form, and the damaged data is restored in the background. Using a service with the replicated flag also allows for faster reads, as it is possible to read from all replicas at the same time. Using a service with the replicated flag reduces write speed because the write operation waits for write confirmation from all three replicas. + +???+ note "Suitable for?" + Suitable for smaller volumes of live data with a preference for reading speed (not very suitable for large data volumes). + +### Erasure Coding (EC) +Erasure coding (EC) is a data protection method. It is similar to the dynamic RAID known from disk arrays. Erasure coding (EC) is a method where data is divided into individual fragments, which are then stored with some redundancy across the data storage. Therefore, if some disks (or the entire storage server) fail, the data is still accessible and will be restored in the background. So it is not possible for your data to be on one disk that gets damaged and you lose your data. + +???+ note "Suitable for?" + Suitable, for example, for storing large data volumes. + diff --git a/object-storage/s3browser-screenshots/s3b-multipart1.png b/object-storage/s3browser-screenshots/s3b-multipart1.png new file mode 100644 index 0000000000000000000000000000000000000000..89fe2d2acfbe985aa46aa572455ef7a8aaa73e35 Binary files /dev/null and b/object-storage/s3browser-screenshots/s3b-multipart1.png differ diff --git a/object-storage/s3browser-screenshots/s3b-multipart2.png b/object-storage/s3browser-screenshots/s3b-multipart2.png new file mode 100644 index 0000000000000000000000000000000000000000..4378e42d78895303a5b506ee442489d441edd0a2 Binary files /dev/null and b/object-storage/s3browser-screenshots/s3b-multipart2.png differ diff --git a/object-storage/s3browser-screenshots/s3browser1.png b/object-storage/s3browser-screenshots/s3browser1.png new file mode 100644 index 0000000000000000000000000000000000000000..9596e9cd3263748011b2e77a6064e84e974bf990 Binary files /dev/null and b/object-storage/s3browser-screenshots/s3browser1.png differ diff --git a/object-storage/s3browser-screenshots/s3browser2.png b/object-storage/s3browser-screenshots/s3browser2.png new file mode 100644 index 0000000000000000000000000000000000000000..fd136cb9f43a79fa33c34a596b7b1e0a81f258cb Binary files /dev/null and b/object-storage/s3browser-screenshots/s3browser2.png differ diff --git a/object-storage/s3browser-screenshots/s3browser3.png b/object-storage/s3browser-screenshots/s3browser3.png new file mode 100644 index 0000000000000000000000000000000000000000..0566fef844235b95058dc0aa2499078b3df14c0b Binary files /dev/null and b/object-storage/s3browser-screenshots/s3browser3.png differ diff --git a/object-storage/s3browser-screenshots/s3browser4.png b/object-storage/s3browser-screenshots/s3browser4.png new file mode 100644 index 0000000000000000000000000000000000000000..0d7a018beab4c1f131d50e845bdf1faa167b5ddb Binary files /dev/null and b/object-storage/s3browser-screenshots/s3browser4.png differ diff --git a/object-storage/s3browser.md b/object-storage/s3browser.md new file mode 100644 index 0000000000000000000000000000000000000000..206fee4d39221604010b5798e5b40e9086927167 --- /dev/null +++ b/object-storage/s3browser.md @@ -0,0 +1,45 @@ +--- +languages: + - en + - cs +--- + +# S3 Browser + +[S3 Browser](https://s3browser.com/) is a freeware powerful and easy-to-use Windows client for S3 storage. You can manage up to two S3 accounts for free. + +For installation please use the official package on the S3 [Browser webpages](https://s3browser.com/download.aspx). + +## Basic configuration + +Storage settings are made via the **Accounts** button in the left part of the program window. + +{ style="display: block; margin: 0 auto" } + +Then select **Add new account** + +{ style="display: block; margin: 0 auto" } + +In the following window, select **S3 Compatible Storage** + +{ style="display: block; margin: 0 auto" } + +Then fill in **Display name** which is your connection name for better orientation, if you have multiple accounts. Then the **server s3.clX.du.cesnet.cz (clX - X according to the provided storage)** And keys: **Access Key ID = acces_key** and **Secret Access Key = secret_key by**. By clicking on **Add new account** the settings will be saved. + +{ style="display: block; margin: 0 auto" } + +## Multipart upload/download configuration + +If you need to upload and download large objects (typically larger than 5GB) you need to configure so-called multipart uploads/downloads. A large object is divided into multiple parts and then uploaded/downloaded. This functionality can also optimize the data throughput. On the data storage system are the objects represented as one object again. + +Open the tool S3 Browser and then click in the main menu on **1. Tools** and then on **2. Options**. + +{ style="display: block; margin: 0 auto" } + +Then click on the bookmark **1. General**. Then tick the box **2. Enable multipart uploads** and define the `part` size for upload. Then tick the box **3. Enable multipart downloads** and define the `part` size for download. In the end, click on the button **4. Save changes**. + +{ style="display: block; margin: 0 auto" } + + + + diff --git a/object-storage/s3cmd.md b/object-storage/s3cmd.md new file mode 100644 index 0000000000000000000000000000000000000000..926574309f5d6cbe49e8648b0efb44e237db856e --- /dev/null +++ b/object-storage/s3cmd.md @@ -0,0 +1,115 @@ +--- +languages: + - en + - cs +--- + +# s3cmd command line tool + +[S3cmd](https://s3tools.org/download) is a free command line tool. It allows you to upload and download your data to the S3 object storage. S3cmd is written in Python. S3cmd is an open-source project available under GNU Public License v2 (GPLv2) and it is free for personal as well as commercial usage. + +!!! warning + We recommend you **use preferably [AWS CLI](s3cmd.md)**. We encountered some issues while using s3cmd. For instance, bucket names cannot begin with numbers or capital letters. + +## Installation of s3cmd tool + +S3cmd is available in the system repositories for CentOS, RHEL and Ubuntu. You can install it via following guide. + +**On CentOS/RHEL** + + sudo yum install s3cmd + +**On Ubuntu/Debian** + + sudo apt install s3cmd + +## Configuration of s3cmd tool + +Please insert the following lines into the config file located at **/home/user/.s3cfg**. + + [default] + host_base = https://s3.clX.du.cesnet.cz + use_https = True + access_key = xxxxxxxxxxxxxxxxxxxxxx + secret_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + host_bucket = s3.clX.du.cesnet.cz + +`host_base` and `host_bucket` is S3 endpoint URL, which you received via email together with `access_key` and `secret_key`. You should receive it via email during the S3 account creation. + +**Config file with GPG encryption** + + [default] + host_base = https://s3.clX.du.cesnet.cz + use_https = True + access_key = xxxxxxxxxxxxxxxxxxxxxx + secret_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + host_bucket = s3.clX.du.cesnet.cz + gpg_command = /usr/bin/gpg + gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s + gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s + gpg_passphrase = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + +## Basic s3cmd commands + +S3cmd commands support elementary operations with buckets - creation, listing, and deletion. + +### Bucket operations + +???+ note "Bucket name" + The bucket name should be unique within tenant and should contain only small letters, capital letters, numerals, dashes, and dots. The bucket name must begin only with a letter or numeral and it cannot contain dots next to dashes or multiple dots. + +**Listing all s3 buckets** + + s3cmd ls + +**Creation of new s3 bucket** + + s3cmd mb s3://newbucket + +**Removing s3 bucket** + + s3cmd rb s3://newbucket + +_Only emptied bucket can be removed!_ + +**Listing s3 bucket size** + + s3cmd s3://newbucket/ du + +### Files and directories operation + +**Listing of s3 bucket** + + s3cmd ls s3://newbucket/ + +**File upload** + + s3cmd put file.txt s3://newbucket/ + +**Upload of encrypted files** + + s3cmd put -e file.txt s3://newbucket/ + +**Directory upload** + + s3cmd put -r directory s3://newbucket/ + +_Please make sure, that you didn't forget to remove the trailing slash (e.g. .: directory/), trailing slash denotes uploading only the content of the desired directory._ + +**Download file from s3 bucket** + + s3cmd get s3://newbucket/file.txt + +**Data deletion from s3 bucket** + + s3cmd del s3://newbucket/file.txt + + s3cmd del s3://newbucket/directory + +**Data sync into s3 bucket from local machine** + + s3cmd sync /local/path/ s3://newbucket/backup/ + +**Data sync from s3 bucket to local machine** + + 3cmd sync s3://newbucket/backup/ ~/restore/ diff --git a/object-storage/s5cmd.md b/object-storage/s5cmd.md new file mode 100644 index 0000000000000000000000000000000000000000..bf34bd842ce0fe890d82e4774bdc6d4b63275267 --- /dev/null +++ b/object-storage/s5cmd.md @@ -0,0 +1,36 @@ +--- +languages: + - en + - cs +--- + +# s5cmd for very fast transfers + +In case you have a fast connection of about 1-2Gbps and you want to utilize it for data transfers, you can use the s5cmd tool. It allows you to fully optimize the data transfer. The tool is available in form of compiled binaries for Windows, Linux and macOS. It is also available as a source code or docker image. Detailed information can be found on [the project Github page](https://github.com/peak/s5cmd). + +Please insert into **.aws/credentials** the folowing options. + + [default] + aws_access_key_id = xxxxxxxxxxxxxxxxxxxxxx + aws_secret_access_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + max_concurrent_requests = 200 + max_queue_size = 20000 + multipart_threshold = 128MB + multipart_chunksize = 32MB + +`aws_access_key_id` and `aws_secret_access_key` has been provided by admins while creating the S3 account. + +**Listing all buckets** + + s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz ls + +**Simple file upload** + + s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp myfile s3://bucket + +???+ note "How to achieve high transfer speed?" + To achieve higher speed for data transfers it is necessary to modify the following parameters, particularly utilize or CPU cores and workers, see below.<br/> + + s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp -c=8 -p=5000 /directory/big-file s3://bucket + + diff --git a/object-storage/template.md b/object-storage/template.md new file mode 100644 index 0000000000000000000000000000000000000000..7df09ba8ba2b76253a1766a7aca85efe49783af1 --- /dev/null +++ b/object-storage/template.md @@ -0,0 +1,20 @@ +--- +languages: + - en + - cs +--- + +# Object Storage + +Detail documentation for Object Storage services could be found at [du.cesnet.cz](https://du.cesnet.cz/en/navody/object_storage/start) + +{ style="display: block; margin: 0 auto" } + +!!! warning + To be able to configure the rclone tool using this guide **first, you have to download, unzip and install rclone**, the guide can be found in the [first section](#downloading-and-installation-of-rclone-tool). + +???+ note "Command line in Windows and Linux" + **Windows users** need to run **Command Prompt** and then run the command below. + **Linux users** can open the **terminal window** and then run the command below. + + diff --git a/object-storage/veeam-backup.md b/object-storage/veeam-backup.md new file mode 100644 index 0000000000000000000000000000000000000000..57d4b8e206b8c9d3197267aa60050df43ba3049c --- /dev/null +++ b/object-storage/veeam-backup.md @@ -0,0 +1,12 @@ +--- +languages: + - en + - cs +--- + +# Veeam backup suite + +!!! warning + This guide is under construction. + + diff --git a/object-storage/winscp-screenshots/winscp_setup1en.png b/object-storage/winscp-screenshots/winscp_setup1en.png new file mode 100644 index 0000000000000000000000000000000000000000..37e8ba2c610100467bb0e9b4a05ea2f048727e9b Binary files /dev/null and b/object-storage/winscp-screenshots/winscp_setup1en.png differ diff --git a/object-storage/winscp-screenshots/winscp_setup2en.png b/object-storage/winscp-screenshots/winscp_setup2en.png new file mode 100644 index 0000000000000000000000000000000000000000..7c41b47be187ace71fc86c4eacf1cd8e170ebf24 Binary files /dev/null and b/object-storage/winscp-screenshots/winscp_setup2en.png differ diff --git a/object-storage/winscp-screenshots/winscp_setup3en.png b/object-storage/winscp-screenshots/winscp_setup3en.png new file mode 100644 index 0000000000000000000000000000000000000000..bc83d1aa6a24734c4e34772f9c82a62c8645c741 Binary files /dev/null and b/object-storage/winscp-screenshots/winscp_setup3en.png differ diff --git a/object-storage/winscp-screenshots/winscp_setup4en.png b/object-storage/winscp-screenshots/winscp_setup4en.png new file mode 100644 index 0000000000000000000000000000000000000000..164c185aee42bd872e800bc5857e545b7150e65f Binary files /dev/null and b/object-storage/winscp-screenshots/winscp_setup4en.png differ diff --git a/object-storage/winscp-screenshots/winscp_setup5en.png b/object-storage/winscp-screenshots/winscp_setup5en.png new file mode 100644 index 0000000000000000000000000000000000000000..aed039383c3689fbc713bc802b817850ab2fbe8a Binary files /dev/null and b/object-storage/winscp-screenshots/winscp_setup5en.png differ diff --git a/object-storage/winscp-screenshots/winscp_setup6en.png b/object-storage/winscp-screenshots/winscp_setup6en.png new file mode 100644 index 0000000000000000000000000000000000000000..5d9d409492d28b157d5a377b077cc9536245e12e Binary files /dev/null and b/object-storage/winscp-screenshots/winscp_setup6en.png differ diff --git a/object-storage/winscp.md b/object-storage/winscp.md new file mode 100644 index 0000000000000000000000000000000000000000..862f808a452c649739aa9a9d8eb9fa8580290d2e --- /dev/null +++ b/object-storage/winscp.md @@ -0,0 +1,40 @@ +--- +languages: + - en + - cs +--- + +# WinSCP - S3 usage + +[WinSCP](https://winscp.net/eng/index.php) is the popular SFTP client and FTP client for Microsoft Windows! Transfer files between your local computer and remote servers using FTP, FTPS, SCP, SFTP, WebDAV or S3 file transfer protocols. + +## WinSCP installation +Please use the package directly from [WinSCP](https://winscp.net/eng/download.php) for installation. Installation is common and no special settings are required. + +## WinSCP configuration + +Run the tool. + +{ style="display: block; margin: 0 auto" } + +The storage connection is made via **Session** from the main menu and the **New session**. + +{ style="display: block; margin: 0 auto" } + +In the drop-down menu **File protocol** select **Amazon S3**. + +{ style="display: block; margin: 0 auto" } + +Then insert **Host name** `s3.clX.du.cesnet.cz` where replace **X** with the number of the cluster associated with your S3 account. Port `443` will be pre-filled automatically. Copy `access_key` into the field Access key ID and `secret_key` into the Secret access key - you received both of these keys encrypted from the administrators. + +{ style="display: block; margin: 0 auto" } + +Select **Advanced** settings. + +{ style="display: block; margin: 0 auto" } + +The field `Default region` leave blank! A Style URL **Path** and confirm OK. Then click **Connect**. + +{ style="display: block; margin: 0 auto" } + +Then the storage is connected and you will see a list of your buckets on the right side. Or you can start to create own buckets.