30.03.2021 yahe administration linux security
Every once in while I get asked if a certain backup scheme is a good idea and oftentimes the suggested backup solution is beyond what I would use myself. Duplicity, its simplification Duply or the not-so-dissimilar contenders Borg and Restic are among those solutions that are mentioned most often, with solutions like Bacula and its offspring Bareos coming much later.
Unfortunately, I would not trust any of these tools further than I could throw a harddrive containing a backup created with them. The reason for this is somewhat simple: In my opinion, all of these solutions are too complex in a worst-case scenario.
As soon as I mention this opinion, most people I talk to want to know how I do backups and they want to know if I ever tried those integrated solutions. Yes, I have used Duplicity for years until a system of mine broke down. I had an up-to-date backup of that system but still lost a lot of (fortunately not so important) data because the Duplicity backup had become inconsistent over time without notice. I was able to manually extract some data out of the backup, but it was not worth the time. That was the moment when I decided that I did not want to be reliant on such a software again.
There are several design goals that I wanted to achieve with my personal backup solution:
There also some non-goals that are not so important for me:
Let us start with the encryption. For this I have chosen the FUSE wrapper GoCryptFS which is available on GitHub. It is developed by @rfjacob who was one of the most active maintainers of the well-known EncFS encryption layer back in 2015-2018. The "project was inspired by EncFS and strives to fix its security issues while providing good performance" and it looks like he achieved that goal.
Using GoCryptFS ist pretty simple. After downloading the static binary from the GitHub repository you can create the required folders and initialize a so-called reverse repository. A reverse repository takes an unencrypted source directory and provides an encrypted version of the contained files through a second directory. This way you can encrypt files in-memory on-the-fly instead of requiring additional storage space for the encrypted copy of the files. The ad-hoc encryption and decryption of GoCrypt will come in handy for restores as well.
For our purposes we will create three folders:
./unencrypted
will contain our source material to be backed up./encrypted
will contain the ad-hoc encrypted files to be backed up./decrypted
will contain the ad-hoc decrypted files of the backup### create the local folders
mkdir -p ./unencrypted ./encrypted ./decrypted
### initialize the reverse encryption
gocryptfs --init --reverse ./unencrypted
### you can use the --plaintextnames parameter
### if the file names are not confidential
# gocryptfs --init --reverse --plaintextnames ./unencrypted
### mount the unencrypted folder in reverse mode
gocryptfs --reverse ./unencrypted ./encrypted
After initializing the reverse repository in the ./unencrypted
folder you will find a new file called ./unencrypted/.gocryptfs.reverse.conf
. This file contains relevant encryption parameters that are required to be able to encrypt the files. When the reverse repository is mounted into the ./encrypted
folder you will find a file called ./encrypted/gocryptfs.conf
which is an exact copy of the previous ./unencrypted/.gocryptfs.reverse.conf
file. It is required to be able to decrypt the files again. You must not lose this file!
There are two possibilities to prevent this:
./encrypted/gocryptfs.conf
. As the configuration can only be used in conjunction with the corresponding password, it should be save to store this paper-based backup somewhere even if someone should be able to read it.If you did not use the --plaintextnames
option you should also find a new file called ./encrypted/gocryptfs.diriv
. By default GoCryptFS does not only encrypt the file content but also the file name. The directory initialization vectors (dirivs) contained in the gocryptfs.diriv
files are required to be able to decrypt the file names again. If you cannot risk to lose file names or if your file names are not confidential then it might be better for you to disable the file name encryption.
I use a FUSE wrapper to mount the remote storage as if it were a local device. Typically your backup software would have to be able to upload files to the remote storage itself, unnecessarily complicating things. The FUSE wrapper takes this complexity away from the actual backup tool. Using a FUSE wrapper will also come in handy later on when restoring data.
Typical FUSE wrappers include:
I personally use SSHFS as I use a remote VM with SFTP access to store my backups. This also reduces the amount of transferred data for the restore test as we will see later.
### create the remote folder
mkdir ./backup
### mount the remote storage
sshfs backup@backup.example.com:/backup ./backup
### you should use additional parameters
### if you run into problems
# sshfs backup@backup.example.com:/backup ./backup -o ServerAliveInterval=15 -o idmap=user -o uid=$(id -u) -o gid=$(id -g) -o rw
### create the remote subfolders
mkdir ./backup/checksums ./backup/files ./backup/snapshots
Now we are ready to create the actual backup. For this we will use rsync which is used far and wide for such tasks and has some nice benefits:
--backup-dir=
and --delete
parameters rsync provides a rather simple versioning of files. Files that have changed or that have been deleted between synchronizations are moved to the provided backup directory path and can easily be accessed. If you need more storage you can delete backup directories of earlier synchronizations.### copy over files and keep modified and deleted files
rsync -abEP "--backup-dir=../snapshots/$(date '+%Y%m%d-%H%M%S')/" --delete ./encrypted/ ./backup/files/
### you should add the --chmod=+w parameter
### if created folders in the backup target are not writable
# rsync -abEP "--backup-dir=../snapshots/$(date '+%Y%m%d-%H%M%S')/" --chmod=+w --delete ./encrypted/ ./backup/files/
You do not have a proper backup unless you have successfully tried to restore it. However, restore-testing remote backups can be ressource-intensive. The way I do it is to calculate checksums of the local files and of the remote files which are then compared to make sure that the remote copy is identical to the local copy.
### enter the encrypted directory
cd ./encrypted
### create checksums of all files
find . -type f -print0 | xargs -0 sha1sum > ../original
### copy the checksums over to the remote server
cp ../original ../backup/checksums/original
### leave the encrypted directory
cd ..
Calculating the checksums of the remote files through the FUSE wrapper is possible. Unfortunately, this would mean to download the whole backup in the background. As the checksum calculation is separated from the checksum comparison we can optimize things a bit. Given that you have SSH access to the remote target you can log into the remote server, calculate the checksums there and only transfer the checksum file to compare it with the checksums of the original files. This greatly reduces the amount of data that have to be transferred for the restore test.
### enter the backup files directory
cd ./backup/files
### create checksums of all files
find . -type f -print0 | xargs -0 sha1sum > ../checksums/backup
### leave the backup files directory
cd ../..
Comparing checksums that have been written to files has one caveat. The files might be sorted differently. You have to remember this and sort the checksum files before diffing them or otherwise you might find a lot of deviations.
### sort the checksums
sort ./backup/checksums/backup > ./backup/checksums/backup.sorted
sort ./backup/checksums/original > ./backup/checksums/original.sorted
### compare the checksums
diff ./backup/checksums/original.sorted ./backup/checksums/backup.sorted
The suggested restore test has one small imperfection. It may speed up the comparison of the local and remote copy, but this is only true for the encrypted files. Normally you would want to make sure that the decrypted files are identical to the original unencrypted files as well. There are two different approaches to achieve this:
The solution that I chose is a bit different: Thanks to the regular restore test I already know that the local encrypted files and the remote files are identical. That also means that the local encrypted files will decrypt to the exact same result as the remote files. So, I can just mount the local encrypted files via GoCryptFS which can then be compared to the original unencrypted files. As the encryption and decryption happen in-memory on-the-fly it is not necessary to keep a second copy of the data around.
### mount the encrypted folder in forward mode
gocryptfs ./encrypted ./decrypted
When calculating the checksums of the original unencrypted files we have to ignore the .gocryptfs.reverse.conf
file as it will not be present after the decryption.
### enter the unencrypted directory
cd ./unencrypted
### create checksums of all files
### but ignore the gocryptfs config file
find . -type f ! -path "./.gocryptfs.reverse.conf" -print0 | xargs -0 sha1sum > ../unencrypted
### leave the unencrypted directory
cd ..
Calculating the checksums of the decrypted files might take a bit longer. Remember that in this case each file is read from disk, encrypted and then decrypted before calculating the actual checksum.
### enter the decrypted directory
cd ./decrypted
### create checksums of all files
find . -type f -print0 | xargs -0 sha1sum > ../decrypted
### leave the decrypted directory
cd ..
After sorting the checksum files we can finally compare them.
### sort the checksums
sort ./decrypted > ./decrypted.sorted
sort ./unencrypted > ./unencrypted.sorted
### compare the checksums
diff ./unencrypted.sorted ./decrypted.sorted
This whole process does not necessarily have to be done for each and every backup. It is primarily used to make sure that the encryption layer works as expected. After you are done, do not forget to unmount the decryption folder.
### unmount the directory
fusermount -u ./decrypted
Thanks to the usage of the FUSE wrapper it is pretty easy to restore files from the remote backup. For other applications the FUSE mount looks like any local storage device which means that you can also mount the remote backup directly via GoCryptFS.
### mount the backup folder in forward mode
gocryptfs ./backup/files ./decrypted
Now it is possible to browse the backup and search for the files that you want to restore. Once you found them you can just copy them over. During the copy process the files will be downloaded and decrypted in-memory on-the-fly. After you are done, just unmount the decryption folder.
### unmount the directory
fusermount -u ./decrypted
There we have it. By combining some tools that do their own job we have created a backup solution that is - in my opinion - easy to understand and use. Those tools make up a solution that is better than the single parts alone:
Best of it all: Those tools are independent of each other. Most of them could be replaced should it become necessary. I hope that you can see the benefits of this approach to simple encrypted remote backups.
So finally, as a last step, do not forget to unmount the used folders. 😃
### unmount the directories
fusermount -u ./backup
fusermount -u ./encrypted
16.11.2020 yahe publicity security
Nearly a year ago I wrote that I had an extensive look into the server side encryption that is provided by the Default Encryption Module of Nextcloud. I also mentioned that I have written some helpful tools and an elaborate description for people that have to work with its encryption.
What I did not write about at that time was that I had also discovered several cryptographic vulnerabilities. After a full year, these have now finally been fixed, the corresponding HackerOne reports have been disclosed and so I think it is about time to also publish the whitepaper that I have written about these vulnerabilities.
The paper is called "Cryptographic Vulnerabilities and Other Shortcomings of the Nextcloud Server Side Encryption as implemented by the Default Encryption Module" and is available through the Cryptology ePrint Archive as report 2020/1439. The vulnerabilities presented in this paper have received their own CVEs, namely:
Having such an in-depth look into the implementation of a real-world application has been a lot of fun. However, I am also relieved that this project now finally comes to an end. I am eager to start with something new. 😃
A few weeks ago I wrote about the new cryptographic basis of the Shared-Secrets service. What I did not write about was that one user asked if a file-sharing option could be added. I declined because sharing files is nothing that the service is meant to be there for. But I tried whether sharing files would be possible.
To share the files I needed a place to store them, but I did not feel like storing them on the actual server. So I searched for an easy file storage service. The first service that came to my mind was Dropbox. I had never used Dropbox nor the Dropbox API but felt that it should not be all too difficult. But I did not know how easy it actually was to use the Dropbox API. Because of this simplicity I decided to write a short blog post about it. So here are 5 simple steps to use the Dropbox API...
First of all you need to register a Dropbox account. For this you should use a valid e-mail address that you have access to because you will have to verify that e-mail address later on.
After logging in with your new account you can visit the app creation page where you are requested to verify your e-mail address. You have to proceed with the verification in order to create your own Dropbox App.
Now that you have verified your e-mail address you can visit the app creation page again to create the actual app. Select to use the Dropbox API, to store the files in a separate App folder, choose a name for your App and you are ready to go.
When you have created the app the app details will be shown to you. Scroll down a bit and you will find a button labeled "Generate" in the "OAuth 2" section that will generate your individual access token. The Access Token displayed below the "Generated access token" heading is needed to identify your Dropbox Account.
Using the Dropbox API itself is relatively simple as most actions can be done with a single REST API call. Here are some PHP examples that illustrate the usage of the Dropbox API. First of all we define the Access Token which is needed by all API calls:
// see https://blogs.dropbox.com/developers/2014/05/generate-an-access-token-for-your-own-account/
define("DROPBOX_ACCESS_TOKEN", "YOUR DROPBOX ACCESS TOKEN");
Now, in order to store a given string ($content
) in Dropbox we can use the following function. On failure it returns NULL
and on success it returns a random identifier through which the content can be retrieved again:
function dropbox_upload($content) {
$result = null;
// store content in memory
if ($handler = fopen("php://memory", "rw")) {
try {
if (strlen($content) === fwrite($handler, $content)) {
if (rewind($handler)) {
// get random filename
$filename = bin2hex(openssl_random_pseudo_bytes(32, $strong_crypto));
if ($curl = curl_init("https://content.dropboxapi.com/2/files/upload")) {
try {
$curl_headers = ["Authorization: Bearer ".DROPBOX_ACCESS_TOKEN,
"Content-Type: application/octet-stream",
"Dropbox-API-Arg: {\"path\":\"/$filename\"}"];
curl_setopt($curl, CURLOPT_HTTPHEADER, $curl_headers);
curl_setopt($curl, CURLOPT_PUT, true);
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($curl, CURLOPT_INFILE, $handler);
curl_setopt($curl, CURLOPT_INFILESIZE, strlen($content));
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($curl);
if (200 === curl_getinfo($curl, CURLINFO_RESPONSE_CODE)) {
$result = $filename;
}
} finally {
curl_close($curl);
}
}
}
}
} finally {
fclose($handler);
}
}
return $result;
}
To retrieve the stored content the following function can be used. It requires the random identifier ($filename
) as the input parameter:
function dropbox_download($filename) {
$result = null;
if ($curl = curl_init("https://content.dropboxapi.com/2/files/download")) {
try {
$curl_headers = ["Authorization: Bearer ".DROPBOX_ACCESS_TOKEN,
"Content-Type: application/octet-stream",
"Dropbox-API-Arg: {\"path\":\"/$filename\"}"];
curl_setopt($curl, CURLOPT_HTTPHEADER, $curl_headers);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($curl);
if (200 === curl_getinfo($curl, CURLINFO_RESPONSE_CODE)) {
$result = $response;
}
} finally {
curl_close($curl);
}
}
return $result;
}
In order to delete the stored content the following function can be used. It again requires the random identifier ($filename
) as the input parameter:
function dropbox_delete($filename) {
$result = null;
if ($curl = curl_init("https://api.dropboxapi.com/2/files/delete_v2")) {
try {
$curl_fields = json_encode(["path" => "/$filename"]);
$curl_headers = ["Authorization: Bearer ".DROPBOX_ACCESS_TOKEN,
"Content-Type: application/json"];
curl_setopt($curl, CURLOPT_HTTPHEADER, $curl_headers);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, $curl_fields);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($curl);
if (200 === curl_getinfo($curl, CURLINFO_RESPONSE_CODE)) {
$result = $response;
}
} finally {
curl_close($curl);
}
}
return $result;
}
And there we have it. With a few lines of code we were able to store, retrieve and delete contents in Dropbox. In my case I also encrypted the content before storing it in Dropbox, something that you should consider as well if you are handling confidential contents. But other than that the shown functions are most of what you will need in the beginning.
17.12.2019 yahe code linux security
About 3 years ago I wrote about a tool called Shared-Secrets that I had written. It had the purpose of sharing secrets through encrypted links which should only be retrievable once. Back then I made the decision to base the application on the GnuPG encryption but over the last couple of years I had to learn that this was not the best of all choices. Here are some of the problems that I have found in the meantime:
As this last issue is a problem with the GnuPG message format itself its solution required to either change or completely replace the cryptographic basis of Shared-Secrets. After thinking about the possible alternatives I decided to design simple message formats and completely rewrite the cryptographic foundation. This new version has been published a few weeks ago and a running instance is also available at secrets.syseleven.de.
This new implementation should solve the previous problems for good and will in future allow me to implement fundamental improvements when they become necessary as I now have a much deeper insight into the used cryptographic algorithms and the design of the message formats.
02.12.2019 yahe administration code security update
At the beginning of the year we ran into a strange problem with our server-side encrypted Nextcloud installation at work. Files got corrupted when they were moved between folders. We had found another problem with the Nextcloud Desktop client just recently and therefore thought that this was also related to synchronization problems within the Nextcloud Desktop client. Later in the year we bumped into this problem again, but this time it occured while using the web frontend of Nextcloud. Now we understood that the behaviour did not have anything to do with the client but with the server itself. Interestingly, another user had opened a Github issue about this problem at around the same time. As these corruptions lead to quite some workload for the restore I decided to dig deeper to find an adequate solution.
After I had found out how to reproduce the problem it was important for us to know whether corrupted files could still be decrypted at all. I wrote a decryption script and proved that corrupted files could in fact be decrypted when Nextcloud said that they were broken. With this in mind I tried to find out what happened during the encryption and what broke files while being moved. Doing all the research about the server-side encryption of Nextcloud, debugging the software, creating a potential bugfix and coming up with a temporary workaround took about a month of interrupted work.
Even more important than the actual bugfix (as we are currently living with the workaround) is the knowledge we gained about the server-side encryption. Based on this knowledge I developed a bunch of scripts that have been published as nextcloud-tools on GitHub. These scripts can help you to rescue your server-side encrypted files in cases when your database was corrupted or completely lost.
I also wrote an elaborate description about the inner workings of the server-side encryption and tried to get it added to the documentation. It took some time but in the end it worked! For about a week now you can find my description of the Nextcloud Server-Side Encryption Details in the official Nextcloud documentation.
Due to popular demand I wrote the decrypt-all-files.php
script that helps you to decrypt all files that have been encrypted with the server-side encryption. It is accompanied with a somewhat extensive description on how to use it.