[TASK] Disabling file deduplication by default.

File hashing is a pretty intensive task both for CPU and disk.
This feature seems not to be a must have and can produce unwanted side effects.
Very large file can take a lot of time to be hashed at the end of transfert, producing protential timeouts and user frustration.

Users can still enable file deduplication feature by setting `file_hash` to `md5`.

Signed-off-by: Jerome Jutteau <jerome@jutteau.fr>
This commit is contained in:
Jerome Jutteau 2022-07-05 13:46:16 +02:00
parent aec88112ff
commit b0d7e17277
4 changed files with 10 additions and 7 deletions

View File

@ -13,7 +13,8 @@
# version 4.5.0
- Support for dark theme
- Fix side effects of setting too high values in php configuration.
- Fix side effects of setting too high values in php configuration
- Change default `file_hash` option to `random`
New configuration items:
- `max_upload_chunk_size_bytes` option

View File

@ -273,7 +273,7 @@ Check [issues](https://gitlab.com/mojo42/Jirafeau/issues) to check open bugs and
### What about this file deduplication thing?
Jirafeau uses a very simple file level deduplication for storage optimization.
Jirafeau can use a very simple file level deduplication for storage optimization.
This mean that if some people upload several times the same file, this will only store one time the file and increment a counter.
@ -283,9 +283,11 @@ When the counter falls to zero, the file is destroyed.
In order to know if a newly uploaded file already exist, Jirafeau will hash the file using md5 by default but other methods are available (see `file_hash` documentation in `lib/config.original.php`).
This feature is disabled by default and can be enabled through the `file_hash` option.
### What is the difference between "delete link" and "delete file and links" in admin interface?
As explained in the previous question, files with the same hash are not duplicated and a reference counter stores the number of links pointing to a single file.
When file deduplication feature is enabled, files with the same hash are not duplicated and a reference counter stores the number of links pointing to a single file.
So:
- The button "delete link" will delete the reference to the file but might not destroy the file.
- The button "delete file and links" will delete all references pointing to the file and will destroy the file.

View File

@ -36,7 +36,7 @@ Available options:
- `ADMIN_PASSWORD`: setup a specific admin password. If not set, a random password will be generated.
- `WEB_ROOT`: setup a specific domain to point at when generating links (e.g. 'jirafeau.mydomain.com/').
- `VAR_ROOT`: setup a specific path where to place files. default: '/data'.
- `FILE_HASH`: can be set to `md5` (default), `partial_md5` or `random`.
- `FILE_HASH`: can be set to `md5`, `partial_md5` or `random` (default).
- `PREVIEW`: set to 1 or 0 to enable or disable preview.
- `TITLE`: set Jirafeau instance title.
- `ORGANISATION`: set organisation (in ToS).

View File

@ -158,7 +158,7 @@ $cfg['proxy_ip'] = array();
/* File hash
* In order to make file deduplication work, files can be hashed through different methods.
* By default, files are hashed through md5 but other methods are available.
* To enable file deduplication feature, set this option to `md5`.
*
* Possible values are 'md5', 'md5_outside' and 'random'.
*
@ -168,9 +168,9 @@ $cfg['proxy_ip'] = array();
* - md5 of the last part of the file and
* - file's size.
* This method offer file deduplication at minimal cost but can be dangerous as files with the same partial hash can be mistaken.
* With 'random' option, file hash is set to a random value and file deduplication cannot work anymore but it is fast and safe.
* With 'random' option, file hash is set to a random value and file deduplication cannot work but it is fast and safe.
*/
$cfg['file_hash'] = 'md5';
$cfg['file_hash'] = 'random';
/* Work around that LiteSpeed truncates large files when downloading.
* Only for use with the LiteSpeed web server!