Allow configuration of lucene query term limit

fixes #2834
This commit is contained in:
Carsten Brandt 2017-11-23 13:22:05 +01:00
parent a5559da7a9
commit 708fc8299f
No known key found for this signature in database
GPG Key ID: BE4F41DE1DEEEED0
3 changed files with 51 additions and 1 deletions

View File

@ -4,6 +4,7 @@ HumHub Change Log
-------------------------
- Fix: Added `ManageSpaces` and SystemAdmin check to `UserGroupAccessValidator`.
- Fix: Only include content with `stream_channel = default` into spacechooser update count.
- Enh: Make lucene search term limit configurable via `ZendLuceneSearch::$searchItemLimit`.
1.2.3 (October 23, 2017)
-------------------------

View File

@ -43,4 +43,32 @@ You can modify the default search directory in the [configuration](advanced-conf
]
// ...
];
```
```
### Limitations
The Zend Lucence Engine runs inside the PHP process and is limited by the
settings of the PHP environment in terms of memory usage and execution time.
By default Zend Lucence Engine sets a limit on the number of terms in a search query,
which also results in a limitation of the number of items a search term can match.
For the space search this must be set at least as high as the number of spaces.
In general the limit depends on the number of items a search term can match so it
highly depends on the content. To be sure all searches work you can set it higher than the
number of spaces/users/content you have.
It can be set to 0 for no limitation, but that may result in search queries
to fail caused by high memory usage.
You can [configure](advanced-configuration.md) the limit by setting `searchItemLimit` on the `search` application component:
```php
return [
'components' => [
'search' => [
'searchItemLimit' => 10000,
],
],
];
```

View File

@ -15,6 +15,7 @@ use humhub\modules\search\libs\SearchResultSet;
use humhub\modules\space\models\Space;
use ZendSearch\Lucene\Document\Field;
use yii\helpers\VarDumper;
use ZendSearch\Lucene\Lucene;
/**
* ZendLucenceSearch Engine
@ -30,6 +31,24 @@ class ZendLuceneSearch extends Search
*/
public $index = null;
/**
* @var integer sets the `termsPerQueryLimit` property for the lucene index.
* This limits the number of terms in a search query, which also results in a
* limitation of the number of items a search term can match.
*
* This property should be at least as high as the number of items a search can match.
* It needs to be configured dependent on the amount of items stored in the
* Humhub database.
*
* It can be set to 0 for no limitation, but that may result in search queries
* to fail caused by high memory usage.
*
* Defaults to 2048, which is twice as high as the default value set by Lucene.
*
* @see Lucene::getTermsPerQueryLimit()
*/
public $searchItemLimit = 2048;
/**
* @inheritdoc
*/
@ -285,6 +304,8 @@ class ZendLuceneSearch extends Search
\ZendSearch\Lucene\Analysis\Analyzer\Analyzer::setDefault(new \ZendSearch\Lucene\Analysis\Analyzer\Common\Utf8Num\CaseInsensitive());
\ZendSearch\Lucene\Search\QueryParser::setDefaultOperator(\ZendSearch\Lucene\Search\QueryParser::B_AND);
\ZendSearch\Lucene\Lucene::setTermsPerQueryLimit($this->searchItemLimit);
try {
$index = \ZendSearch\Lucene\Lucene::open($this->getIndexPath());
} catch (\ZendSearch\Lucene\Exception\RuntimeException $ex) {