Skip arbitrary features in the index

Description:

In censhare 4.8.5, there is now an option to switch off the indexing for certain features. Furthermore, you are now able to prevent applying the values ​​for word frequency and document length in a full-text index. censhare administrators should take this into consideration to prevent the memory requirements (database cache) and the database size to grow uncontrollably. censhare administrators should take this into consideration to prevent the memory requirements (database cache) and the database size to grow uncontrollably.

The "use-frequency" option is disabled only for the metadata (text.meta and text.name). For these two indices, the relevance has little to do with the ratio of hits and document length, so there are really no disadvantages, but rather a better predictability of the results.

The censhare full-text index, wherein the "DocLength" is significantly more important continues to use frequency. All censhare versions prior to 4.8 always indexed with frequency on. The default value remains therefore set to "true".

Configuration:

Edit the XML file /app/services/assetstore/config.xml" and make the necessary or desired adjustments for your specific case.

<fulltext><index ... use-frequency ="false"/> <index name="..." disabled="true"/>

You can disable a feature index in the Admin-Client via a check box ("Disable this index") (Configuration -> Embedded Database).

Due to the changes made in the database configuration, a rebuild of the CDB is required. The relevance of search results is now calculated a bit less accurate, but this is absolutely tolerable.

2453587.png


The "comment" index has been set to inactive in this figure



See also Wikipedia: Approximate string matching