Incomplete data in the Cached Tables results in missing Master Data

Unexpected behavior in censhare due to missing Master Data

In environments with remote servers (without their own database connection), users have no access to some domains (therefore assets in those domains can't be reached). Or sometimes, users who try to assign a new feature to an asset will see the feature (type: value list) does not display all the values which had been added in the past. Also, when the administrator reviews the Master Data from the Admin client those values are missing. The missing Master Data values will show up again if the administrator does a "data cache refresh" on the Master Server or if a feature is saved with or without changes.

Symptoms

These behaviors are just symptoms and they might be intermitent. For example, users could lose access to certain domains for a few minutes and then be able to acccess them again or lose rights. In admin client the user may vanish temporary, but you still can login with it. It is also important to notice:

  • This issue occurs only in environment with remote servers which don't have their own database connection.

  • It will happen after someone changes Master Data on those remote servers without their own database connection (not recomended).

  • It only affects Master Data tables which have more than 10.000 records. This means it might not affect all Master data at once, or it could affect Master Data other than Domains, Permissions, Users and Features.

Troubleshooting

You could increase the log level on the master server for the CacheService (com.censhare.support.cache.CacheService) from INFO to FINER by using the Admin-Client module Configuration / Modules / Logger Manager. A restart for the activation is not necessary.

Now with the increased log level you can see when cached_tables are removed or newly added:

corpus@server:~$ grep -3 "CacheService: no-context: updating cached table by event: party_role" ~/work/logs/server-0.?.log|grep -E "removed|new"
/opt/corpus/work/logs/server-0.1.log-2018.03.15-15:25:53.945 FINE   : T062: CacheService: no-context: removed: <party_role party_id="6976" role="export" domain="root.techRoot.cdc." domain2="root." enabled="1" tcn="0" rowid="AAAamLAAXAAABt2ABh"/>
/opt/corpus/work/logs/server-0.1.log-2018.03.15-15:26:01.883 FINE   : T034: CacheService: no-context: removed: <party_role party_id="6166" role="export" domain="root.techRoot.mediaPool.cdc.flat." domain2="root." enabled="1" tcn="2" rowid="AAAamLAAaAAKiUDABI"/>
/opt/corpus/work/logs/server-0.1.log-2018.03.15-15:26:01.883 FINE   : T034: CacheService: no-context: new: <party_role corpus:dto_flags="pt" party_id="6748" role="layout" domain="root.techRoot.launchPacks." domain2="root." enabled="1" tcn="1" rowid="AAAamLAAXAAABt3AAd"/>
/opt/corpus/work/logs/server-0.2.log-2018.03.15-15:25:39.877 FINE   : T078: CacheService: no-context: new: <party_role corpus:dto_flags="pt" party_id="6748" role="layout" domain="root.techRoot.launchPacks." domain2="root." enabled="1" tcn="1" rowid="AAAamLAAXAAABt3AAd"/>
CODE

But still you do not know which (remote) server or action updates the cached tables. For that, additional logging censhare-Server/java/source/com/censhare/manager/cachemanager/CacheServiceImpl.java (requires a server downtime and rebuild) would be necessary. Here an example:

Afterwards the logging could look like below and you know at least from which server the event came from, method=refresh indicates a master data update.

2018.03.19-15:07:03.057 FINE : T040: CacheService: no-context: update cached table resource_asset by event: [serverName=remote, target=CacheManager, method=refresh], entries removed: 0, entries new: 0, entries modified: 1 2018.03.19-15:07:03.058 FINE : T040: CacheService: no-context: updating cached table by event: resource_asset 
CODE

Cause

This issue is caused by the limit in the Database Service to fetch cached data. This is the "Max rows for remote query" setting: remote-max-resultset-size="10000" whichs sets the limit for the Cached Tables to 10.000 records. For instance:

There is a party_role table with 10.123 records. Any change or addition of an user on that Remote Server leads to 123 records to be temporarily removed from the Cached Tables because the limit is set to 10.000 records.

Solution

A fix for this issue has been included in the following censhare versions:

  • 2018.3.0 and higher

  • This fix cannot be downported to lower versions due to its complexity.

Workaround

The easiest workaround is to execute the "Reload Data Cache" action on the Master Server. However, this will work until another change to the Master Data is made from the Remote Server. It is recommended to always perform Master Data maintenance from the Master Server.

It is also possible to increase the remote-max-resultset-size limit value (i.g. remote-max-resultset-size="15000") for the Master Server and then restart the censhare Application Servers. This approach is very costly performance-wise and it can't be done repeatedly.

Additional information

Q: Why is there a limit on the Max rows for remote query? How is it helpful?

A: The reason for the limit is that queries to the DataObjectService are performed in a single round trip to the Master Server instead of reading each row separately. An unlimited query has the potential to crash the server with a very large XML result. There is an optimization planned for further censhare Server versions which repeatedly queries until all data is read.