Performance Analysis
Introduction
This guide provides a comprehensive approach to analyzing performance issues:
Step 1: Initial Server Performance Check (15 Minutes)
1.1 Memory-Related Checks
Check for OutOfMemory Errors:
Run the following command to search for OutOfMemory errors in the server logs:
CODEgrep OutOfMemory ~/work/logs/server-0.*.log
Check if there are any heap dump files created with a recent timestamp:
CODEls -ltr /opt/corpus/*.hprof
If OutOfMemory errors are found:
Identify the command that caused the OutOfMemory error from the logs.
Send the automatically created heap dump file (*.hprof) and the relevant server logs to your support team for further analysis.
1.2 CPU-Related Checks
Check for High CPU Usage:
Log in via SSH.
Execute the Unix
top
command and watch it for a few seconds.Identify if any threads are hogging the CPU by using the
top
command:
top -H -p <censhareServerPID>
If a thread is consuming excessive CPU:
Create 5 thread dumps using
cssjstack
:CODEfor i in {1..5}; do cssjstack <censhareServerPID> > thread_dump_$i.txt; sleep 5; done
Send the thread dumps, relevant server logs, and a screenshot of the Admin-Client to your support team for further analysis.
1.3 Check censhare Diagrams
Log into the censhare Admin Client.
Go to Status | Diagrams, open it and take a screenshot of it (#analysis data).
Check if there are peaks (needs some experience).
Clue: Peaks can be of two types. The line goes up above normal and comes down after some time. This shows there was a problem, but the censhare Server has recovered. If the line goes up above normal and continues to be there, it shows that the problem still exists and may require a server restart.
1.4 Documentation and Next Steps
If server performance is OK:
Communicate with the customer or partner using a predefined template to gather more detailed information about the performance issue.
Step 2: Gather Detailed Information from the Customer
2.1 Understand the Problem
What exactly is slow?
Login / log out
Search functions, edit metadata (CDB or Oracle database)
Asset operations (e.g., check-in, check-out)
Network, Storage, Oracle database, triggered post-processes
Previews generation (parallelism, 3rd-party command forking)
Special module issues (e.g., attribute inheritance, archiving/de-archiving)
When exactly is it slow?
Specific recent timeframes (e.g., since the last xx hours, or at specific times)
Has something been changed?
User load changes (e.g., more users than before)
Internal configuration changes (e.g., JVM/GC memory configuration, CDB configuration changes, new file system added)
Oracle database changes (e.g., decreased SGA size, Archivelog mode activated)
Operating system changes (e.g., updates, swap removed)
Storage changes (e.g., firmware update, NFS parameters)
Network changes (e.g., DNS, bandwidth, latency)
Where does the performance issue occur?
Only within the censhare Web-Client
Only within the censhare Client
Client-independent issue (occurs on all workstations with all logged-in users)
2.2 Collect Environment Details
System Information:
Hardware specifications (CPU, RAM, Disk) for servers and clients.
Operating System and version for all components.
Network Information:
Network configuration and bandwidth.
Any recent changes to network settings or infrastructure.
2.3 Software and Configuration Details
censhare Version:
Exact version of censhare Server, Client, Web application, ServiceClient, and Render Client.
Third-Party Integrations:
Details about any integrated third-party services or applications.
Configuration Settings:
Custom settings or configurations in censhare and its components.
Database details (type, version, configuration).
Step 3: Analyze Logs and Metrics
3.1 Review System Logs
censhare Server Logs:
Check for error messages, warnings, or unusual activity.
censhare Client Logs:
Review logs for any performance-related entries.
censhare Web Logs:
Look at logs from the Angular.js/Angular application for any issues.
ServiceClient and Render Client Logs:
Analyze logs for errors or performance bottlenecks.
Database Logs:
Look for slow queries, locks, or errors.
Application Server Logs:
Check for performance-related entries and resource utilization.
3.2 Monitor System Resources
CPU and Memory Usage:
Identify processes consuming excessive CPU or memory.
Disk I/O:
Check for high disk read/write operations.
Network Activity:
Monitor network traffic for any unusual spikes.
Step 4: Reproduce the Issue
4.1 Attempt to Reproduce
Try replicating the issue in a controlled environment using the same steps described by the customer.
Use similar data and configuration settings to simulate the customer’s environment as closely as possible.
4.2 Document Findings
Record the steps to reproduce the issue.
Note any differences in behavior between the test environment and the customer’s environment.
Step 5: Analyze Database Performance
5.1 Check Database Health
Analyze Database Performance Metrics:
Query response times.
Transaction logs.
Index usage and fragmentation.
Identify slow-running queries and consider optimization.
5.2 Database Maintenance
Ensure regular database maintenance tasks are performed:
Index rebuilding or reorganization.
Statistics updates.
Cleanup of old or unnecessary data.
Step 6: Application and Code Review
6.1 Review Application Configuration
Check for optimal configuration settings in the censhare Server, Client, and Web application.
Ensure that application settings align with best practices for performance.
6.2 Code Analysis
Inspect custom code or scripts for inefficiencies in censhare Server, Client, Web, ServiceClient, and Render Client.
Review recent code changes that might impact performance.
Step 7: Network and Infrastructure Assessment
7.1 Network Performance
Conduct network tests to check for latency, packet loss, or bandwidth issues.
Verify network configuration and routing settings.
7.2 Infrastructure Review
Evaluate the infrastructure supporting censhare:
Load balancers.
Firewalls.
Storage systems.
Step 8: Document and Involve Other Teams
8.1 Documentation
Document what has already been checked:
Detail all the steps taken and findings from each phase of the analysis.
Include logs, metrics, screenshots, and any other relevant data.
8.2 Involve Other Teams
Communicate with Other Teams:
Share the documented findings with relevant teams (e.g., network, database, application).
Ensure that other teams know what has been checked and what needs further investigation.
Step 9: Engage with the Customer
9.1 Communicate Findings
Share initial findings and potential solutions with the customer.
Provide recommendations for immediate steps to mitigate the issue.
9.2 Follow Up
Schedule follow-up meetings to track progress.
Ensure the customer has implemented recommended changes and monitor the impact.
Conclusion
By following this structured approach, you can systematically analyze and address performance issues reported by customers. Thorough documentation, effective communication, and continuous monitoring are key to ensuring customer satisfaction and maintaining optimal performance of censhare systems.