by Wade Lockhart
February 21, 2017
In pt.1 of this blog I discussed an example of how to measure low performance using SQL queries both through Informatica and run manually against the repository database. In pt.2 I will be going over a few hints for improving repository performance along with their source.
You should not try any of these hints without first consulting Informatica Support. You might also wish to take a full repository backup as well as a backup of the server environment before you begin the cleanup.
Running some of these commands against a repository with large amounts of data can cause performance issues as well and, in some cases, may ‘hang’. As a good rule of thumb to avoid this type of issue is to run the commands manually on the command line as opposed to using the Informatica client tools.
Hint 1: Update the Repository Statistics
There is a pmrep command that will help improve the repository performance – UpdateStatistics. To explain how it works, let me just paste a section from an Informatica KB article:
Using the pmrep UpdateStatistics command will return “completed successfully” or “failed”. If it fails, then contact Informatica Support and get them to help you out right away.
It is best to run this command in conjunction (and before) you look at the hint 2 command to truncate logs below.
Hint 2: Delete Old Logs
As workflows and sessions are executed against the repository database, logs about statistics and history will begin to build up. If they are allowed to grow without ever being cleaned up once in a while, then that can lead to poor performance. By using the pmrep truncateLog command (on the command line) this will clear out the historical statistics and keep only the most recent.
Usage for the command is: pmrep truncateLog –f <folder_name> -w <workflow_name>. This will also return “completed successfully” or “failed”. If it fails make sure your arguments are correct, like a valid folder name, and that the workflow exists in that folder.
There is also a way to delete log records directly against the database. There are 3 tables that store historical log information:
If you have direct access to the repository database then you can do something like run:
SELECT count (*) from OPB_SESS_TASK_LOG
This will return the number of records of historical logs. You could then run SQL to selectively delete these records.
Hint 3: Purging Old Object Versions
The repository maintains all of the old versions for every single object. That means, as time goes by and newer versions are added, the really old versions are taking up unwanted space and bloating the repository. One strategy to get rid of older versions of objects is to use the pmrep purgeversion command on the command line.
An example of using this command would be to run it against a specific folder in the repository and delete all versions (of objects) older than the newest one. Like…
Pmrep purgeversion –f <folder_name> -n <number_of_latest_versions_to_keep>
Tip: Cleanup the infa_shared Directories
Managing the storage of the Informatica server can be just as important as the repository. I understand that this blog is about repository performance, but thought it worth mentioning since it can go overlooked.
Each time an Informatica job executes it writes files into the SessLogs, WorkflowLogs, Temp, and Cache directories. Sometimes instead of replacing these files each run, new ones are created which can take up a lot of space and also make the directories harder to access for the OS. This can mean a loss of performance as the OS tries to access the growing volume of files in those directories.
One method to test the OS access to a directory with large amounts of file data is to perform the ls command and see how long it takes to return the information. If it takes any longer than a few seconds and you have yourself a bottleneck.
A good practice is to maintain a schedule to review the contents of these directories and remove old files as required.