RACF Performance Through the Eyes of Support
From an RACF perspective, good performance is all about I/O and the ENQs that serializes that work.
By John Reale III11/17/2020
What ENQs does RACF use for database I/O? On each system, RACF will use MAJOR=SYSZRAC2,MINOR=data.set.name,SCOPE=System to control access to the cache buffers it has allocated in CSA. Once that ENQ is obtained, MAJOR=SYSZRACF,MINOR=data.set.name,SCOPE=SystemS is used to control access to the database itself. If there is contention on SYSZRAC2, it usually means there is contention on SYSZRACF somewhere across the GRS system complex (aka, sysplex). For this article, we are specifically concerned with the SYSZRACF ENQ.
Any time a profile is updated or added, this ENQ is obtained for Exclusive use and could create a bottleneck if the work does not complete quickly. There are several steps you can take to reduce RACF’s need to read to or write from the RACF database datasets.
RACLISTOne of the easiest things you can do is tell RACF to load profile data for a Class into a dataspace. This is done via SETROPTS RACLIST(class). When a Class is in this state, a need for a profile from this Class will be satisfied from the dataspace thereby avoiding I/O and the ENQ. When profile changes are made, they happen on DASD. You would then schedule a time for a SETOPTS RACLIST(class) REFRESH to make a completely fresh dataspace and delete the old one. Virtually all RACF Classes can be RACLISTed. A reason for a Class to not be a good candidate for RACLIST would be if it is susceptible to frequent updates. The TAPEVOL class comes to mind.
CF StructureAnother way RACF can cache profile data is in a CF Structure. When a CF Structure is available, RACF will first check the CF if the needed profile block is there. If not, it will read from the database and save a copy to the CF Structure. When the profile block is updated, RACF will invalidate the saved copy in the CF Structure and let the next read of that block re-populate the CF Structure. Reading from the CF Structure now includes using the ENQ—thanks to APAR OA53679—so you might notice a difference after its PTF is installed. But reading from the CF is much faster than an I/O.
When is a CF Structure is available? If all of the RACF systems that share a given RACF database are all members of a sysplex, and there is no other RACF database in the sysplex that is already using the CF Structure, you can create the Structures and set RACF to using them. You would need to first IPL each sharing system into RACF Sysplex Communication (denoted by a flag in the ICHRDSNT). Once that is complete, you can issue the RVARY DATASHARE command (in TSO). Do not forget to update the ICHRDSNT again so that future IPLs will enter DataSharing Mode (DSM) automatically.
A note here about the ENQ. When in DSM, this ENQ is a pure ENQ processed within the sysplex. This is why there cannot be any systems from outside the sysplex using the RACF database. When not using DSM, the ENQ is connected to a hardware RESERVE on the volume. This controls serialization among systems that are not in a common sysplex, or any sysplex.
False ContentionOnly one RACF database can be in DSM in the sysplex. This is a limitation to help drive to one database across the whole sysplex; although, there are reasons why you cannot go to one database. If you are in that situation, please do not give your RACF database datasets the same names. The DSN is part of the ENQ resource (MINOR name), so common DSNames will create “false contention” between the two (or more) databases.
Another situation has also recently surfaced that would create “false contention.” It has to do with the Lock Structure used by GRS and how XES monitors Locks. Long story short, XES took PE APAR OA60394 to note a problem with the fix of APAR OA59122. (Learn more about how a Lock Structure can create “false contention” when it is too small.)
Caching in VLFWhen a user is logged into the system, RACF builds an ACEE block from their USER profile (and other profiles) for some of the data and access criteria. This data usually does not change much, so caching the ACEE would allow for subsequent logins to run much faster. To do this, you update VLF options in the COFVLFxx PARMLIB member to include CLASS NAME(IRRACEE). To also cache much of the Unix-based data, you will want to include NAMEs IRRUMAP, IRRGMAP and IRRSMAP. Depending on the age of your system, you will want to verify a few APARs are installed. For RACF: OA51204, OA52117, OA52226, OA52291, OA57821; for VLF: OA51218, OA54909.
INITSTATS DAILYEven with the ACEE being cached, RACF at each logon will want to write the “Last Access” date and time back to the USER profile. The first time this timestamp is updated on any given day, RACF will get Exclusive use of the ENQ. Subsequent logins may be able to get Shared use, under certain conditions. Whether they do or not, it would be an improvement if the update did not happen at all. Avoiding the update was first made available for Unix activity via the SESSION=OMVSSRV parameter on the RACROUTE REQUEST=VERIFY,ENVIR=CREATE call.
It has been made available to all other applications that use the APPL=applname parameter. All you need to do is add some APPLDATA to the covering profile in the APPL Class; e.g.
RALTER APPL applname APPLDATA(‘RACF-INITSTATS(DAILY)’)
ICHRIX01What about an application that does not use the APPL=applname parameter? You’re in luck, if you have decent assembler programming skills. You can create or update your RACF VERIFY Pre-Processing exit—ICHRIX01—to check for a blank or null APPL value and then insert your own applname, ie, NOAPPLNM. Then you can create a covering profile and have everyone in your shop avoiding a plethora of RACF updates:
RDEFINE APPL NOAPPLNM APPLDATA(‘RACF-INITSTATS(DAILY)’) UACC(READ)
This can be important if a TSO logon happens to lose the CPU in the middle of this RACF activity and doesn’t have the WLM definition to get it back before the system starts to process at a snail’s pace.
A note of warning: From an auditing perspective, the LAST-ACCESS DATE/TIME in the user profile has never been a proper method to know a user’s last login. SMF type80 records should be used for tracking logins.
Other IssuesThere are a few other things one might come across that have negative effects on performance. If one happens to SETROPTS AUDIT(DIRSRCH(SUCCESS)) you will start to get millions of SMFtype80 records that will bring your system to its knees.
DSS & HSM are DFSMS components that could also cause an issue with RACF. You absolutely do not want the RACF volume to be managed by them. They will hold a RESERVE of the volume, and if RACF is not in DSM, it will also use RESERVE. This will keep RACF blocked out until the volume is copied. If RACF is in DSM, the copy will be worthless because RACF will be making updates at the same time the copy is being made. (While I am on the topic, do not allow DSS to do a DEFRAG of the volume. Just don’t.)
Other ApplicationsLastly, many products have their own methods of caching security data and limiting how often they call RACF. You can use RACF’s GTF/SAF Trace to track the ASID(s) that make RACROUTE calls the most and see if they have any options to reduce the RACF workload. An example would be CICS’ USRDELAY parm.
Avoiding a Performance-Security ConflictI hope this handful of tidbits keeps you out of a performance-security conflict, especially one that might require assistance from Support on a high-volume morning. But if none of this helps your situation, or you have already implemented all of them, one last option is to split your database across multiple dataset pairs. This ultimately will cause RACF to use multiple ENQ resource names as the dataset name is part of it. Doing a RACF split is a project in and of itself, analyzing profile usage so you can create a proper ICHRRNG range table for the IRRUT400 utility. But once you have multiple dataset pairs, you will also need a plex-wide IPL to incorporate the updated ICHRDSNT & ICHRRNG.
May you and your systems run at peak performance!
John Reale III is a senior software engineer at IBM and is a leader in z/OS Support. John joined IBM in 1989 and Support in 1994. His career has covered SMP/E, RMF, the RACF+ team and now the LE+ team.
Sponsored ContentAchieve Compliance Without Impacting Productivity
Post a Comment
Note: Comments are moderated and will not appear until approvedcomments powered by Disqus