ORA-29702 during Flashback or reverting database upgrade

ORA-29702 during Flashback or reverting database upgrade

Mike Dietrich blogged about this issue but I received some feedback/questions that I thought it is best to provide additional details on this error.

and

As mentioned earlier, Mike has covered this topic in depth and this is my attempt to provide additional information along with snippets from the trace files to explain the root cause analysis process, so in future DBA's can perhaps reuse this information to better understand ORA-29702 errors

ORA-29702

An ORA-29702 error is typically signaled when the Database instance attempts to talk to the Clusterware (ocssd process). The ocssd process is the Oracle Cluster Synchronization Services daemon responsible for Node management services. The most common cause of 29702 error is typically some misconfiguration that breaks the authentication required by the database process. A simple orachk will check for required permission and provide hints on what is broken (hint: check oradism configuration). I have also covered some more details in this slide share "Smart Monitoring: How does Oracle RAC manage Resources and State".

Ocssd process has steadily added more and more features over the years. One such new feature is the increase in maximum number of members that CSSD can support to 2048 from 512). In order to benefit from this change, all entities that were earlier (prior to the upgrade) connected to the CSSD needs to be register with the new maximum members value.

So any GI and database release greater than 19.3 would work without signaling the ORA-29702 error as the 19.3 database would now use the new 2048 value for MaxMembers during registration. The value of maximum members that can connect to the CSSD is persisted to storage so that future database start up can reuse that information. The problem occurs when a customer either reverts the upgrade or performs flashback operation

Lets understand this with an example

  1. GI version 11.2.0.4 running with 11.2.0.4 release of RDBMS with one database called Sales
  2. Maxmembers used by Sales is 512
  3. GI and Database is subsequently upgraded to 19.3
  4. When database sales starts, it requests MaxMembers to be set to 2048
  5. This value is persistently stored in the GI
  6. Now either Sales DB is flashed back to 11204 or the upgrade was reverted to 11204
  7. When Sales DB in 11204 version attempts to connect to the 19.3 GI, its Maxmember requested is 512 whereas the persistent value in the GI is 2048

Lmon trace file snippet

Batching factor: cache replay 114 size per lock 72
kjxggin: CGS tickets = 1000
kjxgrdmpcpu: CPU Total 14 Core 7 Socket 2 OCPU 14
kjxgrdmpcpu: High load threshold 17920
2020-06-03 13:52:41.714: [ CSSCLNT]clssgsGroupJoin: bad server response(-24)
kgxgnreg: error: status 1 (0 )
kjxgmjoin: can not join the group (DAALL_DB_XXXXXX) with id 0 (inst 1)
kjxgmjoin: kgxgn error 16
2020-06-03 13:52:41.714585 : IMR recording device closed, terminating IMR
kjfmreg: Joining the cluster failed with err code 16

Cssd trace file snippet

2020-06-15 14:52:31.537 : CSSD:487659264: [ INFO] clssgmCheckGrpAttrCompat: Change was rejected due to incompatible MaxMembers value grock DAALL_DB_XXXXX, ID 219071, current value 2048, requested value 512

As seen above, CSSD rejects the request by the RDBMS instance as the MaxMembers current value is 2048 and the requested value by the older version of the RDBMS is 512

The solution as Mike suggested is already in the latest RU's. Additionally it is possible to bypass this error by shutting the GI on all the nodes.

Somdyuti Paul

Database Management Specialist-Google

4y

Anil Nair- good to know.

Like
Reply
Aritra Kundu

Senior Manager at Oracle

4y

This is addressed in lastest RU, workaround is cold crs restart

JESUS BASTIDAS BRICEÑO

DBA Architect | Senior DBA | Administrador de base de datos Oracle | Database Administrator | SQL SERVER DBA | Azure Database Administrator |53X OCI Certified|8X AZURE Certified|2X GCP|2X AWS

4y

Molto bene. Grazie mille. Benedizioni.

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics