One of our customers had issues over the past several months with Oracle Active Data Guard replication to a standby server. The database on the standby server would intermittently fall out of sync with the database on the primary server, and log files were not consistently shipping to the standby server.
The Oracle database version was 126.96.36.199 running on Oracle Linux 6. The firewall was a Cisco ASA-5585-SSP-40, and the ASA version is 9.6(4)8.
TNS tracing showed: CORRUPTION DETECTED: In redo blocks starting at block #…
By the time I got involved, the firewall administrators had already implemented all the recommended firewall changes to disable the following:
- SQLNet fixup protocol
- Deep Packet Inspection (DPI)
- SQLNet packet inspection
- SQL Fixup
The following errors were noted in the primary database alert log:
- ORA-16055: FAL request rejected
- ARC6: Standby redo logfile selected for thread 2 sequence 33351 for destination LOG_ARCHIVE_DEST_2
- ARC6: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (12152)
- ARC6: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
The following errors were noted in the standby database alert log:
- CORRUPTION DETECTED: In redo blocks starting at block #…
- RFS: Possible network disconnect with primary database
- Error 1017 received logging on to the standby
- FAL[client, USER]: Error 16191 connecting to …
- ORA-16191: Primary log shipping client not logged on standby
The root cause of the problem turned out to be a bug in the Cisco firewall. For reasons unknown, when the primary and standby database listeners were using port 1521, the firewall would ignore the settings the admins had implemented for the Oracle Data Guard connections and revert to the default settings. As a workaround, we changed to a different port.
If you are experiencing intermittent or hard-to-diagnose database issues in your environment, contact Buda Consulting.