Oracle Database는 계획되거나 계획되지 않은 다운타임의 원인에 대한 문제를 해결할 수 있도록 설계되었습니다
다음은 시스템에 다운타임이 발생하는 다양한 원인과 솔루션을 정리한 오라클 매뉴얼의 일부 내용입니다.
1.Causes of Downtime
Category | Outage Type | Description | Examples |
---|---|---|---|
Unplanned | Computer failure | A computer failure outage occurs when the system running the database becomes unavailable because it has shut down or is no longer accessible. | Database system hardware failure Operating system failure Oracle instance failure Network interface failure |
Storage failure | A storage failure outage occurs when the storage holding some or all of the database contents becomes unavailable because it has shut down or is no longer accessible. | Disk drive failure Disk controller failure Storage array failure | |
Human error | A human error outage occurs when there is unintentional or malicious actions committed that cause data within the database to become logically corrupt or unusable. The service level impact of a human error outage can vary significantly depending on the amount and critical nature of the affected data. | Dropped database object Inadvertent data changes Malicious data changes | |
Data corruption | A data corruption outage occurs when a hardware or software component causes corrupt data to be read or written to the database. The service level impact of a data corruption outage may vary, from a small portion of the database (down to a single database block) to a large portion of the database (making it essentially unusable). | Operating system or storage device driver, host bus adapter, disk controller, or volume manager error causing bad disk read or writes Stray writes by operating system or other application software | |
Site failure | A site failure outage occurs when an event causes all or a significant portion of an application to stop processing or slow to an unusable service level. A site failure may affect all processing at a data center, or a subset of applications supported by a data center. | Extended site-wide power failure Site-wide network failure Natural disaster making a data center inoperable Terrorist or malicious attack on operations or the site | |
Planned | System changes | Planned system changes occur when performing routine and periodic maintenance operations and new deployments. Planned system changes include any scheduled changes to the operating environment that occur outside the organizational data structure within the database. The service level impact of a planned system change varies significantly depending on the nature and scope of the planned outage, the testing and validation efforts made prior to implementing the change, and the technologies and features in place to minimize the impact. | Adding/removing processors to/from an SMP server Adding/removing nodes to/from a cluster Adding/removing disks drives or storage arrays Changing configuration parameters Upgrading/patching system hardware and software Upgrading/patching Oracle software Upgrading/patching application software System platform migration Database relocation |
Data changes | Planned data changes occur when there are changes to the logical structure or physical organization of Oracle database objects. The primary objective of these changes is to improve performance or manageability. | Table definition changes Adding table partitioning Creating and rebuilding indexes |
2.Oracle High Availability Solutions for Unplanned Downtime
Outage Type | Oracle Solution | Benefits | Recovery Time |
---|---|---|---|
Computer failures | Fast-Start Fault Recovery | Tunable and predictable cache recovery | Minutes to hoursFoot 1 |
RAC | Automatic recovery of failed nodes and instances, fast connection failover, and service failover | No downtimeFoot 2 | |
Data Guard | Fast Start Failover and fast connection failover | < 1 minute | |
Oracle Streams | Online replica database | No downtime2 | |
Storage failures | ASM | Mirroring and online automatic rebalance | No downtime |
RMAN with flash recovery area | Fully managed database recovery and managed disk-based backups | Minutes to hours | |
Data Guard | Fast Start Failover and fast connection failover | < 1 minute | |
Oracle Streams | Online replica database | No downtime2 | |
Human errors | Oracle security features | Restrict user access as prevention | No downtime |
Oracle Flashback technology | Fine-grained and database-wide rewind capability | < 30 minutesFoot 3 | |
LogMiner | Log analysis | Minutes to hours | |
Data corruptions | HARD | Corruption prevention within a storage array | No downtime |
RMAN with flash recovery area | Online block media recovery and managed disk-based backups | Minutes to hours | |
Data Guard | Automatic validation of redo blocks before they are applied, execute fast failover to an uncorrupted standby database | < 1 minute | |
Oracle Streams | Online replica database | No downtime2 | |
Site failures | RMAN | Fully managed database recovery and integration with tape management vendors | Hours to days |
Data Guard | Fast Start Failover and fast connection failover | Seconds to 5 minutesFoot 4 | |
Oracle Streams | Online replica database | Seconds to 5 minutes4 |
Footnote 1 Recovery time consists largely of the time it takes to restore the failed system.
Footnote 2 Database is still available, but portion of application connected to failed system is affected.
Footnote 3 Recovery time for human errors depend primarily on detection time. If it takes seconds to detect a malicious DML or DLL transaction, it typically only requires seconds to flashback the appropriate transactions. Longer detection time usually leads to longer recovery time required to repair the appropriate transactions. An exception is undropping a table, which is literally instantaneous regardless of detection time.
Footnote 4 Recovery time indicated applies to database and existing connection failover. Network connection changes and other site-specific failover activities may lengthen overall recovery time.
3.Oracle High Availability Solutions for Planned Downtime
Maintenance Type | Oracle Solution | Description | Recovery Time | Considerations |
---|---|---|---|---|
System and hardware upgrades | RAC | To avoid downtime:
| No downtime | Need to check for system restrictions. Need to check if the database and clusterware versions are certified with the new system and hardware changes. |
Operating system upgrade | RAC | To avoid application downtime:
| No downtime | Need to check if the database and the clusterware versions are certified for both operating system patch releases. |
Oracle one-off patches | RAC | "One-off" patches—or interim patches—to database software are usually applied to implement known fixes for software problems, or to apply diagnostic patches to gather information on a problem. Such patch application is often performed during a schedule maintenance outage. Oracle provides the capability to do rolling patch upgrades with RAC with little or no database downtime using the A RAC rolling upgrade enables at least some instances of the RAC installation to be available during the scheduled outage required for patch upgrades. Only the RAC instance that is currently being patched needs to be disabled. The other instance can continue to remain available. This means that the impact on the application downtime required for scheduled outages is further reduced. Oracle's | No downtime | Rolling upgrade is only available for patches that are certified for rolling upgrades. Typically, patches that can be installed in a rolling upgrade include:
RAC cannot be used for rolling upgrade of patch sets. |
Storage migrationFoot 1 | ASM | ASM enables you to add all disks in one storage array and subsequently drop all disks from another array. ASM will automatically rebalance and migrate data to the new storage while the database remains operational. | No downtime | Before removing the source storage array, ensure that the rebalancing is complete. |
System and cluster upgrades | Data Guard | For system upgrades that are not rolling upgradable with RAC due to system restrictions or cluster firmware upgrades that require downtime, leverage Data Guard to switch over to a physical or logical standby database:
| Seconds to minutes | For fastest switchover, the standby database should be using real-time apply and synchronized prior to the switchover operation. |
Patchset and database upgrades | Data Guard using SQL Apply | Leverage Data Guard using SQL Apply to upgrade an Oracle database:
| Seconds to minutes | Only supported for Oracle database versions 10.1.0.3 and higher. SQL Apply has some data type restrictions. For more information, see Oracle Data Guard Concepts and Administration. |
Database upgrades and platform migration | Transportable tablespace | Transporting a database only requires copying datafile and integration the tablespace structural information. Tablespaces can even be transported between databases from different releases. With Oracle database 10g, tablespaces can be transported across platforms. To perform a database upgrade or platform migration:
If the target database reside on a separate host but on the same platform, create a physical standby database from the initial primary database co-located with the target database. After a Data Guard Switchover, transport the tablespaces from the source to the target without incurring the file transfer time as part of the downtime.Foot 2 | Minutes to hours | Transportable tablespace has limitations and restrictions in regard to character sets, opaque types, and system tablespace objects. Unlike previous solutions, the steps are not automated. Transportable tablespaces do provide the following benefits:
|
Oracle Streams | Like Data Guard using SQL Apply, Oracle Streams can capture database changes, propagate them to destinations, and apply the changes at these destinations. Oracle Streams is optimized for replicating data and can capture changes locally in the online redo log as it is written. The captured changes can then be propagated asynchronously to replica databases. This optimization can reduce latency and enable the replicas to lag the primary database by no more than a few seconds. Unlike Data Guard using SQL Apply, Oracle Streams enables updates on the replica and provides support for heterogeneous platforms with different database releases. Therefore, Oracle Streams may provide the fastest approach for database upgrades and platform migration. | Seconds to minutes to hours | Oracle Streams also has data type limitations and restrictions, such as for advanced queue and object types. Oracle Streams implementations will require additional investment for setup and configuration since it is designed to be a more flexible architecture. |
Footnote 1 An example is migration from traditional storage to low cost storage
Footnote 2 For more information, refer to the best practices white papers available at
http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
.
'DB > ORACLE' 카테고리의 다른 글
MERGE 문 구문 (0) | 2014.07.23 |
---|---|
implicit query (0) | 2014.07.15 |
파이썬 설치 및 오라클 접속 예제 (0) | 2014.06.30 |
[펌]물리모델링시 Width가 없는 Number형을 쓰지 말아야 할 이유 (0) | 2014.06.24 |
[펌] 네임스페이스에서 이름생성 방법(Within a namespace, no two objects can have the same name) (0) | 2014.06.24 |