Backup and Disaster Recovery

 Data Backup

  • Data backup is the process of storing additional copies of your data in physical or virtual locations distinct from your data files in storage. Typically, backup data includes all the data documents, media files, configuration and registry files, machine images, etc. required to perform the workload on your server. In essence, any data that you desire to keep can be saved as backup data.
  • The main goal of backup is to generate a copy of the data that can be recovered if the primary data fails. Failure can be hardware or software failures, data corruption, or a human-initiated event such as an attack (virus or malware) or data deletion by an accident.
  • The act of backing up your data in the case of a loss and putting up secure mechanisms that allow you to recover your data, as a result, this process is known as data backup and recovery. It copies and preserves data in order to keep it available in the event of data loss or damage. Suppose you have backed up your data, so you can only recover data from a previous point in time. Data backup is a type of disaster recovery that should be included in every plan for disaster recovery
  • Backup copies should be made on a constant, regular basis for optimal outcomes, since this will reduce the amount of data lost between backups. When recovering from a backup, the longer the gap between backup copies, the greater the risk of data loss.


Note - Even if you back up your data, you may not be able to restore all of your system's data and settings.

Types of Backups:


1. Full Backup:

  •  As the name implies, backs up each and every file and folder (hard drives and more) on the system. As the backup is full and complete, it takes longer time and takes up more space than other backup options, but the process of recovering lost data from backup is much faster.

  • The primary advantage of performing a full backup during every operation is that a complete copy of all data is available with a single media set.

  • Thus, full backups are typically run only periodically. Data centers with a small amount of data may choose to run a full backup daily or even more often in some cases. Typically, backup operations employ a full backup in combination with either incremental or differential backups


2. Differential Backup:

  • These backups begin with a full backup, which saves all of your files. Then differential backups are performed, which save only the data that has changed since the previous full backup. This saves a lot of time and resources compared to doing continuous full backups. Its backups only save changes to files that changed as compared to the previous backup and it also makes a copy if new files if any. It also provides for a speedier restore time than incremental although it requires more storage space.

  • A differential backup operation is similar to an incremental the first time it is performed, in that it will copy all data changed from the previous backup. However, each time it is run afterward, it will continue to copy all data changed since the previous full backup. Therefore, it will store more backed up data than an incremental on subsequent operations, although typically far less than a full backup.

  • Differential backups require more space and time to complete than incremental backups, although less than full backups. From these three primary types of backup, it is possible to develop an approach for comprehensive data protection. An organization often uses one of the following backup settings:


Full daily

Full weekly + differential daily

Full weekly + incremental daily


3. Incremental Backup:

  •  An incremental backup is similar to a differential backup, but it only includes the data that has changed since the last backup. The differential backup, on the other hand, includes all data since the previous complete backup. Although incremental backups need the least amount of storage space, they can take longer to retrieve data because they must be executed independently during a recovery. However, because they are significantly smaller than full or differential backups, they usually require less restoral time.

  • An incremental backup operation will result in copying only the data that has changed since the last backup operation of any type. An organization typically uses the modified timestamp on files and compares them to the last backup timestamp.

  • Backup applications track and record the date and time that backup operations occur to track files modified since these operations. Because an incremental backup will only copy data since the last backup of any type, an organization may run it as often as desired, with only the most recent changes stored.

  • The benefit of an incremental backup is that it copies a smaller amount of data than a full. Thus, these operations will have a faster backup speed and require fewer media to store the backup.


4. Network Backup:

  •  It backs up a file system from one machine onto a backup device connected to another machine. It is referred to as a remote or network backup.

  • Note: The restoration speed of Full Backup is fastest, but its backup speed is the slowest. Backup speed for incremental is fastest and it requires low memory but its restoration speed is the slowest.


5. Mirror Backup:

  • A mirror backup is comparable to a full backup. This backup type creates an exact copy of the source data set, but only the latest data version is stored in the backup repository with no track of different versions of the files. All the different backed up files are stored separately like they are in the source.

  • One of the benefits of mirror backup is a fast data recovery time. It's also easy to access individual backed up files.

  • One of the main drawbacks, though, is the amount of storage space required. With that extra storage, organizations should be wary of cost increases and maintenance needs. If there's a problem in the source data set, such as corruption or deletion, the mirror backup experiences the same. As a result, it is not to rely on mirror backups for all the data protection needs and have other backup types for the data.

  • One specific kind of mirror, disk mirroring, is also known as RAID 1. This process replicates data to two or more disks. Disk mirroring is a strong option for data that needs high availability because of its quick recovery time. It's also helpful for disaster recovery because of its immediate failover capability. Disk mirroring requires at least two physical drives. If one hard drive fails, an organization can use the mirror copy. While disk mirroring offers comprehensive data protection, it requires a lot of storage capacity.


6. Smart backups

  • Smart backup is a backup type that combines the full, differential and incremental backup types with cleanup operations to efficiently manage the backup settings and the free disk space in the destination. The Smart backup type starts with a full backup.

  • The advantage is that we don't need to worry about the number of backups to store to fit on the destination drive, which backup version to clean or merge, as Backup4all will take care of that.


Backup Devices

You can take backup of your data on any of the following devices:


1. CD and DVD: Because they have a small capacity ranging from 1000's of MB to a few GB, they are utilized for home/personal use where users can save their papers, primarily personal or office-related papers.


2. USB sticks: USB sticks are small in size and cost, but they're big in storage capacity; you can obtain up to 128 gigabytes on a USB stick. They are small in size but have a good transfer speed.


3. USB Drives: This sort of drive has a size range of 500MB to 2TB and is compatible, and normally includes backup and recovery software. Encryption, convenient automatic backups, and a cloud backup option are all included in many models. The cost is high.


4. Solid-state drives (SSDs): They are more expensive than hard drives, but they're also more reliable, smaller, faster, and consumeless power. SSDs are ideal for applications where a speed improvement is worthwhile, such as system files or multimedia production. 


5. NAS (Network-Attached Storage): A network-attached storage is a file storage device that delivers centralized, consolidated disc storage to local-area network (LAN) users via a normal Ethernet connection. NAS allows a network with servers to add more hard disc storage capacity without having to shut them down for maintenance and updates. You can use it solely for backups, or you may use it for file sharing and streaming of multimedia as well.


6. Storage Area Network (SAN): A storage area network, or SAN, is a specialized, fast network that gives storage devices network access. Hosts, switches, storage components, and storage devices make up standard SAN configurations. These components are connected to one another via a range of technologies, topologies, and protocols. In a simple language, a SAN is a network of disks that is accessed by a network of servers. SANs have become increasingly popular over the years. Today SANS are most widely used storage technologies in a variety of enterprise computing applications.


A SAN is a form of block-based storage that uses a fast architecture to link servers to their logical disk units (LUNs).SAN benefits include speed, scalability and fault tolerance. Compared to NAS, a SAN typically uses fiber channel connectivity, while NAS typically ties to the network through a standard Ethernet connection. A SAN stores data at the block level, while NAS accesses data as files. A SAN typically appears as a disk to a client OS and exists as its own separate network of storage devices, while NAS appears as a file server.SAN is associated with structured workloads such as databases, while NAS is generally associated with unstructured data such as video and medical images. 


7. Cloud Backup

Cloud backup refers to a technique for transferring a copy of a digital data (documents, image, multimedia, system files, database etc.), to a secondary, off-site location also known as online backup or remote backup, for the preservation in case of equipment failure or catastrophe. Google drive, iCloud, Government Cloud etc. are the examples of to days cloud services.


The advantages of cloud backup include the management and configuration of high-tech equipment, such as servers, network, storage, firewalls, etc., are taken care of by the cloud service provider, so users don't have to worry about onsite hardware or capital costs. Since it is accessible from anywhere via Internet, affordable, secure, distant data storage (so that the data can be restored in the event of a disaster), the cloud backup has been the most widely used backup system in the today's virtual world. Moreover, as the cloud handles everything about the hardware, the user can concentrate solely on the functional aspects of the system and data.


Data Recovery

Data recovery, often known as a restore, is required when data of any sort is no longer readable or has been curropted by a malicious alterations. alteration. The act, process, or occurrence of | recovering data following inadvertent loss or corruption is known as recovery. The cost of recovering data is high.


Causes of Data Recovery:

Businesses might lose data in a variety of ways. Mistakes in technology can sometimes result in data loss that is irreversible. Other times, hackers manage to breach or evade a copies defense, taking data for their malevolent purposes or draining the company's resources. The following are the most prevalent reasons for data loss:

1. Virus/Spyware/Malware attack.

2. Natural calamities

3. Hardware Failure

4. Human errors.

5. Manipulation in software etc.


BACKUP FOR DISASTER RECOVERY

  • As it is already stated that the data backup is the process of replicating data or files to be kept on different storage drives or machines at a specified place. Disaster recovery backup is for restoring those files following a catastrophe. On the other side, it describes the strategy and procedures for quickly restoring access to IT resources such as apps and data after an outage.

  • There must be a Disaster Recovery Center (DRC) that is at least 50 KM away from the current data center and in a different seismic zone in order to prevent data loss and maintain IT applications in the event of a major disaster, such as an earthquake, flood, nuclear attack, etc., that causes all hardware infrastructure and systems set up in the different machines to crash.

  • Today, the cloud backup is also referred to as an appropriate backup scheme for disaster recovery as a reputable cloud service provider manages and ensures the disaster recovery to the client.

  • Government of Nepal has also set up DRC in Hetauda whose primary data center called Government Integrated Data Center (GIDC) is in singadurbar Kathmandu. The centers are connected through high speed optical fiber channel.


Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Discuss classification or taxonomy of virtualization at different levels.