Data corruption
From open-encyclopedia.com - the free encyclopedia.
Data Corruption refers to computer data that when transmitted it arrives at its destination is different when it was transmitted from the source. This difference often makes the data unusable to those at the destination.
This corruption can have a wide variety of causes. Some causes include interruption in the transmission of data, where the data has holes in it or is incomplete. Environmental conditions can often interfere with data transmission, especially when dealing with wireless transmission methods. Heavy clouds can sometimes block satellite transmissions. Wireless networks are susceptible to interference from devices such as microwave ovens.
Data corruption can also occur during storage as well as during transmission of data.
In cases where data corruption behaves as a Poisson process, where each individual bit of data has independently some low probability of being corrupted, data corruption can generally be detected by the use of checksums. This relies on the checksum being evaluated at intervals over which there is a negligibly small probability of multiple bits being corrupted in a way which has no net effect on the checksum. The longer the checksum, the smaller this probability becomes. The simplest form of checksum is a single parity bit, which can detect a single flipped bit in a given set (typically a byte) but not detect two (or any even number of) bit-flips.
In the event that data corruption is detected, it can hopefully be re-transmitted (as occurs in the TCP protocol) or re-copied from backups. A special case is disk RAID arrays, where parity bits are commonly evaluated and stored (summed over the disk set for each given offset), and can be used to reconstruct the corrupted data in the event of the failure of a known single disk.
Therefore, if appropriate mechanisms are employed to detect and remedy data corruption, the effects can be minimised. This is particularly important in banking, where an undetected bit-flip in a highly significant position could drastically affect an account balance, and in the use of encrypted or compressed data, where a bit-flip can make an extensive dataset unusable.