What is erasure coding in ceph?
What is erasure coding in ceph?
Erasure coding is a method of storing an object in the Ceph storage cluster durably where the erasure code algorithm breaks the object into data chunks ( k ) and coding chunks ( m ), and stores those chunks in different OSDs. For example, 3 data and 2 coding chunks use 1.5x the storage space of the original object.
How does erasure code work?
Erasure coding works by splitting a unit of data, such as a file or object, into multiple fragments (data blocks) and then creating additional fragments (parity blocks) that can be used for data recovery. If such an event occurs, the parity fragments can be used to rebuild the data unit without experiencing data loss.
What is the requirements of erasure coding?
Requirements for erasure coding
- Objects larger than 1 MB in size.
- Long-term or cold storage for infrequently retrieved content.
- High data availability and reliability.
- Protection against complete site and node failures.
- Storage efficiency.
What is meant by erasure coding?
In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of k symbols into a longer message (code word) with n symbols such that the original message can be recovered from a subset of the n symbols.
What is the meaning of the shingled erasure code?
Shingled Erasure Code (SHEC or original SHEC) [1] is a recovery-efficient and highly-configurable erasure code.
Can a SHEC be extended to multiple SHEC?
Moreover, SHEC is extended to multiple SHEC (mSHEC), whose layout is automatically combined from several original SHEC layouts in response to durability and space efficiency each user specifies and, as a result, recovery efficiency is improved from the original SHEC. Takeshi Miyamae (Fujitsu Laboratories Ltd., [email protected])
What does K, M, C mean in SHEC?
SHEC (k,m,c) means a SHEC’s layout which has k data chunks, m parity chunks and durability estimator c. Durability estimator is the average number of parity chunks which cover each data chunk. SHEC has several advantages.