Digital long-term archiving – what about the future of your data?
Archiving documents over long retention periods places special demands on processes, technologies and file formats. In an overview, we clarify what to look out for in digital long-term archiving. Take a look behind the scenes of digital imperishability today, because: A well-informed life is more relaxed.
The old-fashioned floppy disk illustrates the challenges of long-term archiving (LTA). Still quite common and widely used 25 years ago, we can’t find any machines with a the right disk drive to use them anymore. And we’re not even talking about file formats from back then and how difficult they might be to read today. This means a digital long-term archive conflicts with technical developments over time. Since you already use GoBD-compliant archiving with specific retention periods for your digital business documents: What is the difference between this archiving and long-term archiving? What should be taken into consideration for the “well-being” of your bits and bytes over the long term?
What is not considered to be long-term archiving?
When it comes to electronic business records, the statutory retention periods are six and ten years. But a retention period of ten years alone shows that this period cannot be considered long-term digital archiving. Nevertheless, taxable enterprises are required to conduct audit-compliant archiving within these time periods in accordance with principles of proper accounting* [GoBD] applicable in Germany.
Definition: What is considered to be long-term digital archiving?
For certain business documents, the law defines periods of 20 or 30 years to qualify as long-term archiving – To clarify:
- A 20-year retention period applies to business records required to calculate input tax and property for own use (see: Art. 70(3) MWSTG [VAT Act])
- A period of 30 years applies to judgments, dunning notices and court documents as well as:
- Outpatient and inpatient records in hospitals (medical history)
- Pension fund documents, documents relating to pension provisions, pension documents
- Radiation protection applications (X-ray treatment as per X-Ray Ordinance [RöV])*
This requirement for long-term archiving is clearly aimed at organizations and companies that address the respective issues professionally.
In addition, there are documents that must even be retained for life. This requirement applies to natural persons. Such documents include the following:
- Civil documents such as passports, birth and marriage certificates, certificates of inheritance, divorce certificates
- Pension and social security documents
- Health insurance documents
- Documents on real estate purchases and excerpts from the land register
What are the requirements for long-term archiving?
The long-term archive must keep all data and documents on hand at all times over long periods of time in a form that is readable and true to the original and thus unaltered. These are certainly requirements that can hardly be underestimated, in consideration of the above-mentioned 30-year retention periods or even the lifelong retention obligation for data and documents for natural persons.
What distinguishes a digital long-term archive?
- Long-term storage: A long-term archive is designed to securely store information over long periods of time. The goal is the long-term availability and accessibility of archived content.
- Data integrity: Long-term archives ensure the integrity of archived data over time. Mechanisms such as data verification, error correction and regular verification are used to ensure that the data remains unchanged and uncorrupted – i.e. as it was on the first day of storage.
- Long-term readability: Long-term archiving uses open and documented file formats. This approach ensures that the archived data can be read and interpreted in the future. The archive also takes technological developments into account and ensures that data remains accessible on new systems and platforms.
- Metadata management: A long-term archive should contain extensive metadata for the archived documents. Metadata facilitates searching for, identifying and managing archived data over long periods of time.
- Security: Long-term archives protect archived data from loss, damage or unauthorized access. They include security measures such as redundant storage, encryption and access controls to ensure the confidentiality, integrity and availability of data.
- Scalability: The long-term archive should be able to manage and store any amount of data – if only because the amount of information to be archived can grow over time.
easy archive: secure Archiving
easy archive is the leading archiving software from small and mid-size companies to leading enterprises. Digital archiving of documents and data sets not only keeps you compliant, but enables digital business processes and self-service.
Digital formats for long-term data storage
The big challenge with file formats for long-term archiving is that they have to survive several generations of hardware and software, in other words: They must remain legible. To ensure this, file formats should not only be conventional but also well-documented. The latter property is essential to even make it possible to write applications based on this documentation that can display certain file formats.
The following file format has proven to be suitable for long-term digital archiving of text documents:
- PDF/A: the portable document format for archiving. This file format is thoroughly designed for the long-term archiving of digital documents. It is ISO-standardized, well-documented and has layout and structure information in one file. It is also based on Adobe’s PDF format from 1992. The PDF Association has been maintaining and improving this format since 2006.
There are also many other formats for long-term archiving, such as SGML, XML, Office Open XML, etc. These are also well-documented and very widely used. The choice depends on the type of information to be archived.
Which data media are suitable for long-term archiving?
Commercially available storage media immediately come to mind: hard disks (HD/SSD), USB sticks, optical storage media (CD/DVD), etc. Unfortunately, painful experience has taught that the shelf life of the latter two data media is limited. This means they are not readily suitable for long-term archiving.
A number of storage systems developed solely for long-term archiving have established themselves over the past two decades. The storage media used in this case are magnetic tapes, i.e. tape drives, and hard disks (rotating and solid state). These systems set themselves apart by the close integration of the hardware, that is, the data media, and software. Two of these storage systems are worth mentioning:
- Content-Addressed Storage (CAS): CAS is distinguished by the use of unique identifiers. The latter refer to the content of a data object rather than its storage location. This way, the CAS systems enable high data integrity and immutability. They also offer efficient storage, rapid data retrieval and scalability. CAS is especially well-suited for long-term archiving of immutable data such as compliance documents, medical records or legal documents.
- Integrated Content-Addressed Storage (iCAS): These storage systems feature a combination of content-addressed storage (CAS) with additional functions and extensions for long-term archiving. This way, iCAS systems offer integrated data management and archiving functions, including comprehensive metadata management and an advanced search function. It enables seamless integration with existing IT infrastructures and applications to ensure efficient data access and easy management of archived content. iCAS provides advanced security features such as encryption and access control to ensure the confidentiality and integrity of archived data.
Only long-term data storage protects data from being forgotten and fading away
In a world full of digital data and the seemingly endless expanse of the internet, long-term archiving is one safe haven for valuable information. It protects the data from being forgotten and fading away as a “digital retirement home” for data, a long-term archive ensures that memories can be found and played back any time. But amidst all the bits and bytes, we shouldn’t forget that long-term archiving also takes maintenance. Regularly checking data and document integrity is essential – so it’s a good thing that managed service providers for archiving are happy to take on this job.