Skip to main content

Arquivo.pt preserves online data from European H2020 projects

Arquivo.pt, a service managed by FCCN, the FCT's Scientific Computing Unit, has recently preserved 197 million web files documenting research and development projects funded by the European program Horizon 2020. This digital preservation allows to safeguard about 17 Terabytes of information and prevent the information from being lost forever.

After identifying and preserving websites of research and development projects funded by the European Union during the FP4, FP5, FP6 and FP7 programs (from 1994 to 2013), Arquivo.pt has now saved valuable online information at risk of disappearing under the Horizon 2020 program (2014 to 2021).

In recent years, the use of websites for documenting research project activities has been increasing. These websites provide relevant scientific information that complements published literature, such as open datasets, presentations at events, or developed software. With the end of the projects, this information was in danger of being irretrievably lost.

The task of identifying the research projects involved several methodologies and the use of the European Union's open data portal. However, this portal does not provide all the information, and many projects omitted the website. It was therefore necessary to use tools developed by Arquivo.pt to complement the missing information. For example, the website of the Extended Model of Organic Semiconductors (EXTMOS) project, which was available at extmos.eu, was already inactive. However, the information is now fully accessible via Arquivo.pt.

Arquivo.pt provides more information about this work and continues to invite all users to suggest sites that can be preserved.

The Arquivo.pt is a public service, free and of free access to all web users. Every day millions of pages are published on the web, but 80% of this information disappears 1 year after its publication and becomes inaccessible. The purpose of Arquivo.pt is to counteract this trend and enable the search and retrieval of information from old sites.