2. Dataset Deposit Workflow (Data Management Plan DMP)
4. Linking
6. Technical specifications and data security
_________________________________________________________________________
1. ASEP Data Repository
ASEP is an institutional data repository of the Czech Academy of Sciences, where data files with metadata (description) of authors of the Czech Academy of Sciences institutes are stored. It enables versioning of data files and linking of data and bibliographic records. The repository is registered in the RE3DATA Registry of research data repositories.
The data stored in the ASEP database are published in the online catalogue and have persistent identifiers: handle (all records in ASEP) and DOI (data records stored directly in ASEP). The metadata description of each record is based on the UNIMARC standard. It is also compatible with the Datacite metadata schema. The metadata of all records is freely accessible.
2. Dataset Deposit Workflow into the ASEP Repository
All authors of the CAS can deposit their datasets into the ASEP Repository. Eventually, the process of deposit can be done by another authorized person – an institutional ASEP administrator. The author/depositor uploads the metadata records and datasets via an individual account.
- Before creating a data record and storing the dataset, the author must define some basic details of the dataset (structure, file format, license, terms of data publication, etc.) and in the case of publication, ensure that he or she has got the consent of co-authors.
- The depositor applies for an individual myASEP account by filling in a registration form and wait for confirmation e-mail.
- The depositor logs into his myASEP account, creates a record (metadata), attaches a dataset, chooses a license and save.
- Once he decides to publish data in the on-line catalog he moves the record to the institutional data administrator´s account for further control.
- After that a persistent identifier (handle) is assigned and the dataset is accessible in the online catalogue.
- ASEP allows 4 options for accessing stored files.1. Publicly accessible: the data files in the online catalogue are accessible to everyone.2. Accessible for the institute: the files are accessible from the IP address of the institute that uploaded them.3. Publicly accessible with embargo: The files are publicly accessible after a specified period.
4. Publicly inaccessible (on request): The files are inaccessible. It is possible to request access to the files from the depositor.
In the ASEP Data Repository, it is possible to choose from Creative Commons licenses.
Alternatively, you can use your licenses. In this case, the text of the license, named licence.pdf, must be attached to the data files. If Creative Commons or other open licenses are applied to the record, the stored files must be publicly available.
3. Agreement and licensing
To deposit data into the ASEP Repository is allowed only to those employees of the CAS whose institute has signed an agreement with the Library of the Czech Academy of Sciences. The list of institutes that have signed the contract is here.
The datasets deposit agreement in the ASEP Data Repository is here.
The depositor should verify the following before the data deposit:
- Do you have all the rights to make the data available?
- Have you received permission from all other right-holders?
- Do you have data citations ready?
- Have you sufficiently anonymised your data, or obtained explicit consent from any data subjects whose identity could be revealed from the data?
- Are you aware of, and are you comfortable with what rights you are passing on to the repository (The Library of the Czech Academy of Sciences )?
The depositor has two options at the licensing stage. Either can choose a suitable Creative Commons license or if he desires to add a special license that is not currently available, he should contact arl@knav.cz.
4. Linking
There are various possibilities of linking:
Data records can be saved together with a datasets in the ASEP Repository or they can include references to datasets stored in another repository.
Data records can include references to bibliographic records in the ASEP Repository.
Bibliographic records can include references to datasets in the ASEP Repository as well as datasets stored in another repository.
5. Size limits and formats
The current size limit is 50 GB per file. If the depositor wants to deposit larger files to share online he should contact arl@knav.cz to discuss his options.
Data files should be stored compressed as a zip archive. License files (licence.pdf), and files containing documentation do not need to be compressed.
The depositor should ensure that his files are labelled as explicitly as possible, allowing peers to access his data easily. File(s) should be future-proof and if possible not dependent on proprietary software formats. Here is the list of file formats that the ASEP Repository supports and recommends. The depositor is allowed to use other formats, most of which are widely used and we will likely be able to preserve them, but cannot guarantee it.
File formats and naming
Data files should be stored compressed as a zip archive. License files (licence.pdf), and files containing documentation do not need to be compressed. Saved files should be appropriately named. The name should represent the content of the saved files. File names must be without diacritics, no spaces, underscores as separators, and a maximum file name length of 127 characters.
To ensure long-term access and usability of the data in the repository, it is advisable to use standard formats for files that guarantee long-term protection. Formats suitable for long-term protection are primarily those that are open, well-mapped and widely supported by software producers. When selecting appropriate formats, it is best to follow community recommendations for good practice and generally accepted standards.
6. Technical specifications and data security
Software
Asep is based on the „Publication Activity Record“ module of the Advance Rapid Library (ARL) system.
Metadata security is at the level of the IRIS database and its access rights control from the ARL application.
User data is encrypted in the database, which is automatically backed up according to settings in scheduled tasks. ARL complies with GDPR standards.
The digital content of the repository (files stored in the Content server) is secured from an application perspective by access through the Content server gateway, controlled by configurable rules. From a storage perspective, the content is secured at the file system level, the directory permission settings in the operating system and the backup policy of the virtual server administrator, so it falls under Veeam backups.
Archiving is ensured by keeping the entire version history of stored files in the Content server, they are never deleted.
Data transfer between the client browser and the ARL server is secured by the encrypted https protocol, with automatic renewal of Let’s Encrypt certificates.
The virtual servers running the live installation and shadow copies (geographically separated databases) are backed up by the standard means of the Academy of Sciences Library backup infrastructure – particularly the VeeamBackup tool.