Example of robust hypertext and authentic data

by Gérald Jean Francis Banon
December 2023
Updated in October 2024

Introduction

Definitions of Almost Fully Persistent Hyperlinks and Fully Persistent Hyperlinks are presented, respectively, in [1] and [2].

Persistent Hyperlinks are important because they solve the problem of Web resources relocating.

This note introduces one more definition of hyperlink, the so-called Robust Hyperlinks that is the last stage of proposed advanced hyperlink types and contributes to solve the problem of the continued existence of the Web resources which is, together with the problem of resources relocation, the second and last problem to be solved.

It is not uncommon for a digital object (information item) to be composed of parts of other digital objects distributed across the Web. This way of proceeding guarantees the authenticity of reused data and, consequently, prevents their deliberate or unintentional distortion.

To preserve the integrity of such digital object in the Long-Term, that may extend indefinitely, Robust Hyperlinks are specially important.

In other words, a robust implementation of the relevant principle of Single source of truth which provides data that are authentic and referable, must rely on the use of Robust Hyperlinks.

Following the terminology of the Reference Model for an Open Archival Information System (OAIS) [3], this note is the Data Object (Digital Object) of an Archival Information Package (AIP) preserved in an Archive which is member of a Federation of Archives.

Currently, this Federation have the following members (the first is the one hosting the present HTML page):

Definitions

URLib is the acronym for Uniform Repository for a Library, a computing platform for hosting robust hypertexts in an organized way.

doc is the standard name of a directory containing a hypertext (see illustration in Fig. 1).

Figure 1. A doc directory.


Source: [4]

– A Uniform Repository is a set of 4 successive directories which forms a unique peristent path to a doc directory (see illustration in Fig. 2).

Figure 2. Example of a uniform repository.


The repository name is: dpi.inpe.br/ronei/1996/11.20.18.56
Source: [4]

   When convenient, a repository is represented like a single directory (see illustration in Fig. 3).

Figure 3. Convention.


A repository R1
Source: [4]

col is the standard name of a directory containing a local collection of uniform repositories stored in an Archive (see illustration in Fig. 4).

Figure 4. A col – local collection – directory.


A local collection col with two repositories R1 and R2.
Source: [4]

– The federation collection is the set of all the federated Archive local collections (see illustration in Fig. 5).

Figure 5. The federation collection.

        . . . . .       
The set of all the distributed federated Archive local collections forms the overall collection of the Federation.
Source: [4]

A Robust Hyperlink (from a Source Object - SO to a Destination Object - DO, both belonging to a Federation of Archives) is a Fully Persistent Hyperlink (from the SO to the DO, both belonging to a Federation of Archives)[2] whose DO Context Information documents the number of times the Fully Persistent Hyperlink has been activated from the SO.

A Robust Hypertext is a hypertext whose hyperlinks are robust.

Important Observation: Robust Hyperlinks (from a Source Object - SO to a Destination Object - DO, both belonging to a Federation of Archives) can contribute significantly to solve the problem of the continued existence of the DO in the sense that, based on the DO Context Information that it exists a digital object, precisely SO, citing the DO, it is possible to disable any DO removal attempt and, in this way, to preserve the hyperlink functionality.

Example

This note is an example of a digital object using Robust Hyperlinks to incorporate parts of two other digital objects. All these objects belong to the Federation described in the Introduction and consequently the two incorporations below are solved without the need of the IBI global resolver urlib.net.

  1. First incorporation

    Figure 6. Incorporation of a PDF page with its menu bar.


    Source: [5]

    Incorporation source code: <iframe src="fullypersistenthref/ibi/83LX3pFwXQZeBBx/65ij/vestiges/urlibServicePage1995.pdf?ibiurl.backgroundlanguage=en&shortmenu=yes"></iframe>

    Observation 1, about the source code:

    1. The hyperlink consists of a relative URL: the value of the src attribute doesn't contain the resolver domain name;
    2. The long name fullypersistenthref which appears at the begining of the relative URL has been chosen in the URLib platform to avoid possible directory name conflict in the doc directory;
    3. The name ibi specifies the identification name space (IBI stands for Internet Based Identifier);
    4. 83LX3pFwXQZeBBx/65ij is the IBI identifier of the Destination object (DO) in the opaque name space;
    5. vestiges is a directory name in the doc directory of the Destination object (DO);
    6. urlibServicePage1995.pdf is the file name, in the vestiges is a directory, to be displayed;
    7. ibiurl.backgroundlanguage=en (optional) ensures that the horizontal bar menu above the displayed PDF, will be written in english;
    8. shortmenu=yes is to display a shorter horizontal bar menu.

    Observation 2: The horizontal bar menu above the displayed PDF is part of the display of the DO. It indicates the IBI of the DO (shown as a tooltip text), its state (here the DO is the original), the license (here CC BY-NC-ND), its Metadata and, finally, a link to a list of the files that comprise the DO (one can verify in this list, the presence of the file vestiges/urlibServicePage1995.pdf been displayed below the horizontal menu). The Context Information in the Allied Materials Area of the Metadata page (see Fig. 7), shows the value of the Citing Item List field. This field is automatically updated and contains the repository name urlib.net/www/2023/12.25.14.57 of the SO (this page). At its right-hand side is the counter of the number of times the DO has been accessed from the SO (this page), this includes the PDF incorporation and any click onto the anchor <ibi:83LX3pFwXQZeBBx/65ij> of [5] in the References.

    Figure 7. Snapshot of the Citing Item List of DO (see red arrow).

    Observation 3: The moment this note is loaded in the browser, the relative URL in the src attribute of the iframe tag is activated and the browser rebuilt the full URL, that becomes: fullypersistenthref/ibi/83LX3pFwXQZeBBx/65ij/vestiges/urlibServicePage1995.pdf?ibiurl.backgroundlanguage=en&shortmenu=yes. From the rebuilt URL, the repository name urlib.net/www/2023/12.25.14.57 of the SO is passed along by the resolver to the Archive who own the DO and which is then able to update the Context Information of the DO with the information that a SO (in this case, the SO with repository name urlib.net/www/2023/12.25.14.57) has requested the DO or part of it.

    Observation 4: Once the Context Information of the DO has been updated with the information that at least one SO has requested the DO or part of it, the URLib platform delete procedure of the DO is automatically disabled as shown in Fig. 8

    Figure 8. The Delete Button (see red arrow) is disabled while the DO with IBI 83LX3pFwXQZeBBx/65ij has at least one citing item.

    Observation 5: If, for some reason, the SO (this page) is deleted from the URLib platform, then the value of the Citing Item List field of the DO metadata will be automatically updated removing the SO from the citing item list.


  2. Second incorporation

    Figure 9. Incorporation of a bar chart and a link to the Metatada of its original digital object.


    Metadata
    Source: [6]

    Incorporation source code for the bar chart: <img src="fullypersistenthref/ibi-/8JMKD3MGPCW/3HHLNUH/thesisVivaPublicationYearBar.jpg">

    Source code to access the Metadata: <a href="fullypersistenthref/ibi-/8JMKD3MGPCW/3HHLNUH:">Metadata</a>

    Observation 1, about the first source code:

    1. The hyperlinks are relative URLs: the values of the src and href attributes don't contain the resolver domain name;
    2. The long name fullypersistenthref which appears at the begining of the relative URL has been chosen in the URLib platform to avoid possible directory name conflict in the doc directory;
    3. The name ibi- specifies the identification name space (IBI stands for Internet Based Identifier). The hyphen (-) after the word ibi means to display the bar chart and the metadata without the horizontal menu bar;
    4. 8JMKD3MGPCW/3HHLNUH is the IBI identifier of the Destination object (DO) in the opaque name space;
    5. thesisVivaPublicationYearBar.jpg in the first source code is the file name to be displayed;
    6. : (colon) after the identifier, in the second source code, indicates that the metadata of the digital object will be returned.

    Observation 2: The moment this note is loaded in the browser, the relative URL in the src attribute of the img tag is activated and the browser rebuilt the full URL, that becomes: fullypersistenthref/ibi-/8JMKD3MGPCW/3HHLNUH/thesisVivaPublicationYearBar.jpg. Due to the fact that ibi- is used instead of ibi, the repository name urlib.net/www/2023/12.25.14.57 of the SO will NOT be passed along by the resolver to the Archive who own the DO and, consequently, the Context Information of the DO will NOT be updated as it was when using ibi in the first incorporation. In this case, to ensure that the repository name urlib.net/www/2023/12.25.14.57 of the SO will be passed along, at least one click must be made onto the anchor <ibi:8JMKD3MGPCW/3HHLNUH> of [6 ] in the References.

    Observation 3, about the second source code: To program the Metadata access, it is suficient to append a colon (:) to the opaque IBI of the Destination object (DO). No file name needs to be appended.

References

[1] BANON, G. J. F. Example of two almost fully persistent HTML hyperlinks. [S.l.] Deposited in the URLib collection, 2023. IBI: . Available from: <ibi:QABCDSTQQW/49884CP>.

[2] BANON, G. J. F. Example of two fully persistent HTML hyperlinks. [S.l.] Deposited in the URLib collection, 2023. IBI: . Available from: <ibi:QABCDSTQQW/4A86BJH>.

[3] THE CONSULTATIVE COMMITTEE FOR SPACE DATA SYSTEMS (CCSDS). Reference Model for an Open Archival Information System (OAIS) - CCSDS 650.0-M-2. Reston: CCSDS, 2012. 135 p. (CCSDS 650.0-M-2). Available from: <https://public.ccsds.org/pubs/650x0m2.pdf>.

[4] BANON, G. J. F. Uniform repositories for a digital library - URLib. [S.l.] deposited in the URLib collection, 1998. An earlier version of this work has been presented at the VI Seminário sobre Automação em Bibliotecas e Centros de Documentação, September 9-10, 1997, Águas de Lindóia, SP, Brazil. IBI: <83LX3pFwXQZeBBx/aa6dE>. Available from: <ibi:83LX3pFwXQZeBBx/aa6dE>.

[5] BANON, G. J. F. Uniform Repository Service Version 1.1. [S.l.] deposited in the URLib collection, 1998. IBI: <83LX3pFwXQZeBBx/65ij>. Available from: <ibi:83LX3pFwXQZeBBx/65ij>.

[6] PEREIRA, C. M.; BANON, G. J. F. Breve relatório referente ao ano de 2014 com dados oriundos da Memória Científica do INPE. São José dos Campos: INPE, versão: 2014-12-26. IBI: <8JMKD3MGPCW/3HHLNUH>. Available from: <ibi:8JMKD3MGPCW/3HHLNUH>.

[7] COMISSÃO-DE-ESTUDOS ABNT/CB08/SC010/CE70. System for IBI generation. São José dos Campos: Comissão-de-Estudo ABNT/CB08/SC010/CE70, version: 2021-11-14. 48 p. IBI: . Available from: ibi:J8LNKB5R7W/3NSP3DL.
 

The objects involved in these hyperlinks should be interpreted as Archival Information Packages (AIPs).

The ibi name space consists of two sub name spaces [7]. To each information item deposited in the URLib platform is assigned two IBI identifiers, one in the repository name space and the other in an opaque name space.