(registered 2016-11-28, last updated 2016-11-28) Scheme name: pwid Status: Provisional Applications/protocols that use this scheme name: None yet Contact: Name: Eld Zierau Addr.: The Royal Library, Digital Preservation Dept. P.O. Box 2149, 1016 Copenhagen K, Denmark Phone: (+45) 9132 4690 Email: elzi&kb.dk Change controller: Name: Eld Zierau Addr.: The Royal Library, Digital Preservation Dept. P.O. Box 2149, 1016 Copenhagen K, Denmark Phone: (+45) 9132 4690 Email: elzi&kb.dk Description: An RFC for the pwid URI will be submitted before end of 2016 with more information. Below is given a short description of the pwids history and the motivation for creating it. The suggested pwid (Persistent Web IDentifier) URI scheme described here origins from analysis and description in an iPres 2016 paper [IPRES] "Persistent Web References - Best Practices and New Suggestions", and reflects the various comments and inputs that resulted from presentation of this paper. The pwid URI was originally named wpid, but the named has changed as a result of comments to the paper and the presentation (as described below). The motivation for defining a pwid is the growing challenge of references to web resources, which are poorly supported in citation guidelines. Citation guidelines generally don’t cover general and persistent referencing techniques for non-controlled web resources. Controlled web references that are registered by Persistent Identifier systems (like DOI [DOI]) may be covered well. However, an increasing number of references point to resources that only has existed on the web, and are not covered by controlled resources. Such web referencing is highly relevant and crucial for various research fields. For example blogs that shows out to have a historical impact. Today there are different ways to refer to non-controlled references: * By specifying http/https address and date it was visited * Via Citation services constructs citation http/https addresses that are not following a general scheme and that will change or be lost in case the citation service change domain, service * Via Web Archive http/https addresses can only be used for open web archives. Furthermore, Web archive http/https addresses are not following a general scheme, and they will change or be lost in case the web archive service changes domain, or changes path to the web archive service Http/https address and date is in no way persistent, and is the main reason for studies showing that a large percentage of links in research studies are dead after a relatively short period. Citation services can sometimes be used, but responsibility of preservation and collection is not fully clear, and they often use http/https address shorteners for access, which complicates preservation of source and metadata for source even more. Finally, there are the web archives that offer access openly or locally, but where access http/https addresses depends on domains for the web archive as well as differing paths to their access service. The ’pwid’ URI Scheme is another step in facilitate, support, and standardize the problem of persistent web references to resources in web archives. Accessing a referenced web resource will require APIs from web archives no matter whether they are open web archive or not. There are different solutions for resolving of a pwid URI, which needs to be investigated and implemented as use and support of the ’pwid’ URI evolves. According to RFC 3986 [RFC3986], a Uniform Resource Identifier (URI) is "a compact sequence of characters that identifies an abstract or physical resource". The pwid URI Scheme defined in this document identifies web archive resources (abstract resources) in a general, global, stainable, humanly readable and technology agnostic way. An example of such a pwid URI follows: pwid:archive.org:2016-01-22T11.20.29Z_page:http://www.dr.dk In this example the domain of the archive has been used as identifier. However, an archive identifier does NOT need to be a domain. The choice in the example is only used in order to have a short archive identifier that is already associated with the archive. For the sake of usability and sustainability, the definition of the pwid URI scheme is focused on only having the minimum required information in order to precisely identify a resource in an arbitrary web archive. The pwid URI scheme allows identification of web archive resources in a general, global, sustainable, humanly readable and technology agnostic way. No matter whether it will become resolvable, it can be used at any time to identify the web archive resource, as long as the material exists. The pwid is defined as a URI as there are great potentials for making it resolvable. This means it could function as a URN, but is not defined as such as the ambition is to make it resolvable. At the same time the pwid definition can enjoy the same common syntactic, semantic, and shared language benefits that the URI presentation confers. The interest for this new pwid URI scheme has already been shown, a long paper about the invention of the pwid URI "Persistent Web References - Best Practices and New Suggestions" was accepted for the iPres 2016 conference and nominated as best paper. There have been a lot of good feedback to the paper which is incorporated in this scheme specification. The most notable changes are: * The name of the scheme is changed from "wpid" to "pwid" for the following reasons - There was a concern that wpid was interpreted as a PID related to a PID-system as e.g. the DOI. All though PID does not have a precise definition that makes it wrong to call it a "wpid", the danger is that it is confused with PID systems which is not the intension - As the title of the iPres 2016 conference paper suggests, the name "Persistent Web Identifier" is more straightforward than "Web Persistent Identifier" * As commented at the presentation of the paper, there may be other types of web archive materials than just URIs, e.g. snapshots, web recordings and corpus specifications. Therefore the pwid RI scheme has been extended in order to cover such cases as well * In order to increase the readability there are some changes in the syntax of a pwid URI. There is no question about the need for a standard to address web materials meeting precision and persistency issues on par precision in with traditional references for analogue material. The interest for the pwid URI indicates that this is a recognized issue, and that the pwid URI can fill a gap. The pwid URI will benefit from becoming resolvable to some extent, but as a start it has a value even without being resolvable. References: [IPRES] Zierau, E., Nyvang, C., and T. Kromann, "Persistent Web References - Best Practices and New Suggestions", October 2016, . In: proceedings of the 13th International Conference on Preservation of Digital Objects (iPres) 2016, pp. 237-246 [DOI] International DOI Foundation, "The DOI System", 2016, . pwid:archive.org:2016-10-20_22.26.35_page:https://www.doi. org/ [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, .