License statement best practices

Cross post from h5p/h5p-editor-php-library#124.

I just saw at HFP-3047 that you are collecting links to license statements. There are already some "best practices" that I want to make you aware of, two in particular that both stem from the IIIF Presentation API 3.0, specifically its rights definition.

1. Proper URIs for works in copyright

I saw that you currently don't have a good link for content in copyright; you just link to a Wikipedia page. This is actually what is for, it provides a vocabulary to mark the copyright status of works in a machine readable way. While they have a plethora of statements, I specifically suggest the use of In Copyright:

Note that they write "it is not a License, and should not be used to license your Work." In Copyright is a legal conclusion; it's not possible to choose it as license, but it is the result of not choosing a license.

For this reason I am also not sure if your differentiation between Public Domain Mark 1.0 and Public domain is meaningful, because the public domain is not a license and the Public Domain Mark is only a statement to mark works that somehow are in the public domain. Maybe drop the entry Public domain? Also, the public domain does not exist in many countries (Germany, for example), so I'd be very careful with allowing users to place their work in the public domain – it might not be possible after all. Better only allow CC0 1.0, which puts a work in the public domain where it is possible and has a fallback license for the other cases.

2. Canonical license URIs often start with http, not https

All CC licenses have a canonical URI that starts with http://. It is used when describing the license status of works in a machine readable way and the scheme is part of it. This can be verified when looking at the RDF index of CC licenses that contains all canonical URIs. See creativecommons/ and creativecommons/cc.licenserdf#7 for the relevant discussion at CC.

This is the same for the GNU licenses, which also have the http scheme as part of their canonical URI, see gpl-3.0.rdf.

And it's also the same for the URIs, except (a) they display their canonical URIs more prominently in the first place and (b) the use of http is described in more detail in the EDM Mapping Guidelines, section 4.2 + 4.3.

Also, see IIIF/api#1874 if you are unsure about this, because the IIIF (Image Interoperability Framework) already had a discussion on this topic.

And as I'm reading through your list, a third best practice comes to my mind when choosing a license for new works:

3. Use SPDX identifiers to identify licenses and state "only" or "later"

With the SPDX License Identifier there exists a standard for referring to licenses, such as the GPLv3, that can be easily recognized by computers. If you browse through the SPDX license list you will notice that many versioned licenses have multiple identifiers. E.g. for the GPL:

  • GPL-1.0-only
  • GPL-1.0-or-later
  • GPL-2.0-only
  • GPL-2.0-or-late
  • GPL-3.0-only
  • GPL-3.0-or-later

It is best practice to clearly state if only a specific version or also any later version of a license applies. More details here. While H5P users don't need to know about SPDX, they should be presented with SPDX conformant choices (and you can use it internally to refer to the user's choice). Also note that don't have SPDX license identifiers, because they aren't licenses, but only statements about the copyright status (see spdx/license-list-XML#989).

The "only" vs. "later" choice is currently made not so much for CC licenses, although CC writes that it is possible.

As a sidenote, the SPDX specification is part of the larger REUSE Specification, that "defines a standardized method for declaring copyright and licensing for software projects".