Metadata in H5P (especially the OER Hub)

falcon's picture

This post is about what our plans are for metadata in the first version of the H5P OER Hub. As we get feedback and usage data we will make adjustments.

H5P’s vision to “Empower everyone to create, share and reuse interactive content” is guiding most of our decisions, and should also direct us when picking metadata for H5P.

The key parts of the vision are the words “everyone” and “interactive content.” We’ll need metadata that works for a wide range of content. Some example usages of H5P include:

  • OER for all levels of formal education.
  • Professional training within companies large and small.
  • Training material for sports and leisure activities like chess, knitting, and kung fu. A Significant amount of content has been created for these types of subjects and the willingness to share is probably significant.
  • Digital marketing and sales; for example, H5P is integrated with SalesForce.

H5P is currently most widely adopted within higher education, and we will aim to ensure that the metadata works well for our largest user group, but will also make sure it works for all the others.

The widely different user groups also imply that we must be careful about requiring too much metadata for content.

Types of metadata needed

Based on interviews with users around the world, we have created personas for potential users for the H5P Hub. Using these personas, we have identified the need for metadata that fulfills the following purposes:

  1. Makes clear what is allowed and not allowed for usages of the content.
  2. Explains what kind of content it is
  3. Explains what subject(s) the content covers
  4. Explains how advanced the audience should be regarding the subject matter in order to successfully consume the content
  5. Aligns the content to various organizational, national, or other non-global taxonomies
  6. Indicates the quality of the content, and promotes high quality content.
  7. Defines the primary language of the content

1. Rights of use

H5P already has a comprehensive system for collecting and displaying rights of use information and we will continue to use the same system.

2. Content type

More information about the types of content will probably be needed going forward, especially as some of our content types are very flexible. An Interactive Book may for instance be a full interactive course or an essay. We will however initially just use the H5P content types to indicate what type of content it is. 

In the future, we will consider adding information about educational use and/or learning type, and then populate these fields automatically based on the structure of the content.

3. Disciplines

Authors may pick academic disciplines from bepress

For all other uses of H5P, we will encourage using tags as it will be very difficult to create a subject vocabulary that covers all possible subjects H5P is used for.

4. Target audience age/level

We’ll need to allow authors to describe how advanced the target audience should be within the subject(s) covered. For formal education and many other subjects, “typical age” might be a good indicator of the audience’s prior knowledge about the subject. “Type of school” or “school year” may also be a good indicator of the audience’s level for formal education, but work poorly in most other contexts. Some researchers focus on how the target audience is able to apply their knowledge (Bloom’s Taxonomy for instance), but this will also quickly become too academic to apply to everyone.

For our purposes we’ll need something simple that users may use without reading up on first, and that apply to many different use-cases and subjects. We have decided to specify three levels: “Beginner”, “Intermediate” and “Advanced” to be used in addition to “typical age.” These levels will be more appropriate than “typical age” for on the job training and similar.

5. Subject

To allow authors to indicate the content’s subject and the alignment with non-global metadata standards(like Common Core), we are exploring to allow the community to create and maintain their own vocabularies for subject metadata. Initially, we’ll offer the ability to “free-tag” the content which is very flexible.

6. Content quality

The H5P Hub should promote high quality content. To achieve this we intend to create several mechanisms to promote content which is popular and/or from a verified publisher. These mechanisms include: 

  • Use the name of the publisher. Publishers will be able to apply for a verified identity like on Twitter and we’ll give priority to content from verified publishers.
  • Register how often the content has been downloaded.
  • Use reviews / star-ratings (to be added in version 1.1).
  • Manually scan all content for spam and flag content as reviewed once scanned for spam. By default, users will only see “reviewed” content. This process will become more and more automated as usage grows and we gain experience with what sort of spam to look for.

7. Language

H5P already has a language setting to indicate the content’s language and we will continue to use the same system.

General descriptive information

All content will have a descriptive title, a short and long version of a description as well as include multiple screenshots.

List of our metadata categories

Required metadata

The following metadata will be required (all of these will be prepopulated for all content as they are already filled in by authors for all content, but possible to change during the sharing process)

  • Title
  • Rights of use (built in already)
  • Author / publisher
  • Language
  • Whether it has been reviewed (set by us)
  • Content type (automatic)

Optional metadata

  • Typical age (Possible input formats separated by commas: 1, 34-45, -50, 59-)
  • Level (Beginner, Intermediate or advanced)
  • Academic discipline
  • Subject (free tagging)
  • Short description
  • Long description
  • Screenshots

What others are doing

As H5P is currently most widely used within higher education. We have gone through specifications for metadata used within the higher education sector and considered various types of metadata that we will consider to add in the future. Those that stood out the most include educational use (assignment, group work, etc.) and learning type (active, expositive, or mixed for instance). In some cases the content type may provide the same information, but for general purpose content types like Interactive Book, Course Presentation, Interactive Video +++ knowing the content type is not enough.

Time required to go through the content also seems like a hot candidate for future inclusion. We will adjust the metadata as the OER Hub is being used and we see what people add, search for and also what feedback we get.

What are your thoughts on the initial plan for the OER Hub metadata?

 

I have provided some thoughts on the topic “Rights of use” here: License statement best practices