A meaningful way to migrate H5P materials from Drupal 7 to Drupal 8+

Is there a meaningful way to to migare H5P materials from Drupal 7 to Drupal 8+ portal (or what would be the best way that will ensure that there are no broken materials)? This includes moving content from one instance to another that already has different data available.

The task at hand is to move about 6300 H5P materials from a portal runniong Drupal 7 to another one that is running the Drupal 8+ version. The general logic of material is the same as it was in Drupal 7:

  • material is a Node
  • material has a H5P field that only allows a single value

We do have access to both instances on the server level, which grants us full access to everything, including writing custom commands and anything else like that.

There are currently three approaches that we've considered:

  1. Using drush command to somehow directly add data into the database. This leaves a question of proper statistics about used libraries a much more that is handled by the module codebase.
  2. Using drush command to create materials based on the content of the .h5p packages directly as H5P specific entites and corresponding material Nodes for them. Which leaves questions about adding missing content libraries, sometimes older versions of the libraries and probably something else.
  3. Using a utility like puppetteer to upload the packaghes through the interface, let the system handle all the logic, updgades and anything else.

We will have to keep the authors and other data for the materials, but that is somehow doable and the main issue is with moving the H5P portion of the data.

The hird approach seems to be the most promising, although it will require a lot of time to complete and involve loads of trial an error to get it right. The first one seems to be the quickest, but it also possibility to create the most issues. The second one seems would take care about most of the issue, but we're not sure how could the package be processed and fed into the H5P entity creation process as if it was uploaded through the creation form (the upload process does some additional chacking and updates the libraries to the latest versions).

There is also another issue that has also come up during the manual uploading process of .h5p packages. It is the Gateway timeout (504) response. I assume that this is related to cehecking for the external hub servivice to determine the latest content library versions or somehthing of that matter. Is there a way to solve that one?

What would be the best way to get that done?

 

Hi pjotr,

i am aiming for the first apporach. Which did you use ?

Gruß,
Karsten

As it was impossible to make sure that everytihng is as it should be with first two approaches, we decided to go with the third one and emmulate the manual upload of materials.

We've used the data file with titles and other data from the D7 system, placed the downloadable packages in the local file system and used a script (a few scripts, to be honest) to upload the materials to the D9 system. The solution tracked the progressed, captured and reported issues and produced a final report based on the results of individual uploads. It did take a long time to complete the process and required a lot of testing and multiple rounds of full uploads to determine all the issues and iron those out, but the end result was something along the lines of 6.1k total materials with 11 failures. Most of those failures were related with errors on the server side and database deadlocks due to multiple simultaneous uploaders.

This is the only solution where the H5P codebase handles the process in full, making sure that everything is as it should be, including migration to the latest versions of content types (note that H5P content types can only use the JS files to define the migrations). This was the main reason to scrap the partial solution based on one the other approaches and focus on mimicking the manual upload process.