Migrating Sitecore Media Library to Content Hub. Part 1: Exporting Assets from Media Library

Introduction

This is the 1st part of a 3-part series describing scripted content migration from Sitecore XM/XP’s Media Library to Sitecore Content Hub DAM. Having undertaken this task for several clients, I've found that despite unique requirements for each migration, a consistent and effective approach has always been vital.

The migration process comes down to the following steps:

  1. Generating an export file from the Sitecore Media Library. (Covered in this post)
  2. Importing this export file into the Sitecore Content Hub
  3. Updating Sitecore content items to redirect image and file fields from the Media Library to the Content Hub
  4. (Optional) Delete all items in the Sitecore Media Library.

Generating the Export File

For this process, we use Sitecore PowerShell extensions, which must be installed in Sitecore for our scripts to function.

The script provided here is a template that can be adapted to suit your specific migration needs.

How the Export Script works:

  • We loop through all content items in the Media Library path.
  • For each item, we extract essential attributes and generate its URL as it would appear on the CD (or CM, if it’s public). Note that this URL must be publicly accessible to facilitate the import into the Content Hub, as detailed later in this series.
  • All retrieved data is then saved into an Excel or CSV file, forming the export file for the next step. [Link to be added]

A few things to note:

  • SitecoreItemId, a Sitecore Item ID, is crucial for the third step of the migration, which will be further explained in parts 2 and 3 of this series. [Link to be added]
  • SitecorePath, a Sitecore path, often gets transferred to Content Hub, aiding in understanding how an asset was previously utilized in the Media Library. This typically reflects in the Sitecore Content Tree.
  • These two custom fields must be added to the M.Asset schema in the Content Hub for successful import, which we will explore in Part 2. [Link to be added]

PowerShell Script

# Your public site Url - it has to be public in order to be visible to Content Hub, so it can read assets from it
$siteUrl = 'http://[YOUR SITE URL]/'
# Asset approval status defines whether assets show up under assets or review or create pages in Content Hub
$lifeCycleStatus = 'M.Final.LifeCycle.Status.Approved'
$mediaItems = [System.Collections.Generic.List[psobject]]::new()
$formattedMediaItems = [System.Collections.Generic.List[psobject]]::new()

# Change the directory to the Media Library path
cd 'master:/media library/Images/Project/Newsroom'

# Get every asset in the Media Library (an item that is not a folder) recursively.
$itemsToProcess = Get-ChildItem -Recurse . | Where-Object { $_.TemplateName -ne "Media folder" }

if ($itemsToProcess -ne $null) {
    $itemsToProcess | ForEach-Object {
        # Do some URL massaging to turn it into a public URL of the asset
        if ($_.Paths -and $_.Paths.FullPath -and $_.ID) {
            $item = $_
            $_.Languages | ForEach-Object {
                $li = Get-Item -Path "master" -ID $item.ID -Language $_.Name
                if ($null -ne $li) {
                    $url = [Sitecore.Resources.Media.MediaManager]::GetMediaUrl($li) -replace '/sitecore/shell/', $siteUrl
                    $path = $li.Paths.FullPath -replace '/sitecore/media library/', '' -replace $li.Name, '' -replace '//', '/'
                    $extension = $li["Extension"]
                    $itemName = $li.Name

                    if ($itemName.EndsWith(" " + $extension)) {
                        $itemName = $itemName.Substring(0, $itemName.Length - $extension.Length - 1)
                    }

                    # This is an example of adding a custom field to the migration file
                    # $sku = ""
                    # $skuMatch = $li.Name | Select-String -Pattern '[ca|us]+-\d+-'
                    if ($li.Name -match '[ca|us]+-\d+-') {
                        # $sku = $Matches[0].Substring(3, $Matches[0].Length-4)
                    }

                    # The below fields made sense in this particular case. Your choice of fields to include might be different
                    $title = $li["Alt"]
                    if ($title -eq '' -or $title -eq $null) {
                        $title = $itemName -replace '-', ' '
                    }

                    if ($title.EndsWith(" " + $extension)) {
                        $title = $title.Substring(0, $title.Length - $extension.Length - 1)
                    }

                    if ($extension -ne '' -and $extension -ne $null) {
                        $fileName = $itemName + "." + $extension
                    } else {
                        $fileName = $itemName
                    }

                    $mediaItems.Add([pscustomobject]@{
                        ItemName = $itemName
                        SitecoreItemId = $li.ID
                        FileName = $fileName
                        SitecorePath = $path.Trim()
                        Title = $title
                        Url = $url
                        Updated = $li.Updated
                        LocalizationToAsset = "M.Localization." + $li.Language
                    })
                }
            }
        }
    }

    # Display the media items in a list view
    $mediaItems | Show-ListView -Title "Media Item URLs" -PageSize 1000 -Property @{
        Name = "FileName"; Expression = { $_.FileName }
    },
    @{
        Name = "SitecoreItemID"; Expression = { $_.ItemId }
    },
    @{
        Name = "SitecorePath"; Expression = { $_.SubPath }
    },
    @{
        Name = "Title"; Expression = { $_.Title }
    },
    @{
        Name = "File"; Expression = { $_.Url }
    },
    @{
        Name = "Description"; Expression = { $_.Description } # Note: This field is not defined earlier in the code
    },
    @{
        Name = "FileName"; Expression = { $_.FileName }
    },
    @{
        Name = "LocalizationToAsset"; Expression = { $_.LocalizationToAsset }
    },
    @{
        Name = "FinalLifeCycleStatusToAsset"; Expression = { $lifeCycleStatus }
    }
}

Useful Links