如何从Microsoft Content Management Server(MCMS)数据库中提取数据

Mat*_*ser 1 sql-server asp.net sitecore mcms-2000 mcms

我需要从Microsoft Content Management Server(MCMS)数据库中提取大量数据(> 1000页),以便在Sitecore网站中使用.

我可以看到两个主要选项:

  1. 将数据迁移到新的简化数据库并在新网站中显示该信息.

  2. 将MCMS解决方案转换为SharePoint,并使用Sitecore可用的SharePoint连接器模块来显示此信息.

我更倾向于沿着第一条路线前进,因为未来没有计划使用SharePoint来管理数据/内容,并且更愿意将此信息存储在简单的SQL Server数据库中以便更好地搜索.

我已经看了有问题的数据库,并认为我会有兴趣在主表是Node,NodePlaceholderNodePlaceholderContent,但我在努力找到我期望的那样.任何人都可以为我提供一些关于这个数据库架构的解释吗?或者我是否会尝试以这种方式迁移数据?

Mik*_*keM 5

我最近刚刚经历了一个类似的过程,从MCMS 2002中导出内容页面(迁移到Wordpress).

我不是说这是获取数据的100%正确方法,但它对我有用.

这是我从数据库中获取页面内容的过程.

正如您已经看到的那样,存储大部分数据的表是NodeNodePlaceholderContent

1.)要了解Node表格所包含的内容,您可以查看按类型组织的内容

SELECT
    [Type]
    ,CASE [Type] 
        WHEN      1 THEN 'Server'
        WHEN      4 THEN 'Channel'
        WHEN     16 THEN 'Post/Page'
        WHEN     64 THEN 'Resource Gallery'
        WHEN    256 THEN 'Resource Gallery Item (images/documents)'
        WHEN  16384 THEN 'Template Gallery'
        WHEN  65536 THEN 'Template' END as [Description]
    ,COUNT([Type]) as [Count]
FROM        dbo.Node
GROUP BY    [Type]
ORDER BY    [Count] DESC
Run Code Online (Sandbox Code Playgroud)

2.)页面(和帖子,将覆盖进一步向下的帖子)类型= 16 ...但是为了获得页面(而不是帖子)我们需要过滤 IsShortcut = 0

SELECT * FROM dbo.Node WHERE [Type] = 16 AND IsShortcut = 0
Run Code Online (Sandbox Code Playgroud)

3.)我只想要发布的页面,所以过滤 ApprovalStatus = 1

-- Get all published pages
SELECT * 
FROM dbo.Node WHERE [Type] = 16 
AND IsShortcut = 0
AND ApprovalStatus = 1 
Run Code Online (Sandbox Code Playgroud)

4.)接下来,确定由(使用用户名)创建/修改的页面

-- Get published pages & author/editor
SELECT 
    [page].Id
    ,[page].NodeGuid
    ,[page].Name
    ,[created].Username as 'CreatedBy'
    ,[page].CreatedWhen
    ,[modified].Username as 'ModifiedBy'
    ,[page].ModifiedWhen
FROM        dbo.Node [page]
-- add JOIN on created by user
INNER JOIN  dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId
-- add JOIN on modified by user
INNER JOIN  dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId
WHERE [Type] = 16 
AND IsShortcut = 0
AND ApprovalStatus = 1 
Run Code Online (Sandbox Code Playgroud)

5.)接下来,通过使用Node.ParentGUID列来确定层次结构中的位置

SELECT 
    [page].Id
    ,[page].NodeGuid
    ,[page].Name
    ,[pageParent].Name -- add page parent Name
    ,[created].Username as 'CreatedBy'
    ,[page].CreatedWhen
    ,[modified].Username as 'ModifiedBy'
    ,[page].ModifiedWhen
FROM        dbo.Node [page]
INNER JOIN  dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId
INNER JOIN  dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId
-- add JOIN on Node using ParentGUID
INNER JOIN  dbo.Node [pageParent] ON [pageParent].NodeGUID = [page].ParentGUID
WHERE [page].[Type] = 16
AND [page].IsShortcut = 0
AND [page].ApprovalStatus = 1 
Run Code Online (Sandbox Code Playgroud)

此查询让我知道页面位于名为Folders或的父节点中Archive Folder

6.)上升到另一个级别(获得父母的父母)

SELECT 
    [page].Id
    ,[page].NodeGuid
    ,[page].Name
    ,[pageParent].Name 
    ,[pageParent2].Name -- add parent of parent name
    ,[created].Username as 'CreatedBy'
    ,[page].CreatedWhen
    ,[modified].Username as 'ModifiedBy'
    ,[page].ModifiedWhen
FROM        dbo.Node [page]
INNER JOIN  dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId
INNER JOIN  dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId
INNER JOIN  dbo.Node [pageParent] ON [pageParent].NodeGUID = [page].ParentGUID
-- add another JOIN on Node using ParentGUID (parent of parent)
INNER JOIN  dbo.Node [pageParent2] ON [pageParent2].NodeGUID = [pageParent].ParentGUID
WHERE [page].[Type] = 16
AND [page].IsShortcut = 0
AND [page].ApprovalStatus = 1 
Run Code Online (Sandbox Code Playgroud)

父级的父Server级是(根级别)所以现在我的结论是页面的父级是:

  • Folders - 那是一个活跃的页面
  • Archive Folder - 那是另一页的先前版本

我只想要活动页面,所以我Folders只会在父母上加入

7.)现在标记怎么样.在我们的MCMS模板中,只有一个占位符区域.NodePlaceholder如果模板中有多个占位符区域,该表将标识占位符的名称.我只是NodePlaceholdercontent为了简单而加入.

SELECT 
    [page].Id
    ,[page].NodeGuid
    ,[page].Name
    /* remove parent names */
    ,[created].Username as 'CreatedBy'
    ,[page].CreatedWhen
    ,[modified].Username as 'ModifiedBy'
    ,[page].ModifiedWhen
    ,html.PropValue as 'HTML' -- add the markup
FROM        dbo.Node [page]
INNER JOIN  dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId
INNER JOIN  dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId
-- change alias to "folders"
INNER JOIN  dbo.Node [folders] ON [folders].NodeGUID = [page].ParentGUID AND [folders].Name = 'Folders'
-- join on PlaceholderContent to get the HTML
-- this table will also have references to any static files contained in the page (such as images) so we filter those out by PropName = 'HTML'
INNER JOIN  dbo.NodePlaceholderContent html ON html.NodeId = [page].Id AND html.PropName = 'HTML' 
WHERE [page].[Type] = 16
AND [page].IsShortcut = 0
AND [page].ApprovalStatus = 1 
Run Code Online (Sandbox Code Playgroud)

8.)所以在这一点上我有点卡住试图确定页面在系统中的位置(即相对路径或它所处的通道),返回步骤1和2,类型= 16可以是帖子或页面(不是相同的东西,但它们是相关的).所以现在我们将页面加入到帖子记录中以确定路径.

在一些谷歌搜索之后,我偶然发现了这个来自Microsoft Content Management Server 2002的摘录:完整的指南确实有助于完成剩下的工作(并确定了Node.Type枚举)

SELECT 
    [page].Id
    ,[page].NodeGuid
    ,[page].Name
    ,[post].DisplayName as 'Title' -- add page Title from the post record
    ,[pageParent].Name 
    ,[pageParent2].Name
    ,[created].Username as 'CreatedBy'
    ,[page].CreatedWhen
    ,[modified].Username as 'ModifiedBy'
    ,[page].ModifiedWhen
    ,html.PropValue as 'HTML'
FROM        dbo.Node [page]
INNER JOIN  dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId
INNER JOIN  dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId
INNER JOIN  dbo.Node [folders] ON [folders].NodeGUID = [page].ParentGUID AND [folders].Name = 'Folders'
INNER JOIN  dbo.NodePlaceholderContent html ON html.NodeId = [page].Id AND html.PropName = 'HTML' 
-- join using followGUID to get the posting
INNER JOIN  dbo.Node [post] ON [post].FollowGUID = [page].NodeGUID
WHERE [page].[Type] = 16
AND [page].IsShortcut = 0
AND [page].ApprovalStatus = 1 
Run Code Online (Sandbox Code Playgroud)

9.)现在的最后一步是继续上升后父级层次结构,导致几个LEFT JOINS加强ParentGUID链.此查询使用这些LEFT JOINS提供层次结构的直观表示.

SELECT 
    CASE WHEN postParent9.Name IS NULL THEN '' ELSE postParent9.Name + ' > ' END +
    CASE WHEN postParent8.Name IS NULL THEN '' ELSE postParent8.Name + ' > ' END +
    CASE WHEN postParent7.Name IS NULL THEN '' ELSE postParent7.Name + ' > ' END +
    CASE WHEN postParent6.Name IS NULL THEN '' ELSE postParent6.Name + ' > ' END +
    CASE WHEN postParent5.Name IS NULL THEN '' ELSE postParent5.Name + ' > ' END +
    CASE WHEN postParent4.Name IS NULL THEN '' ELSE postParent4.Name + ' > ' END +
    CASE WHEN postParent3.Name IS NULL THEN '' ELSE postParent3.Name + ' > ' END +
    CASE WHEN postParent2.Name IS NULL THEN '' ELSE postParent2.Name + ' > ' END +
    CASE WHEN postParent1.Name IS NULL THEN '' ELSE postParent1.Name + ' > ' END +
    page.Name as [Path]
    ,page.Name + '.htm' as [PageName]
    ,post.DisplayName as [PageTitle]
    ,CASE page.[Type] 
        WHEN      1 THEN 'Server'
        WHEN      4 THEN 'Channel'
        WHEN     16 THEN 'Post/Page'
        WHEN     64 THEN 'Resource Gallery'
        WHEN    256 THEN 'Resource Gallery Item (images/documents)'
        WHEN  16384 THEN 'Template Gallery'
        WHEN  65536 THEN 'Template' END as [Type]
    ,page.CreatedWhen as 'Created'
    ,page.ModifiedWhen as 'Modified'
    ,html.PropValue as 'HTML'
FROM        dbo.Node page
INNER JOIN  dbo.Node folders ON folders.NodeGUID = page.ParentGUID AND folders.Name = 'Folders'
INNER JOIN  dbo.NodePlaceholderContent html ON html.NodeId = page.Id AND html.PropName = 'HTML'
INNER JOIN  dbo.Node post ON post.FollowGUID = page.NodeGUID AND post.IsShortcut = 1
LEFT JOIN   dbo.Node postParent1 ON postParent1.NodeGuid = post.ParentGUID
LEFT JOIN   dbo.Node postParent2 ON postParent2.NodeGuid = postParent1.ParentGUID
LEFT JOIN   dbo.Node postParent3 ON postParent3.NodeGuid = postParent2.ParentGUID
LEFT JOIN   dbo.Node postParent4 ON postParent4.NodeGuid = postParent3.ParentGUID
LEFT JOIN   dbo.Node postParent5 ON postParent5.NodeGuid = postParent4.ParentGUID
LEFT JOIN   dbo.Node postParent6 ON postParent6.NodeGuid = postParent5.ParentGUID
LEFT JOIN   dbo.Node postParent7 ON postParent7.NodeGuid = postParent6.ParentGUID
LEFT JOIN   dbo.Node postParent8 ON postParent8.NodeGuid = postParent7.ParentGUID
LEFT JOIN   dbo.Node postParent9 ON postParent9.NodeGuid = postParent8.ParentGUID
Run Code Online (Sandbox Code Playgroud)

顺便说一句,我的任务不涉及导出资源库内容(图像/文档/等),但是如果你确实需要那些部分,那么这里应该有足够的信息来获得良好的开端.

我希望这对从MCMS 2002迁移的其他人有所帮助......