在T-SQL中反序列化PHP序列化数据

mtt*_*son 5 php t-sql serialization magento

我正在尝试从Magento订单中提取礼品卡代码.其他一些代码使用Magento API从Magento检索订单信息为XML,并将XML插入到MS SQL Server记录中.使用T-SQL我可以使用XML函数来解析从Magento API检索到的XML,并获得我所需要的几乎所有内容,但实际存储卡代码存储的唯一位置是gift_cards字段,恰好是php序列化串.

例子:

a:1:{i:0;a:5:{s:1:"i";s:1:"1";s:1:"c";s:12:"00XCY8S3ZXCU";s:1:"a";d:119;s:2:"ba";d:119;s:10:"authorized";d:119;}}
a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}
Run Code Online (Sandbox Code Playgroud)

礼品卡代码是数组中带有"c"键的值,例如: 00XCY8S3ZXCU 045EMJJWRCF1 06DUQ7Z5GVT7 07A6MRYW511J

我目前正试图通过使用T-SQL函数解析值来解决这个问题,这就像试图用螺丝刀驱动钉子一样.显然这在之前已经被问到,唯一的建议是在T-SQL中从头开始构建解析器,但是使用PHP来反序列化它是更好的选择.

如果Magento没有将PHP序列化数据存储在他们的数据库中,然后将其服务仍然在他们的Web服务中序列化,那将是很好的,但这就是我必须使用的.我会考虑使用C#将其转换并将其存储为数据库中的单独字段,但是能够在T-SQL中解析数据会更方便.如果我使用C#来解析和反序列化PHP对象,我可能会将它作为XML存储在数据库中,因为这是一种更好的格式来交换数据.

mtt*_*son 5

这是我自己能够想出的。我被一篇关于解析 JSON帖子所鼓舞,并决定为序列化的 php 对象找出答案。不过我采取了完全不同的方法。

更新的代码示例现已发布在 github 上

序列化的 php 字符串:

a:3:{
  i:0;
  a:5:{
    s:1:"i";
    s:2:"10";

    s:1:"c";
    s:12:"045EMJJWRCF1";

    s:1:"a";
    d:100;

    s:2:"ba";
    d:100;

    s:10:"authorized";
    d:100;
  }

  i:1;
  a:5:{
    s:1:"i";
    s:2:"11";

    s:1:"c";
    s:12:"06DUQ7Z5GVT7";

    s:1:"a";
    d:101;

    s:2:"ba";
    d:101;

    s:10:"authorized";
    d:101;
  }

  i:2;
  a:5:{
    s:1:"i";
    s:2:"12";

    s:1:"c";
    s:12:"07A6MRYW511J";

    s:1:"a";
    d:102;

    s:2:"ba";
    d:102;

    s:10:"authorized";
    d:102;
  }
}
Run Code Online (Sandbox Code Playgroud)

我的查询以获取结果:

select *
from parsePhpSerializedString('a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}')
Run Code Online (Sandbox Code Playgroud)

查询结果:

element_id  parent_id   var_name                                           var_type                                           var_length  value_int   value_string                                                                                                                                                                                                                                                     value_decimal
----------- ----------- -------------------------------------------------- -------------------------------------------------- ----------- ----------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------
1           0           NULL                                               a                                                  3           NULL        NULL                                                                                                                                                                                                                                                             NULL
2           1           0                                                  a                                                  5           NULL        NULL                                                                                                                                                                                                                                                             NULL
3           1           1                                                  a                                                  5           NULL        NULL                                                                                                                                                                                                                                                             NULL
4           1           2                                                  a                                                  5           NULL        NULL                                                                                                                                                                                                                                                             NULL
5           2           i                                                  s                                                  2           NULL        10                                                                                                                                                                                                                                                               NULL
6           2           c                                                  s                                                  12          NULL        045EMJJWRCF1                                                                                                                                                                                                                                                     NULL
7           2           a                                                  d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             100
8           2           ba                                                 d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             100
9           2           authorized                                         d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             100
10          3           i                                                  s                                                  2           NULL        11                                                                                                                                                                                                                                                               NULL
11          3           c                                                  s                                                  12          NULL        06DUQ7Z5GVT7                                                                                                                                                                                                                                                     NULL
12          3           a                                                  d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             101
13          3           ba                                                 d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             101
14          3           authorized                                         d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             101
15          4           i                                                  s                                                  2           NULL        12                                                                                                                                                                                                                                                               NULL
16          4           c                                                  s                                                  12          NULL        07A6MRYW511J                                                                                                                                                                                                                                                     NULL
17          4           a                                                  d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             102
18          4           ba                                                 d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             102
19          4           authorized                                         d                                                  NULL        NULL        NULL                                                                                                                                                                                                                                                             102
Run Code Online (Sandbox Code Playgroud)

如果我只想要礼品卡代码,我可以编写这样的查询:

select value_string
from parsePhpSerializedString('a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}')
where   parent_id != 0 and
        var_name = 'c'
Run Code Online (Sandbox Code Playgroud)

结果:

value_string
-------------
045EMJJWRCF1
06DUQ7Z5GVT7
07A6MRYW511J
Run Code Online (Sandbox Code Playgroud)

这是用于解析序列化 PHP 字符串的 T-SQL 函数:

IF OBJECT_ID (N'dbo.parsePhpSerializedString') IS NOT NULL
   DROP FUNCTION dbo.parsePhpSerializedString
GO
CREATE FUNCTION dbo.parsePhpSerializedString( @phpSerialized VARCHAR(MAX))
RETURNS @results table 
    (
        element_id int identity(1,1) not null, /* internal surrogate primary key gives the order of parsing and the list order */
        parent_id int, /* if the element has a parent then it is in this column. */
        var_name varchar(50), /* the name or key of the element in a key/value array list */
        var_type varchar(50),
        var_length int,
        value_int int,
        value_string varchar(max),
        value_decimal numeric
    )
AS
BEGIN

    /*
    Built by Matt Johnson (matt@evdat.com) 2012-08-14
    */

    -- we use this table later for collecting auto generated
    -- identity values when inserting records into @results
    declare @insertedIds table (
        element_id int
    )

    -- define variables
    declare @element_start int
    declare @var_type_end int
    declare @var_type varchar(50)
    declare @element_end int
    declare @chunk varchar(max)
    declare @var_length_start int
    declare @var_length_end int
    declare @var_length_string varchar(max)
    declare @var_length int
    declare @value_start int
    declare @value_end int
    declare @value_string varchar(max)
    declare @value_int int
    declare @value_decimal numeric
    declare @array_level int
    declare @value_string_position int
    declare @next_open int
    declare @next_close int
    declare @parent_id int
    declare @element_id int
    declare @key_element_id int
    declare @inserted_element_id int
    declare @var_name varchar(50)

    --initialize variables
    set @parent_id = 0


    --loop through the supplied @phpSerialized string until it's empty
    while 1=1 begin
        set @element_start = null
        set @var_type_end = null
        set @var_type = null
        set @element_end = null
        set @chunk = null
        set @var_length_start = null
        set @var_length_end = null
        set @var_length_string = null
        set @var_length = null
        set @value_start = null
        set @value_end = null
        set @value_string = null
        set @value_int = null
        set @value_decimal = null
        set @array_level = null
        set @value_string_position = null
        set @next_open = null
        set @next_close = null
        set @var_name = null

        --confirm that there is an element to parse and define its starting point
        --patindex will return a value of 1 if the pattern is found and this pattern
        --will only match if the element starting point is the first character in the
        --supplied string. If it is encapsulated in quotes or anything else it will not match
        set @element_start = patindex('[asid]:%[;}]', @phpSerialized)

        if @element_start <= 0 begin
            --if the supplied string is now empty check the existing results table
            --for any nested elements in any array elements

            --reset the value of @element_id to be safe
            set @element_id = null

            --only retrieve the first element found containing sub elements to parse
            select  top 1 
                    @phpSerialized = value_string,
                    @element_id = element_id    
            from @results 
            where   var_type = 'a' and 
                    value_string is not null

            --set the parent_id to the array's element_id
            set @parent_id = @element_id

            --if there were no results found then that means there either
            --were no arrays to parse, or all arrays have already been parsed
            --so break the continuous loop because we are completely done now
            if @element_id is null break

            --set the @element_start again now that we 
            --have a new string to parse for elements
            set @element_start = patindex('[asid]:%[;}]', @phpSerialized)
        end

        --find the end of the type of the element then extract the variable type from the string
        set @var_type_end = patindex('%:%', @phpSerialized)
        set @var_type = substring(@phpSerialized, @element_start, @var_type_end-@element_start)

        --generate an error if a variable type is supplied that hasn't been coded for.
        if @var_type not like '[asid]' begin
            /*
            print @var_type
            RAISERROR (N'Error parsing php serialized string. Variable type found that has not been defined to parse for.', -- Message text.
                       16, -- Severity,
                       1 -- State
                       )
            */

            --apparently errors can't be raised within a function so skip the element
            break
        end

        --array elements contain sub elements so we use different methods for parsing
        --sub elements than we do for parsing individual elements.
        if @var_type != 'a' begin
            --element has no sub elements

            --determine the end of this individual element and then extract 
            --only this individual element from the string
            set @element_end = patindex('%;%', @phpSerialized)+1
            set @chunk = substring(@phpSerialized, @element_start, @element_end-@element_start)

            --strings are serialized differently than numeric elements
            if @var_type = 's' begin
                --element has var length

                --find the starting and ending positions for the var_length and then extract the length
                set @var_length_start = @var_type_end+1
                set @var_length_end = patindex('%:%', substring(@chunk, @var_length_start, len(@chunk))) + @var_length_start - 1
                set @var_length_string = substring(@chunk, @var_length_start, @var_length_end-@var_length_start)
                if @var_length_string not like '[^0-9]' begin
                    --its nice to verify this is actually a number before casting it as such
                    set @var_length = cast(@var_length_string as int)
                end

                --find the starting and ending positions for the value and then extract the value
                set @value_start = @var_length_end+1
                set @value_end = patindex('%;%', @chunk)
                --a string value is quoted so remove quotes in start and end of substring for value
                --we set the substring starting position +1 just past the start of the quote and then
                --set the length of the extracted string -2 to account for both the starting quote and 
                --ending quote to be removed from the extracted string.
                set @value_string = substring(@chunk, @value_start+1, @value_end-@value_start-2)

            end else begin
                --element does not have a var length

                --find the starting and ending positions for the value and then extract the value as a string
                set @value_start = @var_type_end+1
                set @value_end = patindex('%;%', @chunk)
                set @value_string = substring(@chunk, @value_start, @value_end-@value_start)

                --determine what value type the string should be converted to
                if @var_type = 'i' begin
                    if @value_string not like '[^0-9.]' begin
                        set @value_int = cast(@value_string as int)
                        --clear the value_string because the element's value has been converted to its appropriate type
                        set @value_string = null
                    end
                end else if @var_type = 'd' begin
                    if @value_string not like '[^0-9.]' begin
                        set @value_decimal = cast(@value_string as numeric)
                        --clear the value_string because the element's value has been converted to its appropriate type
                        set @value_string = null
                    end
                end

            end


        end else begin
            --element is array and has sub elements

            --we are going to chop up the string to try and determine its end so we'll
            --first set the string to a variable we can destroy in this process
            set @chunk = @phpSerialized

            --find the starting and ending positions for the var_length and then extract the length
            --arrays use this to state how may elements this array contains
            set @var_length_start = @var_type_end+1
            set @var_length_end = patindex('%:%', substring(@chunk, @var_length_start, len(@chunk))) + @var_length_start - 1
            set @var_length_string = substring(@chunk, @var_length_start, @var_length_end-@var_length_start)
            if @var_length_string not like '[^0-9]' begin
                set @var_length = cast(@var_length_string as int)
            end

            --find the value starting position
            --later we will find the true end of the value
            set @value_start = @var_length_end+1

            -- to determine the ending positio