在postgres中混淆数据

MrD*_*erp 2 sql postgresql postgresql-9.1

我想混淆Postgres 9.1中特定列中的数据。

例如,我想给所有人一个“随机”的名字和姓氏。

我可以生成一个名称池以供使用:

select name_first into first_names from people order by random() limit 500;
select name_last into last_names from people order by random() limit 500;
Run Code Online (Sandbox Code Playgroud)

这两个查询都在大约400毫秒内运行(假设它们只需要运行一次,这对我来说很好用!)

使用常规的update语句是行不通的-每个选择只进行一次选择,因此为所有人提供了相同的名称:

update people
    SET name_last=(SELECT * from last_names order by random() limit 1),
    name_first=(SELECT * from first_names order by random() limit 1)
    where business_id=1;
Run Code Online (Sandbox Code Playgroud)

如何在postgres中给每个人一个随机的名字?我真的不想在Ruby on Rails中执行此操作-我假设使用纯SQL方法会更快。但是,速度并不是一个太大的问题,因为我整夜都在处理这个业务案例。

wil*_*ser 5

        -- Invent some data
CREATE TABLE persons
        ( id SERIAL NOT NULL PRIMARY KEY
        , last_name varchar
        );

INSERT INTO persons(last_name)
SELECT 'Name_' || gs::text
FROM generate_series(1,10) gs
        ;

        -- The update
WITH swp AS (
        SELECT last_name AS new_last_name
        , rank() OVER (ORDER BY random() ) AS new_id
        FROM persons
        )
UPDATE persons dst
SET last_name = swp.new_last_name
FROM swp
WHERE swp.new_id = dst.id
        -- redundant condition: avoid updating with same value
AND swp.new_last_name <> dst.last_name
        ;

SELECT * FROM persons
        ;
Run Code Online (Sandbox Code Playgroud)

结果:

 id | last_name 
----+-----------
  1 | Name_6
  2 | Name_4
  3 | Name_8
  4 | Name_2
  5 | Name_1
  6 | Name_10
  7 | Name_5
  8 | Name_7
  9 | Name_3
 10 | Name_9
(10 rows)
Run Code Online (Sandbox Code Playgroud)