SSM*_*SMK 5 sql postgresql google-bigquery
我有一张如下所示的表格
我想创建two new binary columns指示主题是否有steroids和aspirin。我希望在Postgresql and google bigquery
我尝试了以下但不起作用
select subject_id
case when lower(drug) like ('%cortisol%','%cortisone%','%dexamethasone%')
then 1 else 0 end as steroids,
case when lower(drug) like ('%peptide%','%paracetamol%')
then 1 else 0 end as aspirin,
from db.Team01.Table_1
SELECT
db.Team01.Table_1.drug
FROM `table_1`,
UNNEST(table_1.drug) drug
WHERE REGEXP_CONTAINS( db.Team01.Table_1.drug,r'%cortisol%','%cortisone%','%dexamethasone%')
Run Code Online (Sandbox Code Playgroud)
我希望我的输出如下所示
Mik*_*ant 16
以下是 BigQuery 标准 SQL
#standardSQL
SELECT
subject_id,
SUM(CASE WHEN REGEXP_CONTAINS(LOWER(drug), r'cortisol|cortisone|dexamethasone') THEN 1 ELSE 0 END) AS steroids,
SUM(CASE WHEN REGEXP_CONTAINS(LOWER(drug), r'peptide|paracetamol') THEN 1 ELSE 0 END) AS aspirin
FROM `db.Team01.Table_1`
GROUP BY subject_id
Run Code Online (Sandbox Code Playgroud)
如果适用于您问题中的样本数据 - 结果是
Row subject_id steroids aspirin
1 1 3 1
2 2 1 1
Run Code Online (Sandbox Code Playgroud)
LIKE on steroids注意:我使用的是REGEXP_CONTAINS,而不是简单的 LIKE 以冗长且冗余的文本结尾