背景:
我有一个nvarchar(max)名为'questions' 的JSON 列,它看起来像是一行中的这个真实例子......
{"211":0,"212":0,"213":0,"214":0,"215":0,"216":0,"217":0,"218":0,"219":0,"220":"1","221":"1","222":"1","223":"1","224":"1","225":"1","226":"1","227":"1","228":"1","229":"1","230":"1","231":"1","232":"1"}
Run Code Online (Sandbox Code Playgroud)
我正在为示例'call'生成此示例JSON代码段...
[
{
"call": {
"id": 200643,
"yes_answers": [
{
"question_id": "220"
},
{
"question_id": "221"
},
{
"question_id": "222"
},
{
"question_id": "223"
},
{
"question_id": "224"
},
{
"question_id": "225"
},
{
"question_id": "226"
},
{
"question_id": "227"
},
{
"question_id": "228"
},
{
"question_id": "229"
},
{
"question_id": "230"
},
{
"question_id": "231"
},
{
"question_id": "232"
}
]
}
}
]
Run Code Online (Sandbox Code Playgroud)
..使用此查询...
select c.call_id as [call.id],
(
select x.[key] …Run Code Online (Sandbox Code Playgroud) 我有一个大的JSON文件,大约500万条记录和大约32GB的文件大小,我需要加载到我们的Snowflake数据仓库中.我需要将此文件分解为每个文件大约200k条记录(大约1.25GB)的块.我想在Node.JS或Python中执行此操作以部署到AWS Lambda函数,遗憾的是我还没有编写任何代码.我有C#和很多SQL经验,学习node和python都在我的待办事项列表中,所以为什么不直接进入,对吧!?
我的第一个问题是"哪种语言更适合这个功能?Python或Node.JS?"
I know I don't want to read this entire JSON file into memory (or even the output smaller file). I need to be able to "stream" it in and out into the new file based on a record count (200k), properly close up the json objects, and continue into a new file for another 200k, and so on. I know Node can do this, but if Python can also do this, I feel like it would be …
json ×2
for-json ×1
lambda ×1
node.js ×1
python ×1
snowflake-cloud-data-platform ×1
sql-server ×1
t-sql ×1