如何构建用于查询Redshift数据库的前端(希望使用Rails)

joh*_*ser 11 activerecord ruby-on-rails amazon-web-services amazon-redshift

所以我有一个带有足够表的Redshift数据库,我觉得值得花时间构建一个前端,使查询比仅仅输入SQL命令更容易一些.

理想情况下,我可以通过将数据库连接到Rails应用程序来实现这一点(因为我对Rails有一些经验).我不确定如何将远程Redshift数据库连接到本地Rails应用程序,或者如何使activerecord与redshift一起工作.

有没有人有任何建议/资源来帮助我入门?如果预先制作的选项比Rails更容易,我可以使用其他选项将Redshift数据库连接到前端.

har*_*sjb 15

#app/models/data_warehouse.rb
class DataWarehouse < ActiveRecord::Base                      
  establish_connection "redshift_staging"
  #or, if you want to have a db per environment
  #establish_connection "redshift_#{Rails.env}"
end
Run Code Online (Sandbox Code Playgroud)

请注意,我们在5439上连接,而不是默认的5432,所以我指定了端口另外,我指定了一个模式,beta,这是我们用于不稳定聚合的模式,如上所述,您可以为每个环境设置不同的数据库,或者使用各种模式并将它们包含在ActiveRecord的搜索路径中

#config/database.yml
redshift_staging:                                                          
  adapter: postgresql                                                      
  encoding: utf8                                                           
  database: db03                                                         
  port: 5439                                                               
  pool: 5                                                                  
  schema_search_path: 'beta'                                                                                          
  username: admin                                                        
  password: supersecretpassword                                               
  host: db03.myremotehost.us  #your remote host here, might be an aws url from Redshift admin console 
Run Code Online (Sandbox Code Playgroud)

###选项2,直接PG连接

  class DataWarehouse < ActiveRecord::Base                      

    attr_accessor :conn                                                       

    def initialize                                                            
      @conn = PG.connect(                                                     
       database: 'db03',                                                   
       port: 5439,                                                           
       pool: 5,                                                              
       schema_search_path: 'beta',                                           
       username: 'admin',                                                  
       password: 'supersecretpassword',                                         
       host: 'db03.myremotehost.us'                                               
      )                                                                       
    end    
  end


[DEV] main:0> redshift = DataWarehouse
E, [2014-07-17T11:09:17.758957 #44535] ERROR -- : PG::InsufficientPrivilege: ERROR:  permission denied to set parameter "client_min_messages" to "notice" : SET client_min_messages TO 'notice'
(pry) output error: #<ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR:  permission denied to set parameter "client_min_messages" to "notice" : SET client_min_messages TO 'notice'>   
Run Code Online (Sandbox Code Playgroud)

更新:

我最终选择了选项1,但现在使用此适配器有多种原因:

https://github.com/fiksu/activerecord-redshift-adapter

原因1:ActiveRecord postgresql适配器设置client_min_messages原因2:适配器也尝试设置时区,红移不允许(http://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres- sql.html)原因3:即使您在ActiveRecord中更改了前两个错误的代码,也会遇到其他错误,这些错误抱怨Redshift正在使用Postgresql 8.0,此时我转到了适配器,将重新访问并更新如果我后来发现了更好的东西.

我将我的表重命名为base_aggregate_redshift_tests(注意复数),因此ActiveRecord很容易连接,如果你不能在redshift中更改你的表名,请使用我在下面注释掉的set_table方法

#Gemfile:
gem 'activerecord4-redshift-adapter', github: 'aamine/activerecord4-redshift-adapter'
Run Code Online (Sandbox Code Playgroud)

选项1

#config/database.yml
redshift_staging:                                                                                                             
  adapter: redshift                                                                                                           
  encoding: utf8                                                                                                              
  database: db03                                                                                                           
  port: 5439                                                                                                                  
  pool: 5                                                                                                                     
  username: admin                                                                                                
  password: supersecretpassword                                                                                                  
  host: db03.myremotehost.us                                                                                                       
  timeout: 5000   

#app/models/base_aggregates_redshift_test.rb
#Model named to match my tables in Redshift, if you want you can set_table like I have commented out below

class BaseAggregatesRedshiftTest < ActiveRecord::Base
  establish_connection "redshift_staging"
  self.table_name = "beta.base_aggregates_v2"
end
Run Code Online (Sandbox Code Playgroud)

在使用self.table_name的控制台中 - 注意它查询正确的表,因此您可以根据需要为模型命名

[DEV] main:0> redshift = BaseAggregatesRedshiftTest.first                                                                    
D, [2014-07-17T15:31:58.678103 #43776] DEBUG -- :   BaseAggregatesRedshiftTest Load (45.6ms)  SELECT "beta"."base_aggregates_v2".* FROM "beta"."base_aggregates_v2" LIMIT 1            
Run Code Online (Sandbox Code Playgroud)

选项2

#app/models/base_aggregates_redshift_test.rb
class BaseAggregatesRedshiftTest < ActiveRecord::Base
  set_table "beta.base_aggregates_v2"

  ActiveRecord::Base.establish_connection(
    adapter: 'redshift',
    encoding: 'utf8',
    database: 'staging',
    port: '5439',
    pool: '5',
    username: 'admin',
    password: 'supersecretpassword',
    search_schema: 'beta',
    host: 'db03.myremotehost.us',
    timeout: '5000'
  )

end

#in console, abbreviated example of first record, now it's using the new name for my redshift table, just assuming I've got the record at base_aggregates_redshift_tests because I didn't set the table_name

[DEV] main:0> redshift = BaseAggregatesRedshiftTest.first
D, [2014-07-17T15:09:39.388918 #11537] DEBUG -- :   BaseAggregatesRedshiftTest Load (45.3ms)  SELECT "base_aggregates_redshift_tests".* FROM "base_aggregates_redshift_tests" LIMIT 1
#<BaseAggregatesRedshiftTest:0x007fd8c4a12580> {
                                                :truncated_month => Thu, 31 Jan 2013 19:00:00 EST -05:00,
                                                :dma => "Cityville",
                                                :group_id => 9712338,
                                                :dma_id => 9999 
                                                }
Run Code Online (Sandbox Code Playgroud)

祝你好运@johncorser!