使用 pd.read_sql 模拟多个数据库调用

Mik*_*ike 4 python sql mocking

我有一个函数,用于pd.read_sql直接从 Python 进行两个 SQL 查询,如下所示:

def get_sql_queries():
   source_data_query = """
        SELECT
              cf."fund" as 'Fund'
            , cf."Symbol"
   FROM
            sql_table
   """

   transactions = pd.read_sql(
        sql=source_data_query,
        con=DEFAULT_CONNECTION,
    )

   other_source_data_query = """
        SELECT
              cf."fund" as 'Fund'
            , cf."Symbol"
   FROM
            other_sql_table
   """
   marks = pd.read_sql(
        sql=other_source_data_query,
        con=DEFAULT_CONNECTION,
    )
   returns transactions,marks

Run Code Online (Sandbox Code Playgroud)

当我从数据库调用时,这工作得很好。

我现在想模拟这些数据库调用以进行测试,这样在source_data_query运行时,它不会调用数据库,而是读入测试数据帧。同样对于other_source_data_query.

更新:根据亚伦的建议进行编辑如下:

import unittest
from unittest import mock
import pandas as pd
from functions import get_transaction_data


class GetSQLQueriesTest(unittest.TestCase):
    @mock.patch('pd.read_sql')
    def test_get_sql_queries(self, mock_read_sql):
        transaction_data = pd.DataFrame(columns=['Fund', 'Symbol'], data=[['Fund 1', 'Symbol 1']])
        mark_data = pd.DataFrame(columns=['Fund', 'Symbol'], data=[['Fund 2', 'Symbol 2']])

        mock_read_sql.side_effect = (transaction_data, mark_data)  # If the order is fixed

        output = get_transaction_data.get_transactions_between_two_dates()
        self.assertEqual(output, (transaction_data, mark_data))
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

FAILED tests/unit_tests/functions/test.py::GetSQLQueriesTest::test_get_sql_queries - ModuleNotFoundError: No module named 'pd'

我的环境中肯定安装了 pandas。

aar*_*ron 7

用于side_effect模拟pd.read_sql基于的返回值sql

import unittest
from unittest import mock

import pandas as pd

from mymodule import get_sql_queries


class GetSQLQueriesTest(unittest.TestCase):

    @mock.patch('pandas.read_sql')  # Or @mock.patch('mymodule.pd.read_sql')
    def test_get_sql_queries(self, mock_read_sql):
        transaction_data = pd.DataFrame(columns=['Fund', 'Symbol'], data=[['Fund 1', 'Symbol 1']])
        mark_data = pd.DataFrame(columns=['Fund', 'Symbol'], data=[['Fund 2', 'Symbol 2']])

        # mock_read_sql.side_effect = (transaction_data, mark_data)  # If the order is fixed
        mock_read_sql.side_effect = lambda sql, con: (
            transaction_data if ' sql_table' in sql else
            mark_data if 'other_sql_table' in sql else
            None
        )

        output = get_sql_queries()
        self.assertEqual(output, (transaction_data, mark_data))
Run Code Online (Sandbox Code Playgroud)

  • 对于那些将来来到这里的人,我相信使用 @mock.patch('mymodule.pd.read_sql') 作为 @aaron 放入代码注释中就可以修复操作因无法找到名为“pd”的模块而出现的异常。 ' (2认同)