为什么即使使用BatchFetchType.IN提示,JPA/Eclipselink也会激发冗余数据库查询?

Rah*_*ani 8 optimization performance jpa eclipselink

摘要:

我试图最小化我的基于JPA的Java应用程序对数据库的查询数量.我指定了@BatchFetch(BatchFetchType.IN)优化提示,但我仍然看到一些额外的查询,我认为这些查询是多余的和不必要的.

细节:

考虑一个简单的域模型:我们有Invoice mgmt系统.发票与订单具有OneToOne关系.我们也有客户,与订单有OneToMany关系.(客户1-> M订单1 <-1发票).在此处查找更多详情.在这里找到完整的源代码.这是实体定义,因为它目前是:

Client.java(不包括getter和setter):

  @Entity(name = "CUSTOMER") 
  public class Customer {
    @Id //signifies the primary key
    @Column(name = "CUST_ID", nullable = false)
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long custId;

    @Column(name = "FIRST_NAME", length = 50)
    private String firstName;

    @OneToMany(mappedBy="customer",targetEntity=Order.class,
            fetch=FetchType.LAZY)
    private Collection<Order> orders;
    }
Run Code Online (Sandbox Code Playgroud)

Order.java(不包括getter和setter):

 @Entity(name = "ORDERS")
 public class Order {
    @Id
    @Column(name = "ORDER_ID", nullable = false)
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long orderId;

    @Column(name = "TOTAL_PRICE", precision = 2)
    private double totPrice;

    @OneToOne(fetch = FetchType.LAZY, optional = false, cascade = CascadeType.ALL,  mappedBy = "order")
    private Invoice invoice;

    @ManyToOne(optional = false)
    @JoinColumn(name = "CUST_ID", referencedColumnName = "CUST_ID")
    private Customer customer;

    @ManyToMany(fetch = FetchType.LAZY)
    @JoinTable(name = "ORDER_DETAIL", joinColumns = @JoinColumn(name = "ORDER_ID",    referencedColumnName = "ORDER_ID"), inverseJoinColumns = @JoinColumn(name = "PROD_ID",    referencedColumnName = "PROD_ID"))
    private List<Product> productList;

 }
Run Code Online (Sandbox Code Playgroud)

Invoice.java(不包括getter和setter):

@Entity(name = "ORDER_INVOICE")
public class Invoice {

    @Id
    // signifies the primary key
    @Column(name = "INVOICE_ID", nullable = false)
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long invoiceId;

    @Column(name = "AMOUNT_DUE", precision = 2)
    private double amountDue;

    @OneToOne(optional = false, fetch = FetchType.LAZY)
    @JoinColumn(name = "ORDER_ID")
    private Order order;

 }
Run Code Online (Sandbox Code Playgroud)

有了这个模型,我运行了一个简单的测试来获取客户的所有订单.

EntityManagerFactory entityManagerFactory =  Persistence.createEntityManagerFactory("testjpa");     
    EntityManager em = entityManagerFactory.createEntityManager();

    Customer customer = em.find(Customer.class, 100L);

    Collection<Order> orders = customer.getOrders();

    for(Order order: orders){
        System.out.println(order.getInvoice().getInvoiceId());
    }

    em.close();
Run Code Online (Sandbox Code Playgroud)

由于所有内容都是懒惰的,我们得到了四个查询,如下所示:

1398882535950|1|1|statement|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = ?)|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = 100)

1398882535981|0|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = 100)

1398882535995|1|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 111)

1398882536004|0|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 222)
Run Code Online (Sandbox Code Playgroud)

由于我不希望N + 1调用获取发票,因此我考虑使用批量提取并将总查询减少到4(一个查询以获取所有客户订单的发票).为了做到这一点,我更新了我的Order实体,如下所示:

更新 - Order.java,为Invoice添加BatchFetch.(不包括吸气剂和二传手):

 @Entity(name = "ORDERS")
 public class Order {
    @Id
    @Column(name = "ORDER_ID", nullable = false)
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long orderId;

    @Column(name = "TOTAL_PRICE", precision = 2)
    private double totPrice;

    @BatchFetch(BatchFetchType.IN)
    @OneToOne(fetch = FetchType.LAZY, optional = false, cascade = CascadeType.ALL,  mappedBy = "order")
    private Invoice invoice;

    @ManyToOne(optional = false)
    @JoinColumn(name = "CUST_ID", referencedColumnName = "CUST_ID")
    private Customer customer;

    @ManyToMany(fetch = FetchType.LAZY)
    @JoinTable(name = "ORDER_DETAIL", joinColumns = @JoinColumn(name = "ORDER_ID",    referencedColumnName = "ORDER_ID"), inverseJoinColumns = @JoinColumn(name = "PROD_ID",    referencedColumnName = "PROD_ID"))
    private List<Product> productList;

 }
Run Code Online (Sandbox Code Playgroud)

我再次运行相同的测试,并假设将有3个查询来获取数据.(一个用于客户,一个用于订单,一个用于批量获取发票).但是,eclipselink会为此生成5个查询.以下是查询:

1398883197009|1|1|statement|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = ?)|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = 100)

1398883197030|0|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = 100)

1398883197037|1|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (?,?))|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (111,222))

1398883197042|1|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (ORDER_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (ORDER_ID = 222)

1398883197045|0|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 222)
Run Code Online (Sandbox Code Playgroud)

我不明白为什么生成最后两个查询.任何帮助解释正在发生的事情都会有所帮助.

谢谢!

Chr*_*ris 12

看起来像EclipseLink中的一个错误/问题,因为在对象模型中遍历了急切的关系,允许在引用它的Order加载之前在"in"中加载第二个Invoice.这会强制Invoice在数据库中查询Order,而不是在缓存中查找.

您可以通过在Invoice to Order关系上使用延迟提取来解决此问题.此延迟将允许EclipseLink完全构建对象模型,以便在访问时它将位于缓存中.问题中的代码显示此关系标记为Lazy,但这只是JPA提供程序的一个提示,如果不使用代理或字节代码编织而无法在EclipseLink中工作,如下所述:https : //wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Performance/Weaving https://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Performance/Weaving/Dynamic_Weaving

延迟集合不需要编织,仅适用于1:1和其他优化.