我正在使用Tableau和MarkLogic.我有以下XML结构
<CustomerInformation CustomerId="1">
<CustomerBasicInformation>
<CustomerTitle></CustomerTitle>
<CustomerFirstName></CustomerFirstName>
<CustomerMiddleName></CustomerMiddleName>
<CustomerLastName></CustomerLastName>
</CustomerBasicInformation>
<CustomerEmplyomentDetails>
<CustomerEmployer>
<EmployerName IsCurrentEmployer=""></EmployerName>
<CustomerDesignation></CustomerDesignation>
<EmployerLocation></EmployerLocation>
<CustomerTenure></CustomerTenure>
</CustomerEmployer>
<CustomerEmplyomentDetails>
<PolcyDetails>
<Policy PolicyId="">
<PolicyName></PolicyName>
<PolicyType></PolicyType>
<PolicyCategory></PolicyCategory>
<QuoteNumber></QuoteNumber>
<PolicyClaimDetails>
<PolicyClaim ClaimId="">
<PolicyClaimedOn></PolicyClaimedOn>
<PolicyClaimType></PolicyClaimType>
<PolicyClaimantName></PolicyClaimantName>
</PolicyClaim>
</PolicyClaimDetails>
<PolicyComplaintDetails>
<PolicyComplaint ComplaintId="">
<PolicyComplaintStatus></PolicyComplaintStatus>
<PolicyComplaintOn></PolicyComplaintOn>
</PolicyComplaint>
</PolicyComplaintDetails>
<BillingDetails>
<Billing BillingId="">
<BillingAmount></BillingAmount>
<BillingMode></BillingMode>
</Billing>
</BillingDetails>
</Policy>
<Policy PolicyId="">
<PolicyName></PolicyName>
<PolicyType></PolicyType>
<PolicyCategory></PolicyCategory>
<QuoteNumber></QuoteNumber>
<PolicyClaimDetails>
<PolicyClaim ClaimId="">
<PolicyClaimedOn></PolicyClaimedOn>
<PolicyClaimType></PolicyClaimType>
<PolicyClaimantName></PolicyClaimantName>
</PolicyClaim>
</PolicyClaimDetails>
<PolicyComplaintDetails>
<PolicyComplaint ComplaintId="">
<PolicyComplaintStatus></PolicyComplaintStatus>
<PolicyComplaintOn></PolicyComplaintOn>
</PolicyComplaint>
</PolicyComplaintDetails>
<BillingDetails>
<Billing BillingId="">
<BillingAmount></BillingAmount>
<BillingMode></BillingMode>
</Billing>
</BillingDetails>
</Policy>
</PolcyDetails>
</CustomerInformation>
Run Code Online (Sandbox Code Playgroud)
我已经在上面的结构上创建了一个视图.最初我为所有元素创建了一个视图,但在Tableau上我得到了重复值以及笛卡尔连接结果.所以为了解决这个问题,我使用了片段根的方法.由于单个客户可以有多个PolicyDetails.我在Policy上创建了片段根目录.类似声明,投诉,计费,报价对于单个策略可以是多个,我已经在每个策略上创建了片段根. …
我正在读取示例CSV数据,然后使用Hadoop连接器API以Mark形式写入MarkLogic数据库.问题是,只有一些数据被随机地写入数据库.
例如,假设我存储了10条记录,因此MarkLogic数据库应该有10次插入.我得到的是,只有少数记录被随机写入多次.有人可以解释为什么会这样吗?
这是映射器代码:
public static class CSVMapper extends Mapper<LongWritable, Text, DocumentURI, Text> {
static int i = 1;
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
// TODO Auto-generated method stub
ObjectMapper mapper = new ObjectMapper();
String line = value.toString(); //line contains one line of your csv file.
System.out.println("line value is - "+line);
String[] singleData = line.split("\n");
for(String lineData : singleData)
{
String[] fields = lineData.split(",");
Sample sd = new Sample(fields[0], fields[1], fields[2].trim(), fields[3]);
String jsonInString …Run Code Online (Sandbox Code Playgroud)