我试图在'/ doc/story/content'下搜索包含文本'Yahoo'的节点,它返回'content'节点,但我需要包含'Yahoo'或它的父节点的精确文本节点
<doc>
<story>
<content id="201009281450332423">
<ul>MSW NYNES NYPG1 DILMA</ul>
<p> <k> Yahoo, made </k> it nice </p>
<p>
<author>-v-</author>
</p>
</content>
</story>
</doc>
Run Code Online (Sandbox Code Playgroud)
Xpath的: "/doc/story/content[contains(., 'Yahoo')]"
我一直在尝试使用Scrapy(xpath)从Kbb的HTML中提取脚本标记中的数据.但我的主要问题是识别正确的div和脚本标签.我是使用xpath的新手,非常感谢任何帮助!
<script type="text/javascript" src="http://s1.kbb.com/combine/IncentivesPilotJs/949332058"></script>
<input type="hidden" id="ResaleValueUrl" value="/ymmt/resalevalue/?vehicleid=392396" />
<input type="hidden" id="Intent" value="buy-used" />
<!--[if lt IE 9]>
<script>
window.FlashCanvasOptions = {
swfPath: "/js/canvas/FlashCanvas/UCMarketMeter/"
};
</script>
<script type="text/javascript" src="http://s1.kbb.com/combine/YmmtMarketMeterFlashCanvasJs/795892638"></script>
<![endif]-->
<script type="text/javascript" src="http://s1.kbb.com/combine/YMMTOverview/1527402533"></script>
<script type="text/javascript" src="http://s1.kbb.com/combine/YmmtPricingOverviewBuyUsedJs/-1416499456"></script>
<script language="javascript" type="text/javascript">
$(document).ready(function() {
KBB.Vehicle.Pages.PricingOverview.Buyers.setup({
//Workaround until we get cross domain working for Flash
imageDir: window.FlashCanvasOptions ? "/Content/images" : "http://file.kelleybluebookimages.com/kbb/images/marketmeter",
vehicleId: "392396",
zipCode: "78701",
mileage: "10000",
intent: "buy-used",
priceType: "retail",
condition: "good",
options: "392396|53635|78701|100|10|",
price: "17074",
manufacturer: "Nissan",
model: "Altima", …Run Code Online (Sandbox Code Playgroud)