python - How can I assign a value to a null node in Xpath? -


this parts need crawl using scrapy xpath:

<tr class="o"><td>alabama</td><td><code>us.al</code></td><td><code>us01</code></td><td>ala.</td><td>-6~</td><td class="n">4,779,736</td><td class="n">133,916</td><td class="n">51,705</td><td>2</td><td>montgomery</td><td>alabamian</td><td>350-369</td></tr> <tr class="e"><td>alaska</td><td><code>us.ak</code></td><td><code>us02</code></td><td></td><td>-9~</td><td class="n">710,231</td><td class="n">1,530,700</td><td class="n">591,007</td><td>6</td><td>juneau</td><td>alaskan</td><td>995-999</td></tr> 

my xpath expression is:

response.xpath('//tr[@class="o" or @class="e"][2]/descendant::*').extract() 

but there null node in "alaska". <td> node after <code> "us02". not happen in alabama.

when use expression:

response.xpath('//tr[@class="o" or @class="e"][2]/descendant::*/text()').extract() 

to extract text, null node ignored.

but have comply format. how can set null node space?

by way, have better solution crawl page in scrapy?

http://www.statoids.com/uus.html

i here explicit possible , data in "by column" fashion:

for state in response.xpath('//tr[@class="o" or @class="e"]'):     item = state()     item["hasc"] = state.xpath(".//td[2]/code/text()").extract()     ...     yield item 

where state item class. note extract() return list. using item loader takefirst or join processor have string values inside item fields.


Comments

Popular posts from this blog

java - Spring Data JPA: Why findOne(id) executing delete query internally? -

python - Mongodb How to add addtional information when aggregating? -

java - Incorrect order of records in M-M relationship in hibernate -