python - scrapy: convert html string to HtmlResponse object -
i have raw html string want convert scrapy html response object can use selectors css
, xpath
, similar scrapy's response
. how can it?
first of all, if debugging or testing purposes, can use scrapy shell
:
$ cat index.html <div id="test"> test text </div> $ scrapy shell index.html >>> response.xpath('//div[@id="test"]/text()').extract()[0].strip() u'test text'
there different objects available in shell during session, response
, request
.
or, can instantiate htmlresponse
class , provide html string in body
:
>>> scrapy.http import htmlresponse >>> response = htmlresponse(url="my html string", body='<div id="test">test text</div>') >>> response.xpath('//div[@id="test"]/text()').extract()[0].strip() u'test text'
Comments
Post a Comment