python - Webscrape with bad wifi: Can I make my scrape 'go online' again? -


i'm doing big python scrape 10.000+ webpages, , it's taking me several hours do. if disconnect internet during proces, script stalls, , doesn't reconnect when wifi again.

is there way insert a; 'if internet stops, pick left off'?

there framework building scrapers - scrapy. has such capabilities - can save execution state, , resume crawling point (a year later, example).

or if want build scratch, need implement saving of state of crawler. think bad idea trying save interpreter state, need design crawler in such way, state can serialized. example, scrapy designed in such way - crawler has methods, has method, generates initial requests. each requests has callback. each callback can generate additional requests. , on. scrapy's call callbacks, , enqueue requests, , call callbacks them. such design makes able scrapy save requests queue disk , resume execution last request(s).


Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -