Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Please read HTTP Batch Source first to grab the core principals of pagination, formats parsing etc.

...

So for such a cases I propose to load the pages pages for maximum batchInterval seconds, before returning them to RDD. And on the next HttpInputDStream#compute call continue from that place.

...