top | item 41696601

(no title)

I had to solve a similar problem in Go once, partially to scratch an itch and partially to speed things up.

The S3 AWS SDK has support for doing chunked/parallel downloads if you can supply an io.WriterAt [1] implementation.

  type WriterAt interface {
   WriteAt(p []byte, off int64) (n int, err error)
  }

The AWS SDK comes with an implementation that has a Bytes() method to return the contents, so you can download faster but you can't stream the results.

I created a WriteAtReader that implements io.WriterAt and io.Reader. You can read up to the first byte that hasn't been written to the internal buffers. If there's nothing new to read but there's still pending writes, calls to Read will block and immediately return after the next WriteTo call. As bytes are read through Read, the WriterAtReader frees the memory associated with them.

It provides the benefits/immediate results you get with a streaming implementation and the performance of parallel download streams (at the cost of increased memory usage at the outset).

The first "chunk" can be a little slow, but because the second chunk is almost ready to be read by the time the first is done? zoom zoom.

[1] https://pkg.go.dev/io#WriterAt

discuss

No comments yet.