Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API to retrieve column blob into slice of arbitrary length #116

Open
israel-lugo opened this issue Mar 27, 2021 · 2 comments
Open

API to retrieve column blob into slice of arbitrary length #116

israel-lugo opened this issue Mar 27, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@israel-lugo
Copy link
Contributor

israel-lugo commented Mar 27, 2021

Hi,

First of all, thank you for creating this library. I've been using database/sql for my personal projects for a while now, and having to jump through hoops to get any kind of concurrency working, and living without goodies such as save points. This library seems to provide an easy and lightweight alternative that maps much better to SQLite.

I have a column that holds binary protos. These are control messages with scalar fields, so I'm not worried about unbounded growth and don't need streaming blobs. The natural choice to access them would be (*Stmt).ColumnBytes.

The problem is that the only way to use (*Stmt).ColumnBytes correctly when you don't know the size of the blob in advance seems to be by doing:

n := stmt.ColumnLen(col)
buf := make([]byte, n)
stmt.ColumnBytes(col, buf)

This is wasteful, since (*Stmt).ColumnBytes is already calling (*Stmt).ColumnLen internally. It's also rather verbose, and makes reusing a previously allocated slice somewhat cumbersome (we'd have to possibly extend the slice so its length matches n).

Would you consider changing this to be similar to e.g. crypto/hash.(Hash).Sum? I.e. instead of copying the result of (*Stmt).columnBytes into the user's slice, append it.

// ColumnBytes appends the query result into b and returns the resulting slice.
func (*Stmt) ColumnBytes(col int, b []byte) []byte

// Example 1, user gets a newly allocated slice, fine for single use:
buf := stmt.ColumnBytes(col, nil)

// Example 2, reuse a previously allocated slice in a tight loop:
var buf []byte
for {
  if hasRow, err := stmt.Step(); err != nil {
    // handle error
  } else if !hasRow {
    break
  }
  buf = stmt.ColumnBytes(col, buf[:0])
  process(buf)
}

You may not want to modify an existing API to avoid breaking users. This could be a new method, e.g. (*Stmt).ColumnBytesAppend or some better sounding name.

@AdamSLevy
Copy link
Collaborator

AdamSLevy commented Apr 8, 2021

As I understand it, the API was designed this way to avoid passing the user a slice of C memory which will become invalidated after the Stmt is reset. What you're proposing both offers a convenience and avoids calling ColumnLen twice. However the breaking API change is slightly problematic, however we are still below v1. It just needs to be announced clearly.

Really ColumnBytes behaves like io.Reader.Read. I wonder if it should be called ColumnBytesRead instead, and ColumnBytes should append as you describe. The current implementation of ColumnLen could also be improved to cache the results of the C call in a columnLen map[int]int.

I know that you said you don't need streaming blobs but if you want to avoid redundant ColumnLen calls with the current code then you could just use ColumnReader with ioutil.ReadAll (now io.ReadAll in 1.16). Memory/CPU wise this is probably technically slightly less performant than the append implementation you suggested, but completely inconsequential compared to a C call.

@AdamSLevy
Copy link
Collaborator

sqlite/sqlite.go

Lines 938 to 975 in 6c1d4ad

// ColumnBytes reads a query result into buf.
// It reports the number of bytes read.
//
// Column indices start at 0.
//
// https://www.sqlite.org/c3ref/column_blob.html
func (stmt *Stmt) ColumnBytes(col int, buf []byte) int {
return copy(buf, stmt.columnBytes(col))
}
// ColumnReader creates a byte reader for a query result column.
//
// The reader directly references C-managed memory that stops
// being valid as soon as the statement row resets.
func (stmt *Stmt) ColumnReader(col int) *bytes.Reader {
// Load the C memory directly into the Reader.
// There is no exported method that lets it escape.
return bytes.NewReader(stmt.columnBytes(col))
}
func (stmt *Stmt) columnBytes(col int) []byte {
p := C.sqlite3_column_blob(stmt.stmt, C.int(col))
if p == nil {
return nil
}
n := stmt.ColumnLen(col)
slice := struct {
data unsafe.Pointer
len int
cap int
}{
data: unsafe.Pointer(p),
len: n,
cap: n,
}
return *(*[]byte)(unsafe.Pointer(&slice))
}

@AdamSLevy AdamSLevy added the enhancement New feature or request label Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants