API to retrieve column blob into slice of arbitrary length #116

israel-lugo · 2021-03-27T17:24:20Z

Hi,

First of all, thank you for creating this library. I've been using database/sql for my personal projects for a while now, and having to jump through hoops to get any kind of concurrency working, and living without goodies such as save points. This library seems to provide an easy and lightweight alternative that maps much better to SQLite.

I have a column that holds binary protos. These are control messages with scalar fields, so I'm not worried about unbounded growth and don't need streaming blobs. The natural choice to access them would be (*Stmt).ColumnBytes.

The problem is that the only way to use (*Stmt).ColumnBytes correctly when you don't know the size of the blob in advance seems to be by doing:

n := stmt.ColumnLen(col)
buf := make([]byte, n)
stmt.ColumnBytes(col, buf)

This is wasteful, since (*Stmt).ColumnBytes is already calling (*Stmt).ColumnLen internally. It's also rather verbose, and makes reusing a previously allocated slice somewhat cumbersome (we'd have to possibly extend the slice so its length matches n).

Would you consider changing this to be similar to e.g. crypto/hash.(Hash).Sum? I.e. instead of copying the result of (*Stmt).columnBytes into the user's slice, append it.

// ColumnBytes appends the query result into b and returns the resulting slice.
func (*Stmt) ColumnBytes(col int, b []byte) []byte

// Example 1, user gets a newly allocated slice, fine for single use:
buf := stmt.ColumnBytes(col, nil)

// Example 2, reuse a previously allocated slice in a tight loop:
var buf []byte
for {
  if hasRow, err := stmt.Step(); err != nil {
    // handle error
  } else if !hasRow {
    break
  }
  buf = stmt.ColumnBytes(col, buf[:0])
  process(buf)
}

You may not want to modify an existing API to avoid breaking users. This could be a new method, e.g. (*Stmt).ColumnBytesAppend or some better sounding name.

The text was updated successfully, but these errors were encountered:

AdamSLevy · 2021-04-08T19:21:50Z

As I understand it, the API was designed this way to avoid passing the user a slice of C memory which will become invalidated after the Stmt is reset. What you're proposing both offers a convenience and avoids calling ColumnLen twice. However the breaking API change is slightly problematic, however we are still below v1. It just needs to be announced clearly.

Really ColumnBytes behaves like io.Reader.Read. I wonder if it should be called ColumnBytesRead instead, and ColumnBytes should append as you describe. The current implementation of ColumnLen could also be improved to cache the results of the C call in a columnLen map[int]int.

I know that you said you don't need streaming blobs but if you want to avoid redundant ColumnLen calls with the current code then you could just use ColumnReader with ioutil.ReadAll (now io.ReadAll in 1.16). Memory/CPU wise this is probably technically slightly less performant than the append implementation you suggested, but completely inconsequential compared to a C call.

AdamSLevy · 2021-04-08T19:29:24Z

sqlite/sqlite.go

Lines 938 to 975 in 6c1d4ad

    
           // ColumnBytes reads a query result into buf. 
        
           // It reports the number of bytes read. 
        
           // 
        
           // Column indices start at 0. 
        
           // 
        
           // https://www.sqlite.org/c3ref/column_blob.html 
        
           func (stmt *Stmt) ColumnBytes(col int, buf []byte) int { 
        
           	return copy(buf, stmt.columnBytes(col)) 
        
           } 
        
           // ColumnReader creates a byte reader for a query result column. 
        
           // 
        
           // The reader directly references C-managed memory that stops 
        
           // being valid as soon as the statement row resets. 
        
           func (stmt *Stmt) ColumnReader(col int) *bytes.Reader { 
        
           	// Load the C memory directly into the Reader. 
        
           	// There is no exported method that lets it escape. 
        
           	return bytes.NewReader(stmt.columnBytes(col)) 
        
           } 
        
           func (stmt *Stmt) columnBytes(col int) []byte { 
        
           	p := C.sqlite3_column_blob(stmt.stmt, C.int(col)) 
        
           	if p == nil { 
        
           		return nil 
        
           	} 
        
           	n := stmt.ColumnLen(col) 
        
           	slice := struct { 
        
           		data unsafe.Pointer 
        
           		len  int 
        
           		cap  int 
        
           	}{ 
        
           		data: unsafe.Pointer(p), 
        
           		len:  n, 
        
           		cap:  n, 
        
           	} 
        
           	return *(*[]byte)(unsafe.Pointer(&slice)) 
        
           }

AdamSLevy added the enhancement New feature or request label Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API to retrieve column blob into slice of arbitrary length #116

API to retrieve column blob into slice of arbitrary length #116

israel-lugo commented Mar 27, 2021 •

edited

Loading

AdamSLevy commented Apr 8, 2021 •

edited

Loading

AdamSLevy commented Apr 8, 2021

API to retrieve column blob into slice of arbitrary length #116

API to retrieve column blob into slice of arbitrary length #116

Comments

israel-lugo commented Mar 27, 2021 • edited Loading

AdamSLevy commented Apr 8, 2021 • edited Loading

AdamSLevy commented Apr 8, 2021

israel-lugo commented Mar 27, 2021 •

edited

Loading

AdamSLevy commented Apr 8, 2021 •

edited

Loading