Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with searching many key values with v1.4.0 #1

Closed
guycipher opened this issue Oct 14, 2024 · 6 comments
Closed

issue with searching many key values with v1.4.0 #1

guycipher opened this issue Oct 14, 2024 · 6 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@guycipher
Copy link
Owner

guycipher commented Oct 14, 2024

Previous versions brought all sstable keys into memory, then did a binary search which made Get faster. With the implementation of a pager there seems to be an odd case where inserting many keys doesn't get a result. I believe it to be the sstable iterator. This obviously needs to be investigated and corrected.

package main

import (
	"fmt"
	"github.com/guycipher/lsmt"
	"log"
	"time"
)

func main() {
	l, err := lsmt.New("my_lsm_tree", 0755, 100000, 2, 10)
	if err != nil {
		log.Fatal(err)
	}

	if l == nil {
		log.Fatal("expected non-nil lmst")
	}

	defer l.Close()

	t := time.Now()

	// Insert 1000000 key-value pairs
	for i := 0; i < 1000000; i++ {
		err = l.Put([]byte(string(fmt.Sprintf("%d", i))), []byte(string(fmt.Sprintf("%d", i))))
		if err != nil {
			log.Fatal(err)
		}
	}

	log.Println("Write 1000000 elements in", time.Since(t))

	t = time.Now()

	// Get a key
	value, err := l.Get([]byte("834332"))
	if err != nil {
		log.Fatal(err)
	}

	if string(value) != "834332" {
		log.Fatalf("expected 834332, got %s", string(value))
	}

	log.Println("Get key 834332 in", time.Since(t))

}

@guycipher guycipher added the good first issue Good for newcomers label Oct 14, 2024
@guycipher guycipher pinned this issue Oct 14, 2024
@harshgsharma1501
Copy link

assign it to me @guycipher

@guycipher
Copy link
Owner Author

No problem

@harshgsharma1501
Copy link

can u please describe the issue more

@guycipher
Copy link
Owner Author

guycipher commented Oct 14, 2024

Yes, when we insert 1 million key values as in above example. Then we search we cannot find key 834332 which should have been inserted in the previous insert operations. Flushing whats in-memory every 100000 keys. The Get() would either rely on the memtable or searching each sstable sequentially starting from the latest one. The issue lies somewhere there. You wont see the issue when inserting a small amount of key value pairs, only when inserting many. Set your options similar to mine:

l, err := lsmt.New("my_lsm_tree", 0755, 100000, 2, 10)

Then insert 1,000,000 key value pairs

// Insert 1000000 key-value pairs
	for i := 0; i < 1000000; i++ {
		err = l.Put([]byte(string(fmt.Sprintf("%d", i))), []byte(string(fmt.Sprintf("%d", i))))
		if err != nil {
			log.Fatal(err)
		}
	}

Then try to get key 834,332

// Get a key
	value, err := l.Get([]byte("834332"))
	if err != nil {
		log.Fatal(err)
	}

	if string(value) != "834332" {
		log.Fatalf("expected 834332, got %s", string(value))
	}

Currently we get:

2024/10/14 04:20:28 Write 1000000 elements in 8.544535782s
2024/10/14 04:20:28 key not found

@guycipher
Copy link
Owner Author

I've done some minor changes and put a new test in the lsmt_test.go for what I'm talking about. The changes I made were trying to fix the issue but it didn't work out :P

@guycipher
Copy link
Owner Author

I found the issue. Coming next version in a couple minutes. It was the pager and not initializing the pager within the in-memory sstable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants