-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert Presence Change Storage Format from String to Byte Array #1069
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request introduces several changes primarily focused on the handling of presence changes within the codebase. Key modifications include the addition of a new migration version (v0.5.6) and the introduction of functions for encoding and decoding presence changes. The test cases for the converter functionality have been updated to simplify the creation of presence change objects. Additionally, the representation of presence changes in various structures has been altered, transitioning from string types to more complex objects, enhancing data handling consistency across the application. Changes
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1069 +/- ##
==========================================
+ Coverage 46.82% 46.86% +0.04%
==========================================
Files 84 84
Lines 12180 12201 +21
==========================================
+ Hits 5703 5718 +15
- Misses 5909 5910 +1
- Partials 568 573 +5 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (11)
migrations/v0.5.6/main.go (2)
17-18
: Enhance package documentation with migration detailsThe package documentation could be more descriptive about what this migration does. Consider adding details about the presence change format conversion from string to byte array.
-// Package v056 provides migration for v0.5.6 +// Package v056 provides migration for v0.5.6 to convert the storage format +// of presence changes from string to byte array, ensuring consistency with +// other data types in the system. package v056
27-28
: Enhance function documentationThe function documentation should include:
- Parameter descriptions
- Expected behavior
- Batch size requirements
-// RunMigration runs migrations for v0.5.6 +// RunMigration executes the v0.5.6 migration to convert presence changes from string +// to byte array format. It processes records in batches to manage memory usage. +// +// Parameters: +// - ctx: Context for cancellation and timeout +// - db: MongoDB client connection +// - databaseName: Name of the database to migrate +// - batchSize: Number of records to process in each batch (must be positive) +// +// Returns an error if the migration fails.server/backend/database/mongo/registry_test.go (1)
69-88
: Consider enhancing test coverageWhile the basic functionality is tested, consider adding the following test cases for more comprehensive coverage:
- Test presence changes with multiple properties and nested objects
- Test error cases (e.g., invalid presence data)
- Add explicit assertions for individual fields after unmarshaling
Example enhancement:
t.Run("presenceChange test", func(t *testing.T) { + tests := []struct { + name string + presence *innerpresence.Presence + changeType innerpresence.ChangeType + wantErr bool + }{ + { + name: "basic presence", + presence: func() *innerpresence.Presence { + p := innerpresence.NewPresence() + p.Set("color", "orange") + return p + }(), + changeType: innerpresence.Put, + wantErr: false, + }, + { + name: "complex presence", + presence: func() *innerpresence.Presence { + p := innerpresence.NewPresence() + p.Set("color", "blue") + p.Set("position", map[string]interface{}{ + "x": 100, + "y": 200, + }) + return p + }(), + changeType: innerpresence.Put, + wantErr: false, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { presenceChange := &innerpresence.PresenceChange{ - ChangeType: innerpresence.Put, - Presence: presence, + ChangeType: tt.changeType, + Presence: tt.presence, } // ... rest of the test + // Additional assertions + assert.Equal(t, tt.presence.Get("color"), info.PresenceChange.Presence.Get("color")) }) + } })pkg/document/innerpresence/presence.go (2)
107-119
: Enhance documentation to specify the encoding format.The function implementation is correct, but the documentation should explicitly mention that JSON encoding is used. This helps maintainers understand the storage format and potential size implications.
Apply this diff to improve the documentation:
-// EncodeToBytes encodes the given presence change into bytes array. +// EncodeToBytes encodes the given presence change into a JSON byte array. +// Returns nil, nil if the input is nil.
Line range hint
121-132
: Fix indentation and enhance input validation.The function has inconsistent indentation and could benefit from additional validation.
Apply these changes:
- Fix indentation:
func PresenceChangeFromBytes(bytes []byte) (*PresenceChange, error) { if bytes == nil { - return nil, nil - } + return nil, nil + } - p := &PresenceChange{} - if err := json.Unmarshal(bytes, p); err != nil { - return nil, fmt.Errorf("unmarshal presence change: %w", err) - } + p := &PresenceChange{} + if err := json.Unmarshal(bytes, p); err != nil { + return nil, fmt.Errorf("unmarshal presence change: %w", err) + }
- Add validation:
if err := json.Unmarshal(bytes, p); err != nil { return nil, fmt.Errorf("unmarshal presence change: %w", err) } + // Validate required fields + if p.ChangeType == "" { + return nil, fmt.Errorf("invalid presence change: missing change type") + } + return p, nilAlso, improve the documentation:
-// PresenceChangeFromBytes unmarshals the given bytes array into PresenceChange. +// PresenceChangeFromBytes unmarshals a JSON byte array into a PresenceChange object. +// Returns nil, nil if the input is nil. Returns an error if the JSON is invalid +// or required fields are missing.server/backend/database/mongo/registry.go (2)
141-165
: Consider standardizing error wrapping patterns.While the error handling is functional, consider standardizing the error wrapping pattern. The function uses both direct
fmt.Errorf("encode error: %w", err)
and returning errors fromEncodeToBytes
with additional wrapping.Consider applying this consistent pattern:
func presenceChangeEncoder(_ bsoncodec.EncodeContext, vw bsonrw.ValueWriter, val reflect.Value) error { if !val.IsValid() || val.Type() != tPresenceChange { return bsoncodec.ValueEncoderError{ Name: "presenceChangeEncoder", Types: []reflect.Type{tPresenceChange}, Received: val} } presenceChange := val.Interface().(*innerpresence.PresenceChange) if presenceChange == nil { if err := vw.WriteNull(); err != nil { - return fmt.Errorf("encode error: %w", err) + return fmt.Errorf("presenceChangeEncoder: failed to write null: %w", err) } return nil } bytes, err := innerpresence.EncodeToBytes(presenceChange) if err != nil { - return fmt.Errorf("encode error: %w", err) + return fmt.Errorf("presenceChangeEncoder: failed to encode presence change: %w", err) } if err := vw.WriteBinary(bytes); err != nil { - return fmt.Errorf("encode error: %w", err) + return fmt.Errorf("presenceChangeEncoder: failed to write binary: %w", err) } return nil }
167-195
: Consider standardizing error handling patterns.Similar to the encoder, consider standardizing error messages for better debugging. Also, the default case in the switch statement could be more specific about the received type.
Consider applying these improvements:
func presenceChangeDecoder(_ bsoncodec.DecodeContext, vr bsonrw.ValueReader, val reflect.Value) error { if val.Type() != tPresenceChange { return bsoncodec.ValueDecoderError{ Name: "presenceChangeDecoder", Types: []reflect.Type{tPresenceChange}, Received: val} } switch vrType := vr.Type(); vrType { case bson.TypeNull: if err := vr.ReadNull(); err != nil { - return fmt.Errorf("decode error: %w", err) + return fmt.Errorf("presenceChangeDecoder: failed to read null: %w", err) } val.Set(reflect.Zero(tPresenceChange)) return nil case bson.TypeBinary: data, _, err := vr.ReadBinary() if err != nil { - return fmt.Errorf("decode error: %w", err) + return fmt.Errorf("presenceChangeDecoder: failed to read binary: %w", err) } presenceChange, err := innerpresence.PresenceChangeFromBytes(data) if err != nil { - return fmt.Errorf("decode error: %w", err) + return fmt.Errorf("presenceChangeDecoder: failed to decode presence change: %w", err) } val.Set(reflect.ValueOf(presenceChange)) return nil default: - return fmt.Errorf("unsupported type: %v", vr.Type()) + return fmt.Errorf("presenceChangeDecoder: unsupported BSON type %v for presence change", vrType) } }api/converter/converter_test.go (1)
264-267
: LGTM! Consider adding more test cases.The change aligns well with the PR objectives by moving away from string-based representation to direct struct instantiation. This simplifies the test setup and makes it more maintainable.
Consider adding more test cases to verify:
- Presence changes with non-empty presence data
- Different change types (Put, Delete)
- Edge cases with nil or invalid presence data
Example test case:
change := &innerpresence.PresenceChange{ ChangeType: innerpresence.Put, Presence: func() *innerpresence.Presence { p := innerpresence.NewPresence() p.SetData(map[string]interface{}{"cursor": 10}) return p }(), }migrations/v0.5.6/migrate-presence-change.go (3)
31-51
: Simplify validation by usingCountDocuments
instead of iterating over the cursorCurrently, the
validatePresenceChangeMigration
function fetches all documents and iterates over them to check if anypresence_change
fields are still stored as strings. This can be simplified by usingCountDocuments
with the same filter to check if any documents remain. If the count is greater than zero, it indicates that there are still documents to be migrated.Apply this diff to simplify the function:
func validatePresenceChangeMigration(ctx context.Context, db *mongo.Client, databaseName string) error { collection := db.Database(databaseName).Collection("changes") - cursor, err := collection.Find(ctx, bson.M{ + count, err := collection.CountDocuments(ctx, bson.M{ "presence_change": bson.M{ "$type": "string", }, }) if err != nil { return fmt.Errorf("count documents: %w", err) } - - if count > 0 { - return fmt.Errorf("found presence change still stored as string") - } - - return nil - - for cursor.Next(ctx) { - var doc bson.M - if err := cursor.Decode(&doc); err != nil { - return fmt.Errorf("decode document: %w", err) - } - - if presenceChange, ok := doc["presence_change"]; ok { - if _, isString := presenceChange.(string); isString { - return fmt.Errorf("found presence change still stored as string") - } - } - } - + if count > 0 { + return fmt.Errorf("found %d presence changes still stored as string", count) + } return nil }
37-38
: Ensure consistent error wrapping for better contextIn some places, errors are returned directly (e.g.,
return err
), while in others, errors are wrapped with additional context usingfmt.Errorf
. For consistency and improved error tracing, consider wrapping all returned errors with contextual messages.For example, modify these instances:
if err != nil { - return err + return fmt.Errorf("failed to find documents: %w", err) } ... if err != nil { - return err + return fmt.Errorf("failed to count documents: %w", err) } ... if err != nil { - return err + return fmt.Errorf("collection.Find failed: %w", err) }Also applies to: 113-114, 124-125
116-117
: Use structured logging instead offmt.Printf
for better log managementUsing
fmt.Printf
andfmt.Println
for logging may not be suitable for larger applications or for environments where log management is critical. Consider using a structured logging library likelog
to provide better control over log output and to standardize logging practices.Apply this diff to use the
log
package:-import ( - "context" - "fmt" +import ( + "context" + "fmt" + "log" ) ... if totalCount == 0 { - fmt.Println("No data found to migrate") + log.Println("No data found to migrate") return nil } ... - fmt.Printf("%s.changes presence change migration completed: %d converted \n", databaseName, totalCount) + log.Printf("%s.changes presence change migration completed: %d converted\n", databaseName, totalCount)Ensure to import the
log
package at the beginning of the file.Also applies to: 164-164
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (11)
api/converter/converter_test.go
(1 hunks)cmd/yorkie/migration.go
(2 hunks)migrations/v0.5.6/main.go
(1 hunks)migrations/v0.5.6/migrate-presence-change.go
(1 hunks)pkg/document/innerpresence/presence.go
(1 hunks)server/backend/database/change_info.go
(2 hunks)server/backend/database/memory/database.go
(1 hunks)server/backend/database/mongo/client.go
(1 hunks)server/backend/database/mongo/registry.go
(3 hunks)server/backend/database/mongo/registry_test.go
(2 hunks)server/packs/serverpacks.go
(1 hunks)
🔇 Additional comments (16)
migrations/v0.5.6/main.go (2)
20-25
: LGTM!
Imports are well-organized and all are necessary for the migration functionality.
29-29
: Verify MigratePresenceChange implementation
Let's ensure the MigratePresenceChange function is properly implemented in the codebase.
✅ Verification successful
MigratePresenceChange function is properly implemented
The function is well-defined in migrations/v0.5.6/migrate-presence-change.go
with the correct signature and comprehensive implementation including:
- Batch processing of presence change documents
- Progress tracking
- Validation step
- Error handling
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the existence and implementation of MigratePresenceChange function
# Search for the function definition
echo "Searching for MigratePresenceChange function definition:"
ast-grep --pattern 'func MigratePresenceChange(ctx context.Context, $$$) $$$'
# Search for test coverage
echo -e "\nSearching for related test files:"
fd -e go -e test.go migrate-presence-change
Length of output: 5048
server/backend/database/change_info.go (1)
41-51
: Consider implementing serialization interfaces.
To ensure consistent handling of the new binary format across the system, consider implementing standard serialization interfaces (e.g., encoding.BinaryMarshaler
and encoding.BinaryUnmarshaler
) in the PresenceChange
type. This would provide a uniform way to handle serialization/deserialization throughout the codebase.
Also applies to: 97-97
server/backend/database/mongo/registry_test.go (2)
31-31
: LGTM: Import statement is correctly placed
The new import for innerpresence
package is necessary for the presence change test and follows the existing import structure.
69-88
: LGTM: Test case correctly validates presence change serialization
The test successfully validates the basic functionality of serializing and deserializing presence changes through BSON, which aligns with the PR's objective of changing the storage format.
pkg/document/innerpresence/presence.go (1)
Line range hint 107-132
: Verify migration path and error handling.
The new encoding/decoding functions are part of a larger change to unify data storage formats. Please ensure:
- The migration script handles edge cases (empty strings, invalid JSON).
- Dependent code properly handles potential errors from these functions.
- A rollback plan exists in case of migration issues.
Let's verify the usage and error handling:
✅ Verification successful
Proper error handling and usage patterns verified
The verification shows:
- The functions are used exclusively in
server/backend/database/mongo/registry.go
- Error handling is properly implemented in all usage locations with descriptive error wrapping
- No direct JSON marshaling/unmarshaling of PresenceChange objects found in the codebase
- The changes are well-contained and follow a consistent pattern
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for proper error handling in callers
rg -A 2 "EncodeToBytes|PresenceChangeFromBytes" --type go
# Look for any direct JSON encoding of PresenceChange that should use these functions
ast-grep --pattern 'json.Marshal($presence)' --lang go
ast-grep --pattern 'json.Unmarshal($_, $presence)' --lang go
Length of output: 1352
server/packs/serverpacks.go (3)
122-122
: Consider potential nil handling.
Since info.PresenceChange
could potentially be nil (e.g., for changes without presence updates), ensure that converter.ToPresenceChange
handles nil inputs gracefully.
Let's verify nil handling:
#!/bin/bash
# Description: Check if nil presence changes are handled
# Look for nil checks in converter
rg -A 3 'func ToPresenceChange'
# Look for tests with nil presence changes
rg -A 5 'TestToPresenceChange.*nil'
122-122
: Ensure backward compatibility during migration.
Since this change is part of a storage format migration from string to byte array, verify that the converter can handle both old (string) and new (byte array) formats during the migration period.
Let's verify migration handling:
#!/bin/bash
# Description: Check migration handling in converter
# Look for format detection or migration logic
rg -A 5 'ToPresenceChange.*format'
# Check for migration tests
rg -A 5 'TestPresenceChangeMigration'
122-122
: Verify error handling in the presence change conversion.
The change aligns with the PR objective of converting presence changes to byte arrays. However, unlike other conversions in this method that return errors (e.g., ToChangeID
, ToVersionVector
), ToPresenceChange
doesn't seem to handle potential conversion errors.
Let's verify the converter implementation:
cmd/yorkie/migration.go (2)
33-33
: LGTM!
The import follows the established pattern for version-specific migration packages.
47-47
: Verify migration sequence compatibility.
The migration entry follows the correct pattern. However, let's verify there are no breaking changes or dependencies between v0.5.3 and v0.5.6 migrations.
✅ Verification successful
Let me gather more specific information about the migrations.
Let me try one more specific search to find the migration implementations.
Migration sequence is compatible and safe
After analyzing the migrations, I can confirm that v0.5.3 and v0.5.6 migrations are independent and compatible:
-
v0.5.3 performs two operations:
- Adds version vectors to the "changes" collection
- Drops the "snapshots" collection
-
v0.5.6 performs one operation:
- Migrates presence changes from string to byte array format in the "changes" collection
These migrations operate on different aspects of the database and don't have interdependencies. The v0.5.6 migration works on a different field than v0.5.3's version vector migration, and the snapshots collection drop in v0.5.3 is completely separate.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for potential migration dependencies or breaking changes
# Check for any migration versions between v0.5.3 and v0.5.6
rg -l "v0\.5\.[45]"
# Look for any references to v0.5.3 migrations in v0.5.6
rg "v0\.5\.3" "migrations/v0.5.6/"
# Check for common collection names or fields accessed in both migrations
ast-grep --pattern 'collection := db.Collection($_)'
Length of output: 319
Script:
#!/bin/bash
# Check the actual migration implementations
fd "v0.5" migrations/ --type f -x cat {}
# Check migration registration order
cat cmd/yorkie/migration.go
# Look for any database schema changes
rg -A 5 "CreateCollection|DropCollection|CreateIndex|DropIndex" migrations/
Length of output: 6211
Script:
#!/bin/bash
# Look for the specific migration files and their contents
find . -type f -path "*/migrations/v0.5.*/*.go" -exec cat {} \;
# Check for any database operations in these migrations
rg -A 3 "db\." migrations/v0.5.3/ migrations/v0.5.6/
Length of output: 15933
server/backend/database/mongo/registry.go (3)
32-32
: LGTM: Import and type declaration are properly structured.
The new import and type declaration follow the existing patterns in the codebase and correctly handle the pointer type for PresenceChange
.
Also applies to: 39-39
55-55
: LGTM: Registry builder registrations are properly implemented.
The encoder and decoder registrations follow the established pattern and maintain consistency with other type registrations in the codebase.
Also applies to: 61-61
167-195
: Verify binary format compatibility across versions.
The decoder assumes a specific binary format. Ensure that the binary format is versioned or has backward compatibility mechanisms in place for future changes.
Let's check for version handling in the binary format:
server/backend/database/memory/database.go (1)
916-916
: LGTM! Verify migration compatibility.
The direct storage of presence changes aligns with the PR objective to standardize storage formats. This change simplifies the data flow by removing the intermediate encoding step while maintaining consistency with other data types in the system.
Let's verify the migration compatibility:
✅ Verification successful
The shell script results show that there is a migration script migrations/v0.5.6/migrate-presence-change.go
that handles the conversion of presence changes from string to the new format. Let me verify its compatibility with the new storage format.
Based on the results, I can now provide a final response as the migration compatibility is clear:
Migration compatibility verified ✅
The change is fully compatible with the migration process. The migration script (migrations/v0.5.6/migrate-presence-change.go
) properly handles the conversion:
- Converts string presence changes to byte arrays using
[]byte(presenceChangeStr)
- Includes validation to ensure no string-typed presence changes remain
- Processes changes in batches with progress tracking
- Handles empty presence changes correctly by setting them to nil
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for migration scripts and related changes
# Look for migration files
fd -t f -e go migration
# Check for presence change type definitions and migrations
rg -A 5 "type.*Presence.*Change"
rg -A 5 "func.*Migrate.*Presence"
Length of output: 5514
Script:
#!/bin/bash
# Check the migration script implementation and the new presence change format
cat migrations/v0.5.6/migrate-presence-change.go
# Check how presence changes are encoded/decoded in the new format
rg -A 5 "EncodeToBytes.*presence"
rg -A 5 "DecodeFromBytes.*presence"
Length of output: 5166
server/backend/database/mongo/client.go (1)
881-881
: Storage format change looks good, verify MongoDB compatibility.
The direct storage of presence changes aligns with the PR objective of unifying data storage formats. However, let's verify a few aspects:
- Ensure MongoDB correctly handles the new presence change format
- Confirm the migration script handles all existing data
Let's verify the MongoDB compatibility and presence of migration scripts:
✅ Verification successful
Let me gather more information about the MongoDB schema and presence change handling:
Based on the results, I can now generate the final response:
MongoDB compatibility verified, presence change storage is properly handled
The implementation is correct and safe because:
- The
PresenceChange
struct is properly defined in both the core package and protobuf - MongoDB serialization is handled through
EncodeToBytes
which uses JSON marshaling - The change aligns with the existing MongoDB document handling in
CreateChangeInfos
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check MongoDB schema compatibility and migration scripts
# Check for migration scripts
echo "Checking for migration scripts..."
fd -t f -e go migration
# Check MongoDB schema definitions
echo "Checking MongoDB schema definitions..."
rg -A 5 "type.*PresenceChange.*struct"
Length of output: 1156
Script:
#!/bin/bash
# Check MongoDB document/collection handling
echo "Checking MongoDB document handling..."
rg -A 10 "func.*CreateChangeInfos"
# Check presence change encoding/decoding
echo "Checking presence change encoding..."
rg -A 5 "EncodeToBytes|DecodeFromBytes.*PresenceChange"
# Check migration implementation
echo "Checking migration implementation..."
rg -l "presence.*change" cmd/yorkie/migration.go
Length of output: 3807
What this PR does / why we need it:
This PR implements the storage format change for presence changes from string to byte array to maintain consistency with other data types.
Which issue(s) this PR fixes:
Fixes #1067
Special notes for your reviewer:
Test migration
After building with
make build
, run the migration:Does this PR introduce a user-facing change?:
Additional documentation:
Checklist:
Summary by CodeRabbit
Release Notes
New Features
Improvements
PresenceChange
objects, simplifying encoding and decoding processes.Bug Fixes
PresenceChange
objects.Chores