Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reevaluate use of Firestore Database / Backend Redesign #307

Open
willhuff0 opened this issue Nov 4, 2024 · 1 comment
Open

Reevaluate use of Firestore Database / Backend Redesign #307

willhuff0 opened this issue Nov 4, 2024 · 1 comment

Comments

@willhuff0
Copy link
Contributor

willhuff0 commented Nov 4, 2024

Problem

Firestore

Since message don't need to persist for future retrieval, I propose that we safely eliminate Firestore form the workflow for messages (keeping Firestore only for storing user data). In the current architecture, messages take the following path: client1 -> server -> Firebase -> server -> client2. Removing the Firebase step will reduce latency: client1 -> server -> client2.

Regions

Additionally, I would like to propose methods to optimize distance checking over large regions. Currently, 'isWithinRadiusMeter' (distance check) is called once for each active user for every message sent on the platform. This could potentially lead to executing thousands of unnecessary distance calculations depending on our active user count. For example, if someone in Florida sends a message, it will be distance checked against some active user in California. Situations like this can be avoided.

Potential Solutions

Grid of Squares

For some view distance r (where r is small enough so that curvature of the earth is not a factor, assume 100 meters for this example), divide the world into a grid of squares of side length r. Then, when a message is sent, calculate which grid cell it came from and run the distance check for all users in that and the surrounding 8 cells. This solution compartmentalizes, reducing unnecessary distance checks over long distances. This is the solution I choose when developing my own app, which shares a similar concept to this project.

In the graphic, assume a message was just sent, the world is divided into cells, the green dots are users, and the red circle shows the radius at which the message can be viewed (view distance).

  • User A: nothing happens (not in surrounding cells)
  • User B: distance check, message not received (not in view distance)
  • User C: distance check, message received

Square_Grid_Visualization

Time per message (server): u * d
u => user count in the 9 regions overlapping with the view circle of the message
d => duration of distance check calculation

Time per user location update: 0

Pros: probably easier to implement, more accurate
Cons: may be more computationally intense

Grid of Circles

Divide the world into overlapping circles of radius r (view distance). When a user's location changes, assign them to the nearest grid circle. When a message is received, forward it to all users assigned to any of the grid circles which contain the message point. There are potential problems with this design such as messages being missed even when they are within range. This error can be improved by decreasing the distance between circles (grid size), with the tradeoff being more distance calculations per message.

In the graphic, assume a message was just sent, the world is divided into overlapping circles, the green dots are users, the arrows point to the grid circle the user is assigned to (closest), and the red circle shows the radius at which the message can be viewed (view distance).

  • User A: nothing happens (not assigned to an overlapping cell)
  • User B: nothing happens (not assigned to an overlapping cell)
  • User C: message received

Circle_Grid_Visualization

Time per message (server): c * d
c => number of overlapping circles (1 to 4, depending on grid size)
d => duration of distance check calculation

Time per user location update: d
d => duration of distance check calculation

Pros: less (or less dependent on user count) computational intensity
Cons: may be less accurate

@willhuff0
Copy link
Contributor Author

I think we've settled on the grid of squares based approach, but modified to use geohashes instead of lat lon regions. I'll begin working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant