You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is an interesting article The Hunt for the Missing Data Type that talks about challenges in development of a graph library. A few citations:
[G]raph algorithm performance is a serious consideration and the special cases really matter. Kelly raised the example of maximum weight matching. If you know that your graph is “bipartite”, you can use a particular fast algorithm to find a matching, while for other graphs you need to use a slow, more general algorithm.
Different graph operations have different performance characteristics on different representations. Take a directed graph with 100 nodes and 200 edges. If we use an adjacency matrix representation, we need a 100×100 matrix containing 200 ones and 9,800 zeros. If we instead use an edge list we need only 200 pairs of nodes. Depending on your PL and level of optimizations that could be a memory difference of 20x or more.
Now instead take a graph with 100 nodes and 8,000 edges and try to find whether an edge exists between node 0 and node 93. In the matrix representation, that’s an O(1) lookup on graph[0][93]. In the edge list representation, that’s an O(|edge|) iteration through all 8,000 edges.
That is why it is possible to choose the exact representation for storing the graph and the algorithms are storage-agnostic. Although it is definitely true that a generic interface required for all representations is limiting and some features which specific representations excel in might not be generalizable.
Supporting many representations has a serious downside: you have to do a lot more work to add algorithms. If you write a separate version of the algorithm for each graph representation, you’re tripling or quadrupling the maintenance burden.
I believe that the impact of this problem is reduced by only implementing a set of core features for each storage and implement more high-level operations on top of them in a generic manner, but...
If you instead write a generic abstraction over polymorphic types, then your library is less performant.
I suppose this is true. It would be interesting to create some benchmarks to quantify this claim.
[W]hat if the graph is simply too big to work with? He gave the example of finding a solution to the 15 puzzle. This is done by running a A* search on the state space. A state space with over 20 trillion states.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Here is an interesting article The Hunt for the Missing Data Type that talks about challenges in development of a graph library. A few citations:
This is exactly what Guarantee trait and "restricted encapsulations" on top of the storage tries to help with.
That is why it is possible to choose the exact representation for storing the graph and the algorithms are storage-agnostic. Although it is definitely true that a generic interface required for all representations is limiting and some features which specific representations excel in might not be generalizable.
I believe that the impact of this problem is reduced by only implementing a set of core features for each storage and implement more high-level operations on top of them in a generic manner, but...
I suppose this is true. It would be interesting to create some benchmarks to quantify this claim.
That is why we aim for convenient support for implicit graphs...
Or to use custom graph storage as long as the common interface is implemented for it.
It was a very good read.
Beta Was this translation helpful? Give feedback.
All reactions