-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to use with sparse/query-only data sources? #53
Comments
Hi @derhuerst, could you please elaborate (maybe through an example) on what kind of queries would you like to support? In order to increase scalability, the Linked Connections (LC) server interface has been designed to be a simplistic API that only provides documents containing a set of https://graph.irail.be/sncb/connections?departureTime=2019-10-04T09:25:00.000Z On top of that, the server also adds to each LC document some metadata for clients to discover more documents, namely the ...
"hydra:next": "https://graph.irail.be/sncb/connections?departureTime=2019-10-04T09:43:00.000Z",
"hydra:previous": "https://graph.irail.be/sncb/connections?departureTime=2019-10-04T09:06:00.000Z",
... The LC server was design mainly for route planning purposes with the Connection Scan Algorithm in mind. We follow the idea behind Linked Data Fragments where a compromise between the workload supported by both servers and clients, may lead to more scalable servers and more flexible data access. However our main interest is to investigate the trade-offs of different Web APIs, so I am certainly interested in the use-case you want to support. |
@derhuerst What do you mean by sparse data exactly? A concrete example would help. This repository is mainly intended to host a Linked Connections server from GTFS and GTFS-RT files. You can however also host a Linked Connections compliant API building it in a totally different way. |
Thanks for you explanations. I want to build a Linked Connections endpoint on top of of a sparse data source, e.g. an API where you can fetch departures/connections. This allows me to have experimental support for public transportation networks that don't publish GTFS and/or GTFS-RT feeds. I am aware that this is terribly inefficient (as the API response time will usually be an order of magnitude higher than file/DB access) and wasteful (as one would often need to query a whole lot more information than just the connections and throw them away after), but as an experiment, I'm interested nonetheless. What I essentially ask for is splitting the Linked Connections server from the data retrieval logic. This has several benefits in addition to support for sparse data sources:
|
I’ve been wondering for a long time whether that would be possible with e.g., HAFAS API responses, but each time I would bump into too many HTTP requests behind a Linked Connections page, as you need to do the matching between the departure and the arrival at the next station. As an experiment it might indeed be interesting nevertheless.
The HTTP view code is quite small though, as @julianrojas87 pointed out above. While we agree with the fact that in the future we might support others data sources (most promising: real back-ends from PTOs), we currently have not been able to identify another data source that could be workable. Would mirroring our HTTP output work for you at this moment? The spec is pretty small: https://linkedconnections.org/specification/1-0 |
I've written a HAFAS-based prototype at https://github.com/derhuerst/hafas-linked-connections-server . Can someone of you have a look if the initial direction makes sense? I tried to run the |
@derhuerst That’s really cool and it does not run too slow either! Really enthusiastic by this. lc-client hasn’t been further developed for a while (it was the initial prototype). I’ve updated the repo to reflect this. We are however heavily developing Planner.js. @julianrojas87 @hdelva can we set up on a test server a browser build where you can type in your LC-server (defaults to localhost:3000/connections), and where it automatically calculates a route from stop A to stop B? I’d say: no prefetching and only transfers based on same stop ID (no downloading of routable tiles). @derhuerst Something lacking is the list of stops and their geo coordinates (indeed not part of the spec, be necessary if we want to visualize it). I’ll open some issues with ideas on your repo! |
Also keep in mind that I need to be able to pick arbitrary locations by myself in order to test this out with my HAFAS-based implementation. |
Of course! That’s the reason I opened derhuerst/hafas-linked-connections-server#1 |
Sorry for the inactivity on this issue. Lots of work travelling combined with some holidays now but will come back in a couple of weeks to complete the implementations. |
Since the posts above, I have built I now want to build a LC server that uses In my case, I don't need much of the complexity and dependencies in What do you think? |
I think it totally makes sense what you propose. The only reason it is bundled altogether is because of the convenience of having one command that does everything and because we were not too aware of Docker back then. I guess we would need to define a common interface to read the |
Yeah, something like |
I'll go ahead and try to come up with such an API in a separate repo, and submit a PR once I've reached something I'm happy with. |
Yes indeed, I was thinking the same. I had in mind something like
That sounds great! thanks for taking it up. I will try to find some time to also start splitting the server in two different modules. But I guess I'll wait for your proposal on the abstract interface to wrap up the data storage half in it. |
Yeah, most of the work on my proof-of-concept implementation will be transforming the HTTP/server logic to be data-source-agnostic, so there would be a lot of duplicated work. So if you're fine with that, I'll propose both an API and an |
Sounds good to me. Please go ahead and I'll jump in once we have your proposal to avoid duplicated work. |
Looks like I never gave an update, so I'll do that now, even though I didn't work on the Linked Connections side of things.
I have tweaked
About a year ago, I have built this as |
The
I'm not sure if I got the TREE stuff right, and I haven't tried consuming with a linked--data-aware client yet. I still think this should be handled by a generic TREE server lib, where you would pass in metadata as well as data retrieval functions. A random only somewhat related thought: It don't know Rust very well, but it seems like this generic TREE server lib would fit Rust's trait model very well, given that any other code from any unrelated domain could still easily adopt the TREE HTTP semantics. |
@derhuerst Do you want us to validate it somehow and test it with an RDF library? |
That you be a great contribution, yes! We could also conceive the aforementioned TREE HTTP server – I think it would make both |
Can you link me up with either an HTTP server that’s publicly reachable, or either set-up instructions in order to set up such an HTTP server locally? |
mkdir gtfs-lc-test
cd gtfs-lc-test
# download GTFS
wget --compression auto -r --no-parent --no-directories -R .csv.gz -P vbb-gtfs -N 'https://vbb-gtfs.jannisr.de/2022-09-09/'
rm vbb-gtfs/shapes.csv
# import GTFS
env PGDATABASE=postgres psql -c 'create database vbb_2022_09_09'
export PGDATABASE=vbb_2022_09_09
npx --package=gtfs-via-postgres@4 -- gtfs-to-sql --require-dependencies --trips-without-shape-id --stops-location-index -- vbb-gtfs/*.csv | sponge | psql -b
# serve LC server
npx derhuerst/gtfs-linked-connections-server#1.2.1 |
I want to built Linked Connections wrapping sparse data sources, which I need to query for connections on demand. This means that:
How would this work with
linked-connections-server
?The text was updated successfully, but these errors were encountered: