-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Facilitate collaborative project work with ScanTailor #102
Comments
Hi, I suspect that the filenames problem is only the very first problem that one will face trying to use ST in a such way. By design ST can't produce output step until all pages in the project are processed. And it can't define the final pages sizes at Page Layout step if "Align with other page sizes" option is on (by default) and not all pages are processed in previous steps. I mean project's pages aren't always may be processed in parallel as at some processing steps the default processing values depend on the parameters of all pages at previous step. Also I suspect one can't easily commit a partially changed project file even if the filenames are same. You may experiment with that. Create a project with for ex. 10 pages. Make initial git commit. Make a new branch. Process first 3 pages, commit. Switch to initial branch. Process 3 other pages, commit. Then try to merge these branches. The filepaths will be same but I doubt that these two branches could be merged automatically. |
You're right : / My wish in practice would be to process a 500 pages completely through automatic batch processes (for deskew, dewarping...)... Then save the project and assign 'review and fine tuning' responsibility for dewarping and deskew to 10 different people . As I noticed that the project file is XML, I imagined that by versioning it it would be possible to work cooperating with people in different places interested in accelerating the production of the same ebook. Perhaps this would be possible if the editing metadata of each image was in an independent metadata file. Ensuring that if a person edits the dewarping or deskew of 1 pages, only one metadata_file (of that page) would change. By the way, something similar to this 'page-independent metadata files' design exists in ABBY Finereader, when parsing But unfortunately applying this would require a giant change in scantailor software design. Would opening a crowdfunding campaign for this feature be something you would be interested in to make it viable? Thanks for responding and I apologize for not 'testing the concept' before posting the suggestion here. |
Hi, Alex. I have an idea that I personally plan to develop involving Scantailor and would appreciate your thoughts and advice if it's not too much trouble. I noticed that it is structured with the following keys:
Where 'filters' contains the keys:
I noticed that each key listed contains a collection of data from each page oriented by page id. In order to enable collaborative work with scantailor, I'm considering building a 'scantailor project' converter to a json collection. A json for each page of the project. So tweaking 10 pages in the dewarp and deskew steps only affects 10 json files. I considered the possibility of creating:
Thus, versioning could be applied to the folder with the json collection, where users would apply their import and export in each repository update. I have in mind the question of the 'Page layout' step and its need to render all the pages of the project. I will consider this in the plan. I program in python and am motivated to develop this. |
CONTEXT
A large book scanning project can cost tens of hours to maximize quality.
So a single person working on digitizing an ebook in their spare time using ScanTailor Universal could take several days or weeks to complete the project.
Another fact is the existence of groups of people interested in creating an ebook and willing to cooperate collaboratively to accelerate its completion.
THE PROBLEM
Today Scantailor Universal saves a project in an XML file containing various information. Among them, the fullpath of the folder where the original images are and the folder where the edited images are.
Thus, collaborative work using versioning technologies such as GIT is very arduous because each team member has a different fullpath for the folders that store the images in their saved project XML file.
SUGGESTION
Facilitate collaborative work with ScanTailor by keeping only relative pathfiles in the XML file, where absolute path_folder would be in a separate file, to allow project versioning without merge conflicts.
PURELY ILLUSTRATIVE EXAMPLE
When saving scantailor project
RESULT
In this way, members of a team engaged in finishing the entire editorial process of the same book quickly, with ScanTailor Universal, could add the '.local' file to the '.gitignore' list and easily version the '.scantailor' file without conflict of merge using GIT.
Each member will be able to work on a set of book pages in parallel and take advantage of the full potential of versioning systems.
The text was updated successfully, but these errors were encountered: