-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
difference between a work's children (fileset vs. another work) should be more transparent #2432
Comments
This issue touches on UI/UX and modeling, so since this issue is tagged with
|
Heh, I think I'm saying I'd like to differentiate less, i.e. lump them together more. |
Ha. I removed my editorial remark so as not to confuse the issue, then. Sorry! |
I’d like to read what the modeling-interested folks have to say, but I agree with @HackMasterA that the distinction hinders usability. @HackMasterA gives a relevant use case where an intermingled list of files and works would be desirable, and where the contained works should have the same visibility as the contained files – not be in a Relationships area that may seem secondary. I think users would find it intuitive that Items listed everything contained by the work. The screen shots that are shown in #2243 seem to demonstrate that this could affect not only the listings of child works but of parent works. The Relationships area gives us a place to try to explain what is ‘parent’ and what is ‘child’, or ‘In Works’ and ‘Has child works’. I lean toward the second option, reserving the Relationships area for ‘In works’ and ‘In Collections’. The Items list could possibly have a designator for Work if we find that the difference between a Work and a File needs to be evident in that list, but I’m not sure that it does. Different issue, but do we support a peer relationship, a simple Related Works between two works (not parent or child)? This could be in the Relationships area if we are supporting it. |
Apologies if I cover something already discussed or misunderstand something the Sufia 7 community (or others) has decided. I'm recapping my limited understanding. From the original question:
So in this example, one predicates how to model pages (Work or Fileset) based on if pages are 'full of text' (so become Filesets) or if the pages have 'plates or engravings' (i.e. images, so become Works)...? The way I understand this is that pages (and other parts, when differentiated) are "Works" regardless of primary content type (text or images or other). Any Work (book, page, image, other) can then have member Filesets, and a Fileset contains Files/non-RDF resources (text, image, whatever) that are that Work's digital surrogates from particular digitization efforts or upload activity. This is coming from my very small lurking involvement in Hybox conversations (as well as someone working with internally-managed/curated digital collections, not self-deposit or heavily faculty-curated items). See here for more discussion by better modelers than myself on this topic: hybox/models#17 (comment) If that's so (each page has a Work), then you'd always be ordering 'Works'. Re-arranging Items/Files/Filesets would perhaps (maybe?) be more about which digitization subset + from that subset, which file (the png versus jpeg, this resolution or that, etc.) you choose as a preference. So, for Anna's use case, there'd always be a 'Work' abstraction in the relationships area to order. You wouldn't have only a Fileset here for one page, and a Work there for another page - not if you wanted to perform page ordering. (I'm still thinking through the UI ease of loading a bunch of files to one "work", the user not making a bunch of new works, then requiring ordering.) The sometimes scare quotes on 'Work' comes from the lack of distinction landed on in this discussion re:Work / Part / Object / Fileset : hybox/models#42 (comment) When I say 'Work', I mean a non-Fileset Object that cannot hasFile pcdm:File. I'm sorry though if I entirely missed the point(s). |
@cmh2166 thanks so much for this response. I think your approach here is from a pcdm2 perspective, of which i have only passing understanding. I understand that filesets will become more formalized. Currently, though, an organization could definitely choose to do it the way I've suggested, and there is incentive to do so:
Will this way of modeling a book or other work as having children that are a mixture of works and filesets become disallowed in some way? Or, just as importantly, is there community agreement that this is for some reason not a good way to model, even in pcdm1 which is what we currently have? If so, why not? Besides the UI limitations :) |
BTW this is on today's hydra tech call agenda please do join! |
I want to second @cmh2166's comment above and say that is my understanding of the outcome of the modeling discussions over the last several months, and characterized as the way that Islandora handles compound objects. But I also take @HackMasterA's point that this is what PCDM 2.0 is all about, and there are real issues in the current (PCDM 1.0) implementation that need to be worked out. I think it's fine to model some pages as FileSets and others as child Works, and that the distinction would revolve around whether the page was seen as useful outside the context of the parent Work. If the drag-and-drop ordering is brought in, then the user could decide what order they went in, including putting all the child Works at the beginning and all the child FileSets at the end if that was the best way to view them. |
@cmh2166, @escowles, please let me know where to find best practices for modeling books and other works if such things have been agreed upon! |
Yes, that's another thing i wanted to say; the idea of a 'page' as a 'work' is conceptually difficult to accept. @cmh2166, @escowles |
@HackMasterA I agree, but I think it's useful to say that all pages/parts/components should be pcdm:Objects, since they might be viewed as a Work in another context (maps/plates are the typical use case). So the idea is that the typical page is a trivial Object, but they can be enriched at any time if it's useful to do so, without having to remodel them. That also makes the ordering question easier, because all the children are Objects, so the Object/FileSet ordering problem never comes up. |
@escowles i hear that. but i don't love the idea of front loading a lot of work for my catalogers 'just in case'. making every page a Work is way harder for them than making them filesets. And the work I would need to do to make it easier for them is not on the horizon for me, time-wise. |
isn't a sufia fileset already a pcdm:object? |
This is a very interesting and useful discussion. I think that at the root of the issue lies the 'overuse' of the work class and the entire work extension of pcdm. Conceptually, I think, a 'work' is a specialized pcdm:object carrying a specific intellectual identity (i.e. something for which we can possibly identify independent intellectual responsibility and unity). I have to agree that a general page should not be considered a work on itself (this is not to say that an picture ON a page is not a work, it is!) so we shall probably roll back and consider either the use of a less specified pcdm:object or come up with a modeling construct for parts (I highly advice against the latter) @escowles is perfectly right in his last comment. So you would have a work entity at the book level, possibly work entities at chapter level, and pcdm:objects at the page level (as an example) than each page would be related to one or more fileset depending on the specific context. |
@HackMasterA yes, the PCDM 1.0 FileSets are also pcdm:Objects. So the PCDM 2.0 proposal is basically to split that in half and have the Object part represent the (possibly minimal) aspect of the page and the FileSet part represent the bundle-of-files aspect of the page. And I completely understand why you wouldn't want to create child Works for every page now. It would be a ton of work, for little benefit. With PCDM 2.0, then you would just upload a file like you do for a FileSet now, but Sufia would always create a child Work and attach a FileSet to it. |
Also, attach filesets directotly to the work for a digitized book and mix them at the same level with other works or pcdm: objects at the page level would make the modeling strategy less consistent and would not allow, for example, to claim that the same page have been digitized twice (two masters) still being the same page. |
I agree that it is an interesting discussion. But have we strayed from the original thrust of @HackMasterA, which to me was the lack of usability of separating filesets and child works in two lists in Sufia, when both are contained by a work? Even with best practices agreed upon, and clarity in the online help, and modeling distinctions maintained, in regard to what should be a work and what should be a file, wouldn’t there be a usability advantage to @HackMasterA’s original point about intermingling? As a user, do I want to see all objects contained by a work in one list? |
I am +1 to modeling some pages and child Works and others as child FileSets, and showing them in a single list. Bringing over the CC ordering functionality would help make that more usable and support mixing them or keeping them separate (and of course each app can override this based on their local needs). |
@escowles @HackMasterA I see the practical difficulties of the modeling overhead in adding intermediate entities, but I still believe that fileset should keep their functionality and not be "directly" used to model "parts". I think the usability issue can be solved at the application and UI level when the creation of the intermediary pcdm:object is done in the background when multiple parts of a work are uploaded (I know this is not there yet, or at least not entirely). |
It might be worth noting that even in the page-is-a-fileset scenario, On Aug 10, 2016 9:55 AM, "Esmé Cowles" [email protected] wrote:
|
Sorry, breaking a few things out to respond to as separate comments. Here are things that we can maybe follow up on elsewhere (with links to elsewhere): 1: Parts As Works
Yes, this dilutes the idea of a "Work" (something already possibly ambiguous regardless). See my original comment on this, which has a link to some really good/pertinent discussions on the topic:
And also see @azaroth42's email to PCDM listserv about this from a few months ago that came from those discussions: https://groups.google.com/forum/#!topic/pcdm/qymzKAv0uoA However, for the sake of this discussion, I'm thinking of "Works" in the its an Object that isn't Fileset sense. I'll use just generic PCDM:Object from now on (with explicit "that is not a PCDM:Fileset" where needed). That Works being diluted issue can be perhaps discussed further on the PCDM listserv (or even Rob's thread, which got no responses). 2: PCDM 1.0 versus 2.0
The more formalized Filesets definition may fall to PCDM 1 versus PCDM 2, but I'd be a bit hesitant to make that claim (seems more about which ingest path/gem/version you use...?). My understanding was "PCDM 2" is where we're trying to pull some of the Hydra-Works/PCDM-Works models (as is? with improvements?) into PCDM, and let Hydra-Works/PCDM-Works become not about defining new classes, but more about application profile/setting behaviours. There is a really good discussion about pinning down Filesets definitions going on here: https://groups.google.com/forum/#!topic/pcdm/8xVAWuczaxQ And this work would probably be part of a PCDM 2 version, if only because it's happening now on the listserv. Discussions of PCDM 1 versus 2 should probably go to the PCDM listserv as well, especially if we are thinking these distinct enough to cause compatibility issues. For this discussion, I'll try to stick with "PCDM 1" core and Hydra-Works/PCDM-Works understandings (in which, we're still generating those intermediate resources at MPOW). |
Do we want to model some Parts as Filesets and other Parts as "Works"/Objects that aren't Filesets and don't contain/have members Files... mmm. I'm -1 to that, sorry. This conflates models and makes metadata profiling/resource management that much more complicated. I agree with @simosacchi re: keeping Filesets as where you manage non-RDF resources, and I think @barmintor makes a good point that once you bring in ordering, you're making these "intermediate" resources (here, proxies) regardless. I like @escowles comment:
So, in my limited and likely to change opinion, I'd rather aim for generation of all those Parts as "Works"/Objects that aren't Filesets when you're ingesting a digital object/collection that you know may require those distinctions (for ordering, additional metadata, etc.) - i.e. the CC functionalities @escowles mentioned. I'm a bit curious when you'd have Catalogers manually creating all those Parts too (would you have a Cataloger manually create a Book and all its relevant Pages in Sufia, versus finding a way to batch load?) Does this also indicate maybe a need for some batch ingest / basic metadata generation tooling around Parts? Just thinking out loud. If for functionality sake, an institution doesn't want to support PCDM:Objects that aren't Filesets for all Parts, then I'd rather the institution makes all the Parts (not just some) Filesets + update the metadata profile for those Filesets to be consistently available for expanded descriptive + other metadata (and then can consistently update UI for other options). This is me speaking firmly with my metadata munger/migration lackey hat on though, and I'm still thinking this through. Sorry for the long messages. Thanks for the discussion, and hope this helps. |
Thank you for all the links, @cmh2166, I really appreciate that.
Please say more about why you think this is easier? It seems harder to me. If at some point I'm going to migrate to some Works + some PCDM:Objects I would think it would be easier to move from some Works + some FileSets than from all Works (which I would have to sift through somehow). |
@cmh2166 So I should add that when I say 'Work' and 'FileSet' I mean the objects in Curation Concerns / Sufia . It's therefore definitely helpful to use pcdm:object instead of 'work' when talking about a non-Work non-FileSet object! |
Copying from private conversation with @cmh2166 with permission:
|
@HackMasterA : I agree completely with @Cam156 perspective on the modeling issues that might come up when modeling entities "at the same level" differently. That speaks directly to my comment above about not abusing the fileset construct, and plan ahead to reduce lack of flexibility moving forward (see the example of multiple digitization masters of the "SAME" page). Unless people see issues of performance by adding layers of indirection, my suggestion is still to model all pages as pcdm:objects and not filesets. I hope there is a way to make the process automatic/invisible for cataloguers at the application level, and with PCDM2.0 in Sufia, according to @escowles it seems the logic is already moving in that direction. |
@simosacchi yes, I appreciated that use case from @cmh2166 (not @Cam156 !). It's definitely food for thought but it still seems like my best option today may be to use Works and FileSets with an eye to migrating when pcdm2 and related infrastructure are available. It seems like it will be easier to move all filesets into new 'object' containers than to sift through Works and figure out what would be made more generic. |
Either way i think we have to allow for the possibility that today there may be a work with both types of children. given that I'm hearing support for integrating their placement in the UI. |
@HackMasterA : Ops, sorry @cmh2166 I was tricked by the autocompletion... I totally see use cases where you have both types of childrens, one example that would not violate the assumptions above: you have both pcdm:objects for pages (and a fileset attached to each of them with TIFs and derivatives) and also have a fileset with a PDF file for the entire work (i.e. the book) directly attached to the parent object (regardless of it being a work or "just" a pcdm:object). |
thanks to everyone for this very illuminating discussion. it seems to have petered out so I move to close the issue. I've created #2459 to reflect the consensus we were able to reach, ui-wise. |
I'd like to argue that whether a work's child is a fileset or another work is a distinction that hinders usability.
E.g.
As a rare books curator I want to model pages full of text as fileset but plates or engravings as works so that i can give them more metadata. I'd then like to combine all of these pages on a single 'book' work and order them as needed, intermingled. As a researcher using this book I'd like all the pages to be in a single ordered list.
Currently contained works will be in the 'relationships' area but contained files will be in the 'items' area.
If we loosen the distinction in the UI we would be able to include drag-and-drop ordering in the current 'files' tab of the edit form, meaning depositors wouldn't have to click through to a separate screen to do ordering.
Related work
See screenshots on #2243
The text was updated successfully, but these errors were encountered: