-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StatementListChanger #157
base: master
Are you sure you want to change the base?
Conversation
fdfbd55
to
a743d5c
Compare
* @license GPL-2.0+ | ||
* @author Thiemo Mättig | ||
*/ | ||
class StatementListChanger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels to me like this should not be a service, but rather a builder or cursor style object: it should wrap a StatementList it gets in the constructor, and offer specialized operations on that StatementList. Perhaps the name should indicate what kind of extra operations these are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very open for suggestions.
I definitely do not want this to be a set of static methods. This is something I would have done a few years ago. I learned. :-)
A builder does not work because I definitely want this to manipulate the original object, not create new ones.
Passing the StatementList via the constructor instead of each method does not change much, on the one hand. On the other hand it makes it much harder to replace this implementation with an other one. You can not simply make this an interface and inject an other implementation. You do not know the StatementList object on construction time. This approach would need a factory.
In conclusion I still think my current approach is the best.
the name should indicate what kind of extra operations these are.
What do you mean with "extra operations"? What is not obvious enough from the method names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method names are fine, but the class name is very generic. "Changer" could be anything. The methods mainly relate to groups, so maybe the class should have "Groups" in the name. But I can't think of anything nice offhand. So whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a StatementGrouper
interface with multiple implementations (including a ByPropertyIdStatementGrouper
), as well as a ByPropertyIdArray
(that's what we need to get rid of) and a ByPropertyIdGrouper
. I wanted this here to be different from these.
This is a layer of code that works with a StatementList while being aware of the concept of "grouping by property ID". ByPropertyIdAwareStatementListChanger?
Pinging @Benestar for the sake of completeness, because he spend a lot of time working on this, but we never managed to use one of his approaches. See wmde/WikibaseDataModel#479 and wmde/WikibaseDataModel#507, to name the most relevant ones. |
a743d5c
to
f069c22
Compare
I'd put If you change That just leaves the group related stuff. I remember that how to best implement this functionality was an old open question already two years ago, at which point there was some investigation into introducing some kind of StatementGroup class. ... found it :) Start of that thread: https://lists.wikimedia.org/pipermail/wikidata-tech/2014-October/000660.html Can't find anything about the discussion I mentioned in that mail though; too ancient history... Anyway, something like StatementGroup is what I'd look into for implementing such behaviour. |
We had a longer discussion about At the moment there is not a single Personally I do not like the idea of replacing the wonderful trivial StatementList with something more complex, because this complexity is almost never needed. |
I see your point about clearing entities being weird. That does not automatically mean having such a method in StatementList, which does not know about entities, is also weird, or as weird. [*] I'm wondering what the clear method being added to this changer class is for now (this is a question). And I'm suspecting that if it being in StatementList is weird, that it being in StatementListChanger is going to be just as weird, plus the feature envy. [*] Having such a method in StatementList and that being accessible in entities does not mean we think it often makes sense for people to use this. Something making sense depends on the context, and should typically be judged there. We can't really prevent users of code doing insensible things with it. For clearing statement lists, this can be done regardless of having the clear method, as this changer class demonstrates.
I guess you are talking about the StatementGroup stuff here. Is that correct? If so, I'm not proposing to replace StatementList. It does its job well, and I agree with you that adding something like the group stuff to it increases complexity while most consumers of this code don't care about this functionality. Hence the suggestion for a new thing that holds the group related code. |
Indeed. Can't argue with that. ;-) The only difference is that it is in an other component (Services instead of DataModel itself) and therefore less visible, which makes what I just explained above so much more obvious. For example, nobody will start using this in test setups. PS: I'm fine with removing
Yes.
Isn't this what I'm doing here? A new thing that knows how statements are grouped by property? What do you suggest instead?
Turning the code I'm proposing here into getters would not help much. We are usually passing Entities around, not StatementLists. We would need to call |
What kind of bad test code do you image would be written if this method was readily available? If I'm not mistaken it was used in some really shitty tests, though those where not shitty because of that method, and I really hope no one is still writing tests in that manner.
I was thinking of an approach different in two ways:
Thanks for explaining, I was not considering you'd want to use this for modifying entities.
What explanation are you referring to? Is the explanation that you don't want people to be able to clear the statements by using setStatements? I'm very dubious about this whole Holder stuff. In the scenario you where thinking off, what would the structure of the code be? Something like this? function doSomeStuff( StatementListHolder $holder ) {
// stuff?
$this->statementChanger->groupByProperty( $holder->getStatements() );
// stuff?
} What's the problem with function doSomeStuff( StatementListHolder $holder ) {
// stuff?
$holder->setStatements( $holder->getStatements()->getGroupedByProperty() );
// stuff?
} While that still has the weird Holder stuff, it's clearer what's going on here (in the former case it's not obvious the $holder ends up with a new state), and the property grouping code does not need to suffer from feature envy. |
$changer = new StatementListChanger( $entity->getStatements() );
$changer->groupByProperty(); Done. No setter. No "holder". |
I was asking specifically about the holder scenario that you brought up. You have not answered my questions about that, so now I still don't know what's going on there. I went through the trouble of outlining stuff about the grouper approach and am a disappointed you also did not reply to that. As for the code you provided now, I certainly agree it is more straightforward. That is due to the lack of the holder, not because of the StatementListChanger. My concern about the readability remains, though only partially. You can still do this instead, though the need is no longer as big (readability wise). $entity->setStatements( $entity->getStatements()->getGroupedByProperty() ); However in the way you wrote that code, another issue which I was wondering about manifests itself. If you construct the StatementListChanger directly, you have static binding on it and its methods. It's not really different than calling a bunch of namespaced global functions, apart from it superficially looking OO. You can inject the StatementListChanger, though then the readability (which is already not good) suffers. |
I don't understand what you want to know. You small example looks nice, but does have the same problem as almost every other possible approach (including what you wrote about "groupers"): it calls a setter that should not be there, and is already deprecated. I don't know what else to say about this. I went through multiple stages with the code in this patch:
|
This is still waiting for a review. We want to get rid of the overengineered cruft that is ByPropertyIdArray for a looong, looong time now, and I still believe this here is a wonderful solution. @manicki maybe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions, but looks fine overall.
* @license GPL-2.0+ | ||
* @author Thiemo Mättig | ||
*/ | ||
class StatementListChanger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method names are fine, but the class name is very generic. "Changer" could be anything. The methods mainly relate to groups, so maybe the class should have "Groups" in the name. But I can't think of anything nice offhand. So whatever.
* Removes all statements from the $statementList object given in the constructor. | ||
*/ | ||
public function clear() { | ||
foreach ( $this->statementList->toArray() as $statement ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me like clear() should really be part of the container object. And should probably have a much more simple implementations...
public function groupByProperty() { | ||
$byId = []; | ||
|
||
foreach ( $this->statementList->toArray() as $statement ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is toArray needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is optional because StatementList itself is Traversable. But leaving the ->toArray()
out will require an additional /** @var Statement $statement */
for type safety, and in most places we decided to go with ->toArray()
instead.
/** | ||
* @param Statement $newStatement | ||
* @param int|null $index An absolute index in the list. If the index is not next to a statement | ||
* with the same property, the closest possible position is used instead. Default is null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "possible" mean here? Does "closest" check in both directions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, both directions. This only matters if the list was not grouped by property ID before.
*/ | ||
private $statementList; | ||
|
||
public function __construct( StatementList $statementList ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
if ( $index === null ) { | ||
$index = $this->getClosestIndexAtGroupBorder( $statements, $validIndex ); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigh This makes it really obvious that we should model groups directly as nested arrays, and get rid of the whole index crap completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean with "crap". Can you please avoid using such offensive language?
@Benestar suggested to replace our naive StatementList with one that's aware of PropertyId groups a long time ago. I'm not opposing such a change.
I still believe the code proposed here is a helpful step in a longer migration path that might end with replacing StatementList.
Note that the code I'm proposing here does have a very specific advantage: It allows us to add new statements where they belong, without the need to reorder existing statement lists. Not that this is of much relevance, because users usually do not see diffs on the JSON level. But still. This smooth migration is not possible with a StatementList that enforces grouping on construction time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was not a complaint about this patch. It's just one more place where it becomes obvious that insertion by index into a grouped list is broken by design. It's just bad interface design. And that bad design (aka "crap") is my fault; I remember the session with Denny where we designed this interface. So, sigh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the proposals we discussed had two numeric indexes: one for a relative position in the group, and one for moving the group to an other position. I found this harder to use than the current absolute index.
I'm still very happy with the super-trivial StatementList we have. Only very few places in our application need statements to be grouped, and the requirements are always a bit different. A "baked in" grouping on the DataModel level may help in some cases, but will not solve all use-cases, and will make others more complicated.
private function isWithinGroup( array $statements, PropertyId $id, $index ) { | ||
$count = count( $statements ); | ||
|
||
// Valid if the index either prepends ot appends a statement with the same property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"precedes or succeeds"?
public function testConstructor() { | ||
$instance = new StatementListChanger( new StatementList() ); | ||
|
||
$this->assertInstanceOf( StatementListChanger::class, $instance ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TDD. This is one of the first tests I wrote. It became obsolete later because all other tests are covering the same feature. I just forgot to remove it.
f069c22
to
0f8d249
Compare
0f8d249
to
7ca9951
Compare
7ca9951
to
4a5c0e0
Compare
No changes since 2016. Please reopen if needed for current goals |
@thiemowmde it is cool you put this effort into getting rid of I looked at this now for 15ish minutes (which means I did not read all comments) and these are my thoughts:
Edit: there is a PR relevant to the StatementGroup topic here: wmde/WikibaseDataModel#507 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think responsibilities are misplaced as per previous comment
Let's do a fresh PR that is more focused and does not contain no longer relevant discussion that scares everyone away if someone want to actually create that. |
This is not meant as "the one" solution to all the year-long "statement (re)ordering" discussions, but merely as a way forward to get rid of ByPropertyIdArray and it's scary capabilities to move statement groups around. Why should I open a new pull request with the same code if it is already here? How are the discussion not relevant? The concern I raised in wmde/WikibaseDataModel#507 (comment) still holds. |
This wraps the last remaining usage of the ByPropertyIdArray class. This is meant as a migration path. First, all relevant tests from ByPropertyIdArrayTest should be carried over. Then the implementation can be changed to not rely on ByPropertyIdArray any more. ByPropertyIdArray can be removed then.