-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a 'tags' endpoint to support runtime tagging #185
Conversation
…/task/artifact has been produced
services/metadata_service/api/tag.py
Outdated
if o['operation'] == 'add': | ||
# This is the only error we fail hard on; adding a tag that is | ||
# in system tag | ||
if o['tag'] in obj['system_tags']: | ||
return DBResponse(response_code=405, body=json.dumps( | ||
{"message": "tag %s is already a system tag and can't be added to %s" | ||
% (o['tag'], o['id'])})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a necessary guard? the metaflow cli does not guard against adding system tags as user tags with --tag
it seems. Should similar checks be done during creation as well then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an excellent point and a bug most likely. This patch was done to allow modification of tags after the fact and the idea is to make sure that system tags stay unique and not affected. They are kept separate but we were trying to make sure that system tags were not overwritten in a way although I will admit that this is maybe not very useful. Let me circle back on that.
services/metadata_service/api/tag.py
Outdated
result = await table.update_row(filter_dict=obj_filter, update_dict={ | ||
'tags': "'%s'" % json.dumps(obj['tags'])}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth checking: Previously update_row was only used for heartbeat updating, but using it this way might not be reliable/secure, as tags contain user inputs. Preferred way for psycopg
is to use the cur.execute(sql_template, values)
which should take care of correctly escaping the values.
Might be worth refactoring the update_row function to use the safe execute syntax.
see https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries for usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I will update the update_row function to use a template. That seems the better way.
|
||
|
||
class TagApi(object): | ||
lock = asyncio.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to be unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied from other APIs but let me check if there was a purpose behind it.
services/metadata_service/api/tag.py
Outdated
elif o['operation'] == 'remove': | ||
if o['tag'] in obj['tags']: | ||
modified = True | ||
obj['tags'].remove(o['tag']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed that the Metaflow client and service do not guard against duplicate tags, so --tag test --tag test
would add the test tag twice, due to tags being a simple array. remove
will only get rid of one occurrence of a tag. Should this be the way the update works?
alternatively get rid of all occurences of a tag:
obj['tags'] = [tag for tag in obj['tags] if tag != o['tag']]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right again on the creation side of things; the update will guard against duplicates but not the add (I will double check on that side; I think we should just convert it to a set and set that and also check against system tags).
for o in body: | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multiple updates are not done inside a transaction, so there is a slight peculiarity with the way things might work in case of errors. For example
- flow/run/step tag updates succeed
- for whatever reason task tag update fails
- API responds with an error, even though some records were updated, and some updates were never reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was a replica of how things were happening inside (in our internal version) and was a conscious choice but maybe something worth reconsidering. @ferras may have some more ideas on that.
This needs #184 to function properly.