-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#124] Allow indexing rules to be invoked manually (main) #141
base: main
Are you sure you want to change the base?
Conversation
Attempting to do full text indexing on a data object without appropriate permissions results in a |
To add to that, this is for the case where the rule is invoked as rods (a rodsadmin) on another user's data object via |
Yes, rodsadmins-only seems good
Agreed, 'all' seems good |
I've added logic that allows rodsadmins to invoke full text indexing on other user's data objects without needing explicit permission. I've confirmed the drop to the C API works as intended. This was done manually using |
// Take the max to avoid passing an integer that's less than zero to the | ||
// the string_view constructor. | ||
{"data", std::string_view(buffer.data(), std::max(0, bytes_read))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a statement explaining how bytes_read
can result in an integer less than zero.
#define IRODS_IO_TRANSPORT_ENABLE_SERVER_SIDE_API | ||
#include <irods/dstream.hpp> | ||
#include <irods/transport/default_transport.hpp> | ||
#ifdef IRODS_HAS_FEATURE_ADMIN_MODE_FOR_DSTREAM_LIBRARIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature test macro name is just a placeholder until irods/irods#7530 is resolved.
This PR isn't blocked by that work. Once the irods/irods issue is resolved, these preprocessor macros can be updated to match the real feature test macro.
// irule <text> | ||
if (rule_text.find("@external rule {") != std::string_view::npos) { | ||
const auto start = rule_text.find_first_of('{') + 1; | ||
const auto end = rule_text.rfind(" }"); | ||
|
||
if (end == std::string_view::npos) { | ||
auto msg = fmt::format("Received malformed rule text. " | ||
"Expected closing curly brace following rule text [{}].", | ||
rule_text); | ||
log_re::error(msg); | ||
return ERROR(SYS_INVALID_INPUT_PARAM, std::move(msg)); | ||
} | ||
|
||
rule_text = rule_text.substr(start, end - start); | ||
} | ||
// irule -F <script> | ||
else if (const auto external_pos = rule_text.find("@external\n"); external_pos != std::string_view::npos) { | ||
// If there are opening and closing curly braces following the "@external\n" prefix, then we | ||
// can assume that the rule text most likely represents a JSON string. | ||
if (const auto start = rule_text.find_first_of('{'); start != std::string_view::npos) { | ||
const auto end = rule_text.rfind(" }"); | ||
|
||
if (end == std::string_view::npos) { | ||
auto msg = fmt::format("Received malformed rule text. " | ||
"Expected closing curly brace following rule text [{}].", | ||
rule_text); | ||
log_re::error(msg); | ||
return ERROR(SYS_INVALID_INPUT_PARAM, std::move(msg)); | ||
} | ||
|
||
rule_text = rule_text.substr(start, end - start); | ||
} | ||
// Otherwise, the rule text must represent something else. In this case, simply strip the | ||
// "@external\n" prefix from the rule text and let the JSON parser throw an exception if the | ||
// rule text cannot be parsed. This allows the REP to fail without causing the agent to crash. | ||
else { | ||
rule_text = rule_text.substr(external_pos + 10); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used this code in two plugins now. It probably needs to be provided by the irods-dev package so we avoid copying it everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make an issue - that seems good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to think maybe this should be a documentation exercise. That code makes a few assumptions about the input which isn't code for general purpose use.
Will think on it a little more.
|
||
json delay_obj; | ||
delay_obj["rule-engine-operation"] = irods::indexing::policy::indexing; | ||
if (irods::indexing::policy::object::index == op) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
irods::indexing::policy::object::index
expands to the string, "irods_policy_indexing_object_index"
.
This is the rule name that must be used to do full text indexing of a single data object. It shares the same name as the rules which are fired as a result of triggering PEPs.
You can see the rest of the rule names here.
irods_capability_indexing/configuration.hpp
Lines 189 to 205 in 3f3529d
namespace object | |
{ | |
static const std::string index{"irods_policy_indexing_object_index"}; | |
static const std::string purge{"irods_policy_indexing_object_purge"}; | |
} // namespace object | |
namespace metadata | |
{ | |
static const std::string index{"irods_policy_indexing_metadata_index"}; | |
static const std::string purge{"irods_policy_indexing_metadata_purge"}; | |
} // namespace metadata | |
namespace collection | |
{ | |
static const std::string index{"irods_policy_indexing_collection_index"}; | |
static const std::string purge{"irods_policy_indexing_collection_purge"}; | |
} // namespace collection |
Are those the rule names we want admins to use or do we want to change them for manual execution contexts?
I can see value in the names staying as they are. It makes it easy for admins to know what happens because they will start to remember the names. However, admins may not be able to distinguish who/what invoked the rule.
All of that to say, perhaps the names should be changed to something like ...
indexing_index_data_object
indexing_index_collection
indexing_purge_data_object
Note: This PR doesn't support invoking indexing rules via the NREP yet. It should be doable, but that may require changing the the rule names for correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't feel too strongly either way yet. consistency across our plugins is where i think i'd find the most value/good.
also noting that it's interesting our namespacing is not in the same order as the rule names...
'policy' and 'indexing'... switched places...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on the consistency thing.
As for the namespacing, that's not surprising to me. If the C++ code used irods::policy
, the possibility of symbol collision rises since other plugins would likely follow suit and define things in the irods::policy
namespace.
I don't know that "policy" is a term that's needed in the rule names since everything iRODS does is about policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that convention started with policy composition... and hasn't really been codified/hardened yet. TBD...
Still need to implement tests.
I've verified at full text indexing on data objects works. I'm pretty sure the other rules work since they were lifted directly from
exec_rule_expression
.The rules can only be invoked via the main indexing plugin, not the elasticsearch plugin.
Some questions ...
As implemented, the rules fire as the user who invoked them. Should any rules result in changing the identity of the RsComm to the user who owns the object(s)? Consider full text indexing being invoked on a data object which the rodsadmin doesn't have permissions on. I'll test this to find out what happens.
Thoughts?