Stable functions do not seem to be honored in RLS in basic form... #9311
Replies: 11 comments 51 replies
-
I've found work arounds for many cases in testing, but at the moment this fairly common RLS I can't get to optimize...
putting two policies on select does not help: All bad cases are 200 msec or more versus <1 msec for Edit: this could have serious impacts to storage RLS policies as they are all on the same table... So far the only thing that seems to work is this: Edit2: Even worse on |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hi @GaryAustin1 , this is some great investigative work:
This is unfortunately the case, and you've dug up a lot of interesting examples of this. Do you mind pg_dumping the test database you are using and zipping it up to michel @ supabase . io? We are working on some tooling to simply the writing of "optimal" RLS policies and I'd like to use some of your examples as a basis. It may also help with any workarounds we can suggest for you. |
Beta Was this translation helpful? Give feedback.
-
Steve, I’ll check the explain wrapper again, but I ran a couple of tests going thru the Rest API with no explain and saw huge differences that were repeatable based on the RLS tests and clearly not random network delay just to confirm it was not the test code.
…Sent from my iPhone
On Oct 25, 2022, at 5:00 PM, Steve Chavez ***@***.***> wrote:
I think the run_explain plpgsql function wrapper is playing tricks on us.
I can somewhat reproduce but when I get the EXPLAIN outside the run_explain I don't get the perf issue.
insert into realtest(id, name, user_id, jsonb_col)
select x, 'name-' || x, uuid_generate_v4(), jsonb_build_object('key', x)
from generate_series(1, 25000) x;
CREATE OR REPLACE FUNCTION run_explain()
RETURNS SETOF text AS
$$
BEGIN
set local role authenticated;
set local request.jwt.claims to '{"role":"authenticated", "sub":"5950b438-b07c-4012-8190-6ce79e4bd8e5"}';
RETURN QUERY
EXPLAIN ANALYZE
SELECT * FROM realtest WHERE user_id = ((current_setting('request.jwt.claims', true))::jsonb->>'sub')::uuid;
END
$$ LANGUAGE plpgsql;
select * from run_explain();
explain analyze SELECT * FROM realtest WHERE user_id = auth.uid();
run_explain
Seq Scan on realtest (cost=0.00..996.50 rows=1 width=62) (actual time=27.525..27.526 rows=0 loops=1)
Filter: (user_id = (NULLIF(COALESCE(current_setting('request.jwt.claim.sub'::text, true), ((current_setting('request.jwt.claims'::text, true))::jsonb ->> 'sub'::text)), ''::text))::uuid)
Rows Removed by Filter: 25001
Planning Time: 1.175 ms
Execution Time: 27.639 ms
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
I think the I can somewhat reproduce but when I get the EXPLAIN outside the create table realtest as
select x as id, 'name-' || x as name, uuid_generate_v4() as user_id, jsonb_build_object('key', x) as jsonb_col
from generate_series(1, 25000) x;
CREATE OR REPLACE FUNCTION run_explain()
RETURNS SETOF text AS
$$
BEGIN
set local role authenticated;
set local request.jwt.claims to '{"role":"authenticated", "sub":"5950b438-b07c-4012-8190-6ce79e4bd8e5"}';
RETURN QUERY
EXPLAIN ANALYZE
SELECT * FROM realtest WHERE user_id = auth.uid();
END
$$ LANGUAGE plpgsql; select * from run_explain();
explain analyze SELECT * FROM realtest WHERE user_id = auth.uid();
Perhaps this is related to plpgsql functions? Seems similar to this previous discussion #8733 |
Beta Was this translation helpful? Give feedback.
-
@steve-chavez So still looks like being called 25,000 times, but getting rid of the jsonb and converter processing makes a relatively big difference. Note I get the 30msec result also if I put just the one line in your uid_matches function also. |
Beta Was this translation helpful? Give feedback.
-
@GaryAustin1 There's a workaround to avoid the JSON parsing and the subsequent json arrow operator Basically it uses a PostgREST pre-request function, to convert the JSON setting to a shorter text setting. create or replace function pre_request()
returns void as $$
select
set_config(
'request.jwt.claims.sub'
, (current_setting('request.jwt.claims',true))::json->>'sub'
, true
);
$$ language sql;
alter role authenticator set pgrst.db_pre_request = 'pre_request';
notify pgrst, 'reload config'; Then we can use: current_setting('request.jwt.claims.sub', true)
-- instead of (current_setting('request.jwt.claims',true))::json->>'sub' On RLS policies or any other logic. This can also work on more complex json in the claims. For a customer it was used like this: create or replace function pre_request()
returns void as $$
select
set_config(
'request.jwt.claims.app_metadata.workspace'
, (current_setting('request.jwt.claims',true))::json->'app_metadata'->>'workspace'
, true
);
$$ language sql;
-- current_setting('request.jwt.claims.app_metadata.workspace', true) is then available, it can be wrapped by a stable function too |
Beta Was this translation helpful? Give feedback.
-
Just found out a solution to this without the need for Essentially, from this: create policy "Test select RLS" on realtest
to authenticated
using (auth.uid() = user_id); To this: create policy "Test select RLS" on realtest
to authenticated
using (
(
with cached as materialized(
select auth.uid() as val
)
select user_id = val from cached
)
); There's a reproducible example on https://github.com/PostgREST/postgrest/issues/2590. |
Beta Was this translation helpful? Give feedback.
-
Did some more testing with a more realistic policy: 100K selects, no difference between indexed and not. Offset is adding
EDIT: * adding Code: drop table if exists rlstest;
create table
rlstest as
select x as id, 'name-' || x as name, uuid_generate_v4() as user_id, uuid_generate_v4() as no_index_user_id, 'user' as role
from generate_series(1, 100000) x;
create index userid on rlstest using btree (user_id) tablespace pg_default;
update rlstest set (user_id,no_index_user_id,role) = ('5950b438-b07c-4012-8190-6ce79e4bd8e5','5950b438-b07c-4012-8190-6ce79e4bd8e5','admin') where id = 1;
update rlstest set (user_id,no_index_user_id,role) = ('79530fa3-2e6a-4c26-9356-cecff8148d46','79530fa3-2e6a-4c26-9356-cecff8148d46','user') where id = 2;
CREATE OR REPLACE FUNCTION is_admin()
RETURNS boolean AS
$$
BEGIN
return 'admin' = (select role from rlstest where user_id = auth.uid());
END
$$ LANGUAGE plpgsql SECURITY DEFINER;
------------ Change RLS here
alter table rlstest ENABLE ROW LEVEL SECURITY;
create policy "rls_test_select" on rlstest
to authenticated
using (
--is_admin() OR auth.uid() = user_id
--is_admin() OR auth.uid() = no_index_user_id
(select is_admin()) OR user_id = (select auth.uid())
--(select is_admin()) OR no_index_user_id = (select auth.uid())
/*
(
with cached as materialized(select is_admin() as val)
select val from cached
)
OR
(
with cached as materialized(select auth.uid() as val)
select user_id = val from cached
)
*/
/*
(
with cached as materialized(select is_admin() as val)
select val from cached
)
OR
(
with cached as materialized(select auth.uid() as val)
select no_index_user_id = val from cached
)
*/
);
---------------
set local role authenticated;
--set request.jwt.claims to '{"role":"authenticated", "sub":"5950b438-b07c-4012-8190-6ce79e4bd8e5"}'; --admin
set request.jwt.claims to '{"sub":"79530fa3-2e6a-4c26-9356-cecff8148d46"}'; -- not admin
explain analyze SELECT count(*) FROM rlstest;
--select count(*) from rlstest
--select * from rlstest order by id limit 10 offset 90000;
--explain analyze select * from rlstest order by id limit 10 offset 90000; |
Beta Was this translation helpful? Give feedback.
-
Unsure if this belongs here or in PostgREST/postgrest-docs#609 because it is related to both. RLS with |
Beta Was this translation helpful? Give feedback.
-
Reviving this old thread... I am building out a new project and I wanted to verify I can handle large tables to find projects my user is allowed to see. My observations show me that if my function is STABLE it is called exactly twice in a query no matter the number of rows (once for the planner and once for the execution), and indexes are used as per I therefore disagree with your conclusion that Background: Users are called "makers", and there's a table of their metadata populated by trigger on create into CREATE TABLE maker_personality_user_role (
user_id UUID NOT NULL REFERENCES user_metadata (user_id) ON DELETE CASCADE,
maker_personality_id UUID NOT NULL REFERENCES maker_personality (maker_personality_id) ON DELETE CASCADE,
permission_role user_role_type NOT NULL,
PRIMARY KEY (user_id,maker_personality_id) -- one per user/team
);
CREATE TABLE project (
project_id UUID NOT NULL DEFAULT gen_random_uuid(),
maker_personality_id UUID NOT NULL REFERENCES maker_personality (maker_personality_id) ON DELETE CASCADE,
created_time TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
... more fields...
PRIMARY KEY (project_id),
UNIQUE(maker_personality_id,project_name)
);
-- function to select what I'm allowed to access
CREATE OR REPLACE FUNCTION teams_i_can(min_role user_role_type) RETURNS UUID[] AS $$
BEGIN
RAISE NOTICE 'teams_i_can';
RETURN ARRAY(SELECT maker_personality_id FROM maker_personality_user_role WHERE user_id=(auth.jwt()->>'sub')::UUID AND permission_role >= min_role);
END
$$ LANGUAGE plpgsql STABLE PARALLEL SAFE SECURITY DEFINER SET search_path = public;
CREATE POLICY "view own maker_user_permission" ON maker_personality_user_role
FOR SELECT TO authenticated USING ( (auth.jwt()->>'sub')::UUID = user_id );
CREATE POLICY "view own project" ON project
FOR SELECT TO authenticated USING ( maker_personality_id = ANY (teams_i_can('view')) ); So what I did was create 10000 new users, and 11 projects per, with each user having permissions to their own projects. One randomly selected user was given permission to ⅓ of all projects, so there was a large list to pick as well. This leaves us with about 146k rows in the permissions role table. Here's what I see when selecting all projects:
Explanation of that is: first select found all 110000 records as it is running as the table owner. Next I simulate logging in as one of the 11 project owner users. You can clearly see it is only calling the
Now, if I change the
And for completeness, my version of Supabase running locally on my Mac M1 laptop:
|
Beta Was this translation helpful? Give feedback.
-
I was starting to do some testing on RLS optimization and there appears to be a big issue with using a stable function in RLS.
I've limited the example here to using auth.uid(), a stable sql function.
To test I am running this basic template so that I can get explain on RLS in the SQL editor.
I have a table with 25000 rows and only one uuid match. RLS is (auth.uid() = user_id).
auth.uid() is called 25000 times apparently.
After verifying stable functions work if added to the actual query in a where clause, I wondered if not having a where query in the RLS policy was causing the optimizer to not do its thing....
So I changed the policy to:
using (exists(select true where auth.uid() = user_id))
edit:
using (non_index_user_id in (select auth.uid()))
also works.And I get this:
So over 10x improvement (on a very simple function... imagine querying another table in the function) by moving the stable function into a where in the policy.
I see the same results on my own stable plpgsql boolean function even when adding a hint at cost of 100000. The optimizer does not seem to honor stable in RLS policy with just a boolean function. As soon as I put it in the exists/where wrap it is honored. Interesting immutable type does work, but that is not useful in most cases. The same results occur using just a dummy RLS function doing a multiply and returning true.
I've done alot of searching on the web and only find examples of using explain for "fake RLS" testing where they just test the function itself in a where statement not as a real policy, which does work as expected.
Hopefully I'm missing something here, or my test environment is not reflective of calls coming in thru PostgREST.
Also with PostgREST 10 hopefully can do the same test thru the API.
Thoughts? @steve-chavez
Edit3:
This also "explains" why auth.role() did not work well. Running policy auth.role()='authenticated' is over 100msec here. Putting it in the exists/where wrapper it is 10msec. So it seems Postgres does not honor stable unless an index is involved (edit2 below) or in a fake where statement....
Edit2:
With an index on user_id, the optimizer seems to run the auth.uid() function just once and gets sub millisecond result. Not sure if it is because it is an SQL function or not. I need to test with a function doing a select on another table and plpgsql to see if indexing helps. But for sure it looks like you must index any column you are testing auth.uid() on for best performance.
Edit4:
LOL. If I create a column call dummy_boolean and populate with true. Add an index and do a boolean stable function on it.
my_function() = dummy_boolean
. It goes from 193msec to 15msec....Edit1:
With my boolean true or false stable plpgsql function I confirmed the multiple calls on it's own, versus 1 single call in the exists/where wrap.
Beta Was this translation helpful? Give feedback.
All reactions