Redshift Stored Procedure Comma separated string in Argument
RedShift stored procedures are really useful feature and after a long time they introduced it. I tried this stored procedure for many use cases and recently I came up with a complexity in it. You can’t get a comma separated sting in the argument. Lets say I have an employee table and I want to pass the employee names in a variable/argument and you need to select * from table where employee_name in ('your comma separated values)
. It’ll not work directly.
Why? many SQL peoples already know the reason. Because you have multiple values in one string and you need to parse it as multiple quotes string and pass it.
Example:
Input: 'emp1, emp2, emp3'
In SQL you can’t use IN clause like this. You need to split each and every value with single quote. Then it should be
'emp1', 'emp2', 'emp3'
In plpgsql
, you can do this parsing with some temp table snd SQL queries. I have asked the steps in DBA’s stackoverflow and a horse with no name (yeah, its a weird name) answered it.
But in Redshift it did’t work as expected.
What’s wrong in Redshift: #
Before checking why its not working, we’ll see how are we thinking to parsing it.
- Pass the argument into a variable.
- Use trim functions to remove the single quote.
- Use the split function to split these values.
- For each values put single quote in the beginning and the end.
Or as mentioned the stackoverflow answer, we can’t use arrays for this. Thats why its not working and it’ll throw the following error.
ERROR: varchar[] is not a supported parameter type for functions or procedures
Then I searched many blogs and found the useful information.
I almost fixed the issue. But now the other problem came when I do query the Information_schema
tables. My select query will look like below.
SELECT table_schema,
table_name
FROM information_schema.TABLES
WHERE table_type = 'BASE TABLE'
AND table_schema NOT IN ( 'pg_catalog', 'information_schema' )
AND table_name IN (SELECT tname
FROM my_parsed_variables);
While running this, I got the below error.
test=# call my_sp('bhuvi');
INFO: Table "my_tables" does not exist and will be skipped
INFO: Table "my_tokenized_tables" does not exist and will be skipped
INFO: Function "has_table_privilege(oid,text)" not supported.
INFO: Function "has_table_privilege(oid,text)" not supported.
INFO: Function "has_table_privilege(oid,text)" not supported.
INFO: Function "has_table_privilege(oid,text)" not supported.
INFO: Function "has_table_privilege(oid,text)" not supported.
INFO: Function "has_table_privilege(oid,text)" not supported.
INFO: Function "has_table_privilege(oid,text)" not supported.
ERROR: Specified types or functions (one per INFO message) not supported on Redshift tables.
CONTEXT: SQL statement " SELECT table_schema, table_name FROM information_schema.TABLES WHERE table_type='BASE TABLE' AND table_schema NOT IN ('pg_catalog', 'information_schema') AND table_name !='unload_history' AND TABLE_NAME in (select tname from my_tokenized_tables)"
Its saying we don’t have enough permissions. So I need to reached out AWS support team to help on this one piece. Then they got help from RedShift product team to write the correct SQL query.
Stored Procedure with multiple values: #
With my stored procedure, I want to get the list of table names from the argument and print its name along with schema.
DROP PROCEDURE multi_comma_values(varchar);
CREATE OR REPLACE PROCEDURE multi_comma_values(tbl_list IN varchar(400))
LANGUAGE plpgsql
AS $$
DECLARE
rec RECORD;
BEGIN
DROP TABLE IF EXISTS my_tables;
DROP TABLE IF EXISTS my_tokenized_tables;
CREATE TEMP TABLE my_tables (comma_table_names varchar(400));
EXECUTE 'INSERT INTO my_tables VALUES (' || quote_literal(tbl_list) || ')';
with NS AS (
select 1 as n union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10
)
SELECT Trim(Split_part(b.comma_table_names, ',', ns.n)) AS tname
INTO temp table my_tokenized_tables
FROM ns
inner join my_tables b
ON ns.n <= regexp_count(b.comma_table_names, ',') + 1;
FOR rec IN
SELECT nspname :: text AS namespace,
relname :: text AS tablename
FROM pg_class pc
join pg_namespace pn
ON pc.relnamespace = pn.oid
WHERE Trim(relname :: text) IN (SELECT tname
FROM my_tokenized_tables)
AND Trim(nspname :: text) NOT IN ( 'pg_catalog', 'information_schema' )
LOOP
raise info 'schema = %, table = %', rec.namespace, rec.tablename;
END LOOP;
DROP TABLE IF EXISTS my_tables;
DROP TABLE IF EXISTS my_tokenized_tables;
END;
$$;
Call the procedure: #
test=# call multi_comma_values ('bhuvi,test');
INFO: Table "my_tables" does not exist and will be skipped
INFO: Table "my_tokenized_tables" does not exist and will be skipped
INFO: schema = public, table = test
INFO: schema = public, table = bhuvi
CALL
Another Scanrio: #
I have a test table called bhuvi
and I have 3 rows in it.
CREATE TABLE bhuvi
(
name VARCHAR(10)
);
INSERT INTO bhuvi
VALUES ('a');
INSERT INTO bhuvi
VALUES ('b');
INSERT INTO bhuvi
VALUES ('c');
Now I want to run a select query where name in b and c
. So in the above stored procure, I just edited the below FOR LOOP part.
FOR rec IN
SELECT name
FROM bhuvi
WHERE name IN (SELECT tname
FROM my_tokenized_tables)
LOOP
raise info 'value is = %',rec.name;
END LOOP;
Call the Procedure
test=# call multi_comma_values ('b,c');
INFO: Table "my_tables" does not exist and will be skipped
INFO: Table "my_tokenized_tables" does not exist and will be skipped
INFO: value is = b
INFO: value is = c
CALL
More Customization: #
- If you would to have a permanent table, then create it in your database and remove the
With NS as
- This will only extract upto 256 I guess, if you inputing more than 10 then refer this.