RedShift Reconstructing SQL from STL_QUERYTEXT
Bhuvanesh
by Bhuvanesh
2 min read

Categories

Tags



If you are managing the RedShift clusters then STL_QUERY and STL_QUERYTEXT tables are not new to you. STL_Query can’t hold the complete SQL query instead we can use STL_QueryText to read the complete query. But there is a challenge, we can’t read that table as it is. Since your queries are saved in multiple rows. So we need to combine all these rows into a single row with LISTAGG function which is well documented here.

What is the real challenge?

From the AWS documentation example, they are showing to reconstruct the SQL for a single query. But in a production system, you may have 50k+ queries where the LISTAGG function will through some errors related to its limitation. Let’s simulate this. I ran the below query on my production RedShift Cluster.

SELECT   Listagg( 
         CASE 
                  WHEN Len(Rtrim(text)) = 0 THEN text 
                  ELSE Rtrim(text) 
         END, '') within GROUP (ORDER BY SEQUENCE) AS text
FROM     stl_querytext limit 10;

ERROR:  Result size exceeds LISTAGG limit
DETAIL:
  -----------------------------------------------
  error:  Result size exceeds LISTAGG limit
  code:      8001
  context:   LISTAGG limit: 65535
  query:     180132
  location:  0.cpp:228
  process:   query0_78_180132 [pid=23735]
  -----------------------------------------------

Why this error?

If you read the LISTAGG function details, If the result set is larger than the maximum VARCHAR size (64K – 1, or 65535), then LISTAGG returns the error. Here I have more than 50k+ rows. So the overall characters will not fit in the result set.

What is the solution?:

If we run the reconstruct query one by one instead of all then the resultset will fit into the LISTAGG’s limitation. Lets say one of my queries split into 2 rows in the STL_QUERYTEXT table.

SELECT text 
FROM   stl_querytext 
WHERE  query = 97729 
ORDER  BY SEQUENCE; 

First, process these two rows and then process another query and then the next one.

How?

We have userid,pid,xid,query columns are common between these two rows. So partition your resultset by query wise. Then do the LISTAGG.

LISTAGG(
      CASE
        WHEN LEN(RTRIM(text)) = 0 THEN text
        ELSE RTRIM(text)
      END,
      ''
    ) within group (
      order by
        sequence
    ) over (PARTITION by userid, xid, pid, query) 

Replace \n and \r characters:

If you are going to run this query from psql client most of the queries starts with \r\nselect * from mytbl or any GUI based tools will return \n characters.

  • \r actually a carriage return
  • \n is not a new line its a sting. So we have to replace twice.
replace(replace(
    LISTAGG(
      CASE
        WHEN LEN(RTRIM(text)) = 0 THEN text
        ELSE RTRIM(text)
      END,
      ''
    ) within group (
      order by
        sequence
    ) over (PARTITION by userid, xid, pid, query),
    '\r','' 
  ), '\\n', '')

Final View:

For east management, Im going to create a view on top of the STL_QUERYTEXT table to return the reconstructed SQL for all the queries.

CREATE VIEW recon_stl_querytext 
AS 
  (SELECT DISTINCT userid, 
                   xid, 
                   pid, 
                   query, 
                   Replace(Replace(Listagg(CASE 
                                             WHEN Len(Rtrim(text)) = 0 THEN text 
                                             ELSE Rtrim(text) 
                                           END, '') 
                                     within GROUP ( ORDER BY SEQUENCE ) over ( 
                                       PARTITION BY userid, xid, pid, query), 
                           '\r' 
                           , ''), '\\n', '') 
                   AS text 
   FROM   stl_querytext 
   ORDER  BY 1, 
             2, 
             3, 
             4); 

Now you can view the proper data from this view.

SELECT * 
FROM   recon_stl_querytext 
WHERE  query = 97729; 

I hope this helps you to get the overall consolidated view on the query history. If you are interested in how to Audit RedShift Historical Queries With pgbadger please click here.


Advertisement


Share this article