I wanted to expand a little on my unwrapping talk from 2006 and also talk about PL/SQL internals BUT more in the direction of the internals and not about unwrapping. I have always been interested in this topic but more so now in some details as i am in the middle of some work currently with PFCLCode our PL/SQL static code analyser for security issues. We have been developing that tool for some time and a new GUI is being added into PFCLScan for release soon. This will allow customers to choose all PL/SQL in the database, a schema of PL/SQL or a single procedure or function or anywhere in between these limits and perform detailed security analysis of that PL/SQL code. We have over 200 check types already built into the engine that does the analysis and more are being added. The GUI allows high level views of all the PL/SQL code or detailed analysis of a single piece of code. We have colour syntax highlighted editor view of each and every issue. Because its built into PFCLScan it uses all of the power of our scanner but also at because of this allows command line running so that this tool; PFCLCode and indeed any of our tools, can be easily integrated into any other tools or workflow such as your favourite database GUI editors.
More on PFCLScan engines, plugins, command line running and integration into other tools soon in a blog post.
I have been collecting details and analysis for years on PL/SQL and I was thinking maybe to write this as an MS Word document and publish as a pdf but i decided to just put it on the blog. I have had an interest in some of these ideas since the mid 90s and some of it early noughties and some more recently as I write and analyse PL/SQL for various reasons and products. So, I decided its time to spend some time and collate and edit and release a succinct version of these ideas and notes.
Enjoy!!
Last week I spoke about the PL/SQL and its machine code in a blog post called "PL/SQL Machine Code Trace - event 10928" and how PL/SQL is a compiled language and it is eventually converted/compiled into machine code and that machine code is executed in a PL/SQL virtual machine. This is a C program that executes PL/SQL machine instructions and this VM is what runs your PL/SQL in the database. This is a concept all of us should be familiar with nowadays because of the rise of virtualisation with tools such as virtualbox or VMware etc. Those are tools that simulate an X86 or X64 or other PC hardware as software allowing copies of Windows or Linux to run in software rather than hardware. These tools are eminently part of the push to cloud with the ideas of provisioning hardware (as software) very quickly and having systems available on demand or moved as needed. Well PL/SQL does the same; it runs in a VM but it runs machine code for PL/SQL.
I did cover some of the details in this post many years ago in my Blackhat talk in 2006 Unwrapping PL/SQL but lets go into some detail again with some new elements and digressions. This post is not about unwrapping but its mention cannot be avoided.
Why am I interested in this low level detail of PL/SQL? well initially I was interested around 20 or more years ago because i loved low level internal details especially of the database. I am a C programmer, I first learned Oracle C and user exits in the mid 90s, then OCI (Oracles C API) and then Pro*C and also in parallel SQL and PL/SQL. I started as a C programmer and always loved low level details. I loved to dig around in the databases of the time (Oracle 6 and 7) and look at the files shipped with the database in the /rdbms/admin directory especially things like the .bsq files, the DIANA, PIDL, DIUTIL, SUBPTXT and more; particularly I liked what was evidently hard core PL/SQL that was written like it was a C programmer writing it; at least to me it did. One example is that I found a procedure SUBPTXT and I looked at it (this below is in 18c XE by the way now, BUT i looked at this originally in Oracle 7):
SQL> select text from dba_source where name='SUBPTXT';
TEXT
--------------------------------------------------------------------------------
procedure subptxt(name varchar2, subname varchar2, usr varchar2,
txt in out varchar2) is
begin
subptxt2(name, subname, usr, null, null, txt);
end;
I then looked at the code of this procedure by selecting it from the database:
SQL> select text from dba_source where name='SUBPTXT2';
TEXT
--------------------------------------------------------------------------------
procedure subptxt2(name varchar2, subname varchar2, usr varchar2,
dbname varchar2, dbowner varchar2,
txt in out varchar2) is
status diutil.ub4;
begin -- main
diutil.subptxt(name, subname, usr, dbname, dbowner, txt, status);
if (status <> diutil.s_ok) then
if (status = diutil.s_subpNotFound) then
txt := '$$$ s_subpNotFound';
elsif (status = diutil.s_stubTooLong) then
TEXT
--------------------------------------------------------------------------------
txt := '$$$ s_stubTooLong';
elsif (status = diutil.s_logic) then
txt := '$$$ s_logic';
elsif (status = diutil.s_notInPackage) then
txt := '$$$ s_notInPackage';
else txt := '$$$ s_other';
end if;
end if;
end subptxt2;
20 rows selected.
SQL>
Look at the comment, "-- main", next to the start of the begin; that's a C programmer, fantastic!!!. As I said I came across these functions in the 90s and also PSTUB as i was involved in C, PL/SQL and some Forms and these were used to generate server side calls for PL/SQL. As i said i loved to dig into the database details. Let's look at a sample PL/SQL function that I have used as a demo for SQL Injection for almost 20 years. This is one iteration of this sample and is of my CUST function; here is the header via a DESCRIBE:
SQL> desc cust
FUNCTION cust RETURNS VARCHAR2
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
PV_NAME VARCHAR2 IN
SQL>
What if we now run the SYS.SUBPTXT procedure; what does it do? The source code file pistub.sql says its a "subprogram stub generator". What does that mean? it is from (as far as I know) the old days of Oracle Forms where server side calls needed to generate stubs to call procedures in the database; it did this by generating a "stub". First lets describe the procedure:
SQL> desc subptxt
PROCEDURE subptxt
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NAME VARCHAR2 IN
SUBNAME VARCHAR2 IN
USR VARCHAR2 IN
TXT VARCHAR2 IN/OUT
Now lets use it to generate the header / describe for my CUST function shown above:
SQL> var a varchar2(2000);
SQL> exec sys.subptxt('CUST',NULL,NULL,:a);
PL/SQL procedure successfully completed.
SQL> print a
A
--------------------------------------------------------------------------------
function CUST (PV_NAME VARCHAR2) return VARCHAR2;
SQL>
This call clearly generated the same function signature as the actual function in the database. But how does it work, what does it do to generate this signature? The SYS.SUBPTXT() procedure is not wrapped and we can read it and we can also read some of the SYS.DIUTIL package procedure that is called from the --main body shown above. I selected some of the DIUTIL package from my 18cXE database but you can also go and review the source code files in the /rdbms/admin directory on the server or view the complete source code yourself as i have done. Here is a section of the SYS.DIUTIL package (some output cut for brevity; see the ... for the gaps):
SQL> select text from dba_source where name='DIUTIL';
TEXT
--------------------------------------------------------------------------------
PACKAGE diutil IS
e_subpnotfound EXCEPTION;
e_notinpackage EXCEPTION;
e_nopriv EXCEPTION;
...
ELSIF (nkind = diana.d_parm_c) THEN
DECLARE seq pidl.ptseqnd := diana.as_list(diana.as_p_ass(n));
BEGIN
etext(last_elt(seq));
END;
-- arglist
ELSIF (nkind = diana.ds_apply) THEN
DECLARE aseq ptnod := diana.as_list(n); BEGIN
rv := rv || '(';
listtext(aseq, ',');
rv := rv || ')';
END;
-- d_f_call
ELSIF (nkind = diana.d_f_call) THEN
DECLARE args ptnod := diana.as_p_ass(n);
BEGIN
IF (pidl.ptkin(args) <> diana.ds_param) THEN
-- ordinary function call
etext(diana.a_name(n));
etext(args);
ELSE -- operator functions, determine if unary or n-ary
DECLARE s pidl.ptseqnd := diana.as_list(args);
namenode ptnod := diana.a_name(n);
BEGIN
IF (pidl.ptslen(s) = 1) THEN -- unary
etext(namenode);
rv := rv || ' ';
etext(pidl.ptgend(s, 0));
ELSE exprtext(namenode, rv); listtext(s, rv);
END IF;
END;
END IF;
END;
...
This is interesting and back in the last 90s when i first looked at this stuff I could see that this is compiler stuff; follow the sample output of DIUTIL above and you can see its reading the code in some way for FUNCTION parameters. When unwrapping of the wrap file became "a thing" in the early noughties and I realised that some of the data in the wrap file was actually DIANA nodes then I remembered all of these packages and code (DIUTIL, PIDIAN, DIANA...) and realised that this is code (in SYS.DIUTIL) that is walking the AST (What is an AST in a minute). You can clearly see that this is the case above and its nice code and interesting to understand it. We will come back to this in more details shortly.
Why walk the AST - the DIANA? well that is because DIANA ('D'escriptive 'I'ntermediate 'A'ttributed 'N'otation for 'A'da - DIANA) was designed for ADA as the language that described the AST for ADA. Why walk the AST in the SYS.DIUTIL package above to generate the function signature? well that's what one of the founding principals of DIANA was. In the days of low memory and low processing power, it was figured that the DIANA would be stored - as IDL ('I'nterface 'D'efinition 'L'anguage) and that DIANA could then be processed and used rather than the source code. This meant that it was designed into DIANA to re-generate source code as in the SUBPTXT() procedure and also in the 9iR2 and earlier wrap file format. The designers of DIANA thought that source, pretty printers, analysers etc could all be driven from ADA. Nice idea.
I have written a number of parsers and lexers for PL/SQL for all of the language and for some of it. For instance our product PFCLObfuscate features a PL/SQL parser so that PL/SQL code can be parsed and protected to remove IPR (Intellectual Property) and also to lock PL/SQL code to a database through dynamic obfuscation so that it will not work if stolen. This is the only product to protect PL/SQL that can lock your code to the database it runs on; think license protection in games from years ago. As i mentioned earlier we have also written a PL/SQL parser for our PL/SQL static code analyser PFCLCode so that your PL/SQL can be analysed for many different types of security bugs including things like SQL Injection vulnerabilities.
In starting to write PL/SQL parsers I often lamented that wouldn't it be great if Oracle exposed their AST to the database users for PL/SQL that exists in the database. Then we could process Oracles AST and analyses it in any way that we needed to - after all that was the principal for DIANA. Unfortunately, even though you can see above that functions like SYS.SUBPTXT can be used to extract the internally stored AST (DIANA) for a function or procedures signature by parsing the internal AST we cannot use the same mechanisms to parse the whole AST for any piece of PL/SQL in a procedure or function body. I will show why shortly.
Oracle for many years have made it clear that the DIANA (AST) is not stored for package bodies and whilst it does seem that DIANA (AST) is stored for standalone procedures; as we will see this is not exposed to us for use even for procedures/functions bodies either. The procedure and function body exposed as DIANA in the database is not what was written and is intended for use in these stub generators.
PL/Scope - seems like it may be a useful tool to analyse PL/SQL code; well it is of course BUT its not perfect to analyse PL/SQL properly as its really a record of identifiers in the PL/SQL code and where those identifiers are created and used (called, referenced and assigned). We can do quite a lot with PL/Scope and it certainly warrants more investigations but as an outsider who audits and investigates other peoples (customers) databases and PL/SQL then we cannot change their databases or compile their code to get the list of identifiers generated in PL/Scope. So we need tools that can analyse code without changing the databases not ones that do BUT clearly PL/Scope is in some way using the AST / DIANA of the code as its parsed and compiled and the identifiers details are written out to tables for analysis; DBA_IDENTIFIERS for instance.
Before the PL/SQL source code gets converted into M-CODE or P-CODE it starts as PL/SQL source code of course. PL/SQL is well known to be based on the ADA language. Oracle based PL/SQL on ADA but removed some of the ADA elements such as parallel processing including tasks and synchronous message passing etc. Oracle of course also added all of the hooks to embed SQL directly into the language. If you are familiar with PL/SQL and look at ADA code you will recognise a lot of the code as its very similar with similar comments, block structure, packages, types and constructs such as IF, FOR and WHILE loops. ADA has support for task based concurrency and wouldn't it be great if Oracle put that back into PL/SQL !!! - More on that in a future post.
Some, but not all of the main components of this puzzle:
- PL/SQL: This is the language of course. This is the source code that you work with BUT Oracle doesn't actually execute your source code it compiles it first into Machine code. The machine code is likely an optimised version of the code and perhaps does not reflect the original PL/SQL exactly. The PL/SQL is your original source code
- AST: Abstract Syntax Tree - This is the tree representation of a PL/SQL program after is it parsed. The parse process will take on a number of phases. The first phase is lexical analysis where each language token is recognised. For instance the keyword PROCEDURE or the keyword FOR or the symbol pair :=. The next phase is the parsing which aims to recognise the language structure or syntax such as PROCEDURE {name} '(' {name} IN|OUT|INOT DATA_TYPE, [...] ')' and also analyse types and context of the code. This analysis / parsing phase constructs a tree called an 'A'bstract 'S'yntax 'T'ree; AST. This tree often in compilers and languages can be two trees or two phases of the same tree. The first phase being the structural tree representation of the language structure BUT as a tree; the second phase is where the tree is populated with symantec or code or lexical attributes - this is where the 'A'ttributed part of the tree comes in. The compiler creates a structured tree to represent the language elements and then adds leaves or tags to the tree. This can be for Lexical elements - i.e. a variable name such as my procedure name 'CUST' or parameter 'PV_NAME'. The attributes can be semantic elements i.e. things that relate to the language, value etc of a node. The main part of the tree are the structural attributes, the elements that point to other structural attributes (other nodes). We will see some examples soon.
- DIANA: The AST in PL/SQL is made up of a tree of 315 different nodes (at least in 18c XE, less in earlier versions). The Abstract Syntax Tree starts at a root node - normally D_COMP_U in most PL/SQL (we will see this in an example in a moment) but may not always be D_COMP_U for all PL/SQL. These are the main components of the tree that is built hanging off the D_COMP_U DIANA node. Each DIANA node can have none (zero, we will see this shortly) or more attributes. These attributes are in some cases pointers to instances of other DIANA nodes
- Attributes: Each DIANA node has a set of none (zero) or many attributes. These attributes can start with L_% and these are lexical attributes; a common one is L_SYMREP which is used for the name of a symbol table entry. Another common lexical attribute is L_SRCPOS which shows the row/column position in the original PL/SQL source. Some attributes start S_% and these are semantic attributes that relate to value or settings such as S_SPEC or S_BODY or SS_PRAGM_L. Then there are also C_% attributes. These are code attributes that relate to the generated code (M-CODE / P-CODE). Examples of code attributes are C_FIXUP, C_ENTRY or C_FRAME. Finally the most important attributes are the structural attributes. An example could be A_D_ attribute that points at the DI_FUNC or DI_PROC DIANA nodes that describe the function or procedure including its name. Some structural DIANA nodes are lists such as AS_LIST and these point at a list of DIANA nodes. In effect this allows DIANA nodes to be added to the tree as a set of nodes at the same level. For instance a list of parameters for a procedure or function would be an AS_LIST as each parameter is at the same level in the tree. Each DIANA node is different and has many different attributes that allow trees to be built that represent the PL/SQL
- IDL: 'I'nterface 'D'efintion 'L'anguage. This is how the DIANA is written down on disk. This is a wat to write at DIANA AST out in a flat way.
- M-Code / P-CODE: As we showed in a previous post PL/SQL is compiled into machine code; M-CODE or P-CODE
How do i know some of the above. Well I started as i said by digging into packages and so on in the database; I also investigated ADA in some depth BUT I also bought a book a very long time ago, 16 or 17 years ago - DIANA An Intermediate Language for ADA - by Goos, Wulf, Evans and Butler:
I even found one of the authors and we engaged in some email correspondence. He told me that DIANA was dropped in ADA quite quickly but he knew Oracle used it originally in PL/SQL. He was very impressed that after all the years since they designed DIANA that someone was interested in it again.
This book is useful as it describes DIANA and its nodes and attributes and sample AST structures for most of ADA (which is also the core and a lot of PL/SQL). This book teaches you a lot of what DIANA and ASTs look like and how they work.
So, now lets see some DIANA and AST in the database for my sample PL/SQL function introduced above. First I needed to create a table so that the PL/SQL code will compile. I then installed my function into the database:
SQL> sho user
USER is "SYS"
SQL> create table credit_card (name_on_card varchar2(30), last_name varchar2(15));
Table created.
SQL> @custa.sql
Function created.
SQL>
Now we can use the DUMPDIANA code from Oracle. This is not installed by default so i need to install this from the ./rdbms/admin directory. This needs to be installed as SYSDBA and i have installed this in a pluggable database in my 18cXE database:
SQL> connect sys/oracle1@//192.168.56.78:1523/xepdb1 as sysdba
Connected.
SQL> @dumpdian
Library created.
Package created.
Package body created.
SQL>
You can describe the package and see its procedures available:
SQL> connect sys/oracle1@//192.168.56.78:1523/xepdb1 as sysdba
Connected.
SQL> desc dumpdiana
PROCEDURE DUMP
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
ANAME VARCHAR2 IN
LU_TYPE NUMBER IN DEFAULT
PRINT_FORMAT BINARY_INTEGER IN DEFAULT
PROCEDURE DUMP
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NOD BINARY_INTEGER IN
PRINT_FORMAT BINARY_INTEGER IN DEFAULT
PROCEDURE NODE_COUNT
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
ANAME VARCHAR2 IN
LU_TYPE NUMBER IN DEFAULT
PROCEDURE NODE_COUNT
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NOD BINARY_INTEGER IN
SQL>
Next we can run the SYS.DUMPDIANA.DUMP procedure for my CUST function and get is DIANA/AST. This is simple to run but it doesn't output anything to the screen. It creates a trace file that shows the AST/DIANA:
SQL> exec dumpdiana.dump(aname => 'CUST');
PL/SQL procedure successfully completed.
SQL>
We can now go to the Linux machine and get the trace file. This trace is the nicest to see, the "structured tree format". You can also play with DUMPDIANA and get a line format and another as well. Here is the DIANA AST for this CUST function - BUT there are some caveats!!
[oracle@oel18cxecrm trace]$ cat XE_ora_15871.trc
Trace file /opt/oracle/diag/rdbms/xe/XE/trace/XE_ora_15871.trc
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0
Build label: RDBMS_18.4.0.0.0DBRU_LINUX.X64_181017
ORACLE_HOME: /opt/oracle/product/18c/dbhomeXE
System name: Linux
Node name: oel18cxecrm.localdomain
Release: 4.14.35-1818.3.3.el7uek.x86_64
Version: #2 SMP Mon Sep 24 14:45:01 PDT 2018
Machine: x86_64
Instance name: XE
Redo thread mounted by this instance: 1
Oracle process number: 40
Unix process pid: 15871, image: oracle@oel18cxecrm.localdomain
*** 2020-03-16T15:23:50.371519+00:00 (XEPDB1(3))
*** SESSION ID:(31.24495) 2020-03-16T15:23:50.371543+00:00
*** CLIENT ID:() 2020-03-16T15:23:50.371548+00:00
*** SERVICE NAME:(xepdb1) 2020-03-16T15:23:50.371553+00:00
*** MODULE NAME:(sqlplus.exe) 2020-03-16T15:23:50.371558+00:00
*** ACTION NAME:() 2020-03-16T15:23:50.371563+00:00
*** CLIENT DRIVER:(SQL*PLUS ) 2020-03-16T15:23:50.371567+00:00
*** CONTAINER ID:(3) 2020-03-16T15:23:50.371572+00:00
(Using the structured tree format)
PD1(2):D_COMP_U [
L_SRCPOS : row 1 col 1
A_CONTEX :
PD2(2): D_CONTEX [
L_SRCPOS : row 1 col 1
AS_LIST : < >
]
A_UNIT_B :
PD3(2): D_S_BODY [
L_SRCPOS : row 1 col 1
A_D_ :
PD4(2): DI_FUNCT [
L_SRCPOS : row 1 col 10
L_SYMREP : CUST,
S_SPEC : PD5^(2),
S_BODY : PDD^(2),
S_LOCATI : 0,
S_STUB : void,
S_FIRST : PD4^(2),
C_OFFSET : 0,
C_FIXUP : NOT YET,
C_FRAME_ : 0,
C_ENTRY_ : 1,
S_FRAME : void,
A_UP : PD3^(2),
S_LAYER : 1,
L_RESTRICT_REFERENCES : 47,
A_METH_FLAGS : 0,
SS_PRAGM_L : void,
S_INTRO_VERSION : 0,
A_PARALLEL_SPEC : void,
C_VT_INDEX : 0,
C_ENTRY_PT : 1,
S_FG_POS : 0,
S_FG_SIG : S_PLSC_SIG : AS_WHTLST : void,
]
A_HEADER :
PD5(2): D_F_ [
L_SRCPOS : row 1 col 1
AS_P_ :
PD6(2): DS_PARAM [
L_SRCPOS : row 1 col 14
AS_LIST : <
PD7(2): D_IN [
L_SRCPOS : row 1 col 15
AS_ID :
PD8(2): DS_ID [
L_SRCPOS : row 1 col 15
AS_LIST : <
PD9(2): DI_IN [
L_SRCPOS : row 1 col 15
L_SYMREP : PV_NAME,
S_OBJ_TY : PD65^(3),
S_INIT_E : void,
S_FIRST : PD9^(2),
C_OFFSET : 0,
S_FRAME : PD4^(2),
S_ADDRES : 0,
SS_BINDS : A_UP : PD7^(2),
A_FLAGS : 0,
S_PLSC_SIG : ]
>
]
A_NAME :
PDA(2): DI_U_NAM [
L_SRCPOS : row 1 col 26
L_SYMREP : VARCHAR2,
S_DEFN_PRIVATE : PD64^(3),
SS_BUCKE :
L_DEFAUL : 256
]
A_EXP_VO : void,
A_INDICA : void,
S_INTERF : void,
]
>
]
A_NAME_V :
PDB(2): DI_U_NAM [
L_SRCPOS : row 1 col 43
L_SYMREP : VARCHAR2,
S_DEFN_PRIVATE : PD64^(3),
SS_BUCKE :
L_DEFAUL : 256
]
S_OPERAT : void,
A_UP : PD3^(2)
]
A_BLOCK_ :
PDD(2): D_BLOCK [
L_SRCPOS : row 1 col 1
AS_ITEM :
PDE(2): DS_ITEM [
L_SRCPOS : row 1 col 1
AS_LIST : < >
A_UP : PDD^(2)
]
AS_STM :
PD10(2): DS_STM [
L_SRCPOS : row 1 col 0
AS_LIST : <
PD29(2): D_P_CALL [
L_SRCPOS : row 1 col 1
A_NAME :
PD2E(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD2C(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD2A(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS,
S_DEFN_PRIVATE : void,
SS_BUCKE :
L_DEFAUL : 0
]
A_D_CHAR :
PD2B(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS_STUB_FOR_PURITY_ANALYSIS,
S_DEFN_PRIVATE : PD4^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
A_D_CHAR :
PD2D(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : PWPS,
S_DEFN_PRIVATE : PD29^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
AS_P_ASS :
PD2F(2): DS_P_ASS [
L_SRCPOS : row 1 col 1
AS_LIST : < >
]
S_NORMARGLIST : <>
C_OFFSET : 0,
A_UP : PD10^(2),
A_FLAGS : 0
]
PD21(2): D_P_CALL [
L_SRCPOS : row 1 col 1
A_NAME :
PD26(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD24(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD22(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS,
S_DEFN_PRIVATE : void,
SS_BUCKE :
L_DEFAUL : 0
]
A_D_CHAR :
PD23(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS_STUB_FOR_PURITY_ANALYSIS,
S_DEFN_PRIVATE : PD4^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
A_D_CHAR :
PD25(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : PRPS,
S_DEFN_PRIVATE : PD1E^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
AS_P_ASS :
PD27(2): DS_P_ASS [
L_SRCPOS : row 1 col 1
AS_LIST : < >
]
S_NORMARGLIST : <>
C_OFFSET : 0,
A_UP : PD10^(2),
A_FLAGS : 0
]
PD19(2): D_P_CALL [
L_SRCPOS : row 1 col 1
A_NAME :
PD1E(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD1C(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD1A(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS,
S_DEFN_PRIVATE : void,
SS_BUCKE :
L_DEFAUL : 0
]
A_D_CHAR :
PD1B(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS_STUB_FOR_PURITY_ANALYSIS,
S_DEFN_PRIVATE : PD4^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
A_D_CHAR :
PD1D(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : PWDS,
S_DEFN_PRIVATE : PD13^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
AS_P_ASS :
PD1F(2): DS_P_ASS [
L_SRCPOS : row 1 col 1
AS_LIST : < >
]
S_NORMARGLIST : <>
C_OFFSET : 0,
A_UP : PD10^(2),
A_FLAGS : 0
]
PD11(2): D_P_CALL [
L_SRCPOS : row 1 col 1
A_NAME :
PD16(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD14(2): D_S_ED [
L_SRCPOS : row 1 col 1
A_NAME :
PD12(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS,
S_DEFN_PRIVATE : void,
SS_BUCKE :
L_DEFAUL : 0
]
A_D_CHAR :
PD13(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : SYS_STUB_FOR_PURITY_ANALYSIS,
S_DEFN_PRIVATE : PD4^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
A_D_CHAR :
PD15(2): DI_U_NAM [
L_SRCPOS : row 1 col 1
L_SYMREP : PRDS,
S_DEFN_PRIVATE : PD8^(4),
SS_BUCKE :
L_DEFAUL : 0
]
S_EXP_TY : void,
]
AS_P_ASS :
PD17(2): DS_P_ASS [
L_SRCPOS : row 1 col 1
AS_LIST : < >
]
S_NORMARGLIST : <>
C_OFFSET : 0,
A_UP : PD10^(2),
A_FLAGS : 0
]
>
A_UP : PDD^(2)
]
AS_ALTER :
PDF(2): DS_ALTER [
L_SRCPOS : row 1 col 1
AS_LIST : < >
S_BLOCK : PDD^(2),
S_SCOPE : void,
A_UP : PDD^(2)
]
C_OFFSET : 0,
SS_SQL :
C_FIXUP : NOT YET,
S_BLOCK : void,
S_SCOPE : void,
S_FRAME : PD4^(2),
A_UP : PD3^(2),
S_LAYER : 0,
S_FLAGS : 0,
A_ENDLIN : 0,
A_ENDCOL : 0,
A_BEGLIN : 0,
A_BEGCOL : 0,
A_FLAGS : 0
]
A_UP : PD1^(2),
A_ENDLIN : -1,
A_ENDCOL : -1,
A_BEGLIN : -1,
A_BEGCOL : -1
]
AS_PRAGM :
PDC(2): DS_PRAGM [
L_SRCPOS : row 1 col 1
AS_LIST : < >
A_UP : void,
]
SS_SQL :
SS_EXLST :
SS_BINDS : A_UP : void,
A_AUTHID : 0,
A_SCHEMA : 0,
S_FG_REFS : ]
[oracle@oel18cxecrm trace]$
The caveats!. The DIANA AST shows the main structure for this function BUT it only shows the main items, the D_COMP_U - the compilation unit - this node has attributes A_CONTEX, A_UNIT_B and AS_PRAGM amongst others. The A_UNIT_B points to a DS_BODY DIANA node and this points to the DI_FUNCT DIANA node which contains the name of the function CUST as well as its type from the DIANA node name - this is pointed at via the A_D_ attribute. The DS_BODY has also the A_HEADER and A_BLOCK attributes. The A_HEADER points at a D_F_ DIANA node which is the starting point for all of the DIANA nodes that make up the header - parameters section and return type of the function. The A_BLOCK attribute of the DS_BODY DIANA node points at the D_BLOCK DIANA node that contains the AS_ITEM, AS_STM and AS_ALTER attributes; these are the main elements of the function; the AS_ITEM is where the function variables are added to the AST and the AS_STM is where all of the body statements if the function are added; finally the AS_ALTER is where the nodes for the exception clause are added to the AST.
OK, that's complex to talk about so I have drawn a picture of the core DIANA nodes of this simple function. This is here:
The diagram shows the main AST for the D_COMP_U and how the function DIANA node is added and all of the header DIANA nodes (parameters). What is strange and missing is the main body of the function. The actual PL/SQL is here:
SQL> sho user
USER is "SYS"
SQL> select text from dba_source where name='CUST' and owner='SYS';
TEXT
--------------------------------------------------------------------------------
function cust(pv_name in varchar2) return varchar2 is
lv_stmt varchar2(2000);
type c_ref is ref cursor;
c c_ref;
name credit_card.name_on_card%type;
Begin
lv_stmt:='select name_on_card from credit_card '||
'where last_name = '''||pv_name||'''';
open c for lv_stmt;
loop
fetch c into name;
TEXT
--------------------------------------------------------------------------------
if(c%notfound) then
exit;
end if;
dbms_output.put_line('name:=['||name||']');
end loop;
close c;
return(name);
end;
20 rows selected.
SQL>
The DIANA nodes for the body of the function in the dumpdiana.dump above are incorrect (not really incorrect as clearly Oracle intended this). The real DIANA for the body is not shown but instead we get SYS_STUB_FOR_PURITY_ANALYSIS calls.
This is a shame as it means that not only did Oracle tell us the DIANA is not stored for packages but also it seems that the DIANA for the real procedure and function bodies is also not accessible. This means that use of the Oracle DIANA except to generate headers for procedure and function calls is not possible BUT we can still learn a lot from the above DIANA dump. What's interesting if you get hold of the Goos, Wulf et al book is that the DIANA nodes for Oracle PL/SQL are basically the same as for ADA as described in this book. So we can see the main structure of a code block; the name, the parameters, return, variables declared, body statements and exception clauses. This means that we can learn what the rest of the DIANA might look like for other code elements such as assign, function calls and more.
Interestingly the code files for the diana functions in DIANA, DIUTIL, DUMPDIANA and PIDL hints that the root DIANA node can be found with a debugger. Which debugger? gdb, or oradebug?
As i said earlier the DIANA is written down as IDL. Lets have a quick look at that storage. There are four tables that contain the IDL for DIANA and M-CODE/P-CODE. These are:
SQL> select table_name from dba_tables where owner='SYS' and table_name like 'IDL%';
TABLE_NAME
--------------------------------------------------------------------------------
IDL_CHAR$
IDL_SB4$
IDL_UB1$
IDL_UB2$
SQL>
We can try and read the date from these tables; well, we can read all of the data except the piece columns in three of them. Here is a simple try:
SQL> @print 'select * from idl_sb4$ where obj#=78457'
old 33: lv_str:=translate('&&1','''','''''');
new 33: lv_str:=translate('select * from idl_sb4$ where obj#=78457','''','''''');
Executing Query [select * from idl_sb4$ where obj#=78457]
declare
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [12338], [], [], [], [], [], [], [],
[], [], [], []
ORA-06512: at "SYS.DBMS_SQL", line 1726
ORA-06512: at line 19
ORA-06512: at line 34
SQL>
This one was interesting as the select hung for a few seconds and then gave an ORA-0600, unhandled exception. This perhaps is a bug. Try the others:
SQL> @print 'select * from idl_ub1$ where obj#=78457'
old 33: lv_str:=translate('&&1','''','''''');
new 33: lv_str:=translate('select * from idl_ub1$ where obj#=78457','''','''''');
Executing Query [select * from idl_ub1$ where obj#=78457]
declare
*
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected CHAR got LONG BINARY
ORA-06512: at line 38
ORA-06512: at line 29
ORA-06512: at "SYS.DBMS_SQL", line 1721
ORA-06512: at line 18
ORA-06512: at line 34
SQL>
And try:
SQL> @print 'select * from idl_ub2$ where obj#=78457'
old 33: lv_str:=translate('&&1','''','''''');
new 33: lv_str:=translate('select * from idl_ub2$ where obj#=78457','''','''''');
Executing Query [select * from idl_ub2$ where obj#=78457]
declare
*
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected CHAR got NUMBER
ORA-06512: at line 38
ORA-06512: at line 29
ORA-06512: at "SYS.DBMS_SQL", line 1721
ORA-06512: at line 18
ORA-06512: at line 34
SQL>
The simplest way to read this data; the piece column at least is to find the block address for each table rows that you are interested in and then do a block dump and read the data that way. This is what I did in the BlackHat paper back in 2006. Lets look at the table descriptions:
SQL> desc idl_sb4$
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJ# NOT NULL NUMBER
PART NOT NULL NUMBER
VERSION NUMBER
PIECE# NOT NULL NUMBER
LENGTH NOT NULL NUMBER
PIECE NOT NULL UNDEFINED
SQL> desc idl_ub1$
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJ# NOT NULL NUMBER
PART NOT NULL NUMBER
VERSION NUMBER
PIECE# NOT NULL NUMBER
LENGTH NOT NULL NUMBER
PIECE NOT NULL LONG RAW
SQL> desc idl_ub2$
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJ# NOT NULL NUMBER
PART NOT NULL NUMBER
VERSION NUMBER
PIECE# NOT NULL NUMBER
LENGTH NOT NULL NUMBER
PIECE NOT NULL UNDEFINED
There are two tables that have a PIECE column data type of "LONG RAW" and one that has a data type of "UNDEFINED". This is interesting but how interesting? i.e. how many others are there of "UNDEFINED" or "LONG RAW":
SQL> select count(*) from dba_tab_columns where data_type='UNDEFINED';
COUNT(*)
----------
4
SQL> select count(*) from dba_tab_columns where data_type='LONG RAW';
COUNT(*)
----------
12
Not many at all. Lets see the "UNDEFINED" columns and what tables they are in:
SQL> col owner for a30
SQL> col table_name for a30
SQL> col column_name for a30
SQL> set lines 220
SQL> l
1* select owner,table_name,column_name from dba_tab_columns where data_type='UNDEFINED'
SQL> /
OWNER TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------ ------------------------------
SYS IDL_UB2$ PIECE
SYS IDL_SB4$ PIECE
SYS ORA_KGLR7_IDL_UB2 PIECE
SYS ORA_KGLR7_IDL_SB4 PIECE
SQL>
All of these are related to the IDL and DIANA and PL/SQL. What about the "LONG RAW" columns:
SQL> select owner,table_name,column_name from dba_tab_columns where data_type='LONG RAW';
OWNER TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------ ------------------------------
SYS TRIGGERJAVAF$ FLAGS
SYS IDL_UB1$ PIECE
SYS JAVASNM$ LONGNAME
SYSTEM MVIEW$_ADV_AJG AJGDES
SYSTEM MVIEW$_ADV_FJG FJGDES
SYSTEM MVIEW$_ADV_GC GCDES
SYSTEM MVIEW$_ADV_CLIQUE CLIQUEDES
SYSTEM MVIEW$_ADV_INFO INFO
SYS ORA_KGLR7_IDL_UB1 PIECE
SYS JAVASNM_TMP$ LONGNAME
SYS JAVA$MC$ PIECE
OWNER TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------ ------------------------------
SYS JAVA$METHOD$METADATA PIECE
12 rows selected.
SQL>
All seem to relate to code of some type, PL/SQL, VIEW, TRIGGER and JAVA. What about the CHAR IDL table? lets see the data for my CUST function:
SQL> connect sys/oracle1@//192.168.56.78:1523/xepdb1 as sysdba
Connected.
SQL> desc idl_char$
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJ# NOT NULL NUMBER
PART NOT NULL NUMBER
VERSION NUMBER
PIECE# NOT NULL NUMBER
LENGTH NOT NULL NUMBER
PIECE NOT NULL LONG
SQL> select object_id from dba_objects where object_name='CUST' and owner='SYS';
OBJECT_ID
----------
78457
SQL> @print 'select * from idl_char$ where obj#=78457'
SP2-0310: unable to open file "print.sql"
SQL> @print 'select * from idl_char$ where obj#=78457'
old 33: lv_str:=translate('&&1','''','''''');
new 33: lv_str:=translate('select * from idl_char$ where obj#=78457','''','''''');
Executing Query [select * from idl_char$ where obj#=78457]
OBJ# : 78457
PART : 0
VERSION : 201326592
PIECE# : 3
LENGTH : 110
PIECE :
"CUST"E1678B6AA399D28B9391B3C5D1B351C2"PV_NAME"VARCHAR2"SYS"SYS_STUB_FOR_PURITY_
ANALYSIS"PRDS"PWDS"PRPS"PWPS""
-------------------------------------------
PL/SQL procedure successfully completed.
This is interesting and backs up what we saw for the function in the DUMPDIANA above. This shows the symbols for the function name and the parameter and its data type and then for the added in by Oracle SYS_STUB_FOR_PURITY_ANALYSIS as discussed earlier. If you look at a wrapped file from 9ir2 and earlier then this is very similar to the symbol table section in those wrapped files, except that in those files the whole of the symbol table was visible. Here is a desc of my function again for reference to show that all symbols necessary to re-generate the function signature from DIANA are there:
SQL> desc cust
FUNCTION cust RETURNS VARCHAR2
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
PV_NAME VARCHAR2 IN
SQL>
Can we learn more about the nodes (DIANA) and attributes - remember from the DIUTIL package earlier where there are calls to the PIDL and DIANA packages and where these are clearly used in the walking of the available DIANA AST. Well we can query the database and find out more about the DIANA and its attributes with these packages.
Here is part of the DIANA package via a DESCRIBE command:
SQL> desc diana
FUNCTION AS_ALTER RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE BINARY_INTEGER IN
FUNCTION AS_ALTERS RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE BINARY_INTEGER IN
FUNCTION AS_ALTS RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE BINARY_INTEGER IN
FUNCTION AS_ALTTYPS RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE BINARY_INTEGER IN
FUNCTION AS_APPLY RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
...
We cannot see the variables that are defined in the package via the describe command but we can see them via a select from the source code or by looking at the source code file in the /rdbms/admin directory. Here is the start and end of the DIANA nodes definitions from the DIANA package:
SQL> select text from dba_source where name='DIANA';
TEXT
--------------------------------------------------------------------------------
package diana is
D_ABORT constant pidl.ptnty := 1;
D_ACCEPT constant pidl.ptnty := 2;
D_ACCESS constant pidl.ptnty := 3;
D_ADDRES constant pidl.ptnty := 4;
D_AGGREG constant pidl.ptnty := 5;
D_ALIGNM constant pidl.ptnty := 6;
D_ALL constant pidl.ptnty := 7;
D_ALLOCA constant pidl.ptnty := 8;
D_ALTERN constant pidl.ptnty := 9;
...
DI_ASSEMBLY constant pidl.ptnty := 313;
D_WHTLST constant pidl.ptnty := 314;
DS_WHTLST constant pidl.ptnty := 315;
The DS_WHTLST is clearly the node for the new white listing in packages added in 12c. This is supported in the PL/SQL syntax with the "ACCESSIBLE BY" clause.
We can also look at the start of the PIDL package:
SQL> desc pidl
FUNCTION PTATTANM RETURNS VARCHAR2
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
ATTR_ENUM BINARY_INTEGER IN
FUNCTION PTATTBTY RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE_ENUM BINARY_INTEGER IN
ATTR_ENUM BINARY_INTEGER IN
FUNCTION PTATTCNT RETURNS BINARY_INTEGER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE_ENUM BINARY_INTEGER IN
FUNCTION PTATTNNM RETURNS VARCHAR2
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
NODE_ENUM BINARY_INTEGER IN
...
This package can be used to list out all of the attributes for each DIANA node. This can be worked out by studying this package and also by studying the source of scripts in the /rdbms/admin directory that use these functions and procedures.
We can use my attrib.sql script that has been on my website since around 2005. We can use this script to create my attrib procedure and then use that to list attributes for a DIANA node. Here is the first example of the D_COMP_U root DIANA node seen in the DUMPDIANA output above for my CUST procedure:
SQL> get attrib
1 create or replace procedure attrib(nod sys.pidl.ptnty) is
2 len pidl.ub2;
3 i binary_integer;
4 begin
5 dbms_output.put_line('Node Type '||pidl.ptattnnm(nod));
6 dbms_output.put_line('Num Attributes '||pidl.ptattcnt(nod));
7 len := pidl.ptattcnt(nod) - 1;
8 for i in 0..len loop
9 dbms_output.put_line(i||': '||pidl.ptatttyp(nod,i)
10 ||':'||pidl.ptattanm(pidl.ptatttyp(nod,i))
11 ||':'||pidl.ptattbty(nod,pidl.ptatttyp(nod,i))
12 ||': REF '||pidl.ptattrty(nod,pidl.ptatttyp(nod,i)));
13 end loop;
14* end;
SQL>
SQL> @attrib
Procedure created.
SQL> begin
2 attrib(23);
3 end;
4 /
Node Type D_COMP_U
Num Attributes 10
0: 9:A_CONTEX:1: REF 1
1: 40:A_UNIT_B:1: REF 1
2: 62:AS_PRAGM:1: REF 1
3: 114:SS_SQL:30: REF 0
4: 113:SS_EXLST:30: REF 0
5: 111:SS_BINDS:30: REF 0
6: 41:A_UP:1: REF 0
7: 138:A_AUTHID:2: REF 0
8: 142:A_SCHEMA:2: REF 0
9: 197:S_FG_REFS:30: REF 1
PL/SQL procedure successfully completed.
Look at the attributes above for D_COMP_U DIANA node and compare with the tree dump above. As an example we can also look at the D_BLOCK DIANA node as it also appears in then DIANA AST tree above. First find out the D_BLOCK value and then use attrib to get its attributes:
SQL> select text from dba_source where name='DIANA' and upper(text) like '%D_BLOCK%';
TEXT
--------------------------------------------------------------------------------
D_BLOCK constant pidl.ptnty := 17;
SQL>
SQL> begin
2 attrib(17);
3 end;
4 /
Node Type D_BLOCK
Num Attributes 17
0: 56:AS_ITEM:1: REF 1
1: 64:AS_STM:1: REF 1
2: 43:AS_ALTER:1: REF 1
3: 69:C_OFFSET:3: REF 0
4: 114:SS_SQL:30: REF 0
5: 66:C_FIXUP:11: REF 0
6: 79:S_BLOCK:1: REF 0
7: 103:S_SCOPE:1: REF 0
8: 88:S_FRAME:1: REF 0
9: 41:A_UP:1: REF 0
10: 92:S_LAYER:6: REF 0
11: 135:S_FLAGS:4: REF 0
12: 188:A_ENDLIN:3: REF 0
13: 187:A_ENDCOL:3: REF 0
14: 186:A_BEGLIN:3: REF 0
15: 185:A_BEGCOL:3: REF 0
16: 120:A_FLAGS:4: REF 0
PL/SQL procedure successfully completed.
SQL>
Compare again with above. Remember that AS_ITEM is where variables are declared - i.e. the declare block. The AS_STM is the statements for the main body and AS_ALTER is the exception clause. Goos, Wulf et al confirm this in their DIANA book.
I mentioned earlier that some nodes can have zero attributes and that some nodes are from ADA. The D_ABORT is a good example. It has zero attributes:
SQL> select text from dba_source where name='DIANA' and upper(text) like '%ABORT%';
TEXT
--------------------------------------------------------------------------------
D_ABORT constant pidl.ptnty := 1;
SQL>
SQL> begin
2 attrib(1);
3 end;
4 /
Node Type D_ABORT
Num Attributes 0
PL/SQL procedure successfully completed.
SQL>
This is a node from ADA and is most likely not used in PL/SQL as there does not seem to be an ABORT PL/SQL keyword but ABORT does appear in the reserved words list and also in the Oracle documentation as a PL/SQL reserved word - ABORT reserved word.
SQL> select * from v$reserved_words where keyword like '%ABORT%';
KEYWORD
--------------------------------------------------------------------------------
LENGTH R R R R D CON_ID
---------- - - - - - ----------
ABORT
5 N N N N N 0
SQL>
Some more ADA keywords and DIANA nodes can be found. For instance ADA supports tasks and the TASK keyword as well as entries.
SQL> select text from dba_source where name='DIANA' and upper(text) like '%TASK%';
TEXT
--------------------------------------------------------------------------------
DI_TASK_ constant pidl.ptnty := 156;
SQL>
SQL> select text from dba_source where name='DIANA' and upper(text) like '%ENTR%';
TEXT
--------------------------------------------------------------------------------
D_ENTRY constant pidl.ptnty := 39;
D_ENTRY_ constant pidl.ptnty := 40;
DI_ENTRY constant pidl.ptnty := 137;
VTABLE_ENTRY constant pidl.ptnty := 300;
function c_ENTRY_(node pidl.ptnod) return pidl.ub4;
function c_ENTRY_PT(node pidl.ptnod) return pidl.ub4;
function c_ENTRY_(node pidl.ptnod) return pidl.ub4 is
function c_ENTRY_PT(node pidl.ptnod) return pidl.ub4 is
8 rows selected.
SQL> select * from v$reserved_words where keyword like '%ENTR%';
KEYWORD
--------------------------------------------------------------------------------
LENGTH R R R R D CON_ID
---------- - - - - - ----------
ENTRY
5 N N N N N 0
SQL>
we can see the DIANA, we can see part of the AST structure for a procedure or function and we can analyse all nodes, their attribute and so on; wee can see some of a diana tree for the headers of functions and procedures but we cannot get the body DIANA (except via an unwapper).
Unwrapping of pre 10g PL/SQL wrapped code is in effect talking to the DIANA and AST. I wrote unwrap_c.sql 16 years ago and created an unwrapper for PL/SQL in PL/SQL. I also wrote unwrappers for PL/SQL 9i and earlier in C - of course, it is my favourite language. These tools are not available so please don't ask me for them. unwrap_c.sql can create the DIANA AST BUT its old and as we need to use a 9i wrap.exe to get the wrap file so that it can then be processed. This is not practical method or the right thing to do to write tools to analyse PL/SQL. This is a 9i PL/SQL compiler and old and doesn't do 18c, 19c, 20c PL/SQL new features of course.
To do something modern with PL/SQL we need to write our own parsers for PL/SQL as the database software does not expose the whole AST for all PL/SQL to use in our own tools. I suspect that might be a way to get the PL/SQL compiler to spit out the whole AST for debugging and analysis - if I was the writer of the PL/SQL compiler and knowing Oracles penchant for adding trace, instrumentation and tools to spit out internals I would add it, so I would be surprised if there is not a way to do that. If there is an event, parameter, command to tell the PL/SQL compiler to spit out the AST I have not found it and if I did find it it probably is not the right way to parse PL/SQL ourselves. BUT, clearly having access to the PL/SQL AST would be a massive time saver to create PL/SQL based tools if we could access it.
We have delved into some PL/SQL internals and whilst you don't need to know this to write PL/SQL its useful if you want to analyse that PL/SQL for any reason. Its always good to get a better understanding of how something works; Oracle really helps us do this with all of the clues and pieces that they leave exposed for us to look at in the database, code files, tools etc.
I said in my blackhat paper that 10g and above still uses DIANA (clearly we can see this is the case on my examples above written and tested in 18c XE). Some people mis-quoted me after that paper in 2006 that I said the wrap mechanism in 10g and above is the same as 9i and earlier. This is NOT what I said or meant and those that misquoted me and also in some cases cut and pasted my slides into their own papers didn't get it. I said that clearly 10g and above still uses DIANA and the AST in some way. I hope that this post was interesting for you to read.
I enjoyed writing it and its nice to spend some time to share knowledge.
OK, bye from WFH (Working From Home)