Learning SQL Notes #13: Metadata

Siqi Zheng

Last updated on Jun 11, 2021 7 min read SQL

Data About Data
information_schema
Working with Metadata

A database server also needs to store information about all of the database objects (tables, views, indexes, etc.) that were created to store this data in a database. This chapter discusses how and where this information, known as metadata, is stored, how you can access it, and how you can use it to build flexible systems.

Data About Data

Metadata is essentially data about data. Every time you create a database object, the database server needs to record various pieces of information. For example, if you were to create a table with multiple columns, a primary key constraint, three indexes, and a foreign key constraint, the database server would need to store all the following information:

Table name
Table storage information (tablespace, initial size, etc.)
Storage engine
Column names
Column data types
Default column values
not null column constraints
Primary key columns
Primary key name
Name of primary key index
Index names
Index types (B-tree, bitmap)
Indexed columns
Index column sort order (ascending or descending)
Index storage information
Foreign key name
Foreign key columns
Associated table/columns for foreign keys

This data is collectively known as the data dictionary or system catalog. The database server needs to store this data persistently, and it needs to be able to quickly retrieve this data in order to verify and execute SQL statements. Additionally, the database server must safeguard this data so that it can be modified only via an appropriate mechanism, such as the alter table statement.

Every database server uses a different mechanism to publish metadata, such as:

A set of views, such as Oracle Database’s user_tables and all_constraints views
A set of system-stored procedures, such as SQL Server’s sp_tables procedure or Oracle Database’s dbms_metadata package
A special database, such as MySQL’s information_schema database

information_schema

All of the objects available within the information_schema database (or schema, in the case of SQL Server) are views. Unlike the describe utility, the views within information_schema can be queried and, thus, used programmatically.

Table Name	Description	Introduced	Deprecated
`ADMINISTRABLE_ROLE_AUTHORIZATIONS`	Grantable users or roles for current user or role	8.0.19
`APPLICABLE_ROLES`	Applicable roles for current user	8.0.19
`CHARACTER_SETS`	Available character sets
`CHECK_CONSTRAINTS`	Table and column CHECK constraints	8.0.16
`COLLATION_CHARACTER_SET_APPLICABILITY`	Character set applicable to each collation
`COLLATIONS`	Collations for each character set
`COLUMN_PRIVILEGES`	Privileges defined on columns
`COLUMN_STATISTICS`	Histogram statistics for column values
`COLUMNS`	Columns in each table
`COLUMNS_EXTENSIONS`	Column attributes for primary and secondary storage engines	8.0.21
`CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS`	Current number of consecutive failed connection attempts per account
`ENABLED_ROLES`	Roles enabled within current session	8.0.19
`ENGINES`	Storage engine properties
`EVENTS`	Event Manager events
`FILES`	Files that store tablespace data
`INNODB_BUFFER_PAGE`	Pages in InnoDB buffer pool
`INNODB_BUFFER_PAGE_LRU`	LRU ordering of pages in InnoDB buffer pool
`INNODB_BUFFER_POOL_STATS`	InnoDB buffer pool statistics
`INNODB_CACHED_INDEXES`	Number of index pages cached per index in InnoDB buffer pool
`INNODB_CMP`	Status for operations related to compressed InnoDB tables
`INNODB_CMP_PER_INDEX`	Status for operations related to compressed InnoDB tables and indexes
`INNODB_CMP_PER_INDEX_RESET`	Status for operations related to compressed InnoDB tables and indexes
`INNODB_CMP_RESET`	Status for operations related to compressed InnoDB tables
`INNODB_CMPMEM`	Status for compressed pages within InnoDB buffer pool
`INNODB_CMPMEM_RESET`	Status for compressed pages within InnoDB buffer pool
`INNODB_COLUMNS`	Columns in each InnoDB table
`INNODB_DATAFILES`	Data file path information for InnoDB file-per-table and general tablespaces
`INNODB_FIELDS`	Key columns of InnoDB indexes
`INNODB_FOREIGN`	InnoDB foreign-key metadata
`INNODB_FOREIGN_COLS`	InnoDB foreign-key column status information
`INNODB_FT_BEING_DELETED`	Snapshot of INNODB_FT_DELETED table
`INNODB_FT_CONFIG`	Metadata for InnoDB table FULLTEXT index and associated processing
`INNODB_FT_DEFAULT_STOPWORD`	Default list of stopwords for InnoDB FULLTEXT indexes
`INNODB_FT_DELETED`	Rows deleted from InnoDB table FULLTEXT index
`INNODB_FT_INDEX_CACHE`	Token information for newly inserted rows in InnoDB FULLTEXT index
`INNODB_FT_INDEX_TABLE`	Inverted index information for processing text searches against InnoDB table FULLTEXT index
`INNODB_INDEXES`	InnoDB index metadata
`INNODB_METRICS`	InnoDB performance information
`INNODB_SESSION_TEMP_TABLESPACES`	Session temporary-tablespace metadata	8.0.13
`INNODB_TABLES`	InnoDB table metadata
`INNODB_TABLESPACES`	InnoDB file-per-table, general, and undo tablespace metadata
`INNODB_TABLESPACES_BRIEF`	Brief file-per-table, general, undo, and system tablespace metadata
`INNODB_TABLESTATS`	InnoDB table low-level status information
`INNODB_TEMP_TABLE_INFO`	Information about active user-created InnoDB temporary tables
`INNODB_TRX`	Active InnoDB transaction information
`INNODB_VIRTUAL`	InnoDB virtual generated column metadata
`KEY_COLUMN_USAGE`	Which key columns have constraints
`KEYWORDS`	MySQL keywords
`MYSQL_FIREWALL_USERS`	Firewall in-memory data for account profiles		8.0.26
`MYSQL_FIREWALL_WHITELIST`	Firewall in-memory data for account profile allowlists		8.0.26
`ndb_transid_mysql_connection_map`	NDB transaction information
`OPTIMIZER_TRACE`	Information produced by optimizer trace activity
`PARAMETERS`	Stored routine parameters and stored function return values
`PARTITIONS`	Table partition information
`PLUGINS`	Plugin information
`PROCESSLIST`	Information about currently executing threads
`PROFILING`	Statement profiling information
`REFERENTIAL_CONSTRAINTS`	Foreign key information
`RESOURCE_GROUPS`	Resource group information
`ROLE_COLUMN_GRANTS`	Column privileges for roles available to or granted by currently enabled roles	8.0.19
`ROLE_ROUTINE_GRANTS`	Routine privileges for roles available to or granted by currently enabled roles	8.0.19
`ROLE_TABLE_GRANTS`	Table privileges for roles available to or granted by currently enabled roles	8.0.19
`ROUTINES`	Stored routine information
`SCHEMA_PRIVILEGES`	Privileges defined on schemas
`SCHEMATA`	Schema information
`SCHEMATA_EXTENSIONS`	Schema options	8.0.22
`ST_GEOMETRY_COLUMNS`	Columns in each table that store spatial data
`ST_SPATIAL_REFERENCE_SYSTEMS`	Available spatial reference systems
`ST_UNITS_OF_MEASURE`	Acceptable units for ST_Distance()	8.0.14
`STATISTICS`	Table index statistics
`TABLE_CONSTRAINTS`	Which tables have constraints
`TABLE_CONSTRAINTS_EXTENSIONS`	Table constraint attributes for primary and secondary storage engines	8.0.21
`TABLE_PRIVILEGES`	Privileges defined on tables
`TABLES`	Table information
`TABLES_EXTENSIONS`	Table attributes for primary and secondary storage engines	8.0.21
`TABLESPACES`	Tablespace information
`TABLESPACES_EXTENSIONS`	Tablespace attributes for primary storage engines	8.0.21
`TP_THREAD_GROUP_STATE`	Thread pool thread group states
`TP_THREAD_GROUP_STATS`	Thread pool thread group statistics
`TP_THREAD_STATE`	Thread pool thread information
`TRIGGERS`	Trigger information
`USER_ATTRIBUTES`	User comments and attributes	8.0.21
`USER_PRIVILEGES`	Privileges defined globally per user
`VIEW_ROUTINE_USAGE`	Stored functions used in views	8.0.13
`VIEW_TABLE_USAGE`	Tables and views used in views	8.0.13
`VIEWS`	View information

Working with Metadata

Schema Generation Scripts

Generate a script that will create the various tables, indexes, views, and so on, that the team has deployed. Build a script that will create the sakila.category table. The following codes can be used to create a template-like SQL script.

SELECT 'CREATE TABLE category (' create_table_statement 
UNION ALL
SELECT cols.txt 
FROM 
(SELECT concat(' ',column_name, ' ', column_type, 
CASE
WHEN is_nullable = 'NO' THEN ' not null' ELSE ''
END, CASE
WHEN extra IS NOT NULL AND extra LIKE 'DEFAULT_GENERATED%' THEN concat(' DEFAULT ',column_default,substr(extra,18)) WHEN extra IS NOT NULL THEN concat(' ', extra)
ELSE '' END, ',') txt
FROM information_schema.columns 
WHERE table_schema = 'sakila' AND table_name = 'category' 
ORDER BY ordinal_position 
) cols 
UNION ALL
SELECT concat(' constraint primary key (') 
FROM information_schema.table_constraints 
WHERE table_schema = 'sakila' AND table_name = 'category' 
AND constraint_type = 'PRIMARY KEY' 
UNION ALL
SELECT cols.txt 
FROM 
(SELECT concat(CASE WHEN ordinal_position > 1 THEN ' ,' 
ELSE ' ' END, column_name) txt
FROM information_schema.key_column_usage 
WHERE table_schema = 'sakila' AND table_name = 'category' 
AND constraint_name = 'PRIMARY'
ORDER BY ordinal_position 
) cols 
UNION ALL 
SELECT ' )' 
UNION ALL
SELECT ')';

Deployment Verification

After the deployment scripts have been run, it’s a good idea to run a verification script to ensure that the new schema objects are in place with the appropriate columns, indexes, primary keys, and so forth. Here’s a query that returns the number of columns, number of indexes, and number of primary key constraints (0 or 1) for each table in the Sakila schema:

SELECT tbl.table_name, 
 (SELECT count(*) 
 FROM information_schema.columns clm
 WHERE clm.table_schema = tbl.table_schema
 AND clm.table_name = tbl.table_name) num_columns, 
 (SELECT count(*) 
 FROM information_schema.statistics sta 
 WHERE sta.table_schema = tbl.table_schema 
 AND sta.table_name = tbl.table_name) num_indexes,
 (SELECT count(*) 
 FROM information_schema.table_constraints tc 
 WHERE tc.table_schema = tbl.table_schema 
 AND tc.table_name = tbl.table_name
 AND tc.constraint_type = 'PRIMARY KEY') num_primary_keys 
 FROM information_schema.tables tbl
 WHERE tbl.table_schema = 'sakila' AND tbl.table_type = 'BASE TABLE'
 ORDER BY 1;

TABLE_NAME	num_columns	num_indexes	num_primary_keys
actor	4	2	1

Dynamic SQL Generation

Most relational database servers, including SQL Server, Oracle Database, and MySQL, allow SQL statements to be submitted to the server as strings. Submit‐ ting strings to a database engine rather than utilizing its SQL interface is generally known as dynamic SQL execution.

Oracle’s PL/SQL language

execute immediate

SQL Server

sp_executesql

MySQL

prepare, execute, deallocate

SET @qry = 'SELECT customer_id, first_name, last_name FROM customer'; 
PREPARE dynsql1 FROM @qry; 
EXECUTE dynsql1;
DEALLOCATE PREPARE dynsql1;

/*conditions can be specified at runtime*/
SET @qry = 'SELECT customer_id, first_name, last_name FROM customer WHERE customer_id = ?';
PREPARE dynsql2 FROM @qry;
SET @custid = 9;
EXECUTE dynsql2 USING @custid;
SET @custid = 145;
EXECUTE dynsql2 USING @custid;
DEALLOCATE PREPARE dynsql2;

Or you can do the following:

SELECT concat('SELECT ', concat_ws(',', cols.col1, cols.col2),
' FROM customer WHERE customer_id = ?') 
INTO @qry 
FROM (SELECT
max(CASE WHEN ordinal_position = 1 THEN column_name 
ELSE NULL END) col1,
max(CASE WHEN ordinal_position = 2 THEN column_name
ELSE NULL END) col2
FROM information_schema.columns 
WHERE table_schema = 'sakila' AND table_name = 'customer' 
GROUP BY table_name
) cols;

PREPARE dynsql3 FROM @qry; 
SET @custid = 45; Query OK, 0 rows affected (0.00 sec)
EXECUTE dynsql3 USING @custid;
DEALLOCATE PREPARE dynsql3;

Note: Generally, it would be better to generate the query using a procedural language that includes looping constructs, such as Java, PL/SQL, Transact-SQL, or MySQL’s Stored Procedure Language.

R SQL notes