xSchema — Documentation

Introduction

xSchema is a command-line SQL Server schema extraction tool written in Python. It connects to a SQL Server database and produces a comprehensive, human-readable Markdown document describing every aspect of the schema — tables, columns, keys, indexes, views, stored procedures, functions, sequences, and user-defined types.

The output is designed to be fed directly into AI assistants, used as living project documentation, or committed to version control alongside your codebase.

SQL Server DB

→

xSchema.py

→

schema.md

→

AI / Docs / Git

Requirements

Python

Python 3.8 or later. No other version constraints.

pyodbc

The only third-party dependency. Install via pip.

Install the dependency

pip install pyodbc

ODBC Driver

A SQL Server ODBC driver must be installed on your system. xSchema automatically selects the newest available driver. The recommended driver is ODBC Driver 17 for SQL Server or later.

Platform	Install method
macOS	`brew install msodbcsql17` — also requires `brew install unixodbc`
Linux (Debian/Ubuntu)	Add Microsoft apt repository and install `msodbcsql17`
Windows	Download the installer from Microsoft Download Center

Quick Start

Step 1 — Set your connection details

Open xSchema.py and edit the configuration block near the top:

SERVER      = "192.168.0.1"   # Your SQL Server IP or hostname
PORT        = 1433              # Default SQL Server port
DATABASE    = "sqlserver"          # Target database name
USERNAME    = "me"              # SQL Server login
PASSWORD    = "letmein"                # SQL Server password
OUTPUT_FILE = "schema.md"       # Output filename (or None for auto)

xSchema — Documentation

Security note: The password is stored in plain text. Do not commit xSchema.py with a real password to any repository. Consider using an environment variable: PASSWORD = os.environ.get("XSCHEMA_PWD", "")

Step 2 — Run the script

python xSchema.py

Step 3 — Find your output

The schema file (schema.md by default) is written to the same directory as the script. xSchema prints progress messages as it connects, queries, and writes, finishing with the file size in bytes.

Connection Settings

All connection parameters live in the configuration block at the top of xSchema.py. There is no external config file — this is intentional for simplicity.

Variable	Type	Default	Description
`SERVER`	str	`"192.168.1.160"`	IP address or hostname of the SQL Server instance
`PORT`	int	`1433`	TCP port — 1433 is the SQL Server default
`DATABASE`	str	`"sqlserver"`	The name of the database to extract
`USERNAME`	str	`"sa"`	SQL Server login (SQL authentication)
`PASSWORD`	str	`""`	Password for the above login
`OUTPUT_FILE`	str / None	`"schema.md"`	Output filename. Set to `None` to auto-name as `<database>_schema.md`

Certificate Trust

The connection string includes TrustServerCertificate=yes, bypassing SSL certificate validation. Appropriate for local/internal servers — remove for production connections with a trusted certificate.

Windows Authentication

To use Windows auth instead of SQL auth, replace the UID/PWD portion in build_connection_string() with Trusted_Connection=yes; and remove the USERNAME/PASSWORD variables.

CLI Arguments

xSchema accepts optional flags to control output content. All are opt-in; the defaults produce the most focused result.

Flag	Type	Default	Description
`--sprocs`	flag	off	Include all non-system stored procedures and their parameters
`--views`	flag	off	Include all non-system views, their columns, and full SQL definition
`--minrecords N`	int	`1`	Exclude tables with fewer than N rows

--minrecords in detail

Filters tables using the approximate row count from sys.partitions. The default of 1 excludes completely empty tables. Set to 0 to include all tables, or a higher value such as 100 to focus on tables with meaningful data. Note that counts are approximate and may lag behind recent bulk operations.

Run python xSchema.py --help at any time to see all flags and their defaults.

Output File

xSchema writes a single UTF-8 Markdown (.md) file, readable in any Markdown renderer including GitHub, VS Code, Obsidian, and AI chat tools.

Document Structure

▶Header — database name, generation timestamp, active options

▶Database information — SQL Server version, edition, collation

▶Table of contents — auto-generated with anchor links to every section and table

▶Schemas — all non-system schemas and their owners

▶Tables — columns, PKs, FKs, unique constraints, check constraints, indexes, triggers per table

▶Views --views — columns and full SQL definition

▶Stored procedures --sprocs — parameters with types and directions

▶Functions — all user-defined scalar and table-valued functions

▶Sequences — if any exist in the database

▶User-defined types — if any exist in the database

Output File Naming

OUTPUT_FILE value	Resulting filename
`"schema.md"` (or any string)	Exactly that filename
`None`	`sqlserver_schema.md` (database name + `_schema.md`)

Usage Examples

Basic extraction — tables with data only

python xSchema.py

Include stored procedures and views

python xSchema.py --sprocs --views

Only tables with 100+ rows, include stored procedures

python xSchema.py --minrecords 100 --sprocs

Include all tables regardless of row count

python xSchema.py --minrecords 0 --sprocs --views

Auto-name the output file

Set OUTPUT_FILE = None in the script. The file will be written as schema.md.

Schema Output — What's Captured

For each table, xSchema extracts the following detail:

Section	Details captured
Table metadata	Approximate row count, created date, modified date, MS_Description extended property
Columns	ID, name, data type (with length/precision/scale), nullability, identity spec, computed column definition, default value, collation, MS_Description
Primary key	Constraint name, column(s), key ordinal, descending flag
Foreign keys	Constraint name, column, referenced table/column, ON DELETE/UPDATE actions, disabled flag
Unique constraints	Constraint name, column(s), ordinal
Check constraints	Constraint name, column (or table-level), definition expression, disabled flag
Indexes	Name, type, uniqueness, key columns, included columns, fill factor, filter definition
Triggers	Name, type, event (INSERT/UPDATE/DELETE), INSTEAD OF flag, disabled flag

Data types are formatted with their full specification — e.g. nvarchar(255), decimal(18,4), nvarchar(MAX). nchar/nvarchar lengths are halved from max_length to display the correct character count.

Architecture

xSchema is a single-file Python script with a clear layered structure. Each layer has a single responsibility.

Layer	Components	Responsibility
CLI	`parse_args()`	argparse-based flag handling
Connection	`build_connection_string()`, `connect()`	ODBC driver detection and database connection
Query helper	`q(cursor, sql, *params)`	Thin wrapper around cursor.execute + fetchall
Extractors	`get_*()` functions	One per schema object type — all return row lists
Formatters	`md_*()`, `format_data_type()`	Markdown string generation
Builder	`build_markdown()`	Orchestrates extraction and assembles the document
Entry point	`main()`	Connects, calls builder, writes file, prints summary

TOC Injection Mechanism

During document assembly a placeholder comment  is inserted at the table of contents position. TOC entries are accumulated as each section is built, then joined and substituted in a single string replacement — avoiding any two-pass build logic.

ODBC Driver Selection

build_connection_string() calls pyodbc.drivers(), filters for entries containing "SQL Server", sorts alphabetically, and picks the last — effectively the highest version number installed. No manual driver configuration is required.

Function Reference

CLI & Connection

Function	Returns	Description
`parse_args()`	Namespace	Returns parsed CLI args: `.sprocs`, `.views`, `.minrecords`
`build_connection_string()`	str	Detects best ODBC driver and builds the pyodbc connection string
`connect()`	Connection	Opens connection with 30s timeout, sets autocommit. Exits on failure.
`q(cursor, sql, *params)`	list[Row]	Executes SQL with optional parameters, returns all rows

Schema Extractors

Function	Parameter	Description
`get_database_info(cursor)`	—	DB name, SQL version, edition, collation via SERVERPROPERTY
`get_schemas(cursor)`	—	All non-system schemas and their owners
`get_tables(cursor, min_records)`	int	Tables with row count ≥ min_records, including extended properties
`get_columns(cursor, object_id)`	int	All columns for a table or view with full metadata
`get_primary_keys(cursor, object_id)`	int	PK constraint, columns, ordinal, sort direction
`get_foreign_keys(cursor, object_id)`	int	FK constraints with referenced table/column and referential actions
`get_check_constraints(cursor, object_id)`	int	CHECK constraints with definition expression and disabled flag
`get_unique_constraints(cursor, object_id)`	int	UNIQUE key constraints
`get_indexes(cursor, object_id)`	int	All indexes with key/included columns via STRING_AGG
`get_triggers(cursor, object_id)`	int	DML triggers on the table
`get_views(cursor)`	—	All non-system views with columns and SQL definition
`get_stored_procedures(cursor)`	—	All non-system stored procedures
`get_proc_parameters(cursor, object_id)`	int	Parameters for a stored procedure
`get_functions(cursor)`	—	All user-defined functions (FN, IF, TF, FS, FT types)
`get_sequences(cursor)`	—	Sequences — wrapped in try/except for pre-2012 compatibility
`get_user_defined_types(cursor)`	—	User-defined non-table types

Markdown Formatters

Function	Description
`md_h1/h2/h3/h4(text)`	Returns a Markdown heading string at the given level
`md_table(headers, rows)`	Generates a GFM pipe table. Returns `_None_` for empty row lists. Pipes and newlines in cell values are escaped.
`format_data_type(row)`	Formats a column's data type with length/precision/scale. Handles nvarchar/varchar MAX, nchar byte-to-char conversion, decimal, float, and all base types.

Core Builder & Entry Point

Function	Description
`build_markdown(conn, database_name, args)`	Main document builder. Opens a cursor, calls all extractors in sequence, assembles the Markdown string, injects the TOC, and returns the complete document.
`main()`	Entry point. Parses args, connects, resolves output filename, calls build_markdown(), writes the file, and prints summary stats.

Extending xSchema

Adding a New Schema Object Type

The pattern is consistent throughout. To add synonyms, for example:

1. Add an extractor function:

def get_synonyms(cursor):
    return q(cursor, """
        SELECT s.name, sc.name AS schema_name,
               s.base_object_name
        FROM sys.synonyms s
        JOIN sys.schemas sc ON s.schema_id = sc.schema_id
        ORDER BY sc.name, s.name
    """)

2. Call it inside build_markdown():

synonyms = get_synonyms(cursor)
if synonyms:
    lines.append(md_h2("Synonyms"))
    toc_entries.append("- [Synonyms](#synonyms)")
    lines.append(md_table(
        ["Schema", "Name", "Base Object"],
        [(r.schema_name, r.name, r.base_object_name) for r in synonyms]
    ))

Adding a New CLI Flag

Add the argument in parse_args(), then gate the relevant section in build_markdown() with if args.your_flag:, following the same pattern as --sprocs and --views.

Troubleshooting

Error / Symptom	Likely cause	Solution
`pyodbc is Not Installed`	pyodbc not in the current Python environment	Run `pip install pyodbc`
`No SQL Server ODBC Driver Found`	No SQL Server ODBC driver installed on the OS	Install ODBC Driver 17 for SQL Server from Microsoft
`Could not Connect to SQL Server`	Wrong IP/port, firewall blocking 1433, or incorrect credentials	Verify SERVER, PORT, USERNAME, PASSWORD and firewall rules
Empty output / few tables	`--minrecords` threshold too high, or tables are empty	Run with `--minrecords 0` to include all tables
Row counts seem wrong	`sys.partitions` counts are approximate	Expected behaviour — counts are fast estimates, not exact SELECTs
Sequences not in output	Sequences require SQL Server 2012 or later	Handled gracefully — `get_sequences()` returns an empty list on failure
Stored procedures not in output	`--sprocs` flag not passed	Run with `--sprocs`
Views not in output	`--views` flag not passed	Run with `--views`
nvarchar shows wrong length	`sys.columns.max_length` is byte length; nvarchar uses 2 bytes/char	Handled automatically — `format_data_type()` halves nchar/nvarchar lengths