Introduction

xSchema is a command-line SQL Server schema extraction tool written in Python. It connects to a SQL Server database and produces a comprehensive, human-readable Markdown document describing every aspect of the schema — tables, columns, keys, indexes, views, stored procedures, functions, sequences, and user-defined types.

The output is designed to be fed directly into AI assistants, used as living project documentation, or committed to version control alongside your codebase.

SQL Server DB
xSchema.py
schema.md
AI / Docs / Git

Requirements

Python

Python 3.8 or later. No other version constraints.

pyodbc

The only third-party dependency. Install via pip.

Install the dependency

pip install pyodbc

ODBC Driver

A SQL Server ODBC driver must be installed on your system. xSchema automatically selects the newest available driver. The recommended driver is ODBC Driver 17 for SQL Server or later.

PlatformInstall method
macOSbrew install msodbcsql17 — also requires brew install unixodbc
Linux (Debian/Ubuntu)Add Microsoft apt repository and install msodbcsql17
WindowsDownload the installer from Microsoft Download Center

Quick Start

Step 1 — Set your connection details

Open xSchema.py and edit the configuration block near the top:

SERVER      = "192.168.0.1"   # Your SQL Server IP or hostname
PORT        = 1433              # Default SQL Server port
DATABASE    = "sqlserver"          # Target database name
USERNAME    = "me"              # SQL Server login
PASSWORD    = "letmein"                # SQL Server password
OUTPUT_FILE = "schema.md"       # Output filename (or None for auto)
xSchema — Documentation
Security note: The password is stored in plain text. Do not commit xSchema.py with a real password to any repository. Consider using an environment variable: PASSWORD = os.environ.get("XSCHEMA_PWD", "")

Step 2 — Run the script

python xSchema.py

Step 3 — Find your output

The schema file (schema.md by default) is written to the same directory as the script. xSchema prints progress messages as it connects, queries, and writes, finishing with the file size in bytes.

Connection Settings

All connection parameters live in the configuration block at the top of xSchema.py. There is no external config file — this is intentional for simplicity.

VariableTypeDefaultDescription
SERVERstr"192.168.1.160"IP address or hostname of the SQL Server instance
PORTint1433TCP port — 1433 is the SQL Server default
DATABASEstr"sqlserver"The name of the database to extract
USERNAMEstr"sa"SQL Server login (SQL authentication)
PASSWORDstr""Password for the above login
OUTPUT_FILEstr / None"schema.md"Output filename. Set to None to auto-name as <database>_schema.md

Certificate Trust

The connection string includes TrustServerCertificate=yes, bypassing SSL certificate validation. Appropriate for local/internal servers — remove for production connections with a trusted certificate.

Windows Authentication

To use Windows auth instead of SQL auth, replace the UID/PWD portion in build_connection_string() with Trusted_Connection=yes; and remove the USERNAME/PASSWORD variables.

CLI Arguments

xSchema accepts optional flags to control output content. All are opt-in; the defaults produce the most focused result.

FlagTypeDefaultDescription
--sprocsflagoffInclude all non-system stored procedures and their parameters
--viewsflagoffInclude all non-system views, their columns, and full SQL definition
--minrecords Nint1Exclude tables with fewer than N rows

--minrecords in detail

Filters tables using the approximate row count from sys.partitions. The default of 1 excludes completely empty tables. Set to 0 to include all tables, or a higher value such as 100 to focus on tables with meaningful data. Note that counts are approximate and may lag behind recent bulk operations.

Run python xSchema.py --help at any time to see all flags and their defaults.

Output File

xSchema writes a single UTF-8 Markdown (.md) file, readable in any Markdown renderer including GitHub, VS Code, Obsidian, and AI chat tools.

Document Structure

Header — database name, generation timestamp, active options
Database information — SQL Server version, edition, collation
Table of contents — auto-generated with anchor links to every section and table
Schemas — all non-system schemas and their owners
Tables — columns, PKs, FKs, unique constraints, check constraints, indexes, triggers per table
Views --views — columns and full SQL definition
Stored procedures --sprocs — parameters with types and directions
Functions — all user-defined scalar and table-valued functions
Sequences — if any exist in the database
User-defined types — if any exist in the database

Output File Naming

OUTPUT_FILE valueResulting filename
"schema.md" (or any string)Exactly that filename
Nonesqlserver_schema.md (database name + _schema.md)

Usage Examples

Basic extraction — tables with data only

python xSchema.py

Include stored procedures and views

python xSchema.py --sprocs --views

Only tables with 100+ rows, include stored procedures

python xSchema.py --minrecords 100 --sprocs

Include all tables regardless of row count

python xSchema.py --minrecords 0 --sprocs --views

Auto-name the output file

Set OUTPUT_FILE = None in the script. The file will be written as schema.md.

Schema Output — What's Captured

For each table, xSchema extracts the following detail:

SectionDetails captured
Table metadataApproximate row count, created date, modified date, MS_Description extended property
ColumnsID, name, data type (with length/precision/scale), nullability, identity spec, computed column definition, default value, collation, MS_Description
Primary keyConstraint name, column(s), key ordinal, descending flag
Foreign keysConstraint name, column, referenced table/column, ON DELETE/UPDATE actions, disabled flag
Unique constraintsConstraint name, column(s), ordinal
Check constraintsConstraint name, column (or table-level), definition expression, disabled flag
IndexesName, type, uniqueness, key columns, included columns, fill factor, filter definition
TriggersName, type, event (INSERT/UPDATE/DELETE), INSTEAD OF flag, disabled flag
Data types are formatted with their full specification — e.g. nvarchar(255), decimal(18,4), nvarchar(MAX). nchar/nvarchar lengths are halved from max_length to display the correct character count.

Architecture

xSchema is a single-file Python script with a clear layered structure. Each layer has a single responsibility.

LayerComponentsResponsibility
CLIparse_args()argparse-based flag handling
Connectionbuild_connection_string(), connect()ODBC driver detection and database connection
Query helperq(cursor, sql, *params)Thin wrapper around cursor.execute + fetchall
Extractorsget_*() functionsOne per schema object type — all return row lists
Formattersmd_*(), format_data_type()Markdown string generation
Builderbuild_markdown()Orchestrates extraction and assembles the document
Entry pointmain()Connects, calls builder, writes file, prints summary

TOC Injection Mechanism

During document assembly a placeholder comment <!-- TOC --> is inserted at the table of contents position. TOC entries are accumulated as each section is built, then joined and substituted in a single string replacement — avoiding any two-pass build logic.

ODBC Driver Selection

build_connection_string() calls pyodbc.drivers(), filters for entries containing "SQL Server", sorts alphabetically, and picks the last — effectively the highest version number installed. No manual driver configuration is required.

Function Reference

CLI & Connection

FunctionReturnsDescription
parse_args()NamespaceReturns parsed CLI args: .sprocs, .views, .minrecords
build_connection_string()strDetects best ODBC driver and builds the pyodbc connection string
connect()ConnectionOpens connection with 30s timeout, sets autocommit. Exits on failure.
q(cursor, sql, *params)list[Row]Executes SQL with optional parameters, returns all rows

Schema Extractors

FunctionParameterDescription
get_database_info(cursor)DB name, SQL version, edition, collation via SERVERPROPERTY
get_schemas(cursor)All non-system schemas and their owners
get_tables(cursor, min_records)intTables with row count ≥ min_records, including extended properties
get_columns(cursor, object_id)intAll columns for a table or view with full metadata
get_primary_keys(cursor, object_id)intPK constraint, columns, ordinal, sort direction
get_foreign_keys(cursor, object_id)intFK constraints with referenced table/column and referential actions
get_check_constraints(cursor, object_id)intCHECK constraints with definition expression and disabled flag
get_unique_constraints(cursor, object_id)intUNIQUE key constraints
get_indexes(cursor, object_id)intAll indexes with key/included columns via STRING_AGG
get_triggers(cursor, object_id)intDML triggers on the table
get_views(cursor)All non-system views with columns and SQL definition
get_stored_procedures(cursor)All non-system stored procedures
get_proc_parameters(cursor, object_id)intParameters for a stored procedure
get_functions(cursor)All user-defined functions (FN, IF, TF, FS, FT types)
get_sequences(cursor)Sequences — wrapped in try/except for pre-2012 compatibility
get_user_defined_types(cursor)User-defined non-table types

Markdown Formatters

FunctionDescription
md_h1/h2/h3/h4(text)Returns a Markdown heading string at the given level
md_table(headers, rows)Generates a GFM pipe table. Returns _None_ for empty row lists. Pipes and newlines in cell values are escaped.
format_data_type(row)Formats a column's data type with length/precision/scale. Handles nvarchar/varchar MAX, nchar byte-to-char conversion, decimal, float, and all base types.

Core Builder & Entry Point

FunctionDescription
build_markdown(conn, database_name, args)Main document builder. Opens a cursor, calls all extractors in sequence, assembles the Markdown string, injects the TOC, and returns the complete document.
main()Entry point. Parses args, connects, resolves output filename, calls build_markdown(), writes the file, and prints summary stats.

Extending xSchema

Adding a New Schema Object Type

The pattern is consistent throughout. To add synonyms, for example:

1. Add an extractor function:

def get_synonyms(cursor):
    return q(cursor, """
        SELECT s.name, sc.name AS schema_name,
               s.base_object_name
        FROM sys.synonyms s
        JOIN sys.schemas sc ON s.schema_id = sc.schema_id
        ORDER BY sc.name, s.name
    """)

2. Call it inside build_markdown():

synonyms = get_synonyms(cursor)
if synonyms:
    lines.append(md_h2("Synonyms"))
    toc_entries.append("- [Synonyms](#synonyms)")
    lines.append(md_table(
        ["Schema", "Name", "Base Object"],
        [(r.schema_name, r.name, r.base_object_name) for r in synonyms]
    ))

Adding a New CLI Flag

Add the argument in parse_args(), then gate the relevant section in build_markdown() with if args.your_flag:, following the same pattern as --sprocs and --views.

Troubleshooting

Error / SymptomLikely causeSolution
pyodbc is Not Installedpyodbc not in the current Python environmentRun pip install pyodbc
No SQL Server ODBC Driver FoundNo SQL Server ODBC driver installed on the OSInstall ODBC Driver 17 for SQL Server from Microsoft
Could not Connect to SQL ServerWrong IP/port, firewall blocking 1433, or incorrect credentialsVerify SERVER, PORT, USERNAME, PASSWORD and firewall rules
Empty output / few tables--minrecords threshold too high, or tables are emptyRun with --minrecords 0 to include all tables
Row counts seem wrongsys.partitions counts are approximateExpected behaviour — counts are fast estimates, not exact SELECTs
Sequences not in outputSequences require SQL Server 2012 or laterHandled gracefully — get_sequences() returns an empty list on failure
Stored procedures not in output--sprocs flag not passedRun with --sprocs
Views not in output--views flag not passedRun with --views
nvarchar shows wrong lengthsys.columns.max_length is byte length; nvarchar uses 2 bytes/charHandled automatically — format_data_type() halves nchar/nvarchar lengths