Introduction
xSchema is a command-line SQL Server schema extraction tool written in Python. It connects to a SQL Server database and produces a comprehensive, human-readable Markdown document describing every aspect of the schema — tables, columns, keys, indexes, views, stored procedures, functions, sequences, and user-defined types.
The output is designed to be fed directly into AI assistants, used as living project documentation, or committed to version control alongside your codebase.
Requirements
Python
Python 3.8 or later. No other version constraints.
pyodbc
The only third-party dependency. Install via pip.
Install the dependency
pip install pyodbc
ODBC Driver
A SQL Server ODBC driver must be installed on your system. xSchema automatically selects the newest available driver. The recommended driver is ODBC Driver 17 for SQL Server or later.
| Platform | Install method |
|---|---|
| macOS | brew install msodbcsql17 — also requires brew install unixodbc |
| Linux (Debian/Ubuntu) | Add Microsoft apt repository and install msodbcsql17 |
| Windows | Download the installer from Microsoft Download Center |
Quick Start
Step 1 — Set your connection details
Open xSchema.py and edit the configuration block near the top:
SERVER = "192.168.0.1" # Your SQL Server IP or hostname
PORT = 1433 # Default SQL Server port
DATABASE = "sqlserver" # Target database name
USERNAME = "me" # SQL Server login
PASSWORD = "letmein" # SQL Server password
OUTPUT_FILE = "schema.md" # Output filename (or None for auto)
xSchema — Documentation xSchema.py with a real password to any repository. Consider using an environment variable: PASSWORD = os.environ.get("XSCHEMA_PWD", "")Step 2 — Run the script
python xSchema.py
Step 3 — Find your output
The schema file (schema.md by default) is written to the same directory as the script. xSchema prints progress messages as it connects, queries, and writes, finishing with the file size in bytes.
Connection Settings
All connection parameters live in the configuration block at the top of xSchema.py. There is no external config file — this is intentional for simplicity.
| Variable | Type | Default | Description |
|---|---|---|---|
SERVER | str | "192.168.1.160" | IP address or hostname of the SQL Server instance |
PORT | int | 1433 | TCP port — 1433 is the SQL Server default |
DATABASE | str | "sqlserver" | The name of the database to extract |
USERNAME | str | "sa" | SQL Server login (SQL authentication) |
PASSWORD | str | "" | Password for the above login |
OUTPUT_FILE | str / None | "schema.md" | Output filename. Set to None to auto-name as <database>_schema.md |
Certificate Trust
The connection string includes TrustServerCertificate=yes, bypassing SSL certificate validation. Appropriate for local/internal servers — remove for production connections with a trusted certificate.
Windows Authentication
To use Windows auth instead of SQL auth, replace the UID/PWD portion in build_connection_string() with Trusted_Connection=yes; and remove the USERNAME/PASSWORD variables.
CLI Arguments
xSchema accepts optional flags to control output content. All are opt-in; the defaults produce the most focused result.
| Flag | Type | Default | Description |
|---|---|---|---|
--sprocs | flag | off | Include all non-system stored procedures and their parameters |
--views | flag | off | Include all non-system views, their columns, and full SQL definition |
--minrecords N | int | 1 | Exclude tables with fewer than N rows |
--minrecords in detail
Filters tables using the approximate row count from sys.partitions. The default of 1 excludes completely empty tables. Set to 0 to include all tables, or a higher value such as 100 to focus on tables with meaningful data. Note that counts are approximate and may lag behind recent bulk operations.
python xSchema.py --help at any time to see all flags and their defaults.Output File
xSchema writes a single UTF-8 Markdown (.md) file, readable in any Markdown renderer including GitHub, VS Code, Obsidian, and AI chat tools.
Document Structure
Output File Naming
| OUTPUT_FILE value | Resulting filename |
|---|---|
"schema.md" (or any string) | Exactly that filename |
None | sqlserver_schema.md (database name + _schema.md) |
Usage Examples
Basic extraction — tables with data only
python xSchema.py
Include stored procedures and views
python xSchema.py --sprocs --views
Only tables with 100+ rows, include stored procedures
python xSchema.py --minrecords 100 --sprocs
Include all tables regardless of row count
python xSchema.py --minrecords 0 --sprocs --views
Auto-name the output file
Set OUTPUT_FILE = None in the script. The file will be written as schema.md.
Schema Output — What's Captured
For each table, xSchema extracts the following detail:
| Section | Details captured |
|---|---|
| Table metadata | Approximate row count, created date, modified date, MS_Description extended property |
| Columns | ID, name, data type (with length/precision/scale), nullability, identity spec, computed column definition, default value, collation, MS_Description |
| Primary key | Constraint name, column(s), key ordinal, descending flag |
| Foreign keys | Constraint name, column, referenced table/column, ON DELETE/UPDATE actions, disabled flag |
| Unique constraints | Constraint name, column(s), ordinal |
| Check constraints | Constraint name, column (or table-level), definition expression, disabled flag |
| Indexes | Name, type, uniqueness, key columns, included columns, fill factor, filter definition |
| Triggers | Name, type, event (INSERT/UPDATE/DELETE), INSTEAD OF flag, disabled flag |
nvarchar(255), decimal(18,4), nvarchar(MAX). nchar/nvarchar lengths are halved from max_length to display the correct character count.Architecture
xSchema is a single-file Python script with a clear layered structure. Each layer has a single responsibility.
| Layer | Components | Responsibility |
|---|---|---|
| CLI | parse_args() | argparse-based flag handling |
| Connection | build_connection_string(), connect() | ODBC driver detection and database connection |
| Query helper | q(cursor, sql, *params) | Thin wrapper around cursor.execute + fetchall |
| Extractors | get_*() functions | One per schema object type — all return row lists |
| Formatters | md_*(), format_data_type() | Markdown string generation |
| Builder | build_markdown() | Orchestrates extraction and assembles the document |
| Entry point | main() | Connects, calls builder, writes file, prints summary |
TOC Injection Mechanism
During document assembly a placeholder comment <!-- TOC --> is inserted at the table of contents position. TOC entries are accumulated as each section is built, then joined and substituted in a single string replacement — avoiding any two-pass build logic.
ODBC Driver Selection
build_connection_string() calls pyodbc.drivers(), filters for entries containing "SQL Server", sorts alphabetically, and picks the last — effectively the highest version number installed. No manual driver configuration is required.
Function Reference
CLI & Connection
| Function | Returns | Description |
|---|---|---|
parse_args() | Namespace | Returns parsed CLI args: .sprocs, .views, .minrecords |
build_connection_string() | str | Detects best ODBC driver and builds the pyodbc connection string |
connect() | Connection | Opens connection with 30s timeout, sets autocommit. Exits on failure. |
q(cursor, sql, *params) | list[Row] | Executes SQL with optional parameters, returns all rows |
Schema Extractors
| Function | Parameter | Description |
|---|---|---|
get_database_info(cursor) | — | DB name, SQL version, edition, collation via SERVERPROPERTY |
get_schemas(cursor) | — | All non-system schemas and their owners |
get_tables(cursor, min_records) | int | Tables with row count ≥ min_records, including extended properties |
get_columns(cursor, object_id) | int | All columns for a table or view with full metadata |
get_primary_keys(cursor, object_id) | int | PK constraint, columns, ordinal, sort direction |
get_foreign_keys(cursor, object_id) | int | FK constraints with referenced table/column and referential actions |
get_check_constraints(cursor, object_id) | int | CHECK constraints with definition expression and disabled flag |
get_unique_constraints(cursor, object_id) | int | UNIQUE key constraints |
get_indexes(cursor, object_id) | int | All indexes with key/included columns via STRING_AGG |
get_triggers(cursor, object_id) | int | DML triggers on the table |
get_views(cursor) | — | All non-system views with columns and SQL definition |
get_stored_procedures(cursor) | — | All non-system stored procedures |
get_proc_parameters(cursor, object_id) | int | Parameters for a stored procedure |
get_functions(cursor) | — | All user-defined functions (FN, IF, TF, FS, FT types) |
get_sequences(cursor) | — | Sequences — wrapped in try/except for pre-2012 compatibility |
get_user_defined_types(cursor) | — | User-defined non-table types |
Markdown Formatters
| Function | Description |
|---|---|
md_h1/h2/h3/h4(text) | Returns a Markdown heading string at the given level |
md_table(headers, rows) | Generates a GFM pipe table. Returns _None_ for empty row lists. Pipes and newlines in cell values are escaped. |
format_data_type(row) | Formats a column's data type with length/precision/scale. Handles nvarchar/varchar MAX, nchar byte-to-char conversion, decimal, float, and all base types. |
Core Builder & Entry Point
| Function | Description |
|---|---|
build_markdown(conn, database_name, args) | Main document builder. Opens a cursor, calls all extractors in sequence, assembles the Markdown string, injects the TOC, and returns the complete document. |
main() | Entry point. Parses args, connects, resolves output filename, calls build_markdown(), writes the file, and prints summary stats. |
Extending xSchema
Adding a New Schema Object Type
The pattern is consistent throughout. To add synonyms, for example:
1. Add an extractor function:
def get_synonyms(cursor):
return q(cursor, """
SELECT s.name, sc.name AS schema_name,
s.base_object_name
FROM sys.synonyms s
JOIN sys.schemas sc ON s.schema_id = sc.schema_id
ORDER BY sc.name, s.name
""")
2. Call it inside build_markdown():
synonyms = get_synonyms(cursor)
if synonyms:
lines.append(md_h2("Synonyms"))
toc_entries.append("- [Synonyms](#synonyms)")
lines.append(md_table(
["Schema", "Name", "Base Object"],
[(r.schema_name, r.name, r.base_object_name) for r in synonyms]
))
Adding a New CLI Flag
Add the argument in parse_args(), then gate the relevant section in build_markdown() with if args.your_flag:, following the same pattern as --sprocs and --views.
Troubleshooting
| Error / Symptom | Likely cause | Solution |
|---|---|---|
pyodbc is Not Installed | pyodbc not in the current Python environment | Run pip install pyodbc |
No SQL Server ODBC Driver Found | No SQL Server ODBC driver installed on the OS | Install ODBC Driver 17 for SQL Server from Microsoft |
Could not Connect to SQL Server | Wrong IP/port, firewall blocking 1433, or incorrect credentials | Verify SERVER, PORT, USERNAME, PASSWORD and firewall rules |
| Empty output / few tables | --minrecords threshold too high, or tables are empty | Run with --minrecords 0 to include all tables |
| Row counts seem wrong | sys.partitions counts are approximate | Expected behaviour — counts are fast estimates, not exact SELECTs |
| Sequences not in output | Sequences require SQL Server 2012 or later | Handled gracefully — get_sequences() returns an empty list on failure |
| Stored procedures not in output | --sprocs flag not passed | Run with --sprocs |
| Views not in output | --views flag not passed | Run with --views |
| nvarchar shows wrong length | sys.columns.max_length is byte length; nvarchar uses 2 bytes/char | Handled automatically — format_data_type() halves nchar/nvarchar lengths |