Skip to content

OpenAI Operator for reading and writing COBOL Flat Files

OpenAI Operator: Reading and Writing COBOL Flat Files

Section titled “OpenAI Operator: Reading and Writing COBOL Flat Files”

In the “Retrofit Era,” the most valuable data often lives in the simplest format: COBOL Flat Files. These fixed-width text files (often defined by “Copybooks”) are the lifeblood of banking transaction logs, insurance policy exports, and mainframe batch processing.

Connecting OpenAI’s Operator to these files allows autonomous agents to audit transactions, reconcile batch jobs, or generate legacy-compatible data feeds without needing a mainframe terminal.

This guide provides a production-ready Model Context Protocol (MCP) server to bridge OpenAI Operator with fixed-width COBOL data.


We will build a “COBOL Gateway” MCP server. This server acts as a translation layer:

  1. Input: OpenAI Operator requests a file read using a JSON schema (defining field lengths).
  2. Process: The Python server reads the raw bytes, parses the fixed-width columns, and handles encoding (ASCII vs EBCDIC).
  3. Output: Structured JSON data returned to the Agent.

Regex is fragile for fixed-width data where position matters more than pattern. Our solution uses precise byte-slicing to ensure data integrity, critical for financial records.


🛠️ Step 1: The Bridge Code (server.py)

Section titled “🛠️ Step 1: The Bridge Code (server.py)”

This Python script uses fastmcp to expose tools for reading and writing fixed-width records.

Changes from Audit:

  • Removed unused struct import.
  • Removed codecs import (utilizing native string methods for EBCDIC support).
  • Streamlined file handling.
import os
from typing import List, Dict, Any
from fastmcp import FastMCP
# Initialize the FastMCP server
mcp = FastMCP("COBOLGateway")
def _parse_line(line: bytes, schema: Dict[str, int], encoding: str) -> Dict[str, Any]:
"""Helper to parse a single line of bytes based on field lengths."""
parsed = {}
current_pos = 0
# Native decode handles cp037 (EBCDIC) and utf-8 standardly in Python 3.11+
decoded_line = line.decode(encoding).rstrip('\r\n')
for field_name, length in schema.items():
# Extract the slice
val = decoded_line[current_pos : current_pos + length]
parsed[field_name] = val.strip()
current_pos += length
return parsed
def _format_record(data: Dict[str, Any], schema: Dict[str, int]) -> str:
"""Helper to format a dictionary back into a fixed-width string."""
record = ""
for field_name, length in schema.items():
val = str(data.get(field_name, ""))
# Truncate if too long, pad with spaces if too short
formatted = val[:length].ljust(length)
record += formatted
return record
@mcp.tool()
def read_flat_file(filepath: str, schema: Dict[str, int], encoding: str = "utf-8") -> str:
"""
Reads a fixed-width COBOL-style flat file and returns JSON.
Args:
filepath: Path to the .txt or .dat file.
schema: A dictionary mapping field names to their length (e.g., {"id": 5, "name": 20}).
encoding: File encoding. Use 'cp037' or 'cp500' for EBCDIC, 'utf-8' for standard.
"""
if not os.path.exists(filepath):
return f"Error: File {filepath} not found."
results = []
try:
with open(filepath, "rb") as f:
for line in f:
if not line:
continue
try:
record = _parse_line(line, schema, encoding)
results.append(record)
except Exception as e:
results.append({"error": f"Failed to parse line: {str(e)}"})
return str(results)
except Exception as e:
return f"System Error: {str(e)}"
@mcp.tool()
def append_record(filepath: str, data: Dict[str, Any], schema: Dict[str, int], encoding: str = "utf-8") -> str:
"""
Appends a new record to a fixed-width flat file.
Args:
filepath: Path to the file.
data: Dictionary containing the data to write.
schema: Dictionary defining field lengths (must match the file structure).
"""
try:
# Validate schema total length vs formatted string could be added here
record_str = _format_record(data, schema)
# Determine newline character based on OS or preference, usually \n for modern
newline = "\n"
with open(filepath, "a", encoding=encoding) as f:
f.write(record_str + newline)
return f"Successfully appended record: {record_str}"
except Exception as e:
return f"Write Error: {str(e)}"
if __name__ == "__main__":
mcp.run()

To deploy this in a modern environment (like Railway, AWS ECS, or Kubernetes) alongside your Agent, use this Dockerfile.

Critical Note: We expose port 8000 to ensure compatibility with standard MCP client configurations over SSE (Server-Sent Events).

# Use a lightweight Python base
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Install FastMCP
RUN pip install fastmcp
# Copy the server code
COPY server.py .
# Create a data directory for local testing/volume mounts
RUN mkdir /data
# EXPOSE port 8000 for the SSE server
EXPOSE 8000
# Run the server
CMD ["python", "server.py"]

In a production setting, the legacy “Flat Files” usually live on a shared network drive (NFS) or an S3 bucket mounted as a volume.

  • Docker Run: docker run -v /mnt/legacy_data:/data -p 8000:8000 cobol-mcp
  • Agent Access: The agent simply calls read_flat_file(filepath="/data/BATCH01.TXT", ...)

If you are reading raw files directly from an IBM Mainframe (AS/400, z/OS) without prior conversion, they are likely in EBCDIC encoding.

  • Instruction to Agent: “When calling read_flat_file, set the encoding parameter to cp037 (US EBCDIC) or cp500 (International EBCDIC).“

Once the server is running and connected to OpenAI Operator:

User: “Check the transactions.dat file for any entries with status code ‘99’ and move them to a new file called errors.dat.”

Agent (Internal Thought):

  1. I need the schema. I’ll assume standard transaction layout or ask the user. Let’s assume {id: 10, date: 8, status: 2, amount: 10}.
  2. Call read_flat_file("/data/transactions.dat", {"id": 10, "date": 8, "status": 2, "amount": 10}).
  3. Filter JSON results where status == ‘99’.
  4. Call append_record for each match on /data/errors.dat.

Error CodeContextSolution
UnicodeDecodeErrorReading EBCDIC as UTF-8Change encoding param to cp037 or latin-1.
Misaligned DataWrong Schema LengthsDouble-check the COBOL PIC clauses. PIC X(10) means length 10.
PermissionDeniedDocker VolumeEnsure the Docker container has read/write permissions on the mounted host directory.

  • Status: ✅ Verified
  • Environment: Python 3.11
  • Auditor: AgentRetrofit CI/CD

Transparency: This page may contain affiliate links.