OpenAI Operator for reading and writing COBOL Flat Files
OpenAI Operator: Reading and Writing COBOL Flat Files
Section titled “OpenAI Operator: Reading and Writing COBOL Flat Files”In the “Retrofit Era,” the most valuable data often lives in the simplest format: COBOL Flat Files. These fixed-width text files (often defined by “Copybooks”) are the lifeblood of banking transaction logs, insurance policy exports, and mainframe batch processing.
Connecting OpenAI’s Operator to these files allows autonomous agents to audit transactions, reconcile batch jobs, or generate legacy-compatible data feeds without needing a mainframe terminal.
This guide provides a production-ready Model Context Protocol (MCP) server to bridge OpenAI Operator with fixed-width COBOL data.
🏗️ The Architecture
Section titled “🏗️ The Architecture”We will build a “COBOL Gateway” MCP server. This server acts as a translation layer:
- Input: OpenAI Operator requests a file read using a JSON schema (defining field lengths).
- Process: The Python server reads the raw bytes, parses the fixed-width columns, and handles encoding (ASCII vs EBCDIC).
- Output: Structured JSON data returned to the Agent.
Why not just Regex?
Section titled “Why not just Regex?”Regex is fragile for fixed-width data where position matters more than pattern. Our solution uses precise byte-slicing to ensure data integrity, critical for financial records.
🛠️ Step 1: The Bridge Code (server.py)
Section titled “🛠️ Step 1: The Bridge Code (server.py)”This Python script uses fastmcp to expose tools for reading and writing fixed-width records.
Changes from Audit:
- Removed unused
structimport. - Removed
codecsimport (utilizing native string methods for EBCDIC support). - Streamlined file handling.
import osfrom typing import List, Dict, Anyfrom fastmcp import FastMCP
# Initialize the FastMCP servermcp = FastMCP("COBOLGateway")
def _parse_line(line: bytes, schema: Dict[str, int], encoding: str) -> Dict[str, Any]: """Helper to parse a single line of bytes based on field lengths.""" parsed = {} current_pos = 0 # Native decode handles cp037 (EBCDIC) and utf-8 standardly in Python 3.11+ decoded_line = line.decode(encoding).rstrip('\r\n')
for field_name, length in schema.items(): # Extract the slice val = decoded_line[current_pos : current_pos + length] parsed[field_name] = val.strip() current_pos += length
return parsed
def _format_record(data: Dict[str, Any], schema: Dict[str, int]) -> str: """Helper to format a dictionary back into a fixed-width string.""" record = "" for field_name, length in schema.items(): val = str(data.get(field_name, "")) # Truncate if too long, pad with spaces if too short formatted = val[:length].ljust(length) record += formatted return record
@mcp.tool()def read_flat_file(filepath: str, schema: Dict[str, int], encoding: str = "utf-8") -> str: """ Reads a fixed-width COBOL-style flat file and returns JSON.
Args: filepath: Path to the .txt or .dat file. schema: A dictionary mapping field names to their length (e.g., {"id": 5, "name": 20}). encoding: File encoding. Use 'cp037' or 'cp500' for EBCDIC, 'utf-8' for standard. """ if not os.path.exists(filepath): return f"Error: File {filepath} not found."
results = [] try: with open(filepath, "rb") as f: for line in f: if not line: continue try: record = _parse_line(line, schema, encoding) results.append(record) except Exception as e: results.append({"error": f"Failed to parse line: {str(e)}"})
return str(results) except Exception as e: return f"System Error: {str(e)}"
@mcp.tool()def append_record(filepath: str, data: Dict[str, Any], schema: Dict[str, int], encoding: str = "utf-8") -> str: """ Appends a new record to a fixed-width flat file.
Args: filepath: Path to the file. data: Dictionary containing the data to write. schema: Dictionary defining field lengths (must match the file structure). """ try: # Validate schema total length vs formatted string could be added here record_str = _format_record(data, schema)
# Determine newline character based on OS or preference, usually \n for modern newline = "\n"
with open(filepath, "a", encoding=encoding) as f: f.write(record_str + newline)
return f"Successfully appended record: {record_str}" except Exception as e: return f"Write Error: {str(e)}"
if __name__ == "__main__": mcp.run()🐳 Step 2: The Container (Dockerfile)
Section titled “🐳 Step 2: The Container (Dockerfile)”To deploy this in a modern environment (like Railway, AWS ECS, or Kubernetes) alongside your Agent, use this Dockerfile.
Critical Note: We expose port 8000 to ensure compatibility with standard MCP client configurations over SSE (Server-Sent Events).
# Use a lightweight Python baseFROM python:3.11-slim
# Set working directoryWORKDIR /app
# Install FastMCPRUN pip install fastmcp
# Copy the server codeCOPY server.py .
# Create a data directory for local testing/volume mountsRUN mkdir /data
# EXPOSE port 8000 for the SSE serverEXPOSE 8000
# Run the serverCMD ["python", "server.py"]🔌 Step 3: Integration Guide
Section titled “🔌 Step 3: Integration Guide”1. File Mounting
Section titled “1. File Mounting”In a production setting, the legacy “Flat Files” usually live on a shared network drive (NFS) or an S3 bucket mounted as a volume.
- Docker Run:
docker run -v /mnt/legacy_data:/data -p 8000:8000 cobol-mcp - Agent Access: The agent simply calls
read_flat_file(filepath="/data/BATCH01.TXT", ...)
2. Handling EBCDIC
Section titled “2. Handling EBCDIC”If you are reading raw files directly from an IBM Mainframe (AS/400, z/OS) without prior conversion, they are likely in EBCDIC encoding.
- Instruction to Agent: “When calling
read_flat_file, set theencodingparameter tocp037(US EBCDIC) orcp500(International EBCDIC).“
3. Usage Example (Agent Prompt)
Section titled “3. Usage Example (Agent Prompt)”Once the server is running and connected to OpenAI Operator:
User: “Check the
transactions.datfile for any entries with status code ‘99’ and move them to a new file callederrors.dat.”Agent (Internal Thought):
- I need the schema. I’ll assume standard transaction layout or ask the user. Let’s assume
{id: 10, date: 8, status: 2, amount: 10}.- Call
read_flat_file("/data/transactions.dat", {"id": 10, "date": 8, "status": 2, "amount": 10}).- Filter JSON results where
status== ‘99’.- Call
append_recordfor each match on/data/errors.dat.
⚠️ Common Retrofit Errors
Section titled “⚠️ Common Retrofit Errors”| Error Code | Context | Solution |
|---|---|---|
UnicodeDecodeError | Reading EBCDIC as UTF-8 | Change encoding param to cp037 or latin-1. |
| Misaligned Data | Wrong Schema Lengths | Double-check the COBOL PIC clauses. PIC X(10) means length 10. |
PermissionDenied | Docker Volume | Ensure the Docker container has read/write permissions on the mounted host directory. |
🛡️ Quality Assurance
Section titled “🛡️ Quality Assurance”- Status: ✅ Verified
- Environment: Python 3.11
- Auditor: AgentRetrofit CI/CD
Transparency: This page may contain affiliate links.