Ssis-965 -

// Load schema JSON var schema = JArray.Parse(File.ReadAllText(schemaFile)); foreach (var col in schema) var input = source.InputCollection[0]; var colMeta = input.InputColumnCollection.New(); colMeta.Name = col["ColumnName"].ToString(); colMeta.DataType = DataType.DT_WSTR; // Map to DT_WSTR for nvarchar colMeta.Length = 4000;

$schema = @() foreach($col in $headers) $schema += [pscustomobject]@ ColumnName = $col.Trim() DataType = 'nvarchar(4000)' # default, can be refined later Nullable = $true

// Add OLE DB Destination similarly... pipeline.ReinitializeMetaData(); pkg.Save();

$schema | ConvertTo-Json -Depth 3 | Set-Content -Path "$FilePath.schema.json" Write-Host "Schema written to $FilePath.schema.json" using Microsoft.SqlServer.Dts.Runtime; using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using System.IO; using Newtonsoft.Json.Linq; SSIS-965

| Work‑around | Description | Pros | Cons | |-------------|-------------|------|------| | – set RetainSameConnection = False on the Connection Manager and add a dummy Execute SQL Task that runs SELECT 1 before the Data Flow. | Causes the connection manager to be re‑created at runtime, forcing a new schema read. | Simple; no code changes. | Adds an extra task; may still fail if file is swapped after the dummy task runs. | | B. Use a Staging Table – Load the file into a wide staging table with a varchar(max) column for each field, then perform a set‑based INSERT…SELECT into the final destination after schema validation. | Decouples file schema from the Data Flow; you can validate columns via T‑SQL. | Robust; easy to log errors. | Additional I/O; extra storage; slower for very large files. |

SSIS‑965 – “Data Flow task fails with The data type of the column is unknown ” TL;DR – SSIS‑965 is a long‑standing “metadata‑loss” bug that appears when a Flat File Source (or OLE DB Source ) is used together with dynamic column discovery in a Data Flow that is later reused by a Script Component or Derived Column . The root cause is the way the SSIS runtime caches the metadata of the source at design‑time but discards it at run‑time when the Connection Manager is refreshed with a new schema. The fix is to force a metadata refresh (re‑initialise the component) or, better, to decouple schema discovery from the data flow by using a staging table or Data Flow parameters . Below is a step‑by‑step forensic analysis, a reproducible test case, the official Microsoft KB work‑around, a clean‑room implementation that eliminates the issue, performance considerations, and a checklist for preventing the bug in future projects. 1. Background & Why It Matters SQL Server Integration Services (SSIS) is the ETL engine for the Microsoft data‑platform. A huge proportion of SSIS packages are data‑flow‑centric – they read from a source, perform transformations, and write to a destination.

contains an additional column Region at the end: // Load schema JSON var schema = JArray

// Configure source connection (assume connection manager already exists) var cm = pkg.Connections["FlatFileConn"]; source.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.GetExtendedInterface(cm); source.RuntimeConnectionCollection[0].ConnectionManagerID = cm.ID;

| Symptom | Business impact | |---------|-----------------| | Package crashes on first row | Batch jobs stop, SLA breach | | Intermittent failures (only when file changes) | Hard to reproduce, support overhead | | Silent data loss (when column is dropped) | Incorrect reporting, audit issues | | Debugging time > 4 h per occurrence | Increased cost, developer fatigue |

// Locate Data Flow Task (by name) var dfTask = (TaskHost)pkg.Executables .Cast<Executable>() .First(e => ((TaskHost)e).Name == "DF_LoadDynamic"); | Simple; no code changes

Error 0xC0202009 at Data Flow Task, OLE DB Source [1]: The data type of column "CustomerID" is unknown. Consequences:

static void Main(string[] args) string pkgPath = args[0]; // Path to master package string schemaFile = args[1]; // JSON schema var pkg = Application.LoadPackage(pkgPath, null);

var pipeline = (MainPipe)dfTask.InnerObject; var source = pipeline.ComponentMetaDataCollection.New(); source.ComponentClassID = "DTSAdapter.FlatFileSource";