Skip to main content
CSV First Aid

How to Fix a Broken CSV File — Complete Guide

CSV files break in dozens of ways — garbled characters, shifted columns, phantom blank rows, dates that sort wrong. This guide covers every common failure mode, explains why it happens, and shows you how to fix it (both manually and with CSV First Aid).

1. Garbled characters (encoding issues)

If you see garbled sequences like é instead of readable characters, your file was saved in Windows-1252 or Latin-1 but opened as UTF-8. This is called mojibake.

Manual fix: in Python, open with encoding='cp1252'. In Excel, use the Text Import Wizard and select '65001: Unicode (UTF-8)' or '1252: Western European' depending on the actual encoding.

Automatic fix: drop the file into CSV First Aid. The encoding detector identifies Windows-1252 patterns and re-decodes to clean UTF-8.


2. First column header starts with garbage (BOM)

If your first header looks like id or \ufeffid, the file has a UTF-8 BOM (Byte Order Mark). It's 3 invisible bytes at position 0 that Excel added when saving as 'CSV UTF-8'.

Manual fix: open in a hex editor and delete bytes EF BB BF. In Python: open('file.csv', encoding='utf-8-sig').

Automatic fix: CSV First Aid strips the BOM as part of the read step.


3. Data shifted into wrong columns (broken quotes)

When a field contains a comma or newline but isn't properly quoted, the CSV parser splits it across multiple columns or rows. One unmatched quote can shift every subsequent field.

Manual fix: find the offending field (look for quotes that don't pair up), add the missing closing quote, and escape inner quotes by doubling them ("").

Automatic fix: CSV First Aid's tolerant parser recovers from unmatched quotes and re-quotes all fields correctly on export.


4. Everything in one column (wrong delimiter)

If your data appears in a single column, the file uses a different delimiter than your tool expects. European exports often use semicolons because commas are decimal separators in those locales.

Manual fix: re-import with the correct delimiter. In Excel: Data → Text to Columns → Delimited → select the right character. In Pandas: pd.read_csv('file.csv', sep=';').

Automatic fix: CSV First Aid detects the delimiter and lets you convert to any standard format.


5. Dates sorting wrong or misinterpreted

When a column mixes formats (01/03/2024 vs 2024-03-01 vs March 1, 2024), sorting fails and imports misinterpret the dates. Is 01/02/2024 January 2nd or February 1st?

Manual fix: regex replacement or a script that parses each format and outputs ISO 8601 (YYYY-MM-DD).

Automatic fix: CSV First Aid detects mixed date patterns per column and normalizes to ISO 8601.


6. Invisible problems (whitespace, NBSP, zero-width chars)

The most frustrating CSV bugs are invisible. Trailing spaces cause VLOOKUP to fail. Non-breaking spaces (NBSP) look like regular spaces but don't match. Zero-width characters from web scraping silently break joins.

Manual fix: in Python, strip() removes whitespace but not NBSP. You need regex: re.sub(r'[\u00a0\u200b\ufeff]', '', text).

Automatic fix: CSV First Aid's invisible character cleaner plus whitespace trimmer handles all of these in one pass.


7. Blank rows and trailing newlines

Extra blank rows inflate row counts, break import tools that expect dense data, and create phantom NULL records in databases.

Manual fix: open in a text editor and delete blank lines. Careful with the trailing newline — it's valid per RFC 4180 but many parsers create an empty final row from it.

Automatic fix: enable the 'Empty rows' fix in CSV First Aid.


8. The 'Unnamed: 0' column (Pandas index leak)

If you see a numeric index as the first column with header 'Unnamed: 0', the file was created with Pandas' df.to_csv() without index=False.

Manual fix: pd.read_csv('file.csv', index_col=0) or delete the first column.

Automatic fix: CSV First Aid detects and strips the sequential-integer index column.

Prefer not to fix these by hand? Drop the file into CSV First Aid — it runs these same checks automatically.

Fix your CSV now →

Related tools