Combining characters stack visual accents, marks, or glyphs on top of base characters. This tool strips them out, returning a clean, base-level version of the text. It uses full Unicode-aware normalization to catch all \u0300-style marks, whether visible or not.
How to Use:
- Paste text containing diacritics or accents into the input
- Watch the clean output appear live in the right box
- Import
.txt
,.log
, or.csv
files if needed - Use Copy or Export to save the result
- Click Clear All to reset the tool
What It Does:
- Converts input to NFD (decomposed form)
- Removes all combining marks using regex
\p{Mn}
- Recombines with NFC normalization
- Preserves layout, spacing, and base text
Example:
Input:
áêĩōu̅
Output:
aeiou
Common Use Cases:
This tool is ideal for cleaning data that uses Unicode accents, diacritical marks, or combining characters often used in Zalgo, emoji stacking, legacy encodings, or user-generated content. It’s especially useful before search indexing, string comparison, or sanitizing for plain-text storage.
Useful Tools & Suggestions:
After stripping out combining marks, it’s smart to run the text through Analyze Unicode to confirm the cleanup. And if you’re prepping it for something strict like encoding, Convert Unicode to ASCII helps flatten things even further.