Sanitizing Text
- Diniz Martins
- May 8
- 2 min read
Updated: Jun 13
The Hidden Side of Text: Detecting and Cleaning Invisible Characters
We often assume that what we see is what we get — especially with text. But under the surface, some messages, social posts, and copied content carry invisible Unicode characters that can mess with formatting, analytics, and even security. If you've ever wondered why some copy-pasted text behaves strangely, you're not imagining things.
Recently, I tested two online tools that let you detect and clean these characters, and the results were surprising.
Test Case: Copying Text from ChatGPT
I copied a simple message from ChatGPT and pasted it into https://invisible-characters.com/view.html. The tool instantly showed 24 hidden characters — all Unicode spaces (U+0020). They looked like normal spaces, but the tool highlighted them as individual Unicode code points.
This helps you see what's lurking beneath the surface of your text, whether it’s spaces, zero-width joiners, or other non-printable characters used for formatting, obfuscation, or even manipulation.
Sanitizing Text
Then I used https://cleanpaste.site/ — a much simpler-looking tool, but just as powerful. When I pasted the same message here, it automatically cleaned the text, stripping out hidden formatting and leaving me with plain, clean text.
This is ideal if you're:
Publishing articles and want clean formatting,
Sharing code or logs,
Copy-pasting between systems (especially in cybersecurity, devops, or academia),
Trying to avoid tracking characters sometimes hidden in shady copy-pastes.
Why This Matters (Especially for Tech People)
If you run a blog, write technical content, or work with scripts:
These characters can break markdown, HTML, or JSON.
They can manipulate display logic in web apps.
In some rare cases, they’ve even been used in social engineering or bypassing string comparison in exploits.
For example, imagine a file that looks identical to a trusted one — but has an invisible character in the filename or path. The implications for code reviews or forensic investigations are real.
Takeaway
Use Invisible Characters Viewer to detect hidden characters in any suspicious text.
Use CleanPaste.site to clean your input before posting, sharing, or saving.
And always remember: text isn’t always what it seems.
Comments