Pasting TEXT from browsers - sort of

Submitted by Frederic Marand on Sun, 2005-12-18 09:24

Are you aware of what you're pasting from a browser's rendition of a page ?

I just had this idea when writing the post about Drupal and XML-RPC about what browsers actually copy from web pages...

In the "A word of warning" section, you'll find a paragraph like this:

This means that you must at all costs include security checks in your new server module if you want it to access any data, otherwise you're ready to be exploited.

OK ? Now select part of this sentence in the paragraph above, in your browser, including the emphasized text. Paste it to a text editor. Don't you notice something strange ?

No ? really ? Now paste it to a word processor, lke OOo, WP, or MS Word. Compare.

OK then, why does the word "MUST" appear as "must" in the text you just pasted, instead of "MUST" in the browser and word processor ?

If you answered CSS, you're right, of course. Prior to the CSS being applied, the text is written in lowercase, and the inline stylesheet applies a text-transform: uppercase formatting. Which means the contents of the clipboard as plain text, which you obtain in a texte editor, is the original, unadulterated, lowercase text, while the contents of the clipboard as rich text, includes the lower-case text, plus the CSS transformation to uppercase.

This can be evidenced by going for Format/Character in MS Word with the cursor within the MUST word, for instance: you'll notice an uppercase transformation is applied and, if you remove it, the lowercase text reappears.

All very logical, but maybe unexpected.

Tagged for ,

I generally find myself dealing with overformatted rather than underformatted text when pasting from browsers. For example, pasting a table of web content into Excel can be challenging. Paste > Special > Text seems to provide the cleanest data for me.