Page 1 of 1

How to convert html from emails into text?

Posted: Fri Dec 22, 2023 9:30 pm
by HGAutomator
Hi,

I'm converting a whole bunch of Thunderbird emails into eml files, then into .csv, then into Excel, the into DBf format.

I've done all this so far, but of course the body of the email is littered with html codes.

How can I strip all the html out, and leave only text? I can probably replace <br> with character returns and line feeds, but I'd like to strip out all of the html in each body memo field into text.

Re: How to convert html from emails into text?

Posted: Fri Dec 22, 2023 10:46 pm
by edk
It is quite simple:

Code: Select all

Function HTMLToPlainText ( cHTML )
   Local oHTMLDoc := CreateObject( "HTMLFile" )
   Local cPlainText
   
   oHTMLDoc:Write ( cHTML ) 
   cPlainText := oHTMLDoc:body:innerText
   oHTMLDoc := Nil
RETURN cPlainText

Re: How to convert html from emails into text?

Posted: Fri Dec 22, 2023 11:16 pm
by HGAutomator
I'll give it a try, thanks edk.

Re: How to convert html from emails into text?

Posted: Sat Dec 23, 2023 8:20 am
by serge_girard
Thanks Edward, this is what I also needed !

Serge