UTF-16 to DBF ?

HMG en Español

Moderator: Rathinagiri

Post Reply
User avatar
AUGE_OHR
Posts: 2096
Joined: Sun Aug 25, 2019 3:12 pm
DBs Used: DBF, PostgreSQL, MySQL, SQLite
Location: Hamburg, Germany

UTF-16 to DBF ?

Post by AUGE_OHR »

hi,

have got a *.CSV which was encode using UTF-16
how can i "read" it and "convert" Data for DBF :?:

Sign inside *.CSV are "normal" ASCI, no "Umlaute"
but between Sign there is a Space ...
have fun
Jimmy
franco
Posts: 889
Joined: Sat Nov 02, 2013 5:42 am
DBs Used: DBF
Location: Canada

Re: UTF-16 to DBF ?

Post by franco »

Can you use append from or is utf-16 to large.
All The Best,
Franco
Canada
edk
Posts: 999
Joined: Thu Oct 16, 2014 11:35 am
Location: Poland

Re: UTF-16 to DBF ?

Post by edk »

If the file is encoded in UTF-16LE you can translate it using hb_Translate(), but if it is in UTF-16BE you must first swap bytes and then translate by hb_Translate().

Code: Select all

* :encoding=UTF-8:	ąćęłńóśżźĄĆĘŁŃÓŚŻŹ

#include <hmg.ch> 
#include "hbextcdp.ch"          //all CodePages

Function Main 

Local cUTF8 := "Zażółć gęśłą jaźń"
Local cUTF16LE, cUTF16BE

HB_CDPSELECT( "UTF8" )

//UTF8 To UTF16LE
cUTF16LE := hb_Translate( cUTF8, , "UTF16LE" )

//UTF8 To UTF16BE
cUTF16BE := ByteSwap ( hb_Translate( cUTF8, , "UTF16LE" ) )

strfile ( cUTF16LE , "utf16le.txt" )
strfile ( cUTF16BE , "utf16be.txt" )

//UTF16LE to UTF8
msginfo ( hb_Translate( filestr ( "utf16le.txt" ), "UTF16LE" ), "UTF16LE TO UTF8" )

//UTF16BE to UTF8
msginfo ( hb_Translate( ByteSwap ( filestr ( "utf16be.txt" ) ), "UTF16LE" ), "UTF16BE TO UTF8" )

RETURN
**********************************************************************
FUNCTION ByteSwap ( cString )
Local cSwapped := "", nByte, cBytePair
FOR nByte := 1 TO Len ( cString ) STEP 2
    cBytePair  := SUBSTR( cString, nByte, 2 )
    cSwapped   += RIGHT( cBytePair, 1 ) + LEFT( cBytePair, 1 )
NEXT
RETURN cSwapped
***************************
User avatar
AUGE_OHR
Posts: 2096
Joined: Sun Aug 25, 2019 3:12 pm
DBs Used: DBF, PostgreSQL, MySQL, SQLite
Location: Hamburg, Germany

Re: UTF-16 to DBF ?

Post by AUGE_OHR »

hi Edward,

thx for Answer, i will try that Way
have fun
Jimmy
franco
Posts: 889
Joined: Sat Nov 02, 2013 5:42 am
DBs Used: DBF
Location: Canada

Re: UTF-16 to DBF ?

Post by franco »

Jimmy,
I do not know what a UTF-16 file looks like and I do not have the knowledge of Edward but I put this together.
Just have to build it.
Hope this can help you or others when importing.

Code: Select all

#include "hmg.ch"

Function Main()
Local  lsuccess := .T., L:= 1, B:=''
Private ni 
Private aSTRUCT_L  := {} 

*************** Create CSV File.
set printer to 'Try.csv'
set device to printer
@ 1,1 say 'Hello1'+chr(127)+chr(155)+chr(200)+ " "+'Hello1b'+chr(165)+chr(170)+chr(127)
@ 2,1 say 'Hello'+chr(50)+chr(129)+chr(155)+chr(220)+ " "+'Hello2b'+chr(185)+chr(190)+chr(129)

set printer to 
set device to screen
*************** End CSV File Creation

AADD( aSTRUCT_L , { 'z1 '    , 'c' , 100, 0 } )
AADD( aSTRUCT_L , { 'z2 '    , 'c' , 100, 0 } )
DBCREATE( "Try.dbf" , aSTRUCT_L,)
use
use  Try via "dbfntx"
zap
append from 'try.csv' delimited with blank    //&fil sdf  // delimited with ","
go top
if len(alltrim(z1)) = 0
	delete
	pack
endif
go top
 EDIT EXTENDED WORKAREA TRY
go top
do while ! eof()
	L:=1
	do while L <= len(alltrim(z1))
 		 if asc(substr(z1,L,1)) > 126
			replace z1 with substr(z1,1,L-1) + substr(z1,L+1,100)
			loop
 		endif
		L := L+1
		loop
	enddo
	skip
	loop
enddo

go top
select printer default to lsuccess     // preview    // default
if lsuccess == .T.
	start printdoc
	start printpage
	@ 5,10 PRINT z1  Font "Arial" Size 14
	@ 10,10 PRINT z2 Size 14
	end printpage
	end printdoc
else 
	msgbox('No Print')
endif
 EDIT EXTENDED WORKAREA TRY
 *EDIT  WORKAREA TRY
BR1()
use
Return

Function br1
Local aValue := { Nil , Nil }

	* Grid Column Controls Definitions
	aCtrl_1 := {'TEXTBOX','CHARACTER'}
	aCtrl_2 := {'TEXTBOX','CHARACTER'}


  DEFINE WINDOW Win_1 ;
      AT 0,0 ;
      WIDTH 1000 ;
      HEIGHT 700 ;
      TITLE "Table Try" ;
      MODAL ; // for test, usually MODAL ;
      BACKCOLOR { 230, 230, 230 } 

  
      @ 90, 10 BROWSE Browse_1 ;
         OF Win_1 ;
         WIDTH 900 ;
         HEIGHT 400 ;
         FONT "Arial" ; 
         SIZE 10 ;
         HEADERS { "Z1","Z2"} ;
         WIDTHS { 100,300 } ;
         WORKAREA Try ;
         FIELDS { "z1","Z2" } 
END WINDOW

CENTER WINDOW win_1
ACTIVATE WINDOW win_1
Return

All The Best,
Franco
Canada
edk
Posts: 999
Joined: Thu Oct 16, 2014 11:35 am
Location: Poland

Re: UTF-16 to DBF ?

Post by edk »

Hello Franco.
You must remember that in UTF16 each character is encoded in two bytes, this also applies to the string delimiter, field separator and line break marker. By dividing the file into lines and fields using only one byte, the remaining string of characters to be processed may be corrupted.

In my opinion, it is better to convert the file to the target code page and then append it.

Take a look at my example with files encoded in UTF16LE (little endian) and UTF16BE (big endian) with and without BOM marker.
csv_UTF16.7z
(6.1 KiB) Downloaded 124 times
franco
Posts: 889
Joined: Sat Nov 02, 2013 5:42 am
DBs Used: DBF
Location: Canada

Re: UTF-16 to DBF ?

Post by franco »

Thank you Edward.
I am studying your code and trying to understand how it is working.
I will carry on testing different things about it.
It works perfectly.
All The Best,
Franco
Canada
Post Reply