In today’s world migrating or integrating the data from one system to another plays a vital role for any businesses.
While loading the data from SAP source to CSV file we have many tables which contain the data in various languages which means It includes information not only in western languages but also in other languages for example chinese, japanese etc.
While loading these kind of tables into CSV file, we will get improper data for other languages except English,The reason behind this is that when we load the data into CSV file format and open it using excel it uses windows-1252 encoding technique which is also known as ANSI (American National Standards Institute) encoding and is mostly compatible with English and other western languages.
In table TPART, the field VTEXT contains Chinese data at the source but when we load it in .csv file the garbage data is loaded.
Fig:1) Source Data from table TPART which contains chinese data in field VTEXT.
Fig:2) target data in .csv file which contains garbage data in field VTEXT.
To overcome this kind of data issue we need to tweak some settings in flat file editor in SAP DS.
A Byte Order Mark (BOM) is a special marker that can be used at the beginning of an Unicode (UTF-8, UTF-16, or UTF-32) Encoded text file, it is used to indicate the byte order (little endian or big endian) and encoding of text to software that reads the file.
Fig 3) Snip of file format editor where the above changes are implemented.
Below Screenshot shows the output data after doing the codepage setting:
Fig :4) Output snip of .csv file after applying the above settings in flat file editor.
find the below links for more details on UTF-8 and file format properties.
What is UTF-8 https://www.freecodecamp.org/news/what-is-utf-8-character-encoding/
Input/Output options in the File Format editor
I hope you found this blog informative.