vim : Convert file encoding to UTF8 or latin9

Convertir l’encodage d’un fichier vers UTF-8 ou latin1 (voire latin9 de préférence)

:set fileencoding=utf8 (1)
:wq (2)

1	Specify UTF8 as a file encoding
2	Write file and quits

Using latin9 file encoding and unix line endings saves 1,10% of an ordinary french database dump.

$ file a.csv
a.csv: UTF-8 Unicode text, with very long lines, with CRLF line terminators
$ ls -l
-rw-r--r-- 1 grim grim 10394653 nov.  11 23:11 a.csv
$ vim a.csv
:set fileencoding=latin9
:wq
$ ls -l
-rw-r--r-- 1 grim grim 10299473 nov.  11 23:11 a.csv (1)
$ vim a.csv
:set ff=unix
:wq
$ ls -l
-rw-r--r-- 1 grim grim 10279957 nov.  11 23:12 a.csv (2)
$ file a.csv
a.csv: Non-ISO extended-ASCII text, with very long lines

1	Writing a french database dump of user records (first names, last names, addresses…) in `latin9` saves 0,91% of the file size
2	Writing the file with `unix` end lines (one `CR` character instead of two : `CR+LF`) saves 0,18% more, so a total of 1,10% ; here it’s 114ko for a 10Mo file.

More compression can be achieved using compression specific tools :

$ gzip a.csv
$ ls -l
-rw-r--r-- 1 grim grim 1045070 nov.  11 23:14 a.csv.gz (1)
$ gunzip a.csv
$ tar cJf a.csv.tar.xz
$ ls -l
-rw-r--r-- 1 grim grim 10279957 nov.  11 23:12 a.csv
-rw-r--r-- 1 grim grim 447556 nov.  11 23:15 a.csv.tar.xz (2)

1	compressed with GZip it’s 10% of the original size
2	compressed with XZ it’s 5% of the original size

More information in french here : Memo_8 : Archives, compression et décompression de fichiers.

Grimoire-
Command
.es

vim : Convert file encoding to UTF8 or latin9

Selection

Themes

Sponsor

Grimoire-Command.es

vim : Convert file encoding to UTF8 or latin9

Selection

Themes

Sponsor

Grimoire-
Command
.es