5 Methods to Convert xlsx Format Files to CSV on Linux CLI

Channel: Linux
Abstract: Processing xlsx2csv-0.7.3.tar.gz Now you can convert you xlsx file # xlsx2csv book.xlsx > convert.csv You can check the content of the file # cat conv

XLSX is a file extension for an open XML spreadsheet file format used by Microsoft excel. Converting Microsoft Excel sheet to a Comma Separated file (CSV) is relatively very easy while using command line. The situation may arrive when you have a XLS file and you need to fill the database from it after formatting the data. It exists some methods in command line in order to do the conversion of the different format files.

1) Gnumeric spreadsheet program

Gnumeric is a spreadsheet program for Unix and Unix-like operating systems distributed under the GNU General Public License. It stores its information by creating files and re-opening these files during a future session. It can import and export spreadsheet data to and from multiple formats, including CSV, Microsoft Excel, HTML, OpenDocument, Quattro Pro, and LaTex.

Gnumeric is not present by default in the repository of your centos 7, you must first install the latest lux-release. First download it

Importing into SPSS

To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video

Importing into SPSS
# wget http://repo.iotti.biz/CentOS/7/noarch/lux-release-7-1.noarch.rpm
--2017-10-13 23:32:19-- http://repo.iotti.biz/CentOS/7/noarch/lux-release-7-1.noarch.rpm
Resolving repo.iotti.biz (repo.iotti.biz)... 156.54.7.11
Connecting to repo.iotti.biz (repo.iotti.biz)|156.54.7.11|:80... connected.

Now you can install the lux release

# rpm -Uvh lux-release-7-1.noarch.rpm 
warning: lux-release-7-1.noarch.rpm: Header V4 DSA/SHA1 Signature, key ID 53e4e7a9: NOKEYCSV
Preparing... ################################# [100%]
Updating / installing...
 1:lux-release-7-1 ################################# [100%]

With the lux-release installed, we can now install gnumeric via package

# yum install gnumeric
Loaded plugins: fastestmirror, langpacks
lux | 2.9 kB 00:00:00 
lux/7/primary_db | 1.0 MB 00:00:05 
Loading mirror speeds from cached hostfile
 * base: ftp.hosteurope.de
 * epel: mirror.liquidtelecom.com
 * extras: ftp.hosteurope.de
 * updates: ftp.hosteurope.de
Resolving Dependencies
--> Running transaction check
---> Package gnumeric.x86_64 1:1.10.10-2.el7.lux.1 will be installed

Now you can use the ssconvert command of the gnumeric spreadsheet to convert the file

# ssconvert book.xlsx file.csv
Using exporter Gnumeric_stf:stf_csv

You can visualize the file now

# cat file.csv 
fichier,
paul,
nathan,couvert
couloir,file
road,
2) xlsx2csv converter

xlsx2csv converter is a python application that is capable to convert a batch of XLSX/XLS files to CSV format. You can specify exactly which sheets to be converted. If you have multiple sheets, the xlsx2csv give the possibility to export all the sheets at once, or one at a time.

To install it, you need to have python already installed. Then, you can proceed as below:

# easy_install xlsx2csv
Searching for xlsx2csv
Reading https://pypi.python.org/simple/xlsx2csv/
Best match: xlsx2csv 0.7.3
Downloading https://pypi.python.org/packages/4c/56/4c7f595525839710ab563c8e5a48226021111c1324b1460e603256f7665c/xlsx2csv-0.7.3.tar.gz#md5=b9cffbbe815259987237135f99658c63
Processing xlsx2csv-0.7.3.tar.gz

Now you can convert you xlsx file

# xlsx2csv book.xlsx > convert.csv

You can check the content of the file

# cat convert.csv 
fichier,
paul,
nathan,couvert
couloir,file
road,

By default, the xlsx2csv command convert only the first sheet even if your file contains multiples sheets. Fortunately, it offes the possibilty to convert all the sheets or to choose the one to convert. You can use some interesting paramaters:

  • -a, --all to export all sheets
  • -d DELIMITER for columns delimiter in csv
  • -p SHEETDELIMITER for sheet delimiter used to separate sheets, pass '' if you do not need delimiter, or 'x07' or '\f' for form feed (default: '--------')
  • -s SHEETID for the sheet number to convert

For example, if you want to convert only a specific sheet

# xlsx2csv class.xlsx -s 2 > sheet2.csv

You can check

# cat sheet2.csv 
sheet2
take
linux
centos

Now if you want to convert all the sheet, you can do as below

# xlsx2csv class.xlsx --all > allsheet.csv

You can check the content as below

# cat allsheet.csv 
-------- 1 - Sheet1
fichier
road
-------- 2 - Sheet2
sheet2
take
linux
centos
-------- 3 - Sheet3
devops
script
lxd

You can see that the default delimiter helps to know the sheets.

3) csvkit tool

csvkit is a python library optimized for working with CSV files. It is a nice tool to manipulate, organize, analyze and work with data, using the csv format. It is very light and fast. It is used through the terminal with its in2csv command which converts a variety of common file formats, including xls, xlsx and fixed-width into CSV format..

# pip install csvkit
Collecting csvkit
 Using cached csvkit-1.0.2.tar.gz
Collecting agate>=1.6.0 (from csvkit)

Now you can convert as below:

# in2csv Classeur2.xlsx > book3.csv
4) unoconv

OpenOffice comes with the unoconv program to perform format conversions on the command line. It is present by default if openoffice is installed. You can use the manual

# unoconv --help
usage: unoconv [options] file [file2 ..]

Convert from and to any format supported by LibreOffice

unoconv options:
  -c, --connection=string  use a custom connection string
  -d, --doctype=type       specify document type
                             (document, graphics, presentation, spreadsheet)
  -e, --export=name=value  set export filter options
                             eg. -e PageRange=1-2
  -f, --format=format      specify the output format
  -i, --import=string      set import filter option string
                             eg. -i utf8
  -l, --listener           start a permanent listener to use by unoconv clients
  -n, --no-launch          fail if no listener is found (default: launch one)
  -o, --output=name        output basename, filename or directory
      --pipe=name          alternative method of connection using a pipe
  -p, --port=port          specify the port (default: 2002)
                             to be used by client or listener
      --password=string    provide a password to decrypt the document
  -s, --server=server      specify the server address (default: 127.0.0.1)
                             to be used by client or listener
      --show               list the available output formats
      --stdout             write output to stdout
  -t, --template=file      import the styles from template (.ott)
  -T, --timeout=secs       timeout after secs if connection to listener fails
  -v, --verbose            be more and more verbose (-vvv for debugging)

The command is capable to convert between various file formats. by default, it converts in pdf. It means that you should indicate the desired format if you don't want to have a undesired format. So, to convert in csv with the unoconv command, you need to use two main parameters:

  • -f which indicates the request the final format of the output file
  • -o to indicate the name and the path of the converted file
# unoconv -f csv -o class2.csv Classeur2.xlsx

You can check the content

# cat class2.csv 
fichier,
,
,couvert
,file
road,

Note that the second row of our original xlsx file is empty, that is why you have the comma on the second line of the csv file.

5) Libreoffice headless

By starting the LibreOffice software from the command line you can assign various parameters, with which you can influence the performance. It is possible through the headless mode which help you to launch LibreOffice in command line without any graphical interface component. It gives you the possibility to convert file in some formats as you need. So, you can use it to convert xlsx files in csv. You need to use the indicated the final format (csv) with the--convert-to parameter followed by the file to convert as below:

# libreoffice --headless --convert-to csv book.xlsx --outdir conv/
convert /home/admin/Desktop/book.xlsx -> /home/admin/Desktop/conv/book.csv using filter : Text - txt - csv (StarCalc)

Now you can check the file

# cat conv/book.csv 
fichier,
paul,
nathan,couvert
couloir,file
road,

You can directly convert some xlsx files as below:

# libreoffice --headless --convert-to csv --outdir conv/ *.xlsx
convert /home/admin/Desktop/book.xlsx -> /home/admin/Desktop/conv//book.csv using filter : Text - txt - csv (StarCalc)
convert /home/admin/Desktop/Classeur1.xlsx -> /home/admin/Desktop/conv//Classeur1.csv using filter : Text - txt - csv (StarCalc)
convert /home/admin/Desktop/Classeur2.xlsx -> /home/admin/Desktop/conv//Classeur2.csv using filter : Text - txt - csv (StarCalc)
convert /home/admin/Desktop/class.xlsx -> /home/admin/Desktop/conv//class.csv using filter : Text - txt - csv (StarCalc)

You can look the converted as below

[root@centos7-srv Desktop]# ls conv
book.csv class.csv Classeur1.csv Classeur2.csv

You can check the content of one file

# cat conv/Classeur2.csv 
fichier,
,
,couvert
,file
road,

We have seen the different tools available on Linux to convert any xlsx file format in csv file on command line. You can decide to convert the file in odt or pdf and it is possible with unoconv and libreoffice headless. Worth trying Miller tool which does conversion between formats and more.

Ref From: linoxide
Channels:

Related articles