SQL to CSV via Python, with Headers
Nifty Python code that converts a SQL database file into a CSV file and includes column headers.
There are quite a few Python scripts available that will convert a SQL database file into a CSV, but I have found the below code to be quite good. It is based on Till Hoffman’s code with modifications to extract column names.
Column names are encoded in the cursor.description
parameter, but this actually contains a list of tuples, of which column_name
is the first. Therefore a colnames
parameter is created to allow just the column name to be added to the output as a column header in the CSV file.
The entire code is as follows:
import sqlite3
import pandas as pd
# Open the file
f = open('/path/to/output.csv', 'w')
# Create a connection and get a cursor
connection = sqlite3.connect('/path/to/database.sql')
cursor = connection.cursor()
# Execute the query
cursor.execute('select * from mydata')
# Get Header Names (without tuples)
colnames = [desc[0] for desc in cursor.description]
# Get data in batches
while True:
# Read the data
df = pd.DataFrame(cursor.fetchall())
# We are done if there are no data
if len(df) == 0:
break
# Let us write to the file
else:
df.to_csv(f, header=colnames)
# Clean up
f.close()
cursor.close()
connection.close()
Rather than hard-coding commands and variables into the script, it’s better to parse them as variables via the command line. To do this, I utilised the argparse utility and modified the script as follows:
import sqlite3
import pandas as pd
import argparse
def options():
parser = argparse.ArgumentParser(description="Convert a SQL file to a CSV file")
parser.add_argument("-i", "--input", help="Input SQL file.", required=True)
parser.add_argument("-o", "--output", help="Output CSV file.", required=False)
parser.add_argument("-c","--command", help="SQL command", required=True)
args = parser.parse_args()
return args
def main():
# Get options
args = options()
# Open the file
f = open(args.output, 'w')
# Create a connection and get a cursor
connection = sqlite3.connect(args.input)
cursor = connection.cursor()
# Execute the query
cursor.execute(args.command)
# Get Header Names (without tuples)
colnames = [desc[0] for desc in cursor.description]
# Get data in batches
while True:
# Read the data
df = pd.DataFrame(cursor.fetchall())
# We are done if there are no data
if len(df) == 0:
break
# Let us write to the file
else:
df.to_csv(f, header=colnames)
# Clean up
f.close()
cursor.close()
connection.close()
if __name__ == '__main__':
main()
Usage on the command line is as follows:
$ python /path/to/script.py -i "/path/to/database.sql" -o "/path/to/output.csv" -c "select * from mydata"
The -c
flag can be used to run advanced SQL commands such as a LEFT JOIN, if desired but must have a basic SQL query at a minimum. Note that the SQL commands must be wrapped in double quotes.
Comments
One response to “SQL to CSV via Python, with Headers”
Super nice and compact.!!Thanks