HOWTO - Import CSV to SQL

by Chris Olstrom

in HOWTO

CSV is a data exchange format that most software can export to, since it is a simple list of values separated by commas (hence the name). Spreadsheet programs like OpenOffice Calc, Microsoft Excel, Google Docs Spreadsheet and others commonly have an ‘Export to CSV’ option, often found under the File menu.

There are scripts that convert CSV to SQL, but most modern database systems support reading CSV data such as:

They all require different syntax to import CSV data, some examples of which are detailed here.

Laying Out the Data

This reference relies on a few assumptions. Firstly, it assumes you have obtained your data in CSV format. Secondly, good sense assumes you would not be doing this if you had a better way. Important metadata can be lost by using CSV as an intermediate format. Details like Foreign Keys, column constraints, and so on.

For the purposes of this guide, we will be using the following dataset.

"id","firstName","lastName"
12345,"Ingrid","Insert"
23456,"Eric","Example"
12543,"Dave","Dataset"

All examples that follow are based on this CSV set.

Creating the Table

A simple example of a table we may be importing the CSV file to.

CREATE TABLE employees (
id INT NOT NULL PRIMARY KEY,
firstName VARCHAR(32) NOT NULL,
lastName VARCHAR(48) NOT NULL
);

Importing the Data

This process isn’t terribly complicated, but it does vary according to the database system you are importing to.

MySQL

LOAD DATA INFILE '/path/to/employees.csv' INTO TABLE employees
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\n';

Microsoft SQL Server and Sybase ASE

BULK INSERT employees FROM ‘c:\path\to\employees.csv’ WITH (
FIELDTERMINATOR = ‘,’,
ROWTERMINATOR = ‘\n’
);
GO

PostgreSQL

COPY employees FROM '/path/to/employees.csv' DELIMITERS ',' CSV;

Oracle

Offhand I don’t know a way to import a CSV file into Oracle using only SQL or PL/SQL, but there’s a great article on Dev Shed that covers how to do it with the Oracle 10G XE administration interface. Here is a convenient link to the good part, since if you’re reading this, you have likely already exported the CSV data, and want to know how to import that CSV to SQL.

Alternatively, you could convert your CSV file to SQL INSERT statements, and then execute those INSERTS. An simple CSV to SQL Conversion Tool can do this effectively.

If anyone knows how to import CSV files into Oracle without using the admin GUI or an external tool, please share!

References

{ 0 comments }

Sometimes you need to replace text in more than one file.  Often a quick ‘find and replace’ with your favourite text editor (CTRL+\ in nano, for instance) can accomplish this, and when multiple files is a small number, it can be a viable method. When dealing with text replacement for large sets of files, a quick script could save you a considerable amount of time. A few examples of when you might do this are:

  • Changing footer links in a template.
  • Updating copyright information on a static HTML site.
  • Replacing contact information in many files.

One solution is very simple shell script. It requires sed (a classic text replacement utility), which should be available on any Linux or Unix system. We can feed it a very simple regular expression, dump the output to a temporary file, and overwrite the original with it.

Batch text replacement for multiple files with a shell script:

TMPFILE=./tmp.$$

for filename in *.txt; do
  sed 's/text_to_replace/something_to_replace_it_with/g' $filename > $TMPFILE
  mv $TMPFILE $filename # Comment this line out to not overwrite the original file.
fi

As mentioned above, this is just one way to accomplish this. Phil Dufault suggests two alternative methods (non-recursive and recursive)

Replacing one string with another in multiple files using sed:

sed -i 's,text,replacement,g' *.txt

Recursively substituting text in many files with find and sed:

find . -type f -iname '*.txt' -exec sed -i 's,this,that,g' {} +

{ 0 comments }

HOWTO - Resolve 32-Bit Dependencies on 64-Bit Ubuntu (or Debian) with getlibs

by Chris Olstrom

One of the more irritating aspects of running a 64-bit distribution is that there are still a large number of applications that are not compiled as native 64-bit binaries. This leads to dependence on 32-bit libraries, and managing these 32-bit dependencies can be a nightmare. There is no shortage of users who have introduced more [...]

Read the full article →

HOWTO - Setup Parental Controls with DansGuardian and Squid

by Chris Olstrom

Ignoring the dangers implicit with censorship, this guide should provide a simple way to filter what some may consider ‘objectionable’ content from the vast pit of corruption that is the web (this is not meant as a negative thing, mind you. Just a simple observation).
The configuration outlined here may also have the added benefit of [...]

Read the full article →

HOWTO - Fix Canonical Problems with .htaccess and mod_rewrite

by Chris Olstrom

Canonical problems occur when there is confusion over which version of a given document is ‘official’.
If a URI identifies a specific document, two different URIs are often treated as two different documents. If the contents of each are identical, it could be assumed that one is a copy of the other. In fact, if we [...]

Read the full article →

HOWTO - Dual WAN Setup under GNU/Linux

by Chris Olstrom

This article presents a reasonably straightforward approach to using two independent internet connections on one system. While the examples are for multiple ethernet connections, some minor modifications could apply the methods to a mixed ethernet/wireless system.
Note: The timestamp on this article may seem unusual. This was written before Slack/Tux was purchased, and relocated to this [...]

Read the full article →

HOWTO - Build .deb Packages from Source in Ubuntu (or Debian)

by Chris Olstrom

Sometimes, we need to build a package from source. This can be for performance reasons (architecture-specific optimizations), memory reasons (removing features we don’t need), or just for kicks. In Ubuntu (or any other Debian derivative, for that matter), this can often pose a problem, as such hand-built packages are no longer managed by our Package [...]

Read the full article →

HOWTO - Forcing SSL Connections with .htaccess and mod_rewrite

by Chris Olstrom

Warning! Forcing SSL use like this can cause more problems than it may be worth. Anything that is not SSL-aware will be unable to view your site. This can break compatibility with web services (see the example of FeedBurner below, including a workaround), search spiders (many may regard SSL-enabled content as private, and not index [...]

Read the full article →

Under New Management

by Chris Olstrom

Slack/Tux has been bought by Chris Olstrom.
Anyone familiar with the previous site will notice that the ad-farm has been taken down in favour of a quality content (in time, of course. I’m sure it looks rather barren at the moment). Though a large number of the links floating around out there indicate that Slackware-related content [...]

Read the full article →