I use to spend much time playing with data in order to import, export and aggregate it. MySQL is one on my primary source of data because stays behind many popular projects and is usually first choice also for custom solutions.
Recently I discovered some really useful unknown functions which help me to export complex data.
This function returns a string result with the concatenated non-
NULL values from a group. It returns
NULL if there are no non-
GROUP_CONCAT(test_score SEPARATOR ',')
GROUP BY student_name;
SELECT * INTO OUTFILE statement is intended primarily to let you very quickly dump a table to a text file on the server machine. It is really useful to export data as CSV directly from your master server. You can also use
DUMPFILE if you need a raw output.
SELECT a,b,a+b INTO OUTFILE '/tmp/result.txt'
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\n'
If you plan to use it with a standard CSV library you must refer to RFC 4180 for correct format in order to avoid reading errors.
If you database is bigger than you are able to manage you probably need Sqoop. It is is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
You can import data from you MySQL database to a CSV file stored on HDFS and access it from anywhere in your cluster.
sqoop import \
--connect jdbc:mysql://mysql.example.com/sqoop \
--username sqoop \
--password sqoop \