0

I need to download locally (macOS BigSur) a large dataset (millions of rows) from an AWS RDS database (MySQL 5.7). Thanks to this great post I am able to connect and download on my machine some data into a csv file:

mysql --host=$HOST --user $USER --password=$PASSWORD --database=$DATABASE --port=$PORT --batch \
  --quick -e "$QUERY" \
  | sed $'s/\\t/","/g;s/^/"/;s/$/"/;s/\\n//g' > $FILE_PATH

However, if I extend my query to thousands of records, after few seconds the process stops and the csv ends up truncated (literally the last written row is truncated half way), so I assume there is some kind of stream or timeout or buffer issue.

mysql> SHOW VARIABLES LIKE '%timeout';

+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| connect_timeout             | 10       |
| delayed_insert_timeout      | 300      |
| have_statement_timeout      | YES      |
| innodb_flush_log_at_timeout | 1        |
| innodb_lock_wait_timeout    | 50       |
| innodb_rollback_on_timeout  | OFF      |
| interactive_timeout         | 28800    |
| lock_wait_timeout           | 31536000 |
| net_read_timeout            | 30       |
| net_write_timeout           | 60       |
| rpl_stop_slave_timeout      | 31536000 |
| slave_net_timeout           | 60       |
| wait_timeout                | 28800    |
+-----------------------------+----------+

Since the command does stops after ~10sec I assume it depends on the connect_timeout value. However I tried setting it with SET @@GLOBAL.connect_timeout=7200 but I get a permission error. I tried adding the --connect-timeout=7200 parameter on the command, but it does not work (which strikes me the most).

Running the query (limited to ~250k rows) on my client (SequelAce) it runs fine, so I can exclude issues with the data or the the SQL script itself.

Any ideas or suggestions?
Are there better tools for the job maybe?

Stefano
  • 445
  • 3
  • 14

0 Answers0