You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -183,13 +188,18 @@ For remote files, you can use glob patterns to specify multiple files. For examp
183
188
184
189
The `FILE_FORMAT` parameter supports different file types, each with specific formatting options. Below are the available options for each supported file format:
| MISSING_FIELD_AS | How to handle missing fields | ERROR |
227
243
228
-
### TYPE = ORC
244
+
</TabItem>
245
+
246
+
<TabItemvalue="orc"label="ORC">
229
247
230
248
| Option | Description | Default |
231
249
|--------|-------------|--------|
232
250
| MISSING_FIELD_AS | How to handle missing fields | ERROR |
233
251
234
-
### TYPE = AVRO
252
+
</TabItem>
253
+
254
+
<TabItemvalue="avro"label="AVRO">
235
255
236
256
| Option | Description | Default |
237
257
|--------|-------------|--------|
238
258
| MISSING_FIELD_AS | How to handle missing fields | ERROR |
239
259
260
+
</TabItem>
261
+
</Tabs>
262
+
240
263
## Copy Options
241
264
242
265
| Parameter | Description | Default |
@@ -270,6 +293,10 @@ If `RETURN_FAILED_ONLY` is set to `true`, the output will only contain the files
270
293
271
294
## Examples
272
295
296
+
:::tip Best Practice
297
+
For external storage sources, it's recommended to use pre-created connections with the `CONNECTION_NAME` parameter instead of specifying credentials directly in the COPY statement. This approach provides better security, maintainability, and reusability. See [CREATE CONNECTION](../00-ddl/13-connection/create-connection.md) for details on creating connections.
298
+
:::
299
+
273
300
### Example 1: Loading from Stages
274
301
275
302
These examples showcase data loading into Databend from various types of stages:
@@ -314,16 +341,19 @@ These examples showcase data loading into Databend from various types of externa
314
341
<TabsgroupId="external-example">
315
342
<TabItemvalue="Amazon S3"label="Amazon S3">
316
343
317
-
This example establishes a connection to Amazon S3 using AWS access keys and secrets, and it loads 10 rows from a CSV file:
344
+
This example uses a pre-created connection to load data from Amazon S3:
318
345
319
346
```sql
320
-
-- Authenticated by AWS access keys and secrets.
347
+
-- First create a connection (you only need to do this once)
348
+
CREATE CONNECTION my_s3_conn
349
+
STORAGE_TYPE ='s3'
350
+
ACCESS_KEY_ID ='<your-access-key-ID>'
351
+
SECRET_ACCESS_KEY ='<your-secret-access-key>';
352
+
353
+
-- Use the connection to load data
321
354
COPY INTO mytable
322
355
FROM's3://mybucket/data.csv'
323
-
CONNECTION = (
324
-
ACCESS_KEY_ID ='<your-access-key-ID>',
325
-
SECRET_ACCESS_KEY ='<your-secret-access-key>'
326
-
)
356
+
CONNECTION = (CONNECTION_NAME ='my_s3_conn')
327
357
FILE_FORMAT = (
328
358
TYPE = CSV,
329
359
FIELD_DELIMITER =',',
@@ -333,19 +363,20 @@ COPY INTO mytable
333
363
SIZE_LIMIT =10;
334
364
```
335
365
336
-
This example connects to Amazon S3 using AWS IAM role authentication with an external ID and loads CSV files matching the specified pattern from 'mybucket':
366
+
**Using IAM Role (Recommended for Production)**
337
367
338
368
```sql
339
-
-- Authenticated by AWS IAM role and external ID.
369
+
-- Create connection using IAM role (more secure, recommended for production)
This example demonstrates how to load CSV files from Amazon S3 using pattern matching with the PATTERN parameter. It filters files with 'sales' in their names and '.csv' extensions:
433
495
434
496
```sql
497
+
-- Create connection for pattern-based file loading
498
+
CREATE CONNECTION pattern_s3_conn
499
+
STORAGE_TYPE ='s3'
500
+
ACCESS_KEY_ID ='<your-access-key-ID>'
501
+
SECRET_ACCESS_KEY ='<your-secret-access-key>';
502
+
503
+
-- Load CSV files with 'sales' in their names using pattern matching
435
504
COPY INTO mytable
436
505
FROM's3://mybucket/'
506
+
CONNECTION = (CONNECTION_NAME ='pattern_s3_conn')
437
507
PATTERN ='.*sales.*[.]csv'
438
508
FILE_FORMAT = (
439
509
TYPE = CSV,
@@ -445,19 +515,19 @@ COPY INTO mytable
445
515
446
516
Where `.*` is interpreted as zero or more occurrences of any character. The square brackets escape the period character `.` that precedes a file extension.
447
517
448
-
To load from all the CSV files:
518
+
To load from all the CSV files using a connection:
449
519
450
520
```sql
451
521
COPY INTO mytable
452
522
FROM's3://mybucket/'
523
+
CONNECTION = (CONNECTION_NAME ='pattern_s3_conn')
453
524
PATTERN ='.*[.]csv'
454
525
FILE_FORMAT = (
455
526
TYPE = CSV,
456
527
FIELD_DELIMITER =',',
457
528
RECORD_DELIMITER ='\n',
458
529
SKIP_HEADER =1
459
530
);
460
-
461
531
```
462
532
463
533
When specifying the pattern for a file path including multiple folders, consider your matching criteria:
@@ -605,7 +675,7 @@ DESC t2;
605
675
An error would occur when attempting to load the data into a table:
606
676
607
677
```sql
608
-
root@localhost:8000/default>COPY INTO t2 FROM @~/invalid_json_string.parquet FILE_FORMAT = (TYPE = PARQUET) ON_ERROR = CONTINUE;
678
+
COPY INTO t2 FROM @~/invalid_json_string.parquet FILE_FORMAT = (TYPE = PARQUET) ON_ERROR = CONTINUE;
609
679
error: APIError: ResponseError with 1006: EOF while parsing a value, pos 3 while evaluating function `parse_json('[1,')`
0 commit comments