Databricks Prints Only Around 280 lines of data

Question

I'm running some large jobs in Databricks, which for now, include inventorying the data lake. I'm trying to print all blob names within a prefix (sub-folder). There are a lot of files in these sub-folders, and I'm getting about 280 rows of file names printed, but then I see this: *** WARNING: skipped 494256 bytes of output *** Then, I get another 280 rows printed.

I'm guessing there is a control to change this, right. I certainly hope so. This is designed to work with BIG data, not ~280 records. I understand that huge data sets can easily crash a browser, but common, this is basically nothing.

score 3 · Accepted Answer · answered Oct 30 '19 at 11:18

Note: Using GUI, you can download full results (max 1 millions rows).

To download full results (more than 1 million), first save the file to dbfs and then copy the file to local machine using Databricks cli as follows.

dbfs cp "dbfs:/FileStore/tables/AA.csv" "A:\AzureAnalytics"

Reference: Databricks file system

The DBFS command-line interface (CLI) uses the DBFS API to expose an easy to use command-line interface to DBFS. Using this client, you can interact with DBFS using commands similar to those you use on a Unix command line. For example:

# List files in DBFS
dbfs ls
# Put local file ./apple.txt to dbfs:/apple.txt
dbfs cp ./apple.txt dbfs:/apple.txt
# Get dbfs:/apple.txt and save to local file ./apple.txt
dbfs cp dbfs:/apple.txt ./apple.txt
# Recursively put local dir ./banana to dbfs:/banana
dbfs cp -r ./banana dbfs:/banana

Reference: Installing and configuring Azure Databricks CLI

Hope this helps.

I figured this out a few days ago. Thanks for trying to help me! — ASH, Oct 30 '19 at 12:49

score 1 · Answer 2 · answered Oct 15 '19 at 13:15

1

After a little more research, I stumbled on something that worked for me.

Also, this will display the contents of a dataframe,

display(df)

So, that will generate the view you see directly above.

answered Oct 15 '19 at 13:15

ASH

18,040
13
61
153

Databricks Prints Only Around 280 lines of data

2 Answers2

Linked