0

I am new in apache spark. I create the schema and data frame and it show me result but the format was not good and it so messy. Hardly I can read the line. So i want to show my result in pandas format. I attached the screen shot of my data frame result. But i don't know how to show my result in pandas format.

Here's my code

from pyspark.sql.types import StructType, StructField, IntegerType
from pyspark.sql.types import * 
from IPython.display import display 
import pandas as pd 
import gzip

schema = StructType([StructField("crimeid", StringType(), True), 
                     StructField("Month", StringType(), True), 
                     StructField("Reported_by", StringType(), True),
                     StructField("Falls_within", StringType(), True), 
                     StructField("Longitude", FloatType(), True), 
                     StructField("Latitue", FloatType(), True), 
                     StructField("Location", StringType(), True),
                     StructField("LSOA_code", StringType(), True),
                     StructField("LSOA_name", StringType(), True),
                     StructField("Crime_type", StringType(), True),
                     StructField("Outcome_type", StringType(), True),
                    ])

df = spark.read.csv("crimes.gz",header=False,schema=schema)
df.printSchema()

PATH = "crimes.gz"
csvfile = spark.read.format("csv")\
.option("header", "false")\
.schema(schema)\
.load(PATH)
df1 =csvfile.show()

it shows the result like below

enter image description here

but in want this data pandas form

Thanks

mck
  • 37,331
  • 13
  • 29
  • 45
  • Does this answer your question? [Convert a spark DataFrame to pandas DF](https://stackoverflow.com/questions/50958721/convert-a-spark-dataframe-to-pandas-df) – SMaZ Dec 07 '20 at 22:50
  • you can also just paste it in any editor or excel and it won't wrap. – jayrythium Dec 08 '20 at 13:40

0 Answers0