Cache and Persist in Pyspark

Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
  • User AvatarKiran Dalvi
  • 12 Mar, 2023
  • 0 Comments
  • 17 Secs Read

Cache and Persist in Pyspark

import pyspark
from pyspark.sql
import SparkSession
spark = SparkSession.builder.master("local[1]")  \

.appName("SparkByExamples.com")\

.getOrCreate()

df = spark.read.csv(r"C:\Users\ankus\PycharmProjects\DecHadoop\Resources\region_country.csv")
print(df.is_cached)
df.cache()
print(df.is_cached)
df.persist(pyspark.storagelevel.StorageLevel.MEMORY_AND_DISK)
print(df.is_cached)
df.show()