Cache and Persist in Pyspark - Learnomate Technologies

ANKUSH THAVALI
12 Mar, 2023
0 Comments
17 Secs Read

Cache and Persist in Pyspark

import pyspark
from pyspark.sql
import SparkSession
spark = SparkSession.builder.master("local[1]")  \

.appName("SparkByExamples.com")\

.getOrCreate()

df = spark.read.csv(r"C:\Users\ankus\PycharmProjects\DecHadoop\Resources\region_country.csv")
print(df.is_cached)
df.cache()
print(df.is_cached)
df.persist(pyspark.storagelevel.StorageLevel.MEMORY_AND_DISK)
print(df.is_cached)
df.show()