ANKUSH THAVALI12 Mar, 20230 Comments17 Secs ReadCache and Persist in Pyspark import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.master("local[1]") \ .appName("SparkByExamples.com")\ .getOrCreate() df = spark.read.csv(r"C:\Users\ankus\PycharmProjects\DecHadoop\Resources\region_country.csv") print(df.is_cached) df.cache() print(df.is_cached) df.persist(pyspark.storagelevel.StorageLevel.MEMORY_AND_DISK) print(df.is_cached) df.show() HadoopShare: Pyspark and amazon s3 integration Bash Profile