icon Join the 3-Day Free Live Sessions on Data Science with Gen AI ENROLL NOW

Pyspark and amazon s3 integration 

Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
  • 03 Mar, 2023
  • 0 Comments
  • 29 Secs Read

Pyspark and amazon s3 integration 

# Create SparkSession from builder
import pyspark 
from pyspark.sql
import SparkSession spark = SparkSession.builder.master("local[6]") \                    .appName('SparkByExamples.com') \                    .getOrCreate()
conf = pyspark.SparkConf()
sc = pyspark.SparkContext.getOrCreate(conf=conf)
accessKeyId='AKIAYH7INJR2EIYR422G'secretAccessKey='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'
hadoopConf = sc._jsc.hadoopConfiguration()
hadoopConf.set('fs.s3a.access.key',accessKeyId)
hadoopConf.set('fs.s3a.secret.key', secretAccessKey)
s3_df=spark.read.csv('s3a://learnomate.spark/partition_zipcodes20.csv',header=True,inferSchema=True)
s3_df.show(5)
Spark submit spark-submit –jars C:\Users\Acer\Downloads\aws-java-sdk-1.7.4.jar, C:\Users\Acer\Downloads\hadoop-aws-2.7.7.jar C:\Users\Acer\PycharmProjects\DecHadoopBatch\readS3.py
lets talk - learnomate helpdesk

Let's Talk

Find your desired career path with us!

lets talk - learnomate helpdesk

Let's Talk

Find your desired career path with us!