icon Register for Oracle Exadata Live Session on 05 May at 7.30 PM IST ENROLL NOW

Pyspark and amazon s3 integration 

Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
  • 03 Mar, 2023
  • 0 Comments
  • 29 Secs Read

Pyspark and amazon s3 integration 

# Create SparkSession from builder
import pyspark 
from pyspark.sql
import SparkSession spark = SparkSession.builder.master("local[6]") \                    .appName('SparkByExamples.com') \                    .getOrCreate()
conf = pyspark.SparkConf()
sc = pyspark.SparkContext.getOrCreate(conf=conf)
accessKeyId='AKIAYH7INJR2EIYR422G'secretAccessKey='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'
hadoopConf = sc._jsc.hadoopConfiguration()
hadoopConf.set('fs.s3a.access.key',accessKeyId)
hadoopConf.set('fs.s3a.secret.key', secretAccessKey)
s3_df=spark.read.csv('s3a://learnomate.spark/partition_zipcodes20.csv',header=True,inferSchema=True)
s3_df.show(5)
Spark submit spark-submit –jars C:\Users\Acer\Downloads\aws-java-sdk-1.7.4.jar, C:\Users\Acer\Downloads\hadoop-aws-2.7.7.jar C:\Users\Acer\PycharmProjects\DecHadoopBatch\readS3.py
lets talk - learnomate helpdesk

Book a Free Demo