Connect Oracle and Mysql database with Spark | RDBMS to Spark Connection

Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
Breadcrumb Abstract Shape
  • User AvatarANKUSH THAVALI
  • 02 Mar, 2022
  • 0 Comments
  • 1 Min Read

Connect Oracle and Mysql database with Spark | RDBMS to Spark Connection

Connection from Spark to Oracle Database

Download ojdbc jar connector

Code to connect oracle to spark

import os
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[1]") \
    .appName('SplitFile') \
    .config("spark.driver.extraClassPath",r'C:\Users\ankus\OneDrive\Desktop\ojdbc8.jar') \
    .getOrCreate()
os.environ['ORACLE_SID'] = "XE"
empDF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:oracle:thin:system/Oracle1234@//127.0.0.1:1521/XE") \
    .option("dbtable", "system.testtable") \
    .option("user", "system") \
    .option("password", "Oracle1234") \
    .option("driver", "oracle.jdbc.driver.OracleDriver") \
    .load()
empDF.show()

Spark submit with jar file

spark-2.1.0-bin-hadoop2.7/bin/pyspark
--jars "/home/jars/ojdbc8.jar"
--master yarn-client
--num-executors 10
--driver-memory 16g
--executor-memory 8g

Mysql Database with Insert statement

CREATE TABLE employee (
          empno INT,
          ename VARCHAR(100),
          designation VARCHAR(100),
          manager INT,
          hire_date VARCHAR(50),
          sal INT,
          deptno INT
    );
    INSERT INTO employee (empno, ename, designation, manager, hire_date, sal, deptno)
    VALUES (7369,'SMITH','CLERK',7902,'1980-12-17',800.00,20),
    (7499,'ALLEN','SALESMAN',7698,'1981-02-20',1600.00,30),
    (7521,'WARD','SALESMAN',7698,'1981-02-22',1250.00,30),
    (7566,'JONES','MANAGER',7839,'1981-04-02',2975.00,20),
    (7654,'MARTIN','SALESMAN',7698,'1981-09-28',1250.00,30);

Spark Code to connect to mysql database

df = spark.read.format("jdbc").options(
    url="jdbc:mysql://localhost:3306/emp",
    driver = "com.mysql.jdbc.Driver",
    dbtable = "employee",
    user="root",
    password="hortonworks1").load()

Write Spark DF to MYSQL database

df = df.write.format("jdbc").options(
    url="jdbc:mysql://localhost:3306/emp",
    driver = "com.mysql.jdbc.Driver",
    dbtable = "employee",
    user="root",
    password="hortonworks1").mode('append').save()