Jupyter Notebook is an open source web application for Machine Learning and Data Exploration.
In this post I will show you how to connect a Jupyter Notebook to Oracle Autonomous Database and explore the data using Python.
The assumption is we already have a Jupyter notebook sandbox running on Oracle Cloud compute instance.
- A Jupyter notebook sandbox on Oracle Compute Instance. (Check out the documentation)
- Assumption that you have already installed Oracle Instant Client on your instance
- Python 3.6
Connect to your OCI compute instance via SSH.
Start your Jupyter notebook environment and create a new notebook.
or if running as root
#jupyter notebook --allow-root
Access your notebook using the browser > http://<your public IP>:8888
Install the required Python libraries.
# Install required libraries
!pip install sqlalchemy pandas numpy cx_oracle matplotlib
Import the libraries
import sqlalchemy as db import pandas as pd import numpy as np import os import matplotlib.pyplot as plt
Create a new connection with Oracle Autonomous Database:
os.environ['TNS_ADMIN']='<path to your Oracle db wallet on OCI instance>' engine = db.create_engine('oracle://<user>:<password>@<service type>')
os.environ['TNS_ADMIN'] = '/opt/oracle/instantclient_19_3/network/admin/' engine = db.create_engine('oracle://ADMIN:Welcome!@smatATP_TP')
Import this data set to your Oracle Autonomous Database using the Jupyter notebook:
Since you are using OCI compute, you will have to upload the dataset on your compute instance. Then using the Jupyter notebook, load the data to your Database.
df = pd.read_csv('/home/opc/applications/Data/iris.csv') df.to_sql('iris', engine, index=False, if_exists='replace')
Retrieve data from the table you just created.
df = pd.read_sql("select * from iris", con=engine) df.head(10)
total = pd.read_sql("select count(*) from iris", con=engine)
Use Matplotlib to visualize the data.
df.groupby('species') plt.scatter(df.sepal_length, df.sepal_width)