Add the contents of an Excel file to an Oracle NoSQL Database table

As you may gather if you’ve read any of my previous posts, one of my hobbies is collecting retro video games 🕹️.

I’ve recently catalogued my collection of games and put these into an Excel spreadsheet (we all know that Excel is the worlds most popular database!).

What I wanted to do though, is to migrate this to an Oracle NoSQL Database hosted within OCI – this is complete overkill for my needs, but a great use-case/example to help me get to grips with using NoSQL 🧠.

To do this, I needed to figure out how to:

  1. Create an Oracle NoSQL Database table to store the data ✅
  2. Read an Excel file (the one containing my list of retro games) using Python, which is my language of choice ✅
  3. Write this data to an Oracle NoSQL Database table ✅

Step 1 – Creating an Oracle NoSQL Database table

I did this directly from the OCI Console, via Databases > Oracle NoSQL Database > Tables > Create table

On the table creation screen, I selected the following:

  • Simple input – I could then easily define my simple schema within the browser (defining the columns needed within the table).
  • Reserved capacity – Further details on how this works can be found here. I opted for a read/write capacity of 10 units which equates to 10KB of reads/writes per second, I only need this capacity for the initial data load so will reduce to 1 after I’ve loaded the data from Excel. I went with 1GB of storage (which is the minimum), I’m sure I won’t use more than 1MB though!
  • Name – I kept this simple and called the table Games.
  • Primary key – I named this ID of type integer, I’m going to populate this with the epoch time so that I have unique values for each row.
  • Columns – I only actually need two columns, Game and System. For example, an entry could be Game = Super Mario Land and System = Game Boy.

I then hit Save and within a few seconds my table was created ✅.

Step 2 – Reading data from an Excel spreadsheet

The spreadsheet with my game collection in has a separate sheet for each system, with the respective games for that system listed within the sheet.

The example below shows the PS1 games I own, as you can see there are sheets for other systems, such as Wii U and PS3.

After much investigation, I found that the easiest way to read an Excel file using Python was with the pandas and OpenPyXL libraries.

I put together the following Python script which iterates through each sheet in the Excel file, outputting the sheet name (system, such as Game Boy) and the contents of each row within the sheet (which would be a game, such as Super Mario Land).

import pandas as pd
import time

excelfilepath = '/Users/bkgriffi/Downloads/Retro Games Collection.xlsx' # Excel file to read from
excel = pd.ExcelFile(excelfilepath)
sheets = excel.sheet_names # Create a list of the sheets by name (each system has a separate sheet)

for sheet in sheets: # Loop through each of the sheets (systems)
    print("----------")
    print(sheet) # Print the name of the sheet (system)
    print("----------")
    excel = pd.read_excel(excelfilepath,header = None, sheet_name= sheet)
    i = 0
    while i < len(excel[0]) - 1: # Run a while loop that only runs until each row in the sheet has been processed
        print(excel[0][i]) # Print the row (game)
        i += 1 # Increase i so that on the next loop it outputs the next row (game) in the sheet (system)

Here is the script in action, as you can see it lists the system (sheet name) and then the rows within that sheet (game), before then moving on to the next sheet.

Step 3 – Writing data to an Oracle NoSQL Database table

Now that I’d figured out how to read an Excel file with Python, the final piece of the puzzle was to write this to the Oracle NoSQL Database table.

I took the script above and incorporated it into the following:

import pandas as pd
import oci
import time

# Connect to OCI
config = oci.config.from_file()
nosql_client = oci.nosql.NosqlClient(config)

# Read Excel file
excelfilepath = '/Users/bkgriffi/Downloads/Retro Games Collection.xlsx' # Path to Excel file
excel = pd.ExcelFile(excelfilepath)
sheets = excel.sheet_names

# Write the data to the Oracle NoSQL Database table
for sheet in sheets:
    print("----------")
    print(sheet)
    print("----------")
    excel = pd.read_excel(excelfilepath,header = None, sheet_name= sheet)
    i = 0
    while i < len(excel[0]) - 1:
        print(excel[0][i])
        update_row_response = nosql_client.update_row(
        table_name_or_id="GamesTable",
        update_row_details=oci.nosql.models.UpdateRowDetails(
        value={'ID': int((str(time.time()).split(".")[0])), 'Game': excel[0][i], 'System': sheet}, # This is the data to write to the table, the value for ID may look a little scary, all this is doing is passing the UNIX epoch time, I did this to ensure that each row had a unique ID, which is needed as the ID column is the primary key
        compartment_id="Replace with the OCI of the compartment that contains the Oracle NoSQL Database table",
        option="IF_ABSENT",
        is_get_return_row=True))
        i += 1

This uses the OCI Python SDK to connect to the Oracle NoSQL Database table created earlier (Games) and writes the data to it, after running the script I could verify this within the OCI Console by going to Explore data > Execute and running the default SQL statement (which returns everything in the table).

Points to note about the script:

  • You need to update the compartment_id and put in the value for the compartment that contains the Oracle NoSQL Database table to populate.
  • This script requires the OCI SDK for Python with appropriate auth in place, I wrote a quick start on this here.

The script can also be found on GitHub.

One thought on “Add the contents of an Excel file to an Oracle NoSQL Database table

Leave a comment