• Give a user read-only access to an OCI tenant 👨‍🏫

    I was recently asked by a customer to perform a review of their OCI tenancy, following the principle of least privilege I stepped them through the process of creating a user account that granted me read-only access to their tenancy, meaning that I can see how everything has been setup, but I cannot change anything.

    Following Scott Hanselman’s guidance of preserving keystrokes I thought I’d document the process here as I’ll no doubt need to guide somebody else through this in the future 😀.

    The three steps to do this are below👇

    Step 1 – Create a user account for the user 👩

    Yes, I know that this is obvious however I’ve included it here for completeness 😜. A user can be added to an OCI tenancy via Identity & Security > Domains > (Domain) > Users > Create user.

    Ensure that the email address for the user is valid as this will be used to confirm their account. The user does not need to be added to any groups at this point (we’ll do that in the next step).

    If the user who you need to grant read-only access to the tenancy already exists, this step can be skipped.

    Step 2 – Create a group 👨👩

    OCI policies do not permit assigning permissions directly to a user, so we will create a group which will be assigned read-only permissions to the tenancy.

    A group can be created via Identity & Security > Domains > (Domain) > Groups > Create group. I used the imaginative name of Read-Only for the group in the example below.

    Once a group has been created, add the user that you wish to grant read-only permissions to the tenancy (in this case Harrison):

    Step 3 – Create a policy to grant read-only access to the tenancy 📃

    We are nearly there, the penultimate step is to create a policy that grant the group named Read-Only with read permissions to the tenancy, a policy can be created via Identity & Security > Policies > Create Policy.

    I created a policy within the root compartment of the tenancy (which targets the policy at the entire tenancy).

    I used the following policy statement – allow group Read-Only to read all-resources in tenancy

    One thing to note, if you have multiple domains within the tenancy and the user account you wish to give read-only access to the tenancy doesn’t reside within the default domain, you’ll need to specify the domain within the policy, in the example above if the user was a member of the domain CorpDomain, the policy statement should be updated to read as follows:

    Allow group ‘CorpDomain’/’Read-Only’ to read all resources in tenancy

  • Connect to an OCI VM instance in a private subnet 🔒

    I’ve previously wrote about how I use OCI Bastion and Site to Site VPN to connect to my VM instances running within OCI that do not have a public IP address. There is also a third option, which I (rather embarrassingly) only recently found out about.

    It’s possible to use the OCI Cloud Shell (which runs within a web browser) to connect via SSH to a VM instance that is attached to a private subnet (therefore has no public IP address).

    To do this, launch Cloud Shell from within the OCI Console

    Select the Network drop-down menu and then Ephemeral private network setup

    Select the VCN and Subnet to connect to (the one that contains the instance you wish to connect to) and then click Use as active network

    Wait a minute or two! When the network status updates to Ephemeral the Cloud Shell is connected directly to the VCN and subnet selected.

    You can SSH into a VM instance within the subnet using it’s private IP address.

  • Using OCI API Gateway to Publish an OCI Function 📚

    OCI API Gateway includes native support for publishing OCI Functions, this was especially useful for me as I wanted to make my function available externally without authentication – whilst it’s possible to make an OCI Function available externally without using API Gateway, it’s not possible to make a function callable without authentication (e.g. make it available to anybody on the internet) 🔓.

    I’d ran through the process of publishing an OCI Function through OCI API Gateway a couple of months ago and got it to work successfully without too much pain, earlier this week I had to do this again and ran into a few issues – I was clearly a lot brighter back then! I thought I’d capture these issues and solutions to help others and for my future self 😀.

    A step-by-step guide for publishing an OCI Function through OCI API Gateway can be found here – if only I’d have read the documentation, I could have saved an hour of my life. Below are the issues I ran into and the solutions that I found ✅

    Issue 1 – Calls to the Function timeout ⏱️

    Using Curl to call the API Gateway endpoint for the Function timed out with the following error:

    curl: (28) Failed to connect to bcmd2sv4corxwehdxx4lzvrj9u.apigateway.uk-london-1.oci.customer-oci.com port 443 after 75019 ms: Couldn’t connect to server

    I’d provisioned a new API Gateway into a public VCN subnet and had forgotten to allow inbound traffic on port 443 to the subnet. To resolve this, I added an ingress rule to the security list associated with the subnet allowing traffic on port 443.

    Issue 2 – Calls to the function generate a 500 error

    Once I’d enabled port 443 inbound to the VCN subnet containing the API Gateway, I started to receive a different error when attempting to call the function using Curl (or a web browser for that matter):

    “Internal Server Error”,”code”:500

    To investigate this further I enabled Execution Logs for the API Gateway Deployment and sent some further requests, I could then see the following in the logs:

    With the full error being:

    “Error returned by FunctionsInvoke service (404, NotAuthorizedOrNotFound). Check the function exists and that the API gateway has been given permission to invoke the function.”

    Damn…….I’d forgotten to give the API Gateway permission to call the Function, hence the not authorized error 🤦‍♂️.

    To resolve this I created a dynamic group that contained the API Gateway – actually this contains all API Gateway’s within the specified compartment.

    I then created a policy to permit this dynamic group (API-DG) access to call Functions – again this rule is quite broad as it provides the dynamic group the permissions to call all functions within the tenancy. Within a production environment, you’d be a little stricter here and restrict this to a specific Function 😀.

    Issue 3 – I have no patience 😀

    After working through issue 1 and 2 and resolving these issues, I was still running into problems – inspecting the logs yielded the same NotAuthorizedOrNotFound error. It turns out that I needed to wait for the policy I created to come to life, about 30 minutes or so later (during this time I was frantically troubleshooting!) it started to work and public calls to my function through the API Gateway started to work 👍.

    Above is the output of my “workout generator” 🏋️ Function. If you’d like to learn more about creating a Function in OCI, check out – Creating a Function in the Oracle Cloud (OCI) to help me stay fit 🏃‍♂️

  • Add the contents of an Excel file to an Oracle NoSQL Database table

    As you may gather if you’ve read any of my previous posts, one of my hobbies is collecting retro video games 🕹️.

    I’ve recently catalogued my collection of games and put these into an Excel spreadsheet (we all know that Excel is the worlds most popular database!).

    What I wanted to do though, is to migrate this to an Oracle NoSQL Database hosted within OCI – this is complete overkill for my needs, but a great use-case/example to help me get to grips with using NoSQL 🧠.

    To do this, I needed to figure out how to:

    1. Create an Oracle NoSQL Database table to store the data ✅
    2. Read an Excel file (the one containing my list of retro games) using Python, which is my language of choice ✅
    3. Write this data to an Oracle NoSQL Database table ✅

    Step 1 – Creating an Oracle NoSQL Database table

    I did this directly from the OCI Console, via Databases > Oracle NoSQL Database > Tables > Create table

    On the table creation screen, I selected the following:

    • Simple input – I could then easily define my simple schema within the browser (defining the columns needed within the table).
    • Reserved capacity – Further details on how this works can be found here. I opted for a read/write capacity of 10 units which equates to 10KB of reads/writes per second, I only need this capacity for the initial data load so will reduce to 1 after I’ve loaded the data from Excel. I went with 1GB of storage (which is the minimum), I’m sure I won’t use more than 1MB though!
    • Name – I kept this simple and called the table Games.
    • Primary key – I named this ID of type integer, I’m going to populate this with the epoch time so that I have unique values for each row.
    • Columns – I only actually need two columns, Game and System. For example, an entry could be Game = Super Mario Land and System = Game Boy.

    I then hit Save and within a few seconds my table was created ✅.

    Step 2 – Reading data from an Excel spreadsheet

    The spreadsheet with my game collection in has a separate sheet for each system, with the respective games for that system listed within the sheet.

    The example below shows the PS1 games I own, as you can see there are sheets for other systems, such as Wii U and PS3.

    After much investigation, I found that the easiest way to read an Excel file using Python was with the pandas and OpenPyXL libraries.

    I put together the following Python script which iterates through each sheet in the Excel file, outputting the sheet name (system, such as Game Boy) and the contents of each row within the sheet (which would be a game, such as Super Mario Land).

    import pandas as pd
    import time
    
    excelfilepath = '/Users/bkgriffi/Downloads/Retro Games Collection.xlsx' # Excel file to read from
    excel = pd.ExcelFile(excelfilepath)
    sheets = excel.sheet_names # Create a list of the sheets by name (each system has a separate sheet)
    
    for sheet in sheets: # Loop through each of the sheets (systems)
        print("----------")
        print(sheet) # Print the name of the sheet (system)
        print("----------")
        excel = pd.read_excel(excelfilepath,header = None, sheet_name= sheet)
        i = 0
        while i < len(excel[0]) - 1: # Run a while loop that only runs until each row in the sheet has been processed
            print(excel[0][i]) # Print the row (game)
            i += 1 # Increase i so that on the next loop it outputs the next row (game) in the sheet (system)
    

    Here is the script in action, as you can see it lists the system (sheet name) and then the rows within that sheet (game), before then moving on to the next sheet.

    Step 3 – Writing data to an Oracle NoSQL Database table

    Now that I’d figured out how to read an Excel file with Python, the final piece of the puzzle was to write this to the Oracle NoSQL Database table.

    I took the script above and incorporated it into the following:

    import pandas as pd
    import oci
    import time
    
    # Connect to OCI
    config = oci.config.from_file()
    nosql_client = oci.nosql.NosqlClient(config)
    
    # Read Excel file
    excelfilepath = '/Users/bkgriffi/Downloads/Retro Games Collection.xlsx' # Path to Excel file
    excel = pd.ExcelFile(excelfilepath)
    sheets = excel.sheet_names
    
    # Write the data to the Oracle NoSQL Database table
    for sheet in sheets:
        print("----------")
        print(sheet)
        print("----------")
        excel = pd.read_excel(excelfilepath,header = None, sheet_name= sheet)
        i = 0
        while i < len(excel[0]) - 1:
            print(excel[0][i])
            update_row_response = nosql_client.update_row(
            table_name_or_id="GamesTable",
            update_row_details=oci.nosql.models.UpdateRowDetails(
            value={'ID': int((str(time.time()).split(".")[0])), 'Game': excel[0][i], 'System': sheet}, # This is the data to write to the table, the value for ID may look a little scary, all this is doing is passing the UNIX epoch time, I did this to ensure that each row had a unique ID, which is needed as the ID column is the primary key
            compartment_id="Replace with the OCI of the compartment that contains the Oracle NoSQL Database table",
            option="IF_ABSENT",
            is_get_return_row=True))
            i += 1
    

    This uses the OCI Python SDK to connect to the Oracle NoSQL Database table created earlier (Games) and writes the data to it, after running the script I could verify this within the OCI Console by going to Explore data > Execute and running the default SQL statement (which returns everything in the table).

    Points to note about the script:

    • You need to update the compartment_id and put in the value for the compartment that contains the Oracle NoSQL Database table to populate.
    • This script requires the OCI SDK for Python with appropriate auth in place, I wrote a quick start on this here.

    The script can also be found on GitHub.

  • How to use query strings with OCI Functions

    I’ve been using OCI Functions for a few month now, having previously used Azure Functions extensively one thing that I did miss was the ability to take values from the query string of a request to a function and pass these directly to the function, rather than including any input values that the function requires within the body of the request and submitting as a POST ✉️.

    As a side note, I did a short video that walks through the process of creating an OCI Function, which can be found here.

    For example, I’d previosuly written an Azure function during the pandemic to create a workout for me, with this function I could pass the number of exercises that I wanted in my workout to the function as part of the query string and it would return a workout with the number of exercises requested 🏋️.

    The query string, could for example be ?exercises=10 to create a workout with 10 exercises (as you can see below the exercises themselves are defined directly in the function itself).

    Example Azure Function code (Python)

    import logging
    import random
    import azure.functions as func
    
    def main(req: func.HttpRequest) -> func.HttpResponse:
        logging.info('Python HTTP trigger function processed a request.')
    
        exercises = req.params.get('exercises')
        if not exercises:
            try:
                req_body = req.get_json()
            except ValueError:
                pass
            else:
                exercises = req_body.get('exercises')
    
        if exercises:
            exerciselist = ['50 Star Jumps','20 Crunches','30 Squats','50 Press Ups','1 Min Wall Sit','10 Burpees','20 Arm Circles',\
            '20 Squats','30 Star Jumps','15 Crunches','10 Press Ups','2 Min Wall Sit','20 Burpees','40 Star Jumps','25 Burpees',\
            '15 Arm Circles','30 Crunches','15 Press Ups','30 Burpees','15 Squats','30 Sec Arm Circles','2 Min Wall Sit','20 Burpees',\
            '60 Star Jumps','10 Crunches','25 Press Ups'
            ]
            workout = []
            for i in range(0,int(exercises)):
                randomnumber = random.randint(0,(len(exerciselist)-1))
                workout.append(exerciselist[randomnumber])
            return func.HttpResponse(str(workout))
                
        else:
            return func.HttpResponse(
                 "Please pass the number of exercises required to the query string",
                 status_code=400
            )
    
    func.HttpResponse
    

    To do the same in an OCI Function, I discovered that I could inspect the ctx.RequestURL which provides the URL that was passed to the function and then do some magic to extract the query string values from this, below is the Python code for my OCI Functions variant of my exercise generator function that does this.

    The comments in the script explain how I achieved this.

    import io
    import random
    import json
    from fdk import response
    
    def handler(ctx,data):
        max = ctx.RequestURL() # Get the requested URL
        max = max.split("=")[1] # Extract the value passed to the query string, this makes the presumption that there is a single query string (I know you should never presume!) and uses this for the max variable that the for loop uses
        exerciselist = ['50 Star Jumps','20 Crunches','30 Squats','50 Press Ups','1 Min Wall Sit','10 Burpees','20 Arm Circles',\
            '20 Squats','30 Star Jumps','15 Crunches','10 Press Ups','2 Min Wall Sit','20 Burpees','40 Star Jumps','25 Burpees',\
            '15 Arm Circles','30 Crunches','15 Press Ups','30 Burpees','15 Squats','30 Sec Arm Circles','2 Min Wall Sit','20 Burpees',\
            '60 Star Jumps','10 Crunches','25 Press Ups'
            ]
        workout = []
        for i in range(0,int(max)):
            randomnumber = random.randint(0,(len(exerciselist)-1))
            workout.append(exerciselist[randomnumber])
    
        return response.Response(ctx, response_data=workout)
    

    Here is the function in all it’s glory being called from a browser (published via OCI App Gateway to make it callable without auth):

    The code can also be found on GitHub.

  • Creating a Site to Site VPN in Oracle OCI using a Raspberry Pi 🌏

    In my test lab I wanted to setup a Site-to-Site VPN between a Raspberry Pi on my home network and OCI, the main reason for this was to enable me to quickly access resources in OCI (such as my VMs) without having to use a Bastion (as I’m lazy and impatient) 🔐.

    I’d recently bought a shiny new Raspberry Pi 5 so this was a perfect excuse to have a play with it! Fortunately for me, Johannes Michler created a fantastic guide on how to setup a Raspberry Pi to connect to an OCI IPSec VPN – https://promatis.com/at/en/using-a-raspberry-pi-to-connect-to-oracle-oci-ipsec-vpn/.

    I followed this guide and was able to get this working eventually (which was definitely a case of me not RTFM’ing). I thought I’d share some additional info which others may find helpful 💡.

    1 – Error: No connection named “connection name”

    When running sudo journalctl | grep pluto to check the status of the VPN tunnels to OCI I could see the tunnels were not connecting and the following error was being repeated – “no connection named “oracle-tunnel-1” and “no connection named “oracle-tunnel-2

    These are the names of the two tunnels that I had defined within /etc/ipsec.d/oci-ipsec.conf and are the default names provided by Oracle (you can call them whatever you like).

    The reason for this error, is that the two tunnels were ignored as they are configured to use IKEv1 (the default when you use the Site-to-Site VPN wizard in OCI). By default Libreswan doesn’t permit IKEv1, to enable this I had to add the ikev1-policy=accept to the file /etc/ipsec.conf within the config section.

    Once I’d done this and restarted the IPSec service using sudo systemctl restart ipsec, it came to life and the two IPSec tunnels connected to OCI 🎉.

    Output from sudo journalctl | grep pluto

    Obviously in a production environment you wouldn’t use IKEv1 as IKEv2 is far more secure, I was happy using this my for basic lab environment though.

    2 – Starting the VPN tunnels automatically on reboot

    To ensure that the VPN tunnels persist between reboots of the Raspberry Pi I needed to configure the IPSec service to start automatically, which I did using the following command:

    sudo systemctl enable ipsec

    3 – Creating a static route on the Raspberry Pi 🛣️

    I added a route to the Raspberry Pi to route all of the traffic destined to my OCI VCN (10.2.0.0./16) via the VPN using the following command:

    sudo ip route add 10.2.0.0/16 nexthop dev vti1

    The device vti1 is what I defined for the first tunnel within the file /etc/ipsec.d/oci-ipsec.conf

    Once I’d added the static route, I was able to SSH directly from my Pi to one of the VMs within my VCN! The challenge is that this route doesn’t persist between reboots, so I needed some way to make this static.

    There is a myriad of different ways to do this on Linux, however I opted for the low-tech approach and added the following line to /home/pi/.bashrc.

    sudo ip route add 10.2.0.0/16 nexthop dev vti1 > /dev/null 2>&1

    This script runs each time the Pi user logs in and as Raspberry Pi OS logs the Pi user in automatically by default, this means that the command will execute each time the Pi boots up – defintely not a production solution, but fine for a lab environment 🥼.

    4 – Creating a route on client devices within my network

    I didn’t want to use my Pi directly to connect to resources within OCI, the idea is that the Pi simply spins up the VPN tunnels, I can then route traffic from my network to the Pi for the 10.2.0.0./16 IP address range, which will send traffic down the VPN tunnel(s) to OCI – result being I can connect to my OCI resources from other devices within my network (such as my Mac and Windows PC).

    I ran the following command to add a route to my Mac:

    sudo route -n add 10.2.0.0/16 192.168.1.228

    This tell the Mac to route all requests for the IP range 10.2.0.0./16 (my VCN in OCI) via the Raspberry Pi (which has an IP address of 192.168.1.228).

    The equivalent on Windows would be – route add 10.2.0.0 mask 255.255.0.0 192.168.1.228

    NOTE: Neither of these commands makes the routes persistent (they’ll disappear after a reboot), although this is easy enough to do by adding /p to the command for Windows and using one of the options described here for macOS.

    Hopefully this is of use to somebody 😀.

  • Using OCI AI to convert speech to text 📖

    In my continued journey to play around with the AI services within OCI (you can read about them here), next up on my list is OCI Speech 🎤.

    Here is a high-level overview of what it offers (taken from https://www.oracle.com/uk/artificial-intelligence/speech/):

    OCI Speech is an AI service that applies automatic speech recognition technology to transform audio-based content into text. Developers can easily make API calls to integrate OCI Speech’s pretrained models into their applications. OCI Speech can be used for accurate, text-normalized, time-stamped transcription via the console and REST APIs as well as command-line interfaces or SDKs. You can also use OCI Speech in an OCI Data Science notebook session. With OCI Speech, you can filter profanities, get confidence scores for both single words and complete transcriptions, and more.

    As an experiment I took one of my recent YouTube videos and submitted this to OCI Speech for transcription, the high-levels steps to do this were:

    1. Upload the video to an OCI Object Storage bucket
    2. Within the OCI Console, navigate to Analytics & AI > Speech > Create Job, I then entered the following:

    This created a transcripton job named IdentityFederationYouTube, configured it to use the Object Storage bucket named videos and to store the transcription output in the same bucket – on the next screen we’ll select the video to transcribe from the bucket.

    I left all other settings as default, one really interesting feature is the ability to detect profanity, if you select Add filter, you can configure the transcription job to detect and act upon profanity by either masking, removing or simply tagging it within the output transcription. I didn’t bother using this although I’m sure that I’ll have a lot of fun playing with it in the future 😆.

    On the next screen I chose the video to transcribe and selected Submit.

    NOTE: As my Object Storage bucket contained a single video file this was the only option that I had, it is possible to submit multiple videos within a single transcription job.

    In under 2 minutes the transcription job was complete, for reference the video that was transcribed was 7 minutes long:

    Clicking into the job provides additional details, I did this and scrolled down to the Tasks section, from here it was possible to download the transcription, which is in JSON format directly (Download JSON). I could also have gone directly to the Object Storage bucket, it was a nice touch that I could do it here though, far less clicking around 😀.

    I downloaded the JSON file and fired up Visual Studio Code to analyse the JSON using Python.

    First things first, I wanted to see the transcription itself so I could see how good a job it had done with my Yorkshire accent.

    To do this I ran the following Python:

    import json
    # Open the transcription JSON output file
    filepath = "/users/bkgriffi/Downloads/transcript.json" # Location of the transcription output JSON file
    transcriptsource = open(filepath)
    # Read the transcription
    transcriptJSON = json.load(transcriptsource)
    transcriptJSON["transcriptions"][0]["transcription"] # This is looking at the output of the first transciption within the output, denoted by 0. If transcribing multiple videos within a single job, this would need to be updated accordingly.
    for word in transcriptJSON["transcriptions"][0]["tokens"]:
        if word["confidence"] < '0.70':
            if word["token"] not in ('.',','):
                print(word["token"] + " - " + "Confidence: " + word["confidence"] + " at: " + word["startTime"])
    
    

    It didn’t do too bad a job at all!

    I opened the JSON directly and could see some other really useful information, in addition to the transcription it also provides it’s confidence of each word detected. I put together the following script that outputs all of the words that were detected that had a confidence level of less than 70% and the timestamp that they occurred within the video.

    for word in transcriptJSON["transcriptions"][0]["tokens"]:
        if word["confidence"] < '0.70': # Only display words with a detection confidence of less than 70%
            if word["token"] not in ('.',','): # Exclude full stops and commas
                print(word["token"] + " - " + "Confidence: " + word["confidence"] + " at: " + word["startTime"])
    

    Looking at the output, it appears to have mostly been confident! There’s only a couple of words that I can see there that appear to have been detected incorrectly.

    The script I wrote can be found on GitHub here.

  • Cataloging my video game collection using the OCI AI Vision Service 🎮

    One of my hobbies is collecting video games, specifically retro games from the 80s and 90s.

    I’ve previously used Azure Cognitive services (now Azure AI Services) to catalog my video game collection and wrote about it here 🕹️, as I’ve been playing around with OCI recently I thought it would a great idea to try and replicate this approach using the OCI AI Vision service – specifically the OCR capabilities that can extract text from an image.

    Before I go any further, here is a little background to my specific scenario:

    My games collection has grown over the years and it’s difficult for me to track what I own. On more than one occasion I’ve bought a game, later to realise that I already owned it 🤦‍♂️. I had a brainwave (or more specifically an excuse to tinker)……..why don’t I keep a digital list of the games that I have!

    My plan was to take photos of my collection, pass these photos to the OCI AI Vision service to extract the text from the photos and then write this to a file, which I’ll then eventually put into a database or more likely Excel!

    I took a photo of some of my games (PS3 so not exactly retro!) and then set about writing a Python script that used the Python SDK for OCI to submit the photos and extract any detected text, below is an example image I used for testing 📷.

    The script that I wrote does the following:

    • Connects to the OCI AI Vision Service
    • Submits an image (which is stored within an OCI Object Storage Bucket) to the Vision Service for analysis, requesting OCR (denoted by the featuretype of TEXT_DETECTION) – you could pass a local image instead if needed, further details on how to do this can be found here.
    • Converts the response from the AI Vision Service to JSON (which makes it easier to use)
    • Output’s each line of text detected to the terminal, but only if this is greater than 5 characters in length – this helps to ensure that only game titles are returned, rather than other information on the spine of the box, such as the game classification and ID number.

    This worked really well, with only a couple of small issues (can you spot them 🔎):

    The script I wrote can be found below and also on GitHub.

    import oci
    import json
    
    # Authenticate to the OCI AI Vision Service
    config = oci.config.from_file()
    ai_vision_client = oci.ai_vision.AIServiceVisionClient(config)
    
    # Set the type of analysis to OCR
    featuretype = "TEXT_DETECTION"
    
    # Analyse the image within object storage
    analyze_image_response = ai_vision_client.analyze_image(
        analyze_image_details=oci.ai_vision.models.AnalyzeImageDetails(
            features=[
                oci.ai_vision.models.ImageClassificationFeature(
                    feature_type=featuretype,
                    max_results=300)],
            image=oci.ai_vision.models.ObjectStorageImageDetails(
                source="OBJECT_STORAGE",
                namespace_name="Replace with Object Storage Namespace",
                bucket_name="Replace with the bucket name",
                object_name="Replace with the name of image to analyse"),
            compartment_id="Replace with Compartment ID"),
       )
    
    # Convert to JSON
    json = json.loads(str(analyze_image_response.data))
    
    # Print the names of the games identified (each returned line in the resnpose with greater than 5 characters)
    lines = []
    for analysedlines in json["image_text"]["lines"]:
        if len(analysedlines["text"]) > 5:
            print(analysedlines["text"])
            lines.append(analysedlines["text"])
    
  • Tracking OCI spend on a per-user basis 💷

    I had an interesting conversation with my colleagues recently about how we could track the spend on our OCI test tenant on a per-user basis.

    There are several people within my team who have access to this shared tenant and we needed a way to quickly and easily see the spend per-user.

    I looked into this and created a script using the Python SDK for OCI, which does the following:

    • Connects to a tenant specified by the tenant_id variable
    • Calculates the date range of the previous month, for example it’s currently February 2024, the script calculates a date range of 1st January 2024 (datefrom) to 1st February 2024 (dateto)- this is used as the reporting period for the usage query.
    • Calls the RequestSummarizedUsageDetails usage API and requests the usage for a given date range (the previous month in this case), returning the cost and grouping this by who created the resource – this uses the inbuilt CreatedBy tag, more details on this can be found here.
    • For each of the users (CreatedBy) in the response from the usage API, print to the console along with the cost attributed to each.

    Here is an example of the script output, which shows cost per user for the previous calendar month (in this case January 2024):

    The script can be found on GitHub and below, the request can be updated to meet your specific needs, using the documentation as a reference:

    import oci
    import datetime
    
    # Authenticate to OCI
    config = oci.config.from_file()
    
    # Initialize the usageapi client service client with the default config file
    usage_api_client = oci.usage_api.UsageapiClient(config)
    
    # Create the from and to dates for the usage query - using the previous calendar month
    dateto = datetime.date.today().replace(day=1) # Get the first day of the current month
    month, year = (dateto.month-1, dateto.year) if dateto.month != 1 else (12, dateto.year-1)
    datefrom = dateto.replace(day=1, month=month, year=year) # Get the first day of the previous month
    
    # Build request
    request_summarized_usages_response = usage_api_client.request_summarized_usages(
        request_summarized_usages_details=oci.usage_api.models.RequestSummarizedUsagesDetails(
            tenant_id="Tenant OCID", # Update with the tenant OCID
            time_usage_started=(datefrom.strftime('%Y-%m-%dT%H:%M:%SZ')),
            time_usage_ended=(dateto.strftime('%Y-%m-%dT%H:%M:%SZ')),
            granularity="MONTHLY",
            is_aggregate_by_time=False,
            query_type="COST",
            group_by_tag=[
                oci.usage_api.models.Tag( # Return results by the CreatedBy tag, which will indicate the user who created the resource (who the usage cost will be attributed to)
                    namespace="Oracle-Tags",
                    key="CreatedBy")],
            compartment_depth=6))
    
    # Store the output of the request
    output = request_summarized_usages_response.data
    
    # Loop through the output and print the usage cost per user
    i = 0
    while i < len(output.items):
        print("-" + output.items[i].tags[0].value + " Cost: " + "£" + str(output.items[i].computed_amount))
        i += 1
    
  • Using OCI Language (text analytics) to detect PII 🤫

    OCI Language AI (text analytics) has the ability to detect PII from a string of text, this is particularly useful for the following use-cases:

    Detecting and curating private information in user feedback 🧑

    Many organizations collect user feedback is collected through various channels such as product reviews, return requests, support tickets, and feedback forums. You can use Language PII detection service for automatic detection of PII entities to not only proactively warn, but also anonymize before storing posted feedback. Using the automatic detection of PII entities you can proactively warn users about sharing private data, and applications to implement measures such as storing masked data.

    Scanning object storage for presence of sensitive data 💾

    Cloud storage solutions such as OCI Object Storage are widely used by employees to store business documents in the locations either locally controlled or shared by many teams. Ensuring that such shared locations don’t store private information such as employee names, demographics and payroll information requires automatic scanning of all the documents for presence of PII. The OCI Language PII model provides batch API to process many text documents at scale for processing data at scale.

    Taken from – https://docs.oracle.com/en-us/iaas/language/using/pii.htm

    I created a simple Python script that uses the OCI Language API to detect PII in a string of text and replace this with a placeholder (PII type), this can be found below and on GitHub.

    It analyses the text contained within the texttoanalyse variable, if any PII is detected this is replaced with the type of data contained and the updated string is printed to the console.

    Update texttoanalyse and compartment_id before running.

    import oci
    
    config = oci.config.from_file()
    ai_language_client = oci.ai_language.AIServiceLanguageClient(config)
    texttoanalyse = "my details are brendan@brendg.co.uk, I was born in 1981" # String to analyse for PII
    
    batch_detect_language_pii_entities_response = ai_language_client.batch_detect_language_pii_entities( # Identify PII in the string
        batch_detect_language_pii_entities_details=oci.ai_language.models.BatchDetectLanguagePiiEntitiesDetails(
            documents=[
                oci.ai_language.models.TextDocument(
                    key="String1",
                    text=texttoanalyse,
                    language_code="en")],
            compartment_id="Compartment ID"))
    
    cleansedtext = texttoanalyse # Replace the PII in the string with the type of data it is, for example e-mail address
    for document in batch_detect_language_pii_entities_response.data.documents:
        for entities in document.entities:
            cleansedtext = cleansedtext.replace((entities.text),("*" + (entities.type) + "*"))
    
    print(cleansedtext)
    

    Here is the script in action: