Google Gemini enabled Workflow Automation

how to speed up your automation

Created: Mar 21, 2024 by Pradeep Gowda — Updated:Aug 01, 2024 — Tagged: llm · gemini · automation · chatgpt .

In this blogpost, I’ll share how I automated a common set of activities using the power of a well written “specification” and Google Gemini¹.

Background: I take screenshots using shottr. I some time use them in my blog posts. I already have a workflow that uploads the file to my webserver, but I’d rather have the file go into my https://files.btbytes.com/ bucket served by Cloudflare R2 ².

The current workflow was written out in plain text as you see below (“prompt”), and Google Gemini spit out the correct working version of the software/script (“response”) in the very first attempt.

The amount of effort it takes to automate common, tedious tasks has probably fallen to the lowest level possible. The next obvious improvement in this direction would be - me telling my computer the same prompt, and the computer will take care of generating, storing, and invoking the right workflow “intelligently” when I need it. Perhaps, that is already feasible with the current LLM technologies, and I have to learn how to do that.

The Prompt

write a script that reads the $HOME/screenshots directory, takes the latest .png file, and converts it into a .webp file and saves it in ~/files.btbytes.com/images/YYYY/MM/FILENAME.webp where YYYY is current year, and MM is current Month, and filename is the first argument passed to the script.

It then prints out this HTML fragment to the console

<figure>
  <img src="https://files.btbytes.com/images/YYYY/MM/FILENAME.web" alt=" ">
  <figcaption></figcaption>
</figure> [via]()

and then runs the sync.sh script in ~/files.btbytes.com/

Notes on the prompt

.webp file are smaller than .png
the figure HTML code goes into my blogposts and notebooks. I prefer this over the img tag because I have CSS styling applied that restricts the size of the image, and also provides a nice figcaption
the sync.sh script rsyncs the whole director to a cloudflare bucket.

Gemini Response

(only the code is shown here. I did not copy the instructions, while it is very nice, the code is quite self-explanatory)

#!/bin/bash

# Check for provided filename
if [ -z "$1" ]; then
  echo "Error: Please provide a filename as an argument."
  exit 1
fi

# Get current year and month
CURRENT_YEAR=$(date +%Y)
CURRENT_MONTH=$(date +%m)

# Define source and destination paths
SOURCE_DIR="$HOME/screenshots"
DEST_DIR="$HOME/files.btbytes.com/images/$CURRENT_YEAR/$CURRENT_MONTH"
FILENAME="$1"

# Find the latest PNG file
LATEST_PNG=$(ls -t "$SOURCE_DIR"/*.png | head -n 1)

# Check if a PNG file was found
if [ -z "$LATEST_PNG" ]; then
  echo "Error: No PNG files found in the screenshots directory."
  exit 1
fi

# Create destination directory if needed
mkdir -p "$DEST_DIR"

# Convert PNG to WebP
cwebp "$LATEST_PNG" -o "$DEST_DIR/$FILENAME.webp"

# Print HTML fragment
echo "<figure>"
echo "  <img src=\"https://files.btbytes.com/images/$CURRENT_YEAR/$CURRENT_MONTH/$FILENAME.webp\" alt=\" \">"
echo "  <figcaption></figcaption>"
echo "</figure> [via]()"

# Run sync script
cd "$HOME/files.btbytes.com/" && ./sync.sh

Notes on generated code

I did not know about the cwebp program³, even though I have it installed on my MacOS. Rest of the code looks very good.

I saved it as ~/scripts/latest-screenshot.sh, chmod +x’d it and ran it ….

Gemini Screenshot Automation — Google Gemini Screenshot automation

It could have been any GPT viz., ChatGPT, Anthropic, etc. Since I have subscription to Google Gemini also, I wanted to give it a shot.↩︎
Cloudflare has very generous free plans for the S3 equivalent R2, and many other services. Give them a shot. (this is not a paid promotion ;)↩︎
PNG Image Size: 203,736, Webp Image Size: 55,646.↩︎