AI Image Caption Generator with Overlay – n8n Automation Template

October 20, 2025

Aladuddin Aladin

This workflow demonstrates how to generate AI-powered captions for images and overlay them directly onto the image. Using Google Gemini’s multimodal vision model, the workflow analyzes an image, creates a descriptive caption (with a punny title), and then applies the caption as an overlay using n8n’s built-in image editing nodes.

It’s ideal for content creators, publishers, and social media managers who want to quickly generate engaging, contextual captions and apply them as overlays for posts, watermarks, or creative assets—all in one automated flow.

Features

  • 📷 Image Import – Pull in images from a URL (e.g., Pexels stock photos) or replace with your own sources (webhooks, file uploads, etc.).
  • 🤖 AI Captioning – Use Google Gemini to generate rich captions that follow a structured format (who, when, where, context, miscellaneous) with a catchy title.
  • 🧩 Smart Layout Calculation – Code node calculates caption placement dynamically based on image size, ensuring readability and proper alignment.
  • 🎨 Overlay on Image – Add caption text with background shading for contrast using n8n’s Edit Image node.
  • 🔄 Fully Customizable – Swap the input source, modify styling (font, colors, positioning), or extend it for watermarks and branding.
  • 🌐 Multimodal AI Demo – Showcases how n8n integrates with modern vision-language models beyond traditional OCR or classification tasks.

About the author

Alauddin Aladin is an AI Automation expert helping businesses streamline operations, boost productivity, and scale effortlessly using tools like Make.com and n8n. With over a decade of experience in digital systems and automation strategy, Alauddin empowers entrepreneurs to save time and grow smarter through intelligent workflows and AI-driven solutions.

Leave a Comment