This n8n workflow enables you to upload a document from Google Drive, store it in a Pinecone vector database, and then perform question-answering (QA) on its contents using OpenAI’s language models. It supports splitting large files into chunks, embedding them for semantic search, retrieving the most relevant sections, and providing chat-based answers with citations pointing to the source document segments.
Features
- Fetch Documents from Google Drive: Downloads any specified file from Google Drive for processing.
- Chunk & Embed Content: Splits large text documents into smaller chunks and creates embeddings via OpenAI for vector similarity search.
- Store in Pinecone Vector Database: Inserts document embeddings into a Pinecone index for fast and scalable semantic search.
- Chat-based Question Answering: Provides a chat interface that retrieves the top relevant chunks, processes them via OpenAI, and generates accurate answers.
- Source Citations Included: Answers include references to the specific file sections used, improving trust and transparency.
- Customizable Setup: Easily change the file source, Pinecone index, chunk size, or embedding model to adapt to various use cases (research papers, legal docs, manuals, etc.).
- Two-Step Workflow:
- Data Preparation: Fetch, chunk, and store document vectors.
- Interactive QA: Query the document and receive cited answers in real-time.