You can use LlamaParse with Upstash Vector to parse documents and perform semantic queries on the content. LlamaParse simplifies the extraction of structured information from files, which can then be indexed and queried using Upstash Vector.

Install

pip install llama-index upstash-vector llama-index-vector-stores-upstash python-dotenv

Setup

Create a Vector Index in the Upstash Console. Set the index with:

  • Dimensions: 1536
  • Distance Metric: Cosine

Add the required environment variables to a .env file:

UPSTASH_VECTOR_REST_URL=your_upstash_url
UPSTASH_VECTOR_REST_TOKEN=your_upstash_token
LLAMA_CLOUD_API_KEY=your_llama_cloud_api_key

Usage

Parsing Documents

Use LlamaParse to parse a document. For example:

from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

# Initialize the parser
parser = LlamaParse(result_type="markdown")

# Parse a document
file_extractor = {".txt": parser}
documents = SimpleDirectoryReader(
    input_files=["./documents/global_warming.txt"],
    file_extractor=file_extractor
).load_data()

Querying the Parsed Content

Once the document is parsed, you can index it using Upstash Vector and query its content:

from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.upstash import UpstashVectorStore
from llama_index.core import StorageContext
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Set up Upstash Vector Store
vector_store = UpstashVectorStore(
    url=os.getenv("UPSTASH_VECTOR_REST_URL"),
    token=os.getenv("UPSTASH_VECTOR_REST_TOKEN")
)

# Create storage context and index the parsed document
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

# Perform a query
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic discussed in the document?")

To learn more, visit the LlamaParse documentation.