Chunking PDFs without breaking sentences and without exhausting the embedding rate limit.
constraint: HuggingFace Inference rate-limits aggressively; a 200-page PDF can produce 800+ chunks. Each retried call counts. Chunks too small lose semantic coherence; too large overflow the LLM context window when stitched.
RecursiveCharacterTextSplitter with chunkSize=1000 and chunkOverlap=200. Embeddings are batched (10-20 at a time) with exponential backoff on 429. Chunking and embedding happen on the API route side, never in the browser.
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const docs = await splitter.createDocuments([cleanText]);
return docs.map((doc, index) => ({
content: doc.pageContent,
index,
}));