Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.
-
Updated
Jul 21, 2023 - TypeScript
Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.
library supporting NLP and CV research on scientific papers
Multiple and Large PDF Documents Text Extraction.
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma DB for similarity search based on user input.
A NPM Package built on top of pdf-lib that provides functonalities like merge, rotate, split,download pdf to disk and many more...
Built with pdf-actions NPM package.
An all-in-one GUI management toolkit built with PyQt6, offering a suite of tools for file synchronization, media organization, PDF merging, code formatting, and more.
A side project to easily get and annotate questions and answers to the PsychometryBot project DB using computer vision and pdf parsing
LangGraphRAG: A terminal-based Retrieval-Augmented Generation system using LangGraph. Features include message history caching, query transformation, and vector database retrieval. Ideal for NLP researchers and developers working on advanced conversational AI and information retrieval systems.
This is some useful mini projects that I had worked for self-learning Python programming.
A powerful Retrieval Augmented Generation (RAG) application built with NVIDIA AI endpoints and Streamlit. This solution enables intelligent document analysis and question-answering using state-of-the-art language models, featuring multi-PDF processing, FAISS vector store integration, and advanced prompt engineering.
A statistical data display and notifier app for Covid-19 pandemic.
Berrylit is a simple chatbot interface that allows users to upload a PDF file and ask a question related to its contents. The chatbot uses the Berri API for processing.
A modern, intelligent invoice processing system with advanced multi-format data extraction capabilities. Process invoices from PDFs, Excel files, and images with smart data recognition.
Azure Document Intelligence Result Processor: A toolset for annotating PDFs based on Azure Document Intelligence analysis results, featuring a React web application and a standalone Python script for processing and visualizing extracted data with confidence indicators.
A powerful Chromium extension that leverages the multiple AI APIs to assist with various text operations, image analysis, and PDF processing.
Image Automatic Cropping Watcher: A tool that automatically detects PDF files, converts them to images, corrects perspective distortion, and compiles them back into PDFs.
The Document Summarizer leverages Hugging Face’s facebook/bart-large-cnn model to transform lengthy documents into concise summaries. Built with ReactJS (Vite) for the frontend and Flask for the backend, it supports PDF and text files, offering real-time summarization for researchers, students, and professionals.
The goal of this project is to eliminate the need for paper by digitizing the process of handling client passport information.
Add a description, image, and links to the pdf-processing topic page so that developers can more easily learn about it.
To associate your repository with the pdf-processing topic, visit your repo's landing page and select "manage topics."