Software Development Engineer
Spectraforce
Lake County, Illinois
an hour ago
Job Description
Role: Software Development Engineer
Location: Lake County, IL (Hybrid – 3 days/week onsite)
Duration: 6+ months (possibility of extension)
Job Description:
We are looking for a Software Development Engineer to build and scale an AI-powered document parsing platform that extracts structured data from complex PDFs (pharmaceutical batch records, certificates, regulatory documents) using OCR, LLMs, and RAG. You will work across the full stack — backend AI pipelines, frontend chat interface, and cloud infrastructure.
Roles & Responsibilities:
• Design and develop production-grade RAG (Retrieval-Augmented Generation) pipelines for domain-specific document querying with hybrid search, reranking, and multi-agent answer synthesis
• Build and optimize document processing pipelines using AWS Textract for OCR extraction from tables, handwritten content, and structured forms
• Integrate and orchestrate multiple LLM models (Claude, Gemini) for intent classification, data extraction, validation, and conversational AI
• Develop and maintain the FastAPI backend — REST APIs, streaming endpoints (SSE), authentication, and background task processing
• Build responsive frontend features using Next.js, React, and TypeScript — chat interface, PDF viewer with highlights, real-time progress tracking
• Manage cloud infrastructure on AWS — EC2 deployment, S3 storage, RDS (PostgreSQL), and IAM configuration
• Work with vector databases (Weaviate) and graph databases (Neo4j) for semantic search and structural document querying
• Implement chunking strategies, embedding generation, cross-encoder reranking, and semantic caching for accurate document retrieval
• Deploy and monitor AI models and services in production — model fallback chains, retry mechanisms, error handling
• Write clean, maintainable code with proper logging, error handling, and documentation
Required Skills:
• Python (FastAPI, async programming, pandas)
• TypeScript / React (Next.js)
• RAG systems — vector search, embeddings, chunking, reranking (production-grade)
• LLM integration — prompt engineering, structured output, multi-model orchestration
• AWS — EC2, S3, Textract, RDS
• PostgreSQL
• REST API design with streaming (SSE)
• Git, basic CI/CD, Linux server management
Good to Have:
• Weaviate, Neo4j, or similar vector/graph databases
• Gemini Vision or GPT-4V for document image analysis
• LangChain / LangGraph
• Docke, nginx
• Pharmaceutical/regulated document experience
Experience:
• 3–6 years
At SPECTRAFORCE, we are committed to maintaining a workplace that ensures fair compensation and wage transparency in adherence with all applicable state and local laws. This position’s starting pay is: $ 38.00/hr.
Location: Lake County, IL (Hybrid – 3 days/week onsite)
Duration: 6+ months (possibility of extension)
Job Description:
We are looking for a Software Development Engineer to build and scale an AI-powered document parsing platform that extracts structured data from complex PDFs (pharmaceutical batch records, certificates, regulatory documents) using OCR, LLMs, and RAG. You will work across the full stack — backend AI pipelines, frontend chat interface, and cloud infrastructure.
Roles & Responsibilities:
• Design and develop production-grade RAG (Retrieval-Augmented Generation) pipelines for domain-specific document querying with hybrid search, reranking, and multi-agent answer synthesis
• Build and optimize document processing pipelines using AWS Textract for OCR extraction from tables, handwritten content, and structured forms
• Integrate and orchestrate multiple LLM models (Claude, Gemini) for intent classification, data extraction, validation, and conversational AI
• Develop and maintain the FastAPI backend — REST APIs, streaming endpoints (SSE), authentication, and background task processing
• Build responsive frontend features using Next.js, React, and TypeScript — chat interface, PDF viewer with highlights, real-time progress tracking
• Manage cloud infrastructure on AWS — EC2 deployment, S3 storage, RDS (PostgreSQL), and IAM configuration
• Work with vector databases (Weaviate) and graph databases (Neo4j) for semantic search and structural document querying
• Implement chunking strategies, embedding generation, cross-encoder reranking, and semantic caching for accurate document retrieval
• Deploy and monitor AI models and services in production — model fallback chains, retry mechanisms, error handling
• Write clean, maintainable code with proper logging, error handling, and documentation
Required Skills:
• Python (FastAPI, async programming, pandas)
• TypeScript / React (Next.js)
• RAG systems — vector search, embeddings, chunking, reranking (production-grade)
• LLM integration — prompt engineering, structured output, multi-model orchestration
• AWS — EC2, S3, Textract, RDS
• PostgreSQL
• REST API design with streaming (SSE)
• Git, basic CI/CD, Linux server management
Good to Have:
• Weaviate, Neo4j, or similar vector/graph databases
• Gemini Vision or GPT-4V for document image analysis
• LangChain / LangGraph
• Docke, nginx
• Pharmaceutical/regulated document experience
Experience:
• 3–6 years
Applicant Notices & Disclaimers
- For information on benefits, equal opportunity employment, and location-specific applicant notices, click here
At SPECTRAFORCE, we are committed to maintaining a workplace that ensures fair compensation and wage transparency in adherence with all applicable state and local laws. This position’s starting pay is: $ 38.00/hr.