Common Issues¶
Solutions to frequently encountered problems with the New Hires Reporting System.
Service Connection Issues¶
"Backend API is not available"¶
Symptoms: Frontend shows error message, can't upload or validate files
Solutions:
# Check service status
docker-compose -f docker-compose.prod.yml ps
# Check backend health
curl http://localhost:8000/health
# Restart backend
docker-compose -f docker-compose.prod.yml restart backend
# Check backend logs for errors
docker-compose -f docker-compose.prod.yml logs backend --tail=50
Common causes: - Backend service crashed → Check logs for Python errors - Database connection failed → Verify database is running - Port 8000 conflict → Check if another service is using the port
"Workers not processing jobs"¶
Symptoms: Jobs stuck in pending status, no corrections happening
Solutions:
# Check worker status
docker-compose -f docker-compose.prod.yml ps workers
# Check worker logs
docker-compose -f docker-compose.prod.yml logs workers --tail=50
# Look for AWS Bedrock errors
docker logs newhires-workers | grep -i "error\|exception"
# Restart workers
docker-compose -f docker-compose.prod.yml restart workers
Common causes:
- AWS credentials invalid → Check .env file has valid AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
- Bedrock access denied → Verify IAM policy includes bedrock:InvokeModel permission
- Model access not enabled → Check AWS Bedrock console for model access
- Throttling → Reduce MAX_CONCURRENT_BEDROCK_CALLS in .env
See AWS Bedrock Error Troubleshooting for detailed Bedrock error solutions.
AWS Bedrock Issues¶
"AccessDeniedException" in worker logs¶
Symptoms: Workers show AWS access denied errors
Solutions:
-
Verify AWS credentials in .env:
-
Check IAM permissions:
- Go to AWS Console → IAM → Users → Your User → Permissions
- Ensure policy includes
bedrock:InvokeModelaction -
See AWS Bedrock Setup for correct IAM policy
-
Test credentials:
-
Restart workers:
"Could not resolve foundation model"¶
Symptoms: Workers can't find Claude or Llama model
Solutions:
- Enable model access in AWS Console:
- Go to: https://console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess
- Click "Manage model access"
- Enable "Claude Sonnet 4.5" and/or "Llama 4 Scout"
-
Wait for "Access granted" status
-
Verify model ID in .env:
-
Restart workers:
Database Issues¶
"Database connection failed"¶
Symptoms: Backend can't connect to PostgreSQL
Solutions:
# Check database is running
docker ps | grep newhires-db
# Check database health
docker exec newhires-db pg_isready -U newhires
# Check database logs
docker logs newhires-db --tail=50
# Verify password in .env
grep POSTGRES_PASSWORD .env
# Restart database (WARNING: may cause brief downtime)
docker-compose -f docker-compose.prod.yml restart db
"Database migration failed"¶
Symptoms: Backend won't start, shows Alembic errors
Solutions:
# Check current migration status
docker exec newhires-backend alembic current
# View migration history
docker exec newhires-backend alembic history
# Retry migration
docker exec newhires-backend alembic upgrade head
# If migration is stuck, check logs
docker logs newhires-backend | grep alembic
Frontend Issues¶
"Frontend not loading"¶
Symptoms: Browser shows blank page or connection refused
Solutions:
# Check frontend is running
docker ps | grep newhires-frontend
# Check frontend logs
docker logs newhires-frontend --tail=50
# Test frontend locally
curl -I http://localhost:8080
# Check nginx config
docker exec newhires-frontend cat /etc/nginx/conf.d/default.conf
# Restart frontend
docker-compose -f docker-compose.prod.yml restart frontend
"API calls failing from frontend"¶
Symptoms: Frontend loads but shows API errors
Solutions:
-
Check VITE_API_URL in .env:
-
Rebuild frontend if you changed VITE_API_URL:
-
Check backend is reachable from frontend:
Validation Issues¶
"INVALID_LENGTH" on every line¶
Cause: Line ending issues (Windows CRLF vs Unix LF)
Solution: System handles this automatically. If persists:
"Unknown record type"¶
Cause: File contains header codes not configured in the system
Solution:
- Check which record type is failing in validation results
- Verify file format matches one of the supported state formats
- Contact support if you need a new state format added
Performance Issues¶
"Corrections taking too long"¶
Symptoms: Jobs sit in processing status for extended time
Solutions:
-
Check worker load:
-
Scale up workers for faster processing:
-
Increase concurrency in
.env:
Warning: Higher concurrency = higher AWS costs
- Switch to faster model:
Llama 4 Scout is ~3x faster than Claude but slightly less accurate
"High AWS Bedrock costs"¶
Symptoms: Unexpected AWS charges
Solutions:
-
Check token usage in logs:
-
Reduce concurrent calls:
-
Reduce retry attempts:
-
Switch to cheaper model:
-
Set up AWS billing alerts:
- AWS Console → Billing → Budgets
- Create alert for Bedrock usage
Docker Issues¶
"Permission denied" errors¶
Solutions:
# Add user to docker group (Linux)
sudo usermod -aG docker $USER
# Log out and back in, then verify
docker ps
"Port already in use"¶
Symptoms: Can't start services, port conflict error
Solutions:
# Find what's using the port
sudo lsof -i :8000 # Backend
sudo lsof -i :8080 # Frontend
# Kill the process or change ports in docker-compose.prod.yml
"Out of disk space"¶
Symptoms: Containers won't start, disk space errors
Solutions:
# Check disk usage
docker system df
# Clean up unused images and containers
docker system prune -a
# Remove old volumes (CAREFUL - deletes data!)
docker volume prune
Environment Variable Issues¶
"Missing required environment variable"¶
Symptoms: Services fail to start with env var errors
Solutions:
-
Verify .env file exists:
-
Check required variables are set:
-
Ensure no placeholder values:
-
Verify .env is in same directory as docker-compose.prod.yml:
-
Restart services after fixing .env:
Quick Diagnostics¶
Run this comprehensive check to diagnose multiple issues:
#!/bin/bash
# Save as: quick-check.sh
echo "=== Service Status ==="
docker-compose -f docker-compose.prod.yml ps
echo -e "\n=== Backend Health ==="
curl -s http://localhost:8000/health
echo -e "\n=== Frontend Health ==="
curl -I http://localhost:8080 2>&1 | head -1
echo -e "\n=== Database Health ==="
docker exec newhires-db pg_isready -U newhires
echo -e "\n=== Worker Status ==="
docker logs newhires-workers --tail=10
echo -e "\n=== Recent Errors ==="
docker logs newhires-workers --tail=50 | grep -i error | tail -10
echo -e "\n=== AWS Credentials ==="
docker exec newhires-workers env | grep AWS_REGION
echo -e "\n=== Job Queue Status ==="
docker exec newhires-db psql -U newhires -d newhires -c \
"SELECT status, COUNT(*) FROM correction_jobs GROUP BY status;"
echo -e "\n=== Resource Usage ==="
docker stats --no-stream
Make executable and run:
Getting More Help¶
Detailed Troubleshooting Guides¶
- AWS Bedrock Errors - Comprehensive Bedrock troubleshooting
- Docker Problems - Docker-specific issues
- Logs & Debugging - How to read and analyze logs
Useful Commands¶
# View all logs
docker-compose -f docker-compose.prod.yml logs -f
# Restart everything
docker-compose -f docker-compose.prod.yml restart
# Fresh start (keeps data)
docker-compose -f docker-compose.prod.yml down
docker-compose -f docker-compose.prod.yml up -d
# Check environment
docker exec newhires-workers env | grep -E "AWS|BEDROCK|POSTGRES"
# Export logs for support
docker logs newhires-backend > backend.log
docker logs newhires-workers > workers.log
docker logs newhires-frontend > frontend.log
Prevention Tips¶
-
Monitor worker logs regularly:
-
Set up AWS billing alerts to avoid surprise costs
-
Backup database regularly:
-
Keep .env file secure:
-
Update regularly:
- Check for new IMAGE_TAG from development team
-
Rotate AWS credentials every 90 days
-
Test after changes:
- After updating .env, restart services and verify health endpoints
- After scaling workers, monitor AWS costs
Still Having Issues?¶
If problems persist:
-
Collect diagnostic information:
-
Check documentation:
- Deployment Overview
- Environment Variables
-
Review recent changes:
- Did you update .env recently?
- Did you change IMAGE_TAG?
-
Did you modify docker-compose.prod.yml?
-
Contact support with diagnostic logs and details of what changed