# Criteria Manager API - Quality Enhancement Updates

## 🎉 New Endpoints Deployed!

The Criteria Manager API has been updated with **5 new endpoints** for high-quality GPT analysis.

**API URL:** `http://localhost:5002`

---

## ✨ New Quality Analysis Endpoints

### 1. **Structured Output Analysis** (Real-time, High Quality)

**POST** `/api/run-structured-analysis`

Run GPT analysis with guaranteed valid JSON responses and detailed reasoning.

**Request:**
```bash
curl -X POST http://localhost:5002/api/run-structured-analysis
```

**Response:**
```json
{
  "success": true,
  "job_id": "a1b2c3d4",
  "message": "Structured analysis started",
  "features": [
    "100% valid JSON (guaranteed)",
    "Detailed reasoning for each score",
    "60-70% cost savings vs legacy",
    "Farming-specific features"
  ],
  "estimated_time": "5-15 minutes"
}
```

**Benefits:**
- ✅ Zero parsing errors (guaranteed valid JSON)
- ✅ Detailed reasoning for each criterion
- ✅ 60-70% cost reduction
- ✅ Farming-specific feature extraction
- ✅ Better analysis quality

**Cost:** ~$0.0003-0.001 per property

---

### 2. **Batch API - Create** (Prepare Overnight Processing)

**POST** `/api/batch-create`

Create batch input file for overnight processing (50% cost savings).

**Request:**
```bash
curl -X POST http://localhost:5002/api/batch-create
```

**Response:**
```json
{
  "success": true,
  "job_id": "e5f6g7h8",
  "message": "Creating batch input file...",
  "next_step": "Use /api/batch-submit after this completes",
  "estimated_time": "5-10 minutes"
}
```

**What it does:**
- Fetches all properties
- Extracts structured facts
- Creates `batch_analysis_input.jsonl`

---

### 3. **Batch API - Submit** (Send to OpenAI)

**POST** `/api/batch-submit`

Submit batch to OpenAI for overnight processing.

**Request:**
```bash
curl -X POST http://localhost:5002/api/batch-submit
```

**Response:**
```json
{
  "success": true,
  "message": "Batch submitted to OpenAI",
  "next_step": "Check status with /api/batch-status",
  "processing_time": "1-24 hours",
  "cost_savings": "50% vs real-time API"
}
```

**Cost:** ~$0.00015-0.0005 per property (50% savings!)

---

### 4. **Batch API - Status** (Check Progress)

**GET** `/api/batch-status`

Check the status of your batch processing.

**Request:**
```bash
curl http://localhost:5002/api/batch-status
```

**Response:**
```json
{
  "success": true,
  "raw_output": "Batch ID: batch_xyz...\nStatus: completed\nProgress: 100%...",
  "next_step": "If completed, use /api/batch-retrieve"
}
```

**Possible statuses:**
- `validating` - OpenAI is validating the batch
- `in_progress` - Processing (check back later)
- `completed` - Ready to retrieve!
- `failed` - Something went wrong

---

### 5. **Batch API - Retrieve** (Download Results)

**POST** `/api/batch-retrieve`

Download and process completed batch results.

**Request:**
```bash
curl -X POST http://localhost:5002/api/batch-retrieve
```

**Response:**
```json
{
  "success": true,
  "job_id": "i9j0k1l2",
  "message": "Retrieving batch results...",
  "next_step": "Run parse_criteria.py to combine scores"
}
```

**What it does:**
- Downloads results from OpenAI
- Processes into CSV format
- Saves to `analysis_output_batch.csv`

---

## 📊 Updated System Status

**GET** `/api/system-status`

Now includes quality feature detection!

**Response:**
```json
{
  "success": true,
  "api_server": "running",
  "data_freshness": "2025-10-19T12:30:45",
  "data_age_hours": 5.2,
  "active_jobs": 0,
  "auth_status": {
    "logged_in": true,
    "session_age_days": 6.1
  },
  "quality_features": {
    "structured_extraction": true,
    "structured_outputs": true,
    "batch_api": true,
    "csv_sync": true
  },
  "features_enabled": true,
  "timestamp": "2025-10-19T16:55:35"
}
```

**New fields:**
- `quality_features` - Which new features are available
- `features_enabled` - `true` if all quality features detected

---

## 🔄 Complete Workflows

### Workflow 1: Real-Time Structured Analysis

Best for: Immediate results, highest quality

```bash
# 1. Run structured analysis
curl -X POST http://localhost:5002/api/run-structured-analysis
# Response: {"job_id": "abc123", ...}

# 2. Check progress
curl http://localhost:5002/api/job-status/abc123

# 3. When complete, results are in analysis_output_structured.csv
```

**Time:** 5-15 minutes
**Cost:** ~$0.03-0.10 per 100 properties

---

### Workflow 2: Batch API (Overnight Processing)

Best for: Weekly updates, maximum cost savings

```bash
# Sunday evening:

# 1. Create batch input
curl -X POST http://localhost:5002/api/batch-create
# Response: {"job_id": "def456", ...}

# 2. Wait for completion (check job status)
curl http://localhost:5002/api/job-status/def456

# 3. Submit to OpenAI
curl -X POST http://localhost:5002/api/batch-submit

# Monday morning:

# 4. Check if batch is done
curl http://localhost:5002/api/batch-status

# 5. Retrieve results
curl -X POST http://localhost:5002/api/batch-retrieve
# Response: {"job_id": "ghi789", ...}

# 6. Results saved to analysis_output_batch.csv
```

**Time:** 15 min active + 8-24 hours passive
**Cost:** ~$0.015-0.05 per 100 properties (50% savings!)

---

## 🎯 Integration with Full Pipeline

The existing full pipeline endpoint still works:

**POST** `/api/scrape-favorites`

```bash
curl -X POST http://localhost:5002/api/scrape-favorites \
  -H "Content-Type: application/json" \
  -d '{"full_pipeline": true}'
```

This now includes:
- ✅ **Step 2 (NEW!):** Sync CSV to enriched data (fixes new favorites bug)
- ✅ Steps 1-6: Scraping, availability, geocoding
- ✅ Step 7: GPT analysis (still uses legacy by default)
- ✅ Step 8: Parse & combine criteria

**To use structured outputs in the pipeline:**
Replace Step 7 manually or use `/api/run-structured-analysis` separately.

---

## 📈 Cost Comparison

| Method | Endpoint | Cost (100 props) | Time | Quality |
|--------|----------|------------------|------|---------|
| Legacy | `/api/run-analysis` | $0.25-0.30 | 20-30 min | Good |
| Structured | `/api/run-structured-analysis` | $0.03-0.10 | 5-15 min | Excellent |
| Batch API | `/api/batch-*` | $0.015-0.05 | 8-24 hours | Excellent |

**Savings:** Up to 85% with batch API!

---

## 🧪 Test the New Features

### Quick Test:
```bash
# Check if quality features are enabled
curl -s http://localhost:5002/api/system-status | python3 -m json.tool

# Should show:
# "quality_features": {
#   "structured_extraction": true,
#   "structured_outputs": true,
#   "batch_api": true,
#   "csv_sync": true
# },
# "features_enabled": true
```

### Test Structured Analysis:
```bash
# Start a structured analysis
curl -X POST http://localhost:5002/api/run-structured-analysis

# Check progress
curl http://localhost:5002/api/job-status/<job_id>

# View logs
tail -f /tmp/farmmatch_job_<job_id>.log
```

---

## 🔧 Troubleshooting

### "Module not found" errors
The API server requires Flask. It's now configured to use Python 3.9:
```bash
/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python criteria_api.py
```

### Restart API Server
```bash
# Find PID
ps aux | grep criteria_api | grep -v grep

# Kill old server
kill <PID>

# Start new server
cd scraper
nohup /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python criteria_api.py > /tmp/criteria_api.log 2>&1 &

# Check logs
tail -f /tmp/criteria_api.log
```

### Check Server Status
```bash
curl http://localhost:5002/api/system-status
```

---

## 📚 Documentation

- **Full Quality Guide:** [QUALITY_ANALYSIS_README.md](QUALITY_ANALYSIS_README.md)
- **Implementation Summary:** [IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md)
- **Pipeline Diagrams:** [PIPELINE_DIAGRAM.md](PIPELINE_DIAGRAM.md)

---

## 🎉 Summary

**New features deployed:**
✅ 5 new API endpoints
✅ Structured output analysis (100% valid JSON)
✅ Batch API support (50% cost savings)
✅ Quality feature detection in system status
✅ Backward compatible with existing code

**Server running at:** `http://localhost:5002`

**Ready to use!** 🚀

Try the new structured analysis:
```bash
curl -X POST http://localhost:5002/api/run-structured-analysis
```
