# SOP: Research Facility Data (Size, Produce, Employee Count)

**Purpose:** Systematically research and populate cold storage facility data for columns G (Size), H (Produce), and J (Employee Count) in the master prospect database.

**Scope:** All cold storage facilities in the database, prioritizing countries/regions as needed.

**Tools Required:**
- web_search (Brave Search API)
- web_fetch (URL content extraction)
- Google Sheets API access
- Spreadsheet ID: 1uVd-xZFF4TEQGqtvw9z6W8fffeaifPCoLsek83GmEoQ

---

## Column G: Size Classification

### Research Process
1. **Search for facility specifications:**
   - Square meters (sqm)
   - Pallet capacity
   - Cubic meters (m³)
   - Number of cold rooms/CA rooms

2. **Search query examples:**
   - "[Company Name] New Zealand cold storage capacity"
   - "[Company Name] facility size square meters"
   - "[Company Name] pallet capacity"
   - "[Company Name] expansion announcement" (often contains facility specs)

3. **Sources to check (in priority order):**
   - Company website (About, Facilities, Services pages)
   - News articles (expansions, new builds, acquisitions)
   - Industry directories (NZ Cold Storage Association, etc.)
   - LinkedIn company page
   - Government/council permits and planning documents

4. **Classification criteria:**
   - **Small:** <2,000 sqm OR <1,000 pallets
   - **Mid:** 2,000-5,000 sqm OR 1,000-2,500 pallets
   - **Large:** 5,000-10,000 sqm OR 2,500-5,000 pallets
   - **XLarge:** 10,000-20,000 sqm OR 5,000-10,000 pallets
   - **XXLarge:** 20,000+ sqm OR 10,000+ pallets

5. **Data entry:**
   - Create hyperlink in cell linking to source
   - If exact number found: "Mid (3,500 sqm)" with hyperlink to source
   - If no data after thorough search: "Unknown - Call"
   - Apply RED background color to "Unknown - Call" cells

### Example Searches
```
web_search("Mr Apple New Zealand cold storage capacity")
web_search("Coolpak Nelson facility size")
web_fetch("https://www.mrapple.co.nz/about-us/")
```

---

## Column H: Produce Types

### Research Process
1. **Search for specific produce types stored:**
   - Primary fruits: apples (varieties), citrus, kiwifruit, avocados, cherries, pears
   - Secondary: berries, stone fruit, vegetables, flowers
   - Specific varieties when available (e.g., "Royal Gala, Jazz, Envy apples")

2. **Search query examples:**
   - "[Company Name] apples storage"
   - "[Company Name] produce types"
   - "[Company Name] fruit varieties"
   - "[Company Name] export" (export articles often mention produce)

3. **Sources to check (in priority order):**
   - Company website (Services, What We Do, Products pages)
   - News articles (harvest reports, export stories)
   - Industry reports and case studies
   - LinkedIn company description
   - Customer testimonials/case studies

4. **Format:**
   - Comma-separated list
   - Most prominent to least prominent
   - Be specific when possible: "Apples (Royal Gala, Jazz, Envy), Kiwifruit (Gold, Green), Cherries"
   - Generic acceptable when specific unavailable: "Apples, Citrus, Stone fruit"

5. **Data entry:**
   - Create hyperlink in cell linking to primary source
   - If no data after thorough search: "Unknown - Call"
   - Apply RED background color to "Unknown - Call" cells

### Example
```
web_search("LeaderBrand New Zealand vegetables")
web_fetch("https://www.leaderbrand.co.nz/")
Result: "Lettuce, Broccoli, Cauliflower, Leafy greens"
```

---

## Column J: Employee Count

### Research Process
1. **Primary source: LinkedIn company page**
   - Most reliable and up-to-date
   - Shows current employee count
   - Search: "[Company Name] LinkedIn"

2. **Alternative sources:**
   - Company website (About Us, Team pages)
   - Industry directories
   - Companies Office NZ (New Zealand companies)
   - News articles mentioning staff numbers

3. **Search query examples:**
   - "[Company Name] LinkedIn"
   - "[Company Name] employees"
   - "[Company Name] staff"

4. **Data entry:**
   - Use specific number when available: "45"
   - If range given (e.g., "50-100"), use midpoint: "75" or note range in Notes column
   - Note source and date if available in Notes: "LinkedIn, Feb 2026"
   - If no reliable data: leave blank (don't guess)

### Example
```
web_search("Mr Apple NZ LinkedIn")
web_fetch("https://www.linkedin.com/company/mr-apple-nz/")
Result: "350" (noted as "LinkedIn, 2026" in Notes)
```

---

## Workflow for Batch Research

### Step 1: Prepare
1. Load current sheet data from Google Sheets
2. Identify range to research (e.g., rows 2-52 for first 50 NZ companies)
3. Note any companies already researched (to avoid duplication)

### Step 2: Research Loop
For each company:
1. **Size research** (Column G)
   - Run 2-3 targeted searches
   - Fetch promising URLs
   - Classify based on criteria
   - Note source URL

2. **Produce research** (Column H)
   - Run 2-3 targeted searches
   - Fetch promising URLs
   - List produce types (most to least prominent)
   - Note source URL

3. **Employee research** (Column J)
   - Check LinkedIn first
   - Fall back to other sources if needed
   - Record number and source

4. **Update Notes** (Column L)
   - Add any relevant context found during research
   - Note sources for employee count
   - Flag any concerns or interesting findings

### Step 3: Update Sheet
1. Prepare batch update with all findings
2. Create hyperlinks for Size and Produce columns
3. Apply RED background formatting to any "Unknown - Call" cells
4. Update Notes column with additional context
5. Execute single batch update to Google Sheets

### Step 4: Review
1. Verify all hyperlinks work
2. Confirm red formatting applied correctly
3. Check for any companies that should be moved to "Non Cool Rooms" sheet
4. Document findings in session notes

---

## Quality Standards

### Required
- **Always hyperlink sources** for Size and Produce
- **Never guess or fabricate data** - use "Unknown - Call" instead
- **Verify company is actually a cold storage facility** - move non-facilities to "Non Cool Rooms"
- **Apply red background** to all "Unknown - Call" entries

### Best Practices
- Cross-reference multiple sources when possible
- Note conflicting information in Notes column
- Flag companies for potential removal (rental companies, packaging-only, etc.)
- Document impressive/notable facilities in Notes (e.g., "NZ's largest integrated apple grower")
- Track common sources for future efficiency (e.g., NZ Cold Storage Association directory)

---

## Common Issues

### Company has no direct website
- **Solution:** Research parent company, add parent URL with explanation in Notes
- **Example:** H1 Packhouse → kaiaponi.co.nz (parent company) with note "H1 is a division of Kaiaponi Farms"

### Company uses external cool storage
- **Action:** Move to "Non Cool Rooms" sheet
- **Example:** CAJ Apples uses T&G coolstores (doesn't own facilities)

### Company only rents/leases cool rooms
- **Action:** Move to "Non Cool Rooms" sheet
- **Example:** Cool Lease (rental company, not operator)

### Company only provides packaging/packhouse services
- **Action:** Move to "Non Cool Rooms" sheet
- **Example:** Mt. Erin Fruit Services (packaging, not storage)

### Facility for sale or uncertain status
- **Action:** Note in Notes column, flag for Jonny to decide on retention
- **Example:** Maleme Coldstorage (property for sale)

---

## Completion Checklist

After completing a batch:
- [ ] All Size entries have hyperlinks or "Unknown - Call"
- [ ] All Produce entries have hyperlinks or "Unknown - Call"
- [ ] All "Unknown - Call" cells have RED background
- [ ] Employee counts recorded where available
- [ ] Non-facilities identified and moved to "Non Cool Rooms" sheet
- [ ] Notable findings documented in Notes
- [ ] Google Sheet updated with all changes
- [ ] Summary provided to Jonny with:
  - Number of companies researched
  - Number of non-facilities moved
  - Count of "Unknown - Call" entries (red flags)
  - Notable prospects identified

---

**Created:** 2026-02-11  
**Last Updated:** 2026-02-11  
**Version:** 1.0
