Ticket: HEL-678
Date: 2026-04-07
next_field_registry (property names don't match columns in next_hubspot_contacts)canonical_keynext_student_profile_data for the 63 matched studentsnext_hubspot_contacts (e.g., act_score__c → column is act_score)next_student_profile_data has only 1 row — all other students fall through to reading HubSpot/legacy on every request. Backfilling freezes a PostgreSQL snapshot and makes the portal the source of truth going forwardnext_student_profile_datanext_hubspot_contacts.student_id)readLegacyProfile, but that's a follow-on)Before writing any SQL, run this to confirm the 11 broken mappings map to the right columns:
SELECT column_name FROM information_schema.columns
WHERE table_name = 'next_hubspot_contacts'
AND column_name IN (
'act_score','sat_score','student_engagement','total_children_college',
'adjusted_gross_income','college_budget_notes','merit_aid_potential',
'total_funds_saved','how_did_you_hear','referral_type','referred_by'
)
ORDER BY column_name;
Expected: 11 rows. If any are missing, stop and investigate before proceeding.
For goals.special_circumstances (property special_circumstances__check_all_that_apply_): no matching column exists in next_hubspot_contacts. Check raw_properties JSONB or remove the HubSpot source mapping entirely. Do not guess.
Update the property field in source_mappings to use the actual column name in next_hubspot_contacts. After this fix, the normalization in canonical-profile.ts becomes identity (column names are already normalized).
-- Template: replace hubspot source entry's property with the correct column name
UPDATE next_field_registry
SET source_mappings = (
SELECT jsonb_agg(
CASE WHEN m->>'source' = 'hubspot'
THEN m || '{"property": "<column_name>"}'
ELSE m
END
)
FROM jsonb_array_elements(source_mappings) m
)
WHERE field_key = '<field_key>';
| field_key | old property | new property (= column name) |
|---|---|---|
academic.act_score |
act_score__c |
act_score |
academic.sat_score |
sat_score__c |
sat_score |
activities.student_engagement |
student_engagement__sports__clubs__work__service_ |
student_engagement |
family.total_children_in_college |
total___of_children_going_to_college |
total_children_college |
financial.agi_range |
to_help_assess_your_potential_for_need_based_financial_aid___what_is_your_family_s_adjusted_gross_income__agi__range_ |
adjusted_gross_income |
financial.college_budget_notes |
college_budget_considerations |
college_budget_notes |
financial.merit_aid_potential |
assess_your_potential_for_merit_based_financial_aid |
merit_aid_potential |
financial.total_funds_saved |
total_funds_saved_for_college |
total_funds_saved |
goals.how_did_you_hear |
how_did_you_hear_about_hellocollege_ |
how_did_you_hear |
goals.referral_type |
referral_ |
referral_type |
goals.referred_by |
referred_by_family_contact |
referred_by |
goals.special_circumstances |
special_circumstances__check_all_that_apply_ |
investigate first |
After running, verify with:
SELECT field_key, source_mappings
FROM next_field_registry
WHERE field_key IN ('academic.act_score','academic.sat_score','financial.total_funds_saved')
Confirm property values match column names.
Also update prisma/seed-field-registry.js to reflect corrected property names.
Insert into next_field_registry (and update prisma/seed-field-registry.js):
| field_key | display_name | category | data_type | vis | is_active | source_mappings |
|---|---|---|---|---|---|---|
academic.target_act_score |
Target ACT Score | academic | number | all | false | none |
academic.target_sat_score |
Target SAT Score | academic | number | all | false | none |
academic.target_gpa |
Target GPA | academic | text | all | false | none |
academic.class_rank |
Class Rank | academic | text | all | false | none |
academic.ap_ib_courses |
AP / IB Courses Taken | academic | textarea | all | false | none |
academic.ap_ib_course_count |
AP / IB Course Count | academic | number | all | false | none |
college_selection.committed_school |
Committed School | college_selection | text | all | false | none |
goals.personal_brand |
Personal Brand / Theme | goals | textarea | all | false | none |
goals.stand_out_factor |
Stand Out Factor | goals | text | all | false | none |
activities.summer_programs |
Summer Programs / Camps | activities | textarea | all | false | none |
All start is_active: false, no source_mappings. Data will accumulate via canonical writes from task forms.
Add canonical_key to field_schema JSONB entries. For fields appearing in 2+ tasks, first confirm exact task names:
SELECT name FROM next_task_types WHERE field_schema @> '[{"id":"<field_id>"}]';
Then apply:
UPDATE next_task_types
SET field_schema = (
SELECT jsonb_agg(
CASE WHEN f->>'id' = '<field_id>'
THEN f || '{"canonical_key": "<registry_key>"}'
ELSE f
END
)
FROM jsonb_array_elements(field_schema) f
)
WHERE name = '<task_name>';
| field_id | → canonical_key | task(s) |
|---|---|---|
rigorAssessment |
academic.academic_rigor |
11th Grade Course Selection |
extracurriculars |
activities.extracurriculars |
Activities & Honors Workshop |
honors |
activities.awards |
Activities & Honors Workshop |
leadershipRoles |
activities.leadership |
Activities & Honors Workshop |
communityService |
activities.community_service |
Develop Resume |
workExperience |
activities.employment |
Develop Resume |
primarySport |
activities.sports |
Develop Resume |
interestAreas |
activities.career_interests |
Complete YouScience & Assessment |
careerMatches |
activities.career_interests |
YouScience Reviewed 1:1 |
careerPaths |
activities.career_interests |
Major and Career Exploration Workshop |
topAptitudes |
youscience.top_aptitudes |
confirm 2 task names |
efc |
financial.efc |
confirm 2 task names |
outOfPocket |
financial.parent_willing_pay |
Private Budget Meeting |
needBasedAid |
financial.need_based_fa_qualify |
Private Budget Meeting |
primaryMajor |
academic.potential_majors |
General Major Identified |
secondaryMajor |
academic.potential_majors |
General Major Identified |
topMajors |
academic.potential_majors |
Major and Career Exploration Workshop |
firstChoiceMajor |
academic.potential_majors |
Major by College Identified |
secondChoiceMajor |
academic.potential_majors |
Major by College Identified |
mathPlacement |
academic.survey_current_math_class |
confirm 2 task names |
areasToImprove |
academic.areas_for_improvement |
Complete My Common App Personal Statement |
improvementAreas |
academic.areas_for_improvement |
BSMD Mock Interview 1, Mock College Interview Meeting |
scripts/backfill-hubspot-canonical.js/**
* Backfill HubSpot data into next_student_profile_data.
*
* For each of the 63 students with a matched HubSpot contact:
* 1. Read their next_hubspot_contacts row
* 2. Map each registry HubSpot field to its value
* 3. Skip nulls, empty strings, and zero scores ("0")
* 4. Merge into next_student_profile_data — NEVER overwrite existing portal values
* 5. Upsert the row
*
* Safe to re-run. Idempotent.
*/
require('dotenv').config({ path: '.env' });
const { Pool } = require('pg');
const pool = new Pool({ connectionString: process.env.DATABASE_URL, max: 3, ssl: { rejectUnauthorized: false } });
const ZERO_SKIP_FIELDS = ['act_score', 'sat_score', 'unweighted_gpa', 'weighted_gpa'];
async function main() {
// 1. Get all active registry fields with hubspot sources
const regResult = await pool.query(`
SELECT field_key, source_mappings FROM next_field_registry
WHERE source_mappings IS NOT NULL
AND source_mappings @> '[{"source":"hubspot"}]'
`);
// Build map: column_name -> field_key
const colToField = new Map();
for (const row of regResult.rows) {
const mapping = row.source_mappings.find(m => m.source === 'hubspot');
if (mapping?.property) colToField.set(mapping.property, row.field_key);
}
// 2. Get all matched students
const studentsResult = await pool.query(`
SELECT student_id FROM next_hubspot_contacts
WHERE match_status = 'matched' AND student_id IS NOT NULL
`);
let inserted = 0, updated = 0, skipped = 0;
for (const { student_id } of studentsResult.rows) {
// 3. Read HubSpot row
const hsResult = await pool.query(
'SELECT * FROM next_hubspot_contacts WHERE student_id = $1 AND match_status = \'matched\' LIMIT 1',
[student_id]
);
if (!hsResult.rows.length) continue;
const hs = hsResult.rows[0];
// 4. Build HubSpot data map
const hsData = {};
const hsSources = {};
for (const [col, fieldKey] of colToField) {
const val = hs[col];
if (val === null || val === undefined || val === '') continue;
if (ZERO_SKIP_FIELDS.includes(col) && (val === '0' || val === 0)) continue;
hsData[fieldKey] = val;
hsSources[fieldKey] = { source_type: 'hubspot', property: col };
}
if (!Object.keys(hsData).length) { skipped++; continue; }
// 5. Read existing canonical row
const existing = await pool.query(
'SELECT data, sources FROM next_student_profile_data WHERE student_id = $1',
[student_id]
);
let data, sources;
if (existing.rows.length) {
// Merge: HubSpot fills in only missing keys
const existingData = existing.rows[0].data || {};
const existingSources = existing.rows[0].sources || {};
data = { ...hsData, ...existingData }; // existingData wins
sources = { ...hsSources, ...existingSources };
await pool.query(
'UPDATE next_student_profile_data SET data = $1, sources = $2, updated_at = NOW() WHERE student_id = $3',
[JSON.stringify(data), JSON.stringify(sources), student_id]
);
updated++;
} else {
data = hsData;
sources = hsSources;
await pool.query(
'INSERT INTO next_student_profile_data (student_id, data, sources, created_at, updated_at) VALUES ($1, $2, $3, NOW(), NOW())',
[student_id, JSON.stringify(data), JSON.stringify(sources)]
);
inserted++;
}
}
console.log(`Done. Inserted: ${inserted}, Updated: ${updated}, Skipped (no data): ${skipped}`);
console.log(`Total next_student_profile_data rows now: ${inserted + updated + (existing count)}`);
await pool.end();
}
main().catch(e => { console.error(e.message); process.exit(1); });
Run with:
node scripts/backfill-hubspot-canonical.js
Expected output: ~62 inserted, 1 updated (the existing row), ~0 skipped.
After running, verify:
SELECT COUNT(*) FROM next_student_profile_data;
-- Expect ~63
SELECT student_id, jsonb_object_keys(data) as key
FROM next_student_profile_data
LIMIT 50;
-- Should show field_key entries like academic.weighted_gpa, academic.potential_majors etc
academic.act_score, academic.sat_score, financial.total_funds_saved — should now show real numbers (source_mappings fixed)extracurriculars should pre-fill from canonical storenext_student_profile_datascripts/backfill-hubspot-canonical.js — new batch backfill scriptprisma/seed-field-registry.js — update 11 property names + add 11 Cat 3A entriesnext_field_registry — UPDATE 11 source_mappings + INSERT 11 new fieldsnext_task_types.field_schema — UPDATE ~25 field entries with canonical_keySELECT COUNT(*) FROM next_field_registry WHERE field_key LIKE 'academic.target%' ... returns 10 new Cat 3A fieldsSELECT source_mappings FROM next_field_registry WHERE field_key='academic.act_score' shows property: "act_score"SELECT COUNT(*) FROM next_student_profile_data is ~63 after backfillcanonical_key count in task field_schemas increases from ~15 to ~37next_student_profile_dataprisma/seed-field-registry.js updated to match DB state