Internal chat interface at (portal)/knowledge/ that lets employees ask questions
about both institutional knowledge (emails, docs, meeting records) and
live operational data (student caseloads, hours, tasks, college lists).
Claude decides which tools to use based on the question, executes them, and
synthesizes the response.
Browser (employee)
→ Google SSO (existing NextAuth, @sayhellocollege.com only)
→ /knowledge page (chat UI)
→ POST /api/knowledge/chat { message, history }
→ Claude API with tool_use enabled
┌─ Tool: search_knowledge
│ Embed query → Supabase pgvector → relevant docs
├─ Tool: query_database
│ Claude writes read-only SQL → Prisma $queryRawUnsafe → results
└─ Claude synthesizes final response from tool results
→ ReadableStream back to client
Claude acts as a router: it reads the question, decides whether it needs the
knowledge base, the database, or both, calls the appropriate tools, and synthesizes.
# ── Knowledge Chat ────────────────────────────────────────
# Supabase hc-knowledge project (separate from main DB)
KNOWLEDGE_SUPABASE_URL=https://appnstdqtjwrcxzwpqnn.supabase.co
KNOWLEDGE_SUPABASE_ANON_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6ImFwcG5zdGRxdGp3cmN4endwcW5uIiwicm9sZSI6ImFub24iLCJpYXQiOjE3NzM4NzkwODQsImV4cCI6MjA4OTQ1NTA4NH0.x475drKnKFrrN7RCQI2R6i_rlpZGRQEK16OfWGCsfw4
# Anthropic API
ANTHROPIC_API_KEY=
# DATABASE_URL already exists in .env — used by Prisma for the operational DB
Already created. One table, one RPC function:
-- Table
corpus_chunks (
id bigserial primary key,
source_path text not null,
source_type text not null, -- 'gdrive', 'email', 'comments', 'transcripts'
counselor_email text,
student_email text,
chunk_index integer not null,
chunk_text text not null,
embedding vector(1024),
role text, -- 'College Counselor', 'Essay Coach', 'Tutor'
created_at timestamptz
)
-- RPC function for vector search
match_chunks(
query_embedding vector(1024),
match_count int default 20,
filter_source_type text default null -- null = all
) returns table (id, source_path, source_type, counselor_email,
student_email, chunk_text, similarity float)
app/(portal)/knowledge/
├── page.tsx # Server component — auth gate + layout
└── chat.tsx # Client component — chat UI
app/api/knowledge/
└── chat/
└── route.ts # POST handler — Claude tool-use loop → stream
lib/
├── knowledge.ts # Supabase client + search helper
├── knowledge-db.ts # Read-only SQL execution against operational DB
└── embed.ts # Local ONNX embedding (mxbai-embed-large)
app/(portal)/knowledge/page.tsx<KnowledgeChat /> client componentapp/(portal)/knowledge/chat.tsxClient component ("use client").
#f8f7fc bg, user bubble uses #281d51messages: Array<{ role: 'user' | 'assistant', content: string }>isStreaming: booleanerror: string | null/api/knowledge/chat with { message, history }app/api/knowledge/chat/route.tsPOST handler. Auth-gated (check NextAuth session, reject if not
@sayhellocollege.com).
{
message: string; // Current user question
history: Array<{ // Previous turns (max 10)
role: 'user' | 'assistant';
content: string;
}>;
}
This route uses Claude's tool_use capability. Claude decides which tools to
call based on the user's question.
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
// 1. Send message with tools defined
let response = await anthropic.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
system: SYSTEM_PROMPT,
tools: [SEARCH_KNOWLEDGE_TOOL, QUERY_DATABASE_TOOL],
messages: [...history, { role: 'user', content: message }],
});
// 2. Tool use loop — Claude may call tools, we execute and feed results back
while (response.stop_reason === 'tool_use') {
const toolUseBlocks = response.content.filter(b => b.type === 'tool_use');
const toolResults = [];
for (const toolUse of toolUseBlocks) {
let result: string;
if (toolUse.name === 'search_knowledge') {
result = await executeKnowledgeSearch(toolUse.input);
} else if (toolUse.name === 'query_database') {
result = await executeDatabaseQuery(toolUse.input);
}
toolResults.push({
type: 'tool_result',
tool_use_id: toolUse.id,
content: result,
});
}
// Feed results back and get next response (may call more tools or give final answer)
response = await anthropic.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
system: SYSTEM_PROMPT,
tools: [SEARCH_KNOWLEDGE_TOOL, QUERY_DATABASE_TOOL],
messages: [
...history,
{ role: 'user', content: message },
{ role: 'assistant', content: response.content },
{ role: 'user', content: toolResults },
],
});
}
// 3. Stream the final text response back to the client
// (For streaming, use anthropic.messages.stream() on the final call)
const SEARCH_KNOWLEDGE_TOOL = {
name: 'search_knowledge',
description: `Search the HelloCollege knowledge base for information from
counselor emails, Google Drive documents, Zoom transcripts, and meeting notes.
Use this for questions about procedures, best practices, how counselors handle
specific situations, communication patterns, and institutional knowledge.`,
input_schema: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'Natural language search query',
},
source_type: {
type: 'string',
enum: ['email', 'gdrive', 'comments', 'transcripts'],
description: 'Optional: filter by source type',
},
max_results: {
type: 'number',
description: 'Number of results (default 15, max 30)',
},
},
required: ['query'],
},
};
const QUERY_DATABASE_TOOL = {
name: 'query_database',
description: `Execute a read-only SQL query against the HelloCollege operational
database. Use this for questions about student counts, caseloads, hours,
task status, college lists, meeting schedules, and other structured data.
IMPORTANT: Only SELECT queries are allowed. Never INSERT, UPDATE, DELETE,
DROP, ALTER, or TRUNCATE.`,
input_schema: {
type: 'object',
properties: {
sql: {
type: 'string',
description: 'A read-only SELECT query',
},
explanation: {
type: 'string',
description: 'Brief explanation of what this query does (for logging)',
},
},
required: ['sql', 'explanation'],
},
};
You are a HelloCollege internal assistant. You help counselors, essay coaches,
and tutors with both knowledge questions and data lookups.
You have two tools:
1. search_knowledge — searches historical emails, Drive docs, Zoom transcripts,
and meeting notes. Use for: procedures, best practices, how-to questions,
communication patterns, institutional knowledge.
2. query_database — runs read-only SQL against the operational database.
Use for: student counts, caseloads, hours, task status, college lists,
meeting schedules, and any question that needs current data.
Key tables and relationships:
- students (id, person_id) → people (id, first_name, last_name)
- employees (id, person_id, email) → people (id, first_name, last_name)
- team_members (id, student_id, employee_id, role_id)
role_id: 1=College Counselor, 4=Essay Coach, 6=Tutor
- team_member_roles (id, name)
- assigned_tasks (id, student_id, task_id, status)
status: 0=Not started, 1=Registered, 2=In Progress, 3=Completed, 4=Opted out, 5=N/A
- tasks (id, name, code, task_type) — task_type: '1-1 Meeting' or 'Workshop'
- comments (id, commentable_id, commentable_type, person_id, comment, note_type, comment_date)
- student_college_decisions (student_id, college_id, decision_status)
- selected_colleges (id, student_id, college_id)
- colleges (id, name)
- student_hours_infos (id, student_id, total_hours, used_hours)
- student_hours_transaction_logs (id, student_hours_info_id, hours, description, created_at)
- ycbm_event_bookings (id, student_id, employee_id, starts_at, ends_at, status)
- families (id), family_relations (family_id, person_id, relation_type)
- student_scores (id, student_id, score_type, score_value)
- selected_services (id, student_id, pfa_service_id)
- pfa_services (id, name, service_type)
- programs (id, name)
Key views (use these when they fit — they handle complex joins):
- v_student_health_status — healthy/not healthy by contact recency
- v_student_enrollment_info — enrollment status
- v_student_last_counselor_meeting — last meeting date per student
- v_next_meeting_per_student — next scheduled meeting
- v_consumed_hours — hours used vs total
- v_current_team_members — active team member assignments
- v_current_staff_per_student — current staff with roles
IMPORTANT: Only write SELECT queries. LIMIT results to 100 rows max.
When a user refers to someone by first name or nickname, search the
people table to resolve to a full record before joining.
Guidelines:
- Decide which tool(s) to use based on the question. Use both when needed.
- Be concise and actionable. Counselors are busy.
- For data queries, present results in a clear format (tables, counts, lists).
- Never share student information with other students or external parties.
This is an internal-only tool.
- If a query returns no results, suggest alternative searches or phrasing.
- When citing knowledge base results, mention the source type and counselor.
lib/embed.tsUses @huggingface/transformers to run mxbai-embed-large locally via ONNX.
This is the same model used to build the corpus (via Ollama), so vectors are
guaranteed compatible. No external API needed.
import { pipeline } from '@huggingface/transformers';
// Singleton — model loads once on first request, cached for subsequent calls.
// First request will be slow (~3-5s model download + init).
// Subsequent requests embed in ~200-400ms on CPU.
let embedder: any = null;
async function getEmbedder() {
if (!embedder) {
embedder = await pipeline(
'feature-extraction',
'mixedbread-ai/mxbai-embed-large-v1',
{ dtype: 'fp32' }
);
}
return embedder;
}
export async function embedQuery(text: string): Promise<number[]> {
const embed = await getEmbedder();
const result = await embed(text, { pooling: 'cls', normalize: true });
return Array.from(result.data);
}
/tmpmaxDuration: 30 and Vercel Pro for the first-requestlib/knowledge.tsimport { createClient } from '@supabase/supabase-js';
import { embedQuery } from './embed';
const supabase = createClient(
process.env.KNOWLEDGE_SUPABASE_URL!,
process.env.KNOWLEDGE_SUPABASE_ANON_KEY!
);
export async function executeKnowledgeSearch(input: {
query: string;
source_type?: string;
max_results?: number;
}): Promise<string> {
const embedding = await embedQuery(input.query);
const matchCount = Math.min(input.max_results ?? 15, 30);
const { data, error } = await supabase.rpc('match_chunks', {
query_embedding: JSON.stringify(embedding),
match_count: matchCount,
filter_source_type: input.source_type ?? null,
});
if (error) return `Error: ${error.message}`;
if (!data || data.length === 0) return 'No relevant results found.';
return data
.map((c: any, i: number) => {
const sim = (c.similarity * 100).toFixed(1);
return [
`[${i + 1}] ${sim}% match | ${c.source_type} | ${c.source_path}`,
c.counselor_email ? `Counselor: ${c.counselor_email}` : '',
c.chunk_text,
].filter(Boolean).join('\n');
})
.join('\n\n---\n\n');
}
lib/knowledge-db.tsUses the existing Prisma client to execute read-only SQL. Validates that the
query is a SELECT before executing.
import { prisma } from './prisma';
const FORBIDDEN_PATTERNS = /\b(INSERT|UPDATE|DELETE|DROP|ALTER|TRUNCATE|CREATE|GRANT|REVOKE|EXEC)\b/i;
export async function executeDatabaseQuery(input: {
sql: string;
explanation: string;
}): Promise<string> {
const { sql, explanation } = input;
// Safety: reject anything that isn't a SELECT
if (FORBIDDEN_PATTERNS.test(sql)) {
return 'Error: Only SELECT queries are allowed.';
}
if (!sql.trim().toUpperCase().startsWith('SELECT')) {
return 'Error: Query must start with SELECT.';
}
// Force a row limit if not present
const hasLimit = /\bLIMIT\b/i.test(sql);
const safeSql = hasLimit ? sql : `${sql} LIMIT 100`;
try {
console.log(`[knowledge-db] ${explanation}: ${safeSql}`);
const results = await prisma.$queryRawUnsafe(safeSql);
if (!Array.isArray(results) || results.length === 0) {
return 'Query returned no results.';
}
// Format as a readable table
const columns = Object.keys(results[0] as any);
const header = columns.join(' | ');
const rows = (results as any[]).map(row =>
columns.map(col => String(row[col] ?? '')).join(' | ')
);
return `${results.length} row(s) returned:\n\n${header}\n${'─'.repeat(header.length)}\n${rows.join('\n')}`;
} catch (err: any) {
return `SQL error: ${err.message}`;
}
}
FORBIDDEN_PATTERNS regex rejects any non-SELECT statement$queryRawUnsafe is used intentionally — Claude generates the SQL, but it'snpm install @anthropic-ai/sdk @supabase/supabase-js @huggingface/transformers
(@supabase/supabase-js may already be installed if used elsewhere in hc-portal)
/knowledge