VectorFlow
Reference

Database Schema

This reference is for advanced self-hosters who need to understand the data model for backup planning, integrations, or troubleshooting. The schema is managed by Prisma migrations -- you do not need to create tables manually. Running npx prisma migrate deploy (or starting the Docker container) applies all pending migrations automatically.

VectorFlow uses PostgreSQL as its sole data store. All state -- pipeline definitions, fleet status, metrics, audit logs, secrets, and user accounts -- lives in the database.


Entity relationship diagram

erDiagram
    Team ||--o{ TeamMember : has
    Team ||--o{ Environment : owns
    Team ||--o{ Template : owns
    Team ||--o{ AlertRule : owns
    Team ||--o{ VrlSnippet : owns

    User ||--o{ TeamMember : belongs_to
    User ||--o{ AuditLog : creates
    User ||--o{ VrlSnippet : creates

    Environment ||--o{ VectorNode : contains
    Environment ||--o{ Pipeline : contains
    Environment ||--o{ Secret : stores
    Environment ||--o{ Certificate : stores
    Environment ||--o{ AlertRule : has
    Environment ||--o{ AlertWebhook : has

    Pipeline ||--o{ PipelineNode : has
    Pipeline ||--o{ PipelineEdge : has
    Pipeline ||--o{ PipelineVersion : tracks
    Pipeline ||--o{ NodePipelineStatus : reports
    Pipeline ||--o{ PipelineMetric : records
    Pipeline ||--o{ PipelineLog : records
    Pipeline ||--o{ AlertRule : monitors
    Pipeline ||--o{ EventSampleRequest : requests
    Pipeline ||--o{ EventSample : stores

    VectorNode ||--o{ NodePipelineStatus : reports
    VectorNode ||--o{ NodeMetric : records
    VectorNode ||--o{ PipelineLog : emits
    VectorNode ||--o{ AlertEvent : triggers

    AlertRule ||--o{ AlertEvent : fires

Core entities

TableDescription
UserUser accounts with authentication credentials, TOTP secrets, and super admin flag
TeamOrganizational unit that groups environments, templates, and members
TeamMemberJoin table linking users to teams with a role (VIEWER, EDITOR, ADMIN)
EnvironmentLogical grouping of nodes and pipelines (e.g., Production, Staging)
VectorNodeAn agent node registered in an environment
PipelineA pipeline definition with its visual graph, deployment state, and global config
PipelineNodeA single component (source, transform, or sink) within a pipeline graph
PipelineEdgeA connection between two pipeline nodes
PipelineVersionAn immutable snapshot of a pipeline's generated YAML config at deploy time
NodePipelineStatusPer-node runtime status for a deployed pipeline
PipelineMetricTime-series pipeline throughput data (events, bytes, errors)
NodeMetricTime-series host system metrics (CPU, memory, disk, network)
PipelineLogLog lines from pipeline processes, forwarded by agents
SecretEncrypted secret values scoped to an environment
CertificateEncrypted TLS certificate files scoped to an environment
TemplateReusable pipeline template stored as JSON nodes/edges
AuditLogImmutable record of every significant action
SystemSettingsSingleton row for global server configuration
AlertRuleAlert condition definition (metric, threshold, duration)
AlertWebhookWebhook destination for alert notifications
AlertEventRecord of a fired or resolved alert
VrlSnippetCustom VRL code snippet in the team library
EventSampleRequestRequest to sample live events from a running pipeline
EventSampleSampled event data and inferred schema for a pipeline component
AccountOAuth/OIDC provider accounts linked to users

Key table details

Pipeline

The central entity. Stores the pipeline definition and tracks deployment state.

FieldTypeDescription
idString (CUID)Primary key
nameStringDisplay name
descriptionString?Optional description
environmentIdStringFK to Environment
globalConfigJson?Global Vector config (API settings, enrichment tables, log level)
isDraftBooleantrue = not deployed, false = actively deployed
isSystemBooleantrue = system pipeline (audit log shipping)
deployedAtDateTime?Timestamp of last deployment (null if never deployed)
createdByIdString?FK to User who created the pipeline
updatedByIdString?FK to User who last modified the pipeline
createdAtDateTimeCreation timestamp
updatedAtDateTimeLast modification timestamp

Relationships: nodes, edges, versions, nodeStatuses, metrics, pipelineLogs, alertRules, sampleRequests, eventSamples.

VectorNode

Represents an enrolled agent node.

FieldTypeDescription
idString (CUID)Primary key
nameStringDisplay name (defaults to hostname at enrollment)
hostStringHostname or IP address
apiPortIntVector API port (default: 8686)
environmentIdStringFK to Environment
statusNodeStatusCurrent health: HEALTHY, DEGRADED, UNREACHABLE, UNKNOWN
lastSeenDateTime?Last time the server processed a heartbeat from this node
metadataJson?Additional node metadata
nodeTokenHashString?Hashed node authentication token (null = revoked)
enrolledAtDateTime?When the node first enrolled
lastHeartbeatDateTime?Timestamp of the last heartbeat
agentVersionString?Reported agent binary version
vectorVersionString?Reported Vector binary version
osString?Operating system and architecture (e.g., linux/amd64)
deploymentModeDeploymentModeSTANDALONE, DOCKER, or UNKNOWN
maintenanceModeBooleanWhether the node is in maintenance mode (pipelines are stopped)
maintenanceModeAtDateTime?When the node entered maintenance mode (null if not in maintenance)
pendingActionJson?Server-initiated action (e.g., self-update command)
createdAtDateTimeRegistration timestamp

Environment

Logical grouping that contains nodes, pipelines, secrets, and certificates.

FieldTypeDescription
idString (CUID)Primary key
nameStringDisplay name (e.g., "Production", "Staging")
isSystemBooleantrue = internal system environment (hidden from UI)
teamIdString?FK to Team (null for system environment)
enrollmentTokenHashString?Hashed enrollment token for agent registration
enrollmentTokenHintString?First few characters of the token for display
secretBackendSecretBackendSecret storage: BUILTIN, VAULT, AWS_SM, EXEC
secretBackendConfigJson?Configuration for external secret backends
gitRepoUrlString?HTTPS URL of the Git repository for pipeline audit trail
gitBranchString?Git branch to commit pipeline YAML to (default: main)
gitTokenString?Encrypted access token for Git repository authentication
createdAtDateTimeCreation timestamp

PipelineVersion

Immutable deployment snapshot. Created each time a pipeline is deployed.

FieldTypeDescription
idString (CUID)Primary key
pipelineIdStringFK to Pipeline
versionIntAuto-incrementing version number
configYamlStringThe generated Vector YAML config
configTomlString?Optional TOML representation
logLevelString?Vector log level at deploy time
globalConfigJson?Global config snapshot
createdByIdStringFK to User who deployed
changelogString?User-provided deploy message
createdAtDateTimeDeploy timestamp

Secret

Encrypted secrets scoped to an environment. Referenced in pipeline configs using SECRET[name] syntax.

FieldTypeDescription
idString (CUID)Primary key
nameStringSecret identifier (unique per environment)
encryptedValueStringAES-256-GCM encrypted value
environmentIdStringFK to Environment
createdAtDateTimeCreation timestamp
updatedAtDateTimeLast update timestamp

Certificate

Encrypted TLS certificate files. Referenced in pipeline configs using CERT[name] syntax.

FieldTypeDescription
idString (CUID)Primary key
nameStringCertificate identifier (unique per environment)
filenameStringOriginal filename (e.g., ca.pem)
fileTypeStringType: ca, cert, or key
encryptedDataStringAES-256-GCM encrypted PEM content
environmentIdStringFK to Environment
createdAtDateTimeUpload timestamp

AuditLog

Immutable audit trail of all significant actions.

FieldTypeDescription
idString (CUID)Primary key
userIdString?FK to User (null for system actions)
actionStringAction identifier (e.g., pipeline.created, deploy.agent)
entityTypeStringTarget entity type (e.g., Pipeline, Environment)
entityIdStringID of the affected entity
diffJson?Before/after field changes
metadataJson?Additional context
ipAddressString?Client IP address
userEmailString?Denormalized email for display
userNameString?Denormalized name for display
teamIdString?Owning team
environmentIdString?Owning environment
createdAtDateTimeTimestamp

Enums

EnumValuesDescription
RoleVIEWER, EDITOR, ADMINTeam membership role
AuthMethodLOCAL, OIDCUser authentication method
NodeStatusHEALTHY, DEGRADED, UNREACHABLE, UNKNOWNAgent node health
DeploymentModeSTANDALONE, DOCKER, UNKNOWNHow the agent is deployed
ComponentKindSOURCE, TRANSFORM, SINKPipeline node category
ProcessStatusRUNNING, STARTING, STOPPED, CRASHED, PENDINGPipeline process state
LogLevelTRACE, DEBUG, INFO, WARN, ERRORLog severity
SecretBackendBUILTIN, VAULT, AWS_SM, EXECSecret storage provider
AlertMetricnode_unreachable, cpu_usage, memory_usage, disk_usage, error_rate, discarded_rate, pipeline_crashedMetric to evaluate
AlertConditiongt, lt, eqComparison operator
AlertStatusfiring, resolvedAlert event state

Encryption at rest

Sensitive fields are encrypted using AES-256-GCM before being stored in the database:

  • Secret values (Secret.encryptedValue) -- pipeline credentials, API keys, passwords
  • Certificate data (Certificate.encryptedData) -- TLS certificates and private keys
  • TOTP secrets (User.totpSecret) -- two-factor authentication secrets
  • TOTP backup codes (User.totpBackupCodes) -- recovery codes
  • Password hashes (User.passwordHash) -- bcrypt-hashed, not AES-encrypted

The encryption key is derived from the NEXTAUTH_SECRET environment variable. Losing this value means encrypted data cannot be recovered.


Indexes

Key database indexes for query performance:

TableIndexPurpose
PipelineMetric(pipelineId, timestamp)Time-range queries for pipeline charts
PipelineMetric(timestamp)Retention cleanup
NodeMetric(nodeId, timestamp)Time-range queries for node charts
PipelineLog(pipelineId, timestamp)Pipeline log pagination
PipelineLog(nodeId, timestamp)Node log pagination
AuditLog(entityType, entityId)Entity-specific audit queries
AuditLog(userId)User activity queries
AuditLog(createdAt)Time-range audit queries
AlertRule(environmentId)Environment-scoped alert listing
AlertEvent(alertRuleId)Alert event history
AlertEvent(firedAt)Time-range alert queries

Data retention

VectorFlow automatically prunes time-series data based on system settings:

Data TypeDefault RetentionSetting
Pipeline metrics7 daysmetricsRetentionDays
Pipeline logs3 dayslogsRetentionDays
Node metrics7 daysmetricsRetentionDays
Audit logsIndefiniteNot automatically pruned
Alert eventsIndefiniteNot automatically pruned

These values are configured in the SystemSettings table via the admin UI.


Backup considerations

The database is the single source of truth for all VectorFlow state. Losing the database without a backup means losing all pipeline definitions, deployment history, secrets, and audit logs.

For backup and restore procedures, see Backup & Restore.

Key points:

  • Use pg_dump for logical backups or continuous archiving for point-in-time recovery
  • The NEXTAUTH_SECRET environment variable must match between backup and restore -- it is the encryption key for secrets and certificates
  • VectorFlow has built-in scheduled backup support (configured via SystemSettings)

On this page