Cloud Governance
mins read

What Are AWS VPC Flow Logs and How to Enable Them?

The foundation of network cost visibility in AWS includes what flow logs capture, where to store them, and how to configure them for cost intelligence.
By

Introduction

If you've ever stared at your AWS bill wondering why your networking costs keep climbing, you're dealing with one of cloud computing's most common frustrations. Data transfer charges, especially those tied to cross-AZ traffic, NAT Gateways, and inter-region transfers, rarely come with a clear explanation. VPC Flow Logs are where that investigation starts.

VPC Flow Logs record metadata about IP traffic moving through your Virtual Private Cloud. They don't log packet payloads, just connection-level details: source and destination IPs, ports, protocols, byte counts, and whether traffic was accepted or rejected. That's exactly the data you need to trace where bytes are going and what each path costs.

This is Article 1 in a 9-part series on using VPC Flow Logs for network cost intelligence.

What Are AWS VPC Flow Logs?

A VPC Flow Log record represents a single network flow, a connection defined by a 5-tuple (source IP, destination IP, source port, destination port, protocol) captured during a set aggregation window.

You can create flow logs at three levels:

  • VPC level — monitors traffic across every ENI (Elastic Network Interface) in the VPC
  • Subnet level — monitors traffic for all ENIs within a specific subnet
  • ENI level — monitors a single network interface

Enabling flow logs at the VPC level automatically covers every ENI in every subnet. This is the most practical starting point for most teams.

What Do VPC Flow Logs Capture?

Each flow log record includes:

  • Source and destination IP addresses
  • Source and destination ports
  • Protocol number
  • Packet and byte counts
  • Start and end timestamps for the capture window
  • Traffic action (accepted or rejected)

With custom log formats (versions 3–5), you get richer metadata built for cost analysis: VPC and subnet identifiers, availability zone IDs, flow direction (ingress/egress), the network path the traffic used (internet gateway, NAT Gateway, VPC Peering, Transit Gateway), and the AWS service name when the destination is S3, DynamoDB, or another managed service.

The traffic-path and az-id fields are the most actionable for cost work. Together, they tell you whether traffic crossed availability zones, passed through a NAT Gateway, used VPC Peering or Transit Gateway, or left the region entirely, each of which carries a different cost implication.

What Do VPC Flow Logs NOT Capture?

Flow logs do not record:

  • DNS queries to the Amazon-provided resolver
  • DHCP traffic
  • Traffic to the instance metadata service (169.254.169.254)
  • Amazon Time Sync Service traffic
  • Traffic between endpoint network interfaces and Network Load Balancer interfaces
  • Packet content, only connection metadata is logged

Common Use Cases of AWS VPC Flow Logs

Most teams enable VPC Flow Logs for security or compliance reasons, then quickly realise the data is useful for much more. Here are the four areas where flow logs consistently deliver value.

Network Cost Analysis

AWS networking charges can be difficult to pin down using billing reports alone, because the bill doesn't tell you why traffic moved the way it did.

Flow log analysis fills that gap. By examining which flows crossed availability zones, passed through NAT or Transit Gateways, or used VPC Peering, engineers can see exactly which traffic patterns are driving costs. The traffic-path and az-id metadata fields make it possible to pinpoint expensive routing decisions down to the workload level, enabling smarter architectural improvements and more accurate cost allocation between teams and departments.

Security Monitoring and Threat Detection

Security engineers use VPC Flow Logs to establish baseline traffic behaviour and identify anomalies. Because flow logs record both accepted and rejected traffic, they're particularly useful for spotting unauthorised access attempts, unexpected lateral movement, or unusual communication patterns across your cloud environment.

In regulated industries, flow logs also serve as an auditable record of network activity, invaluable when you need to reconstruct events before or during a security incident.

Troubleshooting Connectivity Issues

Diagnosing network problems between distributed AWS resources used to require a lot of guesswork. VPC Flow Logs remove that ambiguity. Engineers can query traffic direction, source and destination IPs, the protocol in use, and whether specific requests were accepted or dropped, all without needing to reproduce the issue or add instrumentation to application code.

Cloud Governance and Compliance

Organisations operating under regulatory frameworks often incorporate VPC Flow Logs into their audit and governance processes. A well-maintained flow log archive demonstrates control over cloud network activity and helps compliance teams answer questions about cross-account, cross-region, or inter-environment communication patterns, something that becomes increasingly important as cloud environments grow more complex.

Where Should VPC Flow Logs Be Stored?

AWS supports three destinations for flow log data, each with different cost and latency trade-offs.

Amazon S3 (Recommended for Cost Analysis)

S3 is the right choice for large-scale flow log analysis. It supports Parquet columnar format (which compresses tightly and makes queries dramatically faster), Hive-compatible partitioning by account, region, and date, and automatic tiering to Glacier for long-term, low-cost retention.

Best practice: Store flow logs in Parquet format with Hive-compatible partitions. Compared to plain text, Parquet reduces storage costs by 60–80% and accelerates analytical queries by 5–10x.

Amazon CloudWatch Logs

CloudWatch Logs delivers flow data in near-real-time (roughly a 5-minute lag) and integrates with CloudWatch Insights for ad-hoc queries. The trade-off is cost: ingestion starts at $0.50/GB for standard log classes, which adds up quickly at enterprise scale. Use CloudWatch when you need real-time alerting rather than bulk historical analysis.

Amazon Kinesis Data Firehose

Firehose is well-suited for streaming flow logs into third-party analytics platforms or custom processing pipelines. It supports inline transformations and can deliver to S3, Redshift, OpenSearch, or custom HTTP endpoints, making it a solid option when you're integrating flow data with an existing observability stack.

How to Enable AWS VPC Flow Logs

VPC Flow Logs can be enabled through the AWS Console, CLI, Terraform, or CloudFormation. The four configuration decisions that matter most are outlined below.

Traffic Filter

Select All to capture both accepted and rejected traffic. For cost analysis, accepted traffic is what counts; that's what generates charges. Rejected traffic is valuable for security purposes, but can always be filtered out at query time.

Aggregation Interval

Choose 1 minute for granular analysis or 10 minutes for lower log volume and storage costs. On Nitro-based instances, the interval is always 1 minute or less regardless of this setting.

Log Format

This is the most consequential configuration decision. The default v2 format includes only 14 basic fields. For cost analysis, you need a custom format with the v3–v5 fields:

vpc-id, subnet-id, instance-id, az-id, region, pkt-srcaddr, pkt-dstaddr, pkt-src-aws-service, pkt-dst-aws-service, flow-direction, traffic-path

Without these fields, you can see that traffic moved between two IP addresses, but you can't determine whether it crossed availability zones, used a NAT Gateway, or reached an AWS managed service. All of those factors determine the actual cost.

File Format

Select Parquet over plain text. Enable Hive-compatible partitions and hourly partitioning for efficient querying at scale.

Recommended Custom Format for Cost Analysis

Core fields: version, account-id, interface-id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log-status

Instance metadata (v3): vpc-id, subnet-id, instance-id, tcp-flags, type, pkt-srcaddr, pkt-dstaddr

Location (v4): region, az-id

Cost intelligence (v5): pkt-src-aws-service, pkt-dst-aws-service, flow-direction, traffic-path

Here's why each cost-analysis field matters:

Field

Why It Matters

bytes

The actual data volume driving your transfer costs

az-id

Distinguishes same-AZ (free) vs. cross-AZ ($0.01/GB each way) traffic

traffic-path

Identifies the network path: IGW, NAT GW, VPC Peering, TGW, etc.

pkt-src/dst-aws-service

Identifies traffic to/from S3, DynamoDB, and other AWS services

flow-direction

Separates ingress from egress for accurate cost attribution

Best Practices for AWS VPC Flow Log Management

Enabling flow logs is only step one. As your infrastructure scales, managing log volume and cost becomes just as important as collecting the data.

Store in Amazon S3 — S3 remains the most cost-efficient destination for flow logs at scale, especially when paired with Amazon Athena for analysis.

Use Parquet instead of plain text — At enterprise scale, VPC Flow Logs can generate terabytes of data per day. Parquet's columnar compression significantly reduces both storage and query costs.

Configure an enhanced log format — The default v2 format is insufficient for cost or traffic intelligence work. Always include az-id, traffic-path, flow-direction, and AWS service identifier fields in your custom format.

Implement proper partitioning — Partition logs by account, region, date, and hour. This limits the data scanned during Athena queries, reducing costs and improving performance.

Apply lifecycle and retention policies — Without lifecycle rules, storage costs compound over time. Transition logs to Amazon S3 Glacier after 90 days to retain historical data at a fraction of the active-tier cost.

Only include fields you actually need — Each additional field increases log volume. Review your use case before finalising the format.

Cost of Running VPC Flow Logs

Creating flow logs is free. Costs come from log ingestion and storage.

Vended Log Ingestion Pricing

VPC Flow Logs are classified as "vended logs" with volume-tiered pricing:

Monthly Volume

Price per GB

First 10 TB

$0.50

Next 20 TB

$0.25

Next 20 TB

$0.10

Over 50 TB

$0.05

Cost optimisation tips:

  • Use Parquet format to cut storage by 60–80% vs. plain text
  • Set S3 lifecycle rules to transition logs to Glacier after 90 days
  • Limit fields to what your use case actually requires
  • Use 10-minute aggregation intervals where per-minute granularity isn't needed

Why AWS VPC Flow Logs Matter for FinOps

As cloud bills grow, FinOps teams are under pressure to explain not just how much was spent, but why. Networking costs are one of the hardest line items to interpret from billing data alone; they depend heavily on traffic patterns and architectural decisions that simply aren't visible in aggregate reports.

VPC Flow Logs give FinOps professionals something billing data can't: a direct view into traffic behaviour. Instead of working backwards from summary line items, teams can identify which workloads generate high-volume transfers, which applications route traffic through NAT Gateways unnecessarily, and which services are driving cross-AZ communication costs.

Improving Cost Allocation

Networking costs in AWS are typically shared across environments, making accurate attribution difficult. Flow logs allow teams to correlate traffic patterns with individual workloads or VPCs, enabling proper chargeback and showback across business units.

Identifying Architectural Inefficiencies

Many AWS networking costs aren't the result of workload scale; they're the result of architectural patterns that generate unnecessary transfer charges in practice. Workloads communicating unnecessarily across availability zones, or routing traffic through NAT Gateways multiple times, can quietly accumulate high costs over months. The traffic-path and az-id fields make these patterns visible so engineering and FinOps teams can act on them together.

Supporting Continuous Optimisation

FinOps isn't a one-time exercise. Because VPC Flow Logs provide ongoing visibility into network traffic, organisations can continuously evaluate how infrastructure changes affect networking spend, creating a real feedback loop between engineering decisions and cost outcomes.

The Challenge: From Raw Data to Cost Intelligence

Enabling flow logs is the straightforward part. Turning terabytes of raw flow records into decisions you can act on is where the real work begins.

The manual path involves building and maintaining analytical tables, writing complex queries that join flow data with EC2 instance metadata and subnet-to-AZ lookups, understanding cost rates for each traffic path, refreshing the analysis on a regular cadence, and repeating this across every VPC and account in your organisation.

For a single VPC, that's a manageable weekend project. For a multi-account enterprise, it's effectively a full-time role.

OneLens automates this pipeline, ingesting your VPC Flow Logs, enriching them with AWS resource metadata, and surfacing ready-to-use cost intelligence dashboards without any query writing or table management. It continuously monitors traffic patterns and surfaces optimisation opportunities with estimated dollar savings, so you can go from "we have flow logs enabled" to "we know exactly where our network dollars go" in minutes rather than weeks.

FAQs

What is a VPC Flow Log in AWS?

A VPC Flow Log is a record of network traffic metadata for a specific IP flow through your Virtual Private Cloud. Each record covers a 5-tuple connection (source IP, destination IP, source port, destination port, protocol) over an aggregation window and includes fields like byte counts, timestamps, and whether the traffic was accepted or rejected.

Are AWS VPC Flow Logs free?

Creating flow logs is free, but you pay for log ingestion and storage. Ingestion is priced as a vended log starting at $0.50/GB for the first 10 TB per month, with volume discounts above that threshold. Storage costs vary by destination; S3 is significantly cheaper than CloudWatch Logs at enterprise scale.

What is the difference between VPC Flow Logs and CloudTrail?

VPC Flow Logs capture network traffic metadata, including which IPs communicated, how many bytes were transferred, and via which network path. AWS CloudTrail captures API activity, who made calls to AWS services, when, and from where. They serve different purposes: flow logs for network visibility and cost analysis; CloudTrail for auditing AWS resource operations.

How do I enable VPC Flow Logs?

You can enable flow logs through the AWS Console (navigate to VPC → select your VPC → Flow Logs tab → Create flow log), via the AWS CLI using aws ec2 create-flow-logs, or through Terraform and CloudFormation. Key decisions include traffic filter (all, accepted, or rejected), aggregation interval (1 or 10 minutes), log destination (S3, CloudWatch, or Firehose), and log format (default v2 or a custom v3–v5 format).

What is the best log format for VPC Flow Logs used in cost analysis?

For cost analysis, use a custom v5 log format that includes az-id, traffic-path, flow-direction, pkt-src-aws-service, and pkt-dst-aws-service. Store logs in Parquet format in S3 with Hive-compatible partitions. This gives you the metadata needed to attribute costs to specific workloads, routing paths, and availability zones, while keeping storage and query costs low.

Can VPC Flow Logs be used to detect security threats?

Yes. Because flow logs capture both accepted and rejected traffic, security teams can use them to detect port scanning, unauthorised access attempts, and unusual communication patterns. They're commonly integrated with SIEM tools for ongoing threat detection and used as an audit trail during incident response.

How long should VPC Flow Logs be retained?

Retention depends on your compliance requirements and use case. A common approach is to keep 90 days of logs in S3 Standard for active analysis, then transition older logs to S3 Glacier for long-term archival at a lower cost. Some regulated industries require retention of 1–7 years.

What does the traffic-path field in VPC Flow Logs mean?

The traffic-path field identifies the network path a flow took. Possible values include internet gateway (IGW), NAT Gateway, VPC Peering connection, Transit Gateway, and local gateway, among others. This field is essential for cost analysis because different network paths carry different per-GB transfer charges.