Introduction
In the previous article, we covered what VPC Flow Logs are and how to enable them. But enabling flow logs is just the starting point. The real value comes from understanding what data is actually available inside those records and which fields matter for your specific use case.
AWS has evolved VPC Flow Logs significantly since their launch in 2015. What started as a straightforward 14-field record in version 2 has grown into a rich metadata stream with 35+ fields spanning seven versions. Each version added fields that solve specific problems: mapping traffic to instances (v3), identifying cross-AZ costs (v4), tracing traffic paths through NAT Gateways and Transit Gateways (v5), and achieving ECS container-level visibility (v7).
This article walks through every field, explains what it captures, and shows you which ones to prioritise depending on your goal.
The Anatomy of a VPC Flow Log Record
A single flow log record captures one network flow: a unique combination of source IP, destination IP, source port, destination port, and protocol observed during an aggregation interval of either 1 minute or 10 minutes.
A real v5 record might tell you that an EC2 instance in us-east-1, availability zone use1-az1, sent 8,400 bytes to Amazon S3 over TCP port 443, and that the traffic exited through an internet gateway (traffic-path = 8). That single record contains enough information to calculate the exact data transfer cost for that flow.
Why Understanding VPC Flow Log Fields Matters
Enabling VPC Flow Logs is only the first step toward gaining network visibility inside AWS. The real power lies in knowing what each field represents and how to use it to study traffic trends, diagnose connectivity problems, and reduce cloud networking spend.
Over successive versions, AWS introduced a considerable number of metadata fields. What began as basic networking details grew into a rich data set that can be used for cost attribution, security monitoring, workload analysis, and traffic path intelligence.
Understanding these fields lets your engineering team move beyond just seeing traffic between IPs. You can determine which workloads communicated with each other, how they were routed, whether any AWS service was involved, and whether that communication generated a networking charge. In large cloud environments where millions of flow log records are generated every day, knowing which fields actually matter is what separates useful analysis from expensive noise.
Traffic Attribution and Visibility
One of the most practical benefits of understanding VPC Flow Log fields is accurate traffic attribution. Fields like vpc-id, subnet-id, instance-id, and interface-id let organisations tie network traffic directly to the applications or workloads that generated it. This is especially useful when investigating infrastructure costs or tracing communication patterns across distributed AWS architectures. Rather than staring at anonymous IP traffic, you can connect behaviour to the actual services running in your environment.
Cost Intelligence and Optimisation
Several enhanced VPC Flow Log fields feed directly into network cost calculations. The traffic-path, az-id, flow-direction, and AWS service identifier fields together reveal the traffic patterns that drive networking charges. They help teams identify:
- Traffic crossing availability zones
- Unnecessary use of NAT Gateways
- Inter-region data transfers
- Transit Gateway processing charges
- Internet egress volumes
Without these enhanced fields, tracing the root cause of AWS networking costs from a billing report alone is nearly impossible.
Security and Operational Analysis
Beyond cost visibility, VPC Flow Logs are a reliable source for security analysis and network debugging. Monitoring fields like packet direction, traffic acceptance status, and source and destination metadata lets security teams detect unusual communication patterns before they escalate. Engineers can use the same attributes to validate routing rules, debug connectivity problems, and understand how traffic moves between subnets and VPCs.
Version 2 Fields (Default Format)
Version 2 is the default when you enable flow logs without specifying a custom format. It includes 14 fields that cover the essentials.
account-id identifies which AWS account owns the network interface. This field is essential for multi-account cost attribution in organisations running AWS Organisations.
interface-id is the ENI where traffic was captured. Join this with EC2 metadata to attribute costs to specific workloads.
srcaddr and dstaddr are the source and destination IP addresses. An important nuance: traffic passing through NAT Gateways will show the NAT Gateway IP here, not the original source. Use pkt-srcaddr from version 3 to get the true originator.
bytes is the single most important field for cost analysis. Every AWS data transfer charge is calculated per GB, so this field is the foundation of any cost calculation.
action shows whether traffic was accepted or rejected. Only accepted traffic generates charges, so filter to ACCEPT when doing cost analysis.
log-status flags records as SKIPDATA when flow log records are missing. If this appears frequently, your cost calculations may be undercounting actual transfer volumes.
Version 3 Fields (Instance and Packet Level Metadata)
Version 3 introduced fields that tie network flows to specific AWS resources and reveal true packet origins.
vpc-id, subnet-id, and instance-id let you attribute costs to specific VPCs, subnets, and EC2 instances. When combined with resource tags, you can roll up networking costs by team, environment, or application.
pkt-srcaddr and pkt-dstaddr are the breakthrough fields for NAT Gateway analysis. They reveal the original source workload rather than the NAT Gateway IP. Without these fields, you cannot attribute NAT charges back to the workloads that caused them.
tcp-flags is a bitmask of TCP connection states: SYN (2) signals a new connection opening, FIN (1) signals a connection closing, and RST (4) signals a reset. This field is useful for identifying connection patterns and diagnosing session behaviour.
Version 4 Fields (Location and Region)
az-id is arguably the most important field after bytes. It provides the Availability Zone ID (for example, use1-az1), which is consistent across AWS accounts, unlike AZ names. Cross AZ traffic costs $0.01/GB in each direction, and research shows that 40 to 60% of internal traffic crosses AZ boundaries unnecessarily.
region identifies the AWS region where traffic was captured. Inter-region transfers are the most expensive type of data transfer, typically ranging from $0.01 to $0.02/GB depending on the region pair.
Version 5 Fields (Traffic Path and AWS Service Identification)
Version 5 transformed VPC Flow Logs from a networking tool into a genuine cost intelligence data source.
pkt-dst-aws-service identifies the destination AWS service by name, such as S3, DYNAMODB, EC2, or CLOUDFRONT. This is how you catch S3 traffic flowing through a NAT Gateway and incurring unnecessary charges instead of routing through a free Gateway Endpoint.
flow-direction distinguishes ingress from egress traffic. Since AWS charges for egress in most scenarios, this field prevents double-counting and ensures cost calculations reflect outbound volumes only.
traffic-path is the single most valuable field for network cost analysis. It directly identifies how egress traffic is being routed through your AWS infrastructure:
Value
Path
Cost Implication
1
Same VPC
Free within the same AZ; $0.01/GB cross AZ
2 or 8
Internet gateway
Free for Gateway Endpoints; $0.09/GB for internet egress
3
Virtual private gateway (VPN)
VPN data transfer charges apply
4
Intra-region VPC peering
No peering fee; cross AZ charges may apply
5
Inter-region VPC peering
$0.01 to $0.02/GB depending on region pair
7
Gateway VPC endpoint (Nitro)
Free
The difference between traffic-path value 2 and value 7 can mean the difference between paying NAT Gateway processing fees plus internet egress charges versus paying nothing for S3 or DynamoDB access.
Version 6 and Version 7 Fields
Version 6 added 18 Transit Gateway fields, including tgw-src-vpc-id and tgw-dst-vpc-id. Transit Gateway charges $0.02/GB processed, and these fields let you identify exactly which VPC pairs are generating those charges.
Version 7 added 10 ECS-specific fields, including cluster name, service name, task ID, and container ID. For teams running microservices on ECS, these fields enable per-service network cost attribution that was previously impossible without significant custom tooling.
Fields That Matter Most for Cost Analysis
Not every field deserves equal attention. Here is how to prioritise based on the depth of analysis you need.
Tier 1: Must Have
bytes (v2), az-id (v4), traffic-path (v5), flow-direction (v5), action (v2)
These five fields form the foundation of any network cost analysis. Without all of them, your cost calculations will have significant blind spots.
Tier 2: Strongly Recommended
pkt-srcaddr and pkt-dstaddr (v3), pkt-dst-aws-service (v5), vpc-id, instance-id, subnet-id (v3), region (v4)
These fields add workload attribution and service identification, turning raw cost numbers into actionable data that engineering teams can act on.
Tier 3: Useful for Deep Analysis
pkt-src-aws-service (v5), interface-id (v2), account-id (v2), tcp-flags (v3), start and end timestamps (v2), protocol (v2), and ECS fields (v7)
Reach for these when you need granular attribution, connection pattern analysis, or container-level cost breakdowns in microservices environments.
Key VPC Flow Log Fields for Cost Analysis
While AWS offers more than 35 fields across all flow log versions, a focused subset drives the majority of cost intelligence work. Here is a closer look at the fields that matter most.
bytes
The bytes field is the foundation of any networking cost analysis. Every AWS data transfer charge, whether it involves internet egress, cross-AZ traffic, NAT Gateway processing, or Transit Gateway communication, is ultimately calculated based on data volume. Without this field, calculating the cost of a traffic flow is not possible.
az-id
The az-id field identifies which Availability Zone processed the traffic. This is critical because AWS charges for inter-AZ traffic even when workloads are in the same region. By analysing az-id values across flow records, you can pinpoint which workloads generate excessive cross-AZ traffic and make informed decisions about resource placement.
traffic-path
Among all the fields in enhanced VPC Flow Logs, traffic-path is the most actionable for cost reduction. It identifies the exact route taken by egress traffic, which tells you whether data passed through a NAT Gateway, an internet gateway, a Transit Gateway, VPC Peering, or a Gateway Endpoint. Each of those paths carries a different cost, and many expensive routes can be eliminated with simple architectural changes once you know they exist.
flow-direction
The flow-direction field separates ingress traffic from egress traffic. Since AWS networking costs are primarily tied to outbound data, this field ensures accurate cost attribution and prevents double-counting in analyses that cover internet egress, inter-region transfers, and Transit Gateway communications.
pkt-srcaddr and pkt-dstaddr
The pkt-srcaddr and pkt-dstaddr fields provide packet-level source and destination addresses. They are essential for NAT Gateway analysis because the standard srcaddr field shows the NAT Gateway IP rather than the originating workload. With pkt-srcaddr, you can trace exactly which EC2 instance or service triggered a given transfer and assign that cost to the right team or application.
Understanding srcaddr vs. pkt-srcaddr
This distinction matters more than most teams realise. When an EC2 instance (10.0.1.5) sends traffic through a NAT Gateway (10.0.0.10), the flow log record shows srcaddr = 10.0.0.10 (the NAT Gateway address) and pkt-srcaddr = 10.0.1.5 (the originating instance).
If you rely only on srcaddr, you see a large volume of traffic coming from the NAT Gateway with no way to trace it back to the application that caused it. Adding pkt-srcaddr to your log format immediately resolves that attribution gap and lets you assign NAT costs to the workloads responsible.
Common Challenges When Analysing VPC Flow Log Fields
Understanding the fields is one thing. Turning millions of raw records into meaningful cost intelligence is another challenge entirely.
Mapping Traffic Back to Workloads
Raw flow log data provides networking metadata, but it does not automatically tell you which application or business service generated a given flow. To build a complete picture, teams need to enrich flow records with EC2 instance metadata, subnet mapping, and resource tags. Without that enrichment layer, traffic volumes remain anonymous and difficult to act on.
Interpreting Traffic Paths Correctly
The traffic-path and flow-direction fields are powerful, but using them correctly requires a solid understanding of how AWS routes traffic through NAT Gateways, Transit Gateways, VPC Peering, and Gateway Endpoints. Teams without that background knowledge often misread the data, leading to incorrect cost attribution and ineffective optimisation efforts.
Managing Analytics and Query Costs
Analysing the volume of data produced by VPC Flow Logs can itself become expensive. Poorly structured partitions, unoptimized storage formats, and inefficient queries can drive up Amazon Athena costs significantly. Scaling flow log analysis requires deliberate storage and querying strategies from the start.
Keeping Metadata Context Current
AWS environments change constantly. Workloads scale up and down, instances are replaced, and subnets get reorganised. Keeping the metadata context for VPC Flow Log analysis up to date is an ongoing challenge. Stale resource mappings lead to incorrect cost attribution, which in turn leads to bad optimisation decisions.
The Gap: Fields Exist, but Insights Do Not Build Themselves
AWS gives you 35+ fields across 7 versions. But transforming those fields into answers requires joining flow records with EC2, ENI, and subnet metadata, applying accurate cost rates per traffic-path value, aggregating millions of records every day, handling edge cases like NAT traversal and split flows, and correlating everything with actual AWS bill line items.
That gap between raw data and actionable insight is where most manual analysis efforts break down at scale.
OneLens closes that gap automatically. It ingests your VPC Flow Logs, resolves every field against live AWS resource metadata, applies the correct cost rates for each traffic path, and presents the results as ready-to-use dashboards. It maps traffic-path values to real dollar costs, traces pkt-srcaddr through NAT Gateways back to the originating workloads, identifies cross AZ traffic using az-id, and flags optimisation opportunities continuously across every VPC and account in your organisation.
.jpeg)
