n8n Batch Processing: Handle Large Datasets Without Crashing
Your n8n workflow works perfectly with 10 items. Run it with 1,000 and watch it crash.
This happens constantly. You build an automation that handles a handful of records flawlessly. Then real production data arrives: thousands of customer records, massive CSV imports, or API responses with hundreds of items. Suddenly you’re staring at “JavaScript heap out of memory” errors, frozen workflows, or API rate limit messages.
The problem isn’t n8n. The problem is how you’re processing data.
The Breaking Point
Most n8n workflows process items one by one or let nodes handle arrays automatically. This works fine for small datasets. But as data volume grows, you hit walls:
- Memory exhaustion: n8n holds all items in memory during execution. 10,000 items with complex JSON structures can consume gigabytes of RAM.
- API throttling: Firing 500 API calls in rapid succession triggers rate limits, resulting in 429 errors and failed workflows.
- Timeout failures: Long-running operations exceed webhook timeouts or execution time limits.
- Editor freezing: Large workflows with many items become unresponsive in the n8n interface.
What You’ll Learn
- How n8n processes data internally and why large datasets cause crashes
- Using the Loop Over Items node to process data in controlled batches
- Choosing optimal batch sizes for different workloads
- Rate limiting patterns with Wait nodes to respect API limits
- Memory management strategies to prevent heap exhaustion
- State tracking across batches with workflowStaticData
- Advanced patterns for parallel processing and error isolation
- Real-world examples with production-ready configurations
Understanding n8n’s Data Flow
Before fixing batch processing problems, you need to understand how n8n handles data.
How Items Flow Through Nodes
Every n8n workflow operates on items. An item is a JSON object containing your data. When a trigger fires or a node executes, it produces one or more items that flow to the next node.
Here’s the key insight: many n8n nodes automatically process all incoming items. The HTTP Request node, for example, executes once per item it receives. Pass it 100 items, and it makes 100 HTTP requests.
This automatic iteration is convenient but dangerous. You have no control over timing, memory usage, or error handling for individual items.
When Automatic Processing Fails
Automatic item processing works well when:
- You have fewer than 100 items
- Each item requires minimal processing
- External APIs have generous rate limits
- Memory isn’t a concern
It fails when:
- Datasets contain thousands of items
- API endpoints have strict rate limits (common with email, CRM, and social media APIs)
- Items contain large payloads (binary files, nested JSON)
- You need progress tracking or partial failure recovery
The Memory Model
n8n runs on Node.js, which has a default heap size limit (typically 512MB to 1.7GB depending on your system). Every item in your workflow consumes memory. When processing large datasets:
- The trigger loads all items into memory
- Each node transformation creates new objects
- Results accumulate until execution completes
- Garbage collection can’t reclaim memory fast enough
The result: your n8n instance crashes with FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory.
The Loop Over Items Node
The Loop Over Items node (previously called “Split in Batches”) is n8n’s answer to batch processing. It divides your dataset into smaller chunks and processes them sequentially.
How It Works
The Loop Over Items node:
- Receives all items from the previous node
- Outputs a batch of items (the size you specify)
- Waits for downstream processing to complete
- Outputs the next batch
- Repeats until all items are processed
- Signals completion through its “done” output
This sequential processing gives you control over memory usage, timing, and error handling.
Node Configuration
Batch Size
The number of items to process in each iteration. Default is 1.
Batch Size: 10
Lower values reduce memory pressure but increase total execution time. Higher values process faster but consume more resources.
Options: Reset
Controls whether to clear the loop’s internal state when the workflow completes. Leave enabled unless you have a specific reason to preserve state across executions.
Understanding the Two Outputs
The Loop Over Items node has two outputs:
| Output | When It Fires | Use Case |
|---|---|---|
| Loop (first output) | For each batch of items | Connect your processing nodes here |
| Done (second output) | After all batches complete | Connect final actions (summary, notification) |
Important: The loop output must connect back to the Loop Over Items node to continue processing. Forgetting this connection causes the workflow to stop after the first batch.
Basic Workflow Structure
Trigger → Get Data → Loop Over Items → Process Items → (back to Loop Over Items)
↓
(When done) → Send Summary
Complete Workflow Example
Here’s a workflow that processes items in batches of 10:
{
"nodes": [
{
"parameters": {},
"name": "Manual Trigger",
"type": "n8n-nodes-base.manualTrigger",
"position": [250, 300]
},
{
"parameters": {
"batchSize": 10,
"options": {}
},
"name": "Loop Over Items",
"type": "n8n-nodes-base.splitInBatches",
"position": [450, 300]
},
{
"parameters": {
"url": "https://api.example.com/process",
"method": "POST",
"body": "={{ $json }}"
},
"name": "HTTP Request",
"type": "n8n-nodes-base.httpRequest",
"position": [650, 300]
}
],
"connections": {
"Manual Trigger": {
"main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
},
"Loop Over Items": {
"main": [
null,
[{"node": "HTTP Request", "type": "main", "index": 0}]
]
},
"HTTP Request": {
"main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
}
}
}
Notice how the HTTP Request connects back to Loop Over Items. This creates the loop that processes all batches.
Tracking Loop Progress
You can access loop metadata in expressions:
// Current batch index (0-based)
{{ $node["Loop Over Items"].context["currentRunIndex"] }}
// Check if all items processed
{{ $node["Loop Over Items"].context["noItemsLeft"] }}
This is useful for progress logging or conditional logic based on batch position.
Choosing the Right Batch Size
Batch size significantly impacts performance, reliability, and resource usage. There’s no universal “best” value.
Factors That Influence Batch Size
1. API Rate Limits
If your external API allows 100 requests per minute, calculate:
Batch Size × Batches Per Minute ≤ Rate Limit
With a 10-second wait between batches (6 batches/minute), you can safely use batch size 15.
2. Payload Size
Large JSON payloads or binary data consume more memory per item. Reduce batch size when processing:
- Images or PDFs
- Nested JSON structures with many fields
- Base64-encoded content
3. Processing Time
If each item takes 2 seconds to process, a batch of 50 items takes nearly 2 minutes. Consider whether you need faster feedback or can tolerate longer batch times.
4. Error Tolerance
Smaller batches isolate failures better. If a batch of 100 fails, you lose 100 items. A batch of 10 loses only 10.
Batch Size Recommendations
| Use Case | Recommended Size | Reason |
|---|---|---|
| API calls with rate limits | 10-25 | Stay under rate limits with Wait nodes |
| Database operations | 50-100 | Databases handle bulk operations well |
| Email sending | 5-10 | Email APIs have strict limits |
| File processing | 1-5 | Large memory footprint per item |
| Lightweight transformations | 100-200 | Minimal resource usage |
| External webhooks | 10-20 | Avoid overwhelming recipients |
Testing Methodology
- Start small: Begin with batch size 10
- Monitor memory: Watch n8n’s memory usage during execution
- Check rate limits: Verify no 429 errors occur
- Measure time: Record total execution time
- Increase gradually: Double batch size and repeat
- Find the ceiling: Stop when you see errors or degradation
For production workflows processing thousands of items, testing with realistic data volumes is essential before deployment.
Rate Limiting Patterns
Most APIs restrict request frequency. Batch processing alone doesn’t solve rate limiting. You need controlled delays between batches.
Adding Wait Nodes
The Wait node pauses execution for a specified duration. Insert it between your processing node and the loop return:
Loop Over Items → HTTP Request → Wait (10 seconds) → Loop Over Items
Wait Node Configuration:
Wait Time: 10
Unit: Seconds
Calculating Delay Times
Formula:
Delay = (60 Ă· Requests Per Minute Limit) Ă— Batch Size
Example: Gmail API (250 emails/day = ~10/hour)
Delay = (3600 Ă· 10) Ă— batch_size
Delay = 360 seconds Ă— batch_size
For batch size 5: 1,800 seconds (30 minutes) between batches.
Complete Rate-Limited Workflow
{
"nodes": [
{
"parameters": {},
"name": "Manual Trigger",
"type": "n8n-nodes-base.manualTrigger",
"position": [250, 300]
},
{
"parameters": {
"batchSize": 5,
"options": {}
},
"name": "Loop Over Items",
"type": "n8n-nodes-base.splitInBatches",
"position": [450, 300]
},
{
"parameters": {
"resource": "message",
"operation": "send"
},
"name": "Gmail",
"type": "n8n-nodes-base.gmail",
"position": [650, 300]
},
{
"parameters": {
"amount": 60,
"unit": "seconds"
},
"name": "Wait",
"type": "n8n-nodes-base.wait",
"position": [850, 300]
}
],
"connections": {
"Manual Trigger": {
"main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
},
"Loop Over Items": {
"main": [
null,
[{"node": "Gmail", "type": "main", "index": 0}]
]
},
"Gmail": {
"main": [[{"node": "Wait", "type": "main", "index": 0}]]
},
"Wait": {
"main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
}
}
}
For more comprehensive rate limiting strategies including exponential backoff, see our API rate limits guide.
Handling 429 Responses
Even with delays, occasional rate limit errors occur. Configure the HTTP Request node to retry:
- Open Settings in the HTTP Request node
- Enable Retry On Fail
- Set Max Tries: 3
- Set Wait Between Tries: 5000ms
This handles transient rate limits without failing the entire workflow.
Memory Management Strategies
When processing tens of thousands of items, memory management becomes critical. n8n workflows that work with 1,000 items might crash with 50,000.
Increasing Node.js Heap Size
By default, Node.js limits heap memory. Increase it with the NODE_OPTIONS environment variable:
# Allow 4GB heap memory
NODE_OPTIONS=--max-old-space-size=4096
For Docker deployments:
environment:
- NODE_OPTIONS=--max-old-space-size=4096
Warning: This raises the ceiling but doesn’t solve the underlying problem. You’re buying time, not fixing architecture.
Reducing Data Between Nodes
Each node receives the full output of previous nodes. Strip unnecessary fields early:
// In a Code node, keep only needed fields
return items.map(item => ({
json: {
id: item.json.id,
email: item.json.email
// Omit large fields like 'fullPayload', 'attachments'
}
}));
Using Sub-Workflows for Memory Isolation
Sub-workflows execute in isolated contexts. When a sub-workflow completes, its memory is released.
Pattern:
Main Workflow: Trigger → Loop Over Items → Execute Workflow (sub) → Loop
Sub-Workflow: Webhook → Heavy Processing → Respond to Webhook
Each batch processes in the sub-workflow’s memory space, then releases. The main workflow’s memory stays constant.
When to Use Queue Mode
For very large datasets (50,000+ items) or mission-critical processing, consider n8n queue mode. Queue mode:
- Distributes execution across multiple worker processes
- Handles failures without losing entire workflows
- Scales horizontally with infrastructure
Queue mode requires Redis and PostgreSQL but solves memory and reliability problems at scale.
Binary Data Considerations
Binary data (images, PDFs, files) consumes far more memory than JSON. When batch processing binary files:
- Use external storage: Store files in S3/MinIO, pass URLs instead of file content
- Process one at a time: Set batch size to 1 for heavy file operations
- Clean up immediately: Delete temporary files after processing
For queue mode deployments, configure S3 binary data storage so workers can share files:
N8N_DEFAULT_BINARY_DATA_MODE=s3
N8N_EXTERNAL_STORAGE_S3_BUCKET_NAME=your-bucket
State Management with workflowStaticData
Batch processing often requires tracking state across iterations: which items succeeded, cumulative counts, or data needed for the final batch.
What is workflowStaticData?
The workflowStaticData object persists throughout a single workflow execution. It’s accessible in Code nodes and survives across loop iterations.
// Access static data
const staticData = $getWorkflowStaticData('global');
// Set a value
staticData.processedCount = (staticData.processedCount || 0) + items.length;
// Read the value later
console.log(staticData.processedCount);
Tracking Progress
Count processed items across all batches:
const staticData = $getWorkflowStaticData('global');
// Initialize on first batch
if (!staticData.totalProcessed) {
staticData.totalProcessed = 0;
staticData.failed = [];
staticData.startTime = new Date().toISOString();
}
// Update counts
staticData.totalProcessed += items.length;
return items;
In your final “done” branch, read the totals:
const staticData = $getWorkflowStaticData('global');
return [{
json: {
totalProcessed: staticData.totalProcessed,
failedItems: staticData.failed.length,
startTime: staticData.startTime,
endTime: new Date().toISOString()
}
}];
Distinguishing Creates vs Updates
When syncing data, you often need to create new records or update existing ones:
const staticData = $getWorkflowStaticData('global');
// Initialize lookup map
if (!staticData.existingIds) {
staticData.existingIds = new Set(existingRecords.map(r => r.id));
}
// Categorize current batch
const toCreate = [];
const toUpdate = [];
for (const item of items) {
if (staticData.existingIds.has(item.json.id)) {
toUpdate.push(item);
} else {
toCreate.push(item);
staticData.existingIds.add(item.json.id);
}
}
Manual Checkpoint Patterns
n8n doesn’t support automatic resume from failures. But you can implement manual checkpointing:
const staticData = $getWorkflowStaticData('global');
// Track last successfully processed ID
staticData.lastProcessedId = items[items.length - 1].json.id;
// On workflow restart (in trigger logic), skip already-processed items
const startFromId = staticData.lastProcessedId || 0;
This requires storing state externally (database, file) between workflow executions for true resume capability.
Advanced Patterns
Beyond basic batching, several patterns solve complex processing requirements.
Parallel Batch Processing
Process multiple batches simultaneously using sub-workflows:
Trigger → Split Data Into Chunks (Code) → Execute Workflow (parallel) → Aggregate Results
The Code node splits data into chunks:
const chunkSize = 1000;
const chunks = [];
for (let i = 0; i < items.length; i += chunkSize) {
chunks.push({
json: {
chunkIndex: Math.floor(i / chunkSize),
items: items.slice(i, i + chunkSize).map(item => item.json)
}
});
}
return chunks;
Each chunk executes as a separate sub-workflow. Configure the Execute Workflow node with Mode: Each Item for parallel execution.
Caution: Parallel processing multiplies API calls. Ensure your rate limits can handle concurrent requests from all chunks.
Error Isolation with Continue On Fail
Prevent one failed item from stopping the entire batch:
- On your processing node, enable Settings → Continue On Fail
- Add an IF node after to check for errors
- Route errors to a logging/retry path
Loop Over Items → HTTP Request (continue on fail) → IF (has error?)
↓ Yes: Log Error
↓ No: Continue
Check for errors:
{{ $json.error ? true : false }}
Progress Notifications
Send periodic updates during long-running batch jobs:
const staticData = $getWorkflowStaticData('global');
const batchIndex = $node["Loop Over Items"].context["currentRunIndex"];
// Notify every 10 batches
if (batchIndex % 10 === 0 && batchIndex > 0) {
// This item triggers notification
return [{
json: {
notify: true,
progress: `Processed ${staticData.totalProcessed} items`
}
}];
}
return []; // No notification for other batches
Route notification items to Slack, email, or webhook nodes.
Combining with Aggregate Node
After batch processing, the Aggregate node combines results from all iterations:
Loop Over Items → Process → Loop → (Done) → Aggregate → Final Output
Aggregate settings:
- Fields to Aggregate: Select fields to combine
- Aggregate: Choose “All Item Data” or specific fields
- Output: Single item containing all results
This is useful for generating summary reports after processing.
Troubleshooting Common Issues
Batch processing introduces unique failure modes. Here’s how to diagnose and fix them.
Problem-Solution Reference
| Problem | Likely Cause | Solution |
|---|---|---|
| ”Heap out of memory” error | Dataset too large for available RAM | Reduce batch size, increase NODE_OPTIONS, use sub-workflows |
| Loop freezes on large datasets | Internal serialization bottleneck (45k+ items) | Split data before Loop Over Items, use pagination |
| Batches stop before completion | Missing loop connection | Ensure processing node connects back to Loop Over Items |
| 429 rate limit errors | API throttling | Add Wait node, increase delay, reduce batch size |
| Infinite loop crash | Loop connected incorrectly | Check connections, ensure “done” output isn’t looped |
| Inconsistent batch sizes | Dynamic data during execution | Pre-fetch all data before Loop Over Items |
| Memory grows with each batch | Items accumulating in workflow | Use sub-workflows, strip unnecessary fields |
Debugging Large Dataset Freezes
If Loop Over Items freezes with large datasets (40,000+ items), the node’s internal serialization may be the bottleneck. Workarounds:
1. Pre-split in Code Node
const batchSize = 100;
const batches = [];
for (let i = 0; i < items.length; i += batchSize) {
batches.push({
json: {
batchIndex: Math.floor(i / batchSize),
items: items.slice(i, i + batchSize)
}
});
}
return batches;
Then process each batch object instead of using Loop Over Items.
2. Paginated Data Fetching
Instead of loading all items, fetch pages from your data source and process each page as a batch. This keeps memory constant regardless of total dataset size.
Monitoring Batch Progress
Without built-in metrics, add manual logging:
const batchIndex = $node["Loop Over Items"].context["currentRunIndex"];
const itemCount = items.length;
console.log(`Processing batch ${batchIndex}, ${itemCount} items`);
// Timestamps for duration tracking
if (batchIndex === 0) {
$getWorkflowStaticData('global').startTime = Date.now();
}
return items;
Check logs via your n8n hosting platform or Docker logs.
For persistent debugging assistance, try our free workflow debugger tool.
Real-World Examples
These examples demonstrate batch processing patterns for common scenarios.
Example 1: CRM Contact Sync (10,000 Contacts)
Scenario: Sync contacts from a CSV export to HubSpot.
Configuration:
- Batch size: 50 (HubSpot allows 100/10sec, we stay conservative)
- Wait time: 2 seconds between batches
- Error handling: Continue on fail, log errors
Workflow Structure:
Read CSV → Loop Over Items (50) → HubSpot Create/Update → Wait (2s) → Loop
↓ (on error)
Log to Google Sheets
Key Code (Pre-processing):
// Normalize CSV data for HubSpot format
return items.map(item => ({
json: {
email: item.json.Email?.toLowerCase().trim(),
firstname: item.json['First Name'],
lastname: item.json['Last Name'],
company: item.json.Company
}
})).filter(item => item.json.email); // Skip rows without email
Example 2: Email Campaign with Gmail
Scenario: Send personalized emails to 500 recipients without hitting Gmail limits.
Configuration:
- Batch size: 5
- Wait time: 60 seconds (Gmail daily limit: ~500 for regular accounts)
- Estimated runtime: ~100 minutes for 500 emails
Rate Limit Math:
500 emails Ă· 5 per batch = 100 batches
100 batches Ă— 60 seconds = 6,000 seconds = ~100 minutes
Workflow Structure:
Get Recipients → Loop Over Items (5) → Gmail Send → Wait (60s) → Loop
↓ (done)
Send Summary Email
Example 3: Large CSV Processing
Scenario: Process a 100,000-row CSV file, transform data, and insert into PostgreSQL.
Configuration:
- Pre-split into 1,000-row chunks
- Process each chunk via sub-workflow
- Sub-workflow batch size: 100 for database inserts
Main Workflow:
// Split large CSV into manageable chunks
const fs = require('fs');
const chunkSize = 1000;
const allData = items;
const chunks = [];
for (let i = 0; i < allData.length; i += chunkSize) {
chunks.push({
json: {
chunkId: Math.floor(i / chunkSize),
data: allData.slice(i, i + chunkSize)
}
});
}
return chunks;
Sub-Workflow:
Webhook (receives chunk) → Loop Over Items (100) → Postgres Insert → Loop → Respond
This architecture keeps memory bounded regardless of CSV size and provides natural checkpoints at chunk boundaries.
Frequently Asked Questions
What’s the difference between Loop Over Items and letting nodes process arrays automatically?
Most n8n nodes iterate over all incoming items automatically. If you send 100 items to an HTTP Request node, it makes 100 requests. The difference is control.
Loop Over Items gives you:
- Controlled batch sizes (process 10 at a time instead of all 100)
- Ability to add delays between batches
- Progress tracking across iterations
- Memory management by limiting concurrent processing
Use automatic processing for small datasets or when order and timing don’t matter. Use Loop Over Items when you need control over how and when items are processed.
How do I prevent memory errors when processing 50,000+ items?
Multiple strategies work together:
- Never load all items at once. Paginate your data source and process pages sequentially.
- Use small batch sizes (10-50 items) to limit active memory.
- Strip unnecessary data early in the workflow.
- Use sub-workflows to isolate memory for heavy processing.
- Increase heap size with
NODE_OPTIONS=--max-old-space-size=4096as a stopgap. - Consider queue mode for reliable processing at scale.
The architectural answer is: don’t hold 50,000 items in memory. Fetch, process, and discard in batches.
Can I resume a batch workflow from where it failed?
n8n doesn’t natively support resume from failure. If a workflow fails mid-batch, it restarts from the beginning.
Workarounds:
- Track progress externally: Store the last processed ID in a database or file. On restart, query from that ID.
- Idempotent operations: Design operations so reprocessing doesn’t create duplicates (use upserts, check existence first).
- Checkpoint to database: After each batch, record progress. On failure, query the checkpoint and skip completed items.
For mission-critical workflows, queue mode provides better failure recovery.
What batch size should I use for API calls with rate limits?
Calculate based on the API’s stated limits:
Safe Batch Size = (Rate Limit Ă· Batches Per Minute) Ă— Safety Factor
Example: API allows 60 requests per minute. With 10-second waits between batches (6 batches per minute):
Batch Size = (60 Ă· 6) Ă— 0.8 = 8 items per batch
The 0.8 safety factor accounts for timing variations. Start conservative and increase if stable.
Common API limits and recommended batch sizes:
| API | Typical Limit | Suggested Batch | Wait Time |
|---|---|---|---|
| Gmail | 250/day | 5 | 60s+ |
| HubSpot | 100/10sec | 50 | 5s |
| Stripe | 100/sec | 50 | 1s |
| Slack | 1/sec (certain APIs) | 1 | 1s |
Why does my Loop Over Items node freeze with large datasets?
The Loop Over Items node serializes all items internally before processing. With datasets exceeding 40,000-50,000 items, this serialization becomes a bottleneck, causing the node to appear frozen.
Solutions:
-
Paginate at the source: Fetch data in pages of 5,000 items. Process each page as a separate workflow execution or sub-workflow.
-
Pre-chunk with Code node: Split items into smaller arrays before Loop Over Items sees them.
-
Use Execute Workflow for chunks: Split data into chunks, then call a sub-workflow for each chunk. The sub-workflow uses its own Loop Over Items with a manageable dataset.
This is a known limitation with very large datasets. The workaround is architectural: don’t pass 50,000 items to a single Loop Over Items node.
Next Steps
Batch processing transforms n8n from a tool for simple automations into a platform capable of handling production workloads. Start with small batches, add appropriate delays, and monitor your workflows under real conditions.
For complex data processing requirements, our workflow development services can architect solutions tailored to your specific needs. If you’re hitting scaling limits, our n8n consulting helps optimize existing workflows for performance.
Check out our n8n best practices guide for additional patterns that complement batch processing.